Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Quickly copy a file between computers (github.com/jwilberding)
110 points by diginux on Feb 16, 2013 | hide | past | favorite | 92 comments



$ nc -l 8888 > file

$ cat file | nc host.example.com 8888

I don't think a dedicated utility is required.

Edit: Sorry about my comment coming off as a bit hostile, I did not intend it to be.


There are a million ways to copy files, a lot of them are simpler to use than your example.

But that's not the point here. Sometimes you just got to make something because you want to, because it fits a specific need that maybe not a lot of other people have, or because it's a learning experience, or just for the heck of it. So, no, it's not required, but that doesn't mean it's useless.

It's cool that the author did something productive that works for him/her, and it's even cooler they shared it with everyone.

I can say that's already infinitely more than what I made today. How about you?


This solution is worse than the shell hack, by a lot. From the source:

    char filename[MAXNAMELEN];
and then a few lines down...

    filename_size = 0;
    memcpy(&filename_size, buf, sizeof(int));
    memcpy(filename, &buf[sizeof(int)], filename_size);
Oops! Looks like both the client and the server have to be trusted, otherwise we've got at least one probably-exploitable vulnerability. And there's no mechanism for authentication, so it's really only safe to use on locally-secured network. (That took about 40 seconds to find, by the way - I would not be particularly suprised if there were more subtly lurking issues.)

I sympathize with the sentiment of "people should go out and try to create things themselves, even at the risk of failing" (or "especially" at the risk of failing), but from any objective standpoint, bcp isn't a good program. 400 lines of C to badly accomplish what 2 lines of shell can do? Someone else commented about it being very much in the unix spirit - no, I don't really think so. netcat + openssl would be in the unix spirit.

Not meant to be a criticism of the author - it's a cool project, if you don't care about certain "real-world" concerns (which isn't as unreasonable as it sounds).


The comments here remind me of the negative feedback Heather Arthur got when she dared to open-source some code: http://harthur.wordpress.com/2013/01/24/771/. The code is on Github. If you find a bug then why not fix it and send a pull-request instead of the negative public criticsm?


There's a difference between tossing some code on github and having random passersby poke fun at it, and submitting it yourself to HN.


It's a matter of opinion. For me projects (including this one) is news I prefer on HN


How do you do a pull request for "Use the pre-existing shell commands instead"?


Yeah, this is also on my list of todos.

I should have noted somewhere, this isn't ready for production by any means, just a first iteration of an idea I had. It is currently intended to be used on a trusted network.

Thanks for pointing this out though.


When the title says "Quickly copy a file..." it is reasonable to assume that we're talking about something quicker/easier then at least the most obvious ways to copy. But if we're talking about something you made just because you wanted to - it is more appropriate to present it as something like "Yet another way to copy a file..." - or, better yet, clearly state how your way is different from a million existing ways.


Don't forget also the "yet another HN karma or attaboy builder".

(Having spent years doing minor little things that nobody ever would care to hear about (back in the day) I can of course fully understand the positive aspects of the mental process. You learn something and perhaps people leave comments that make you feel good which spurs you on to do better things.)


Haha, that certainly is not the case. I made a tool, found it useful, a few of my other friends find it useful, so I posted it.

Some people voted it up, I can't control that. I really could care less about karma, in fact, the critical comments (except yours) have been very useful and worth the post.

I am just being part of a community.


I made it because with the nc solution, it still required knowing the hostname.

I now see though that ncp (http://www.fefe.de/ncp/) would have probably sufficed, though it is a slightly different approach.

Sorry if you feel mislead.


If you read the `nc` manpage, you'd notice that the -n option on osx and debian disables host lookup and that hostname is defined as

     hostname can be a numerical IP address or a symbolic hostname (unless the
     -n option is given).  In general, a hostname must be specified, unless
     the -l option is given (in which case the local host is used).


those hours of time spent writing the code could have been saved by spending minutes learning the standard way of doing it (and given the ubiquity of nc, its utility would stretch far beyond this use case)

I dont begrudge people reinventing the wheel, but i fully suspect this wouldn't have been written if the author knew about nc.


It wasn't many hours and I know nc and have used it this exactly way. However, nc requires you know the hostname, an assumption I didn't want to make.

I appreciate the criticism though, it is always good to question the investment of time. In this case, I feel comfortable with it. Can't discount the joy of programming little things either.


If you read the `nc` manpage, you'd notice that the -n option on osx and debian disables host lookup and that hostname is defined as

     hostname can be a numerical IP address or a symbolic hostname (unless the
     -n option is given).  In general, a hostname must be specified, unless
     the -l option is given (in which case the local host is used).


You still need to know the IP. I use nc this exact same way, and it's definitely a hassle -- not a big one, but still -- to have to check ifconfig on one of the machines and then re-input the IP on the other. On internal networks where IPs are assigned by DHCP on connection and can and do change frequently this essentially needs to be done every time you send a file.

I'm not saying that it's such a big hassle that we need complicated ways of avoiding it, but this is definitely a cool project that does a away with a (minor) pain point and I for one applaud the author for scratching his itches and sharing with the world, even if his hack isn't perfect.


nc does this with the broadcast addresses (read https://en.wikipedia.org/wiki/Broadcast_address for more info):

Recipient listens on 0.0.0.0:

    recipient$ nc -l 0.0.0.0 6969 > file
Sender broadcasts to broadcast address:

    sender$ <file | nc 192.168.0.255 6969
or

    sender$ <file | nc 255.255.255.255 6969


1. In general, I am on the computer I want to send from, then I go to the computer I want to receive on. This script seems to require I be at the receiving computer first. Is there a trick to do it my way?

2. A very minor point, but you are still required to know (or at least specify) a the filename on the receiving end. My solution does not.


To solve both complaints, just tar -c on the ingress and tar -x on the egress:

    sender$ tar -c file | nc -l 12345
    recipient$ nc addr 12345 | tar -x
This will create a file on the recipient side with the specified name and properties (and you run the sender command first)


Doesn't seem to work for me

sender$ tar -c awesome.jpg | nc -l 0.0.0.0 12345

(waiting forever)

recipient$ nc 255.255.255.255 12345 | tar -x

tar: This does not look like a tar archive

tar: Exiting with failure status due to previous errors


I'm not at a terminal now so can't verify this but it looks like you swapped the send and receive addresses from what was suggested. Try send on broadcast and receive on 0.0.0.0.



Better to do

    $ <file nc host.example.com 8888
(performance benefits more apparent with larger files)


Care to explain why?


(you can find more info in the manpage for your shell)

    $ cat foo | bar
ends up creating two processes and a pipe. The foo file is first read by cat and then written onto the pipe (which bar then reads).

    $ <foo bar
is an input redirection: foo is opened for reading and bar's standard input fd is set to that open file (so the file's data is only read once)


for reading files and transferring over the network: the limit is always going to be the IO devices, not an extra context switch and a few extra copies...


s/always/usually/. Depending on the disks involved and your network hardware (either a super-fast link, or even a system with a slow CPU and cheap network hardware that expects the drivers to do the heavy lifting in software), it actually could end up CPU-constrained.

But for the usual case where it is I/O-constrained, you can often get files across more quickly by throwing CPU at reducing the total amount of bandwidth required by, e.g., using something like bzip or pbzip:

  pbzip2 < file | nc $host $port
And on the other side:

  nc -l $port | pbzip2 -d > file


I'd pipe it through gzip, but yes that's exactly what I was about to reply.

Edit: Just to be clear, that'd be:

    $ nc -l $port | gzip -d > $filename
    $ cat $filename | gzip | nc $host $port


This requires you know the IP/domain of the host. With DHCP on my laptop, this is not always the case. Yes, I know how to lookup my ip, but doing it every time becomes a pain.


Still need to get host.example.com from somewhere.

If you massage the discovery/rendezvous part in, it won't be nearly as simple as a single 'nc' command.


Bleh. I want to copy then paste, not create a pasting port then point a copy at it. And it's twice as much typing, too.


+1 I've been doing this for years for nefarious purposes :)


avoid retyping the file name:

nc -l 8888 | tar xf - tar cf - file | nc host.example.com 8888

Also, unnecessary process spawned with cat: nc.host.example.com 8888 < file


It's neat. I noticed a couple of subtle bugs.

1) The code assumes both the client and the server have the same endian. This can be issue since a uint32 is used for both the synchronization code (BCP_CODE) and the port. This can easily fixed with htonl/ntohl conversions.

2) it assumes both the client and the server define int to be the same size (may or may not be true based on compiler/processor/OS combinations). If you use stdint definitions, this typically won't be an issue.


Both great points. Will add these on my todo list. I appreciate you taking the time to look at the code.


And another, noted here: http://news.ycombinator.com/item?id=5232416

(Again, I don't mean to bash you or this project, just to say that it shouldn't be used in the "real-world" without being looked at rather more carefully. On a locally-secure network, or in an environment where security is unimportant, it's pretty cool.)


Completely agreed, I assumed the project would give the perception that it isn't a fully robust tool yet, but it is always worth to make it completely clear.


Udpcast:

http://www.udpcast.linux.lu/

Can do this and other related useful things, including multicast of a file to N machines in parallel. Don't let the UDP bother you: it implements retry/checksum/etc on top of datagrams.

It's been around a few years, and is probably already in your distro.

It can pipe, as well as copy files, I wrote about it some time ago:

http://kylecordes.com/2008/multicast-your-db-backups-with-ud...


This is great tool and nice article. The only slightly additional requirement for this tool is you know the name of the file being sent. I actually was thinking of making this optional (for the same reasons they require it, to allow many different files).

I've added a link for udpcast to my readme as an alternative. Thanks!


There's also UFTP, it doesn't require the receiver to know the file name. (But it is different from your implementation in a number of ways, most importantly it transfers all data via UDP.)

http://www.tcnj.edu/~bush/uftp.html


And it's multicast, which saves a lot of bandwidth and makes it faster :-)


Nice. Very Unix-ish in its spirit.

The client-server arrangement is backwards though :) Typical arrangement is for the clients to do the "anyone there?" broadcast, for the servers to reply and then the client would select the server, connect to it and they would go about their business. In your case, the server connects to the client. If you re-arrange this to natural client-server order, you should be able to get rid of the fork() call and this will help with portability (not that you probably care at this point).

Also,

  size = fread(&buf, 1, 100, ft)
No harm in using chunks larger than 100, especially when dispensing larger files.

Also, consider switching to multicast for discovery.


I agree with you on it being backwards and for some reason I chose to do it this way, though I can not now remember why.. may have just been a flawed though, as I can see no reason not to do it your way at the moment.

Good find on the fread, that is actually a "bug", should be MAXBUFLEN instead of 100.

Agreed on multicast, I will add that to do the todo.

Thanks for your feedback!


I like how clever this is, but is there an option to name a bcp (or a planned option), so that I can send files among a big network with lots of people using this command concurrently?


Yes, I plan to do this as well as have the ability for the listener to stay active indefinitely instead of for just one transmission.


Rad, especially the persistent mode.


It reminds me of http://www.fefe.de/ncp/ It's always interesting when people come up with the same thing independently.


I couldn't seem to get this compiled on OSX :( Do you (or anyone) know if there's a OSX version available? What I like about ncp is that it first requires an action at the sender (push) and then at the receiver afterwards (poll).


Ah yes, I take a slightly different approach, I don't do polling, only have a listener and a single request sent.

Though, this still looks more robust, thanks for the link.


Try and add support for detecting piped input on the server side:

  bcp < filename
because this would allow for things like

  cat filename | gzip | bcp
with the receiving end doing

  bcp | gunzip > filename
Keep it as simple as it is now, but make it play nicely with other tools.


Your examples are all expected to sort out whether to accept an input stream or start an output stream, but without an explicit command, which poses a problem -- the app would have to determine that the input stream is empty before switching modes to streaming output. That's not so easy -- suppose the operator unintentionally pipes an empty input file? Will the app know what to do?

Most (I won't say all) command-line apps must be explicitly told which mode to use. There's a good reason.


isatty() to the rescue, but that's not to say that there shouldn't be a -- option or defaulting to a different mode depending on argv[0], like gzip does.

[0] http://pubs.opengroup.org/onlinepubs/007904875/functions/isa...


This would be a great addition, I will add it as an issue, thanks!


This reminds me of how you used to be able to disable the cypher on SSH and now you can't. I hate that.

http://serverfault.com/questions/116875/how-can-i-disable-en...


You can use High Performance patches for ssh: http://www.psc.edu/index.php/hpn-ssh

Cipher - none.


Um... why? rsh still builds and works fine if that's what you want.


ssh with no cipher still requires authentication, and can stil use pubkey auth.

So "ssh with no cipher" means "people can snoop, but I can use proper auth". But rsh means "plaintext pw - or worse things if allowed".


Hehe, The first thing I wrote when I learned programming ( around 1990 ) was a tool called ipxcopy which did the same over IPX protocol that we always used at our Lan Parties.


I use (and highly recommend) IP Messenger for this style of auto-discovery of other systems on your LAN and message/file transfer:

http://ipmsg.org/index.html.en (For Windows)

http://ishwt.net/en/software/ipmsg/ (For Mac)


Definitely a interesting take on sharing data over a LAN but I would be worried about the repercussions of using broadcasts to move a large quantity of data on a larger network, cool for smaller/personal networks though.


Looking at the code, UDP/Broadcast is used only for negotiation. After negotation, a TCP socket is used for the actual transfer.


One of the ideas I have since I was a kid, was a graphical way to copy files between computers.

Something like you just drag for the extreme left of right border, and sunddenly, the file is being transferred for the other computer.


If that doesn't already exist for synergy (http://synergy-foss.org/), that would be an awesome addition.


I think that was a planned feature for synergy. I don't think it was ever implemented.


This has some usability issues, e.g. you drag a file accidentally to another box, then change your mind and drag it back while it is still being copied or moved in the original direction. This sort of thing. It looks like a simple and natural extension to what Synergy has, but in reality it's just a can of worms.


I wonder if there is a way they could utilize Dropbox or Dropbox-like functionality?


The example in the README could be better if the command prompt showed that you were on different hosts. As it stands, it seems that you sent it from host 'heisenberg' to host 'heisenberg'.


Different ports? Might as well be different hosts.


Actually, I didn't notice the port. I did notice that the IP and hostname were the same though.


Fair enough, I will change it :)



Great find! I added this link, as well as ncp to the README.


python -m SimpleHTTPServer

is another nice way to do this.


I still have to know the IP of my machine, make a script wrap around python and wget or curl to achieve the bcp functionality.

I actually use SimpleHTTPServer a lot for other things though, it is a great tip.


Zmodem (rz/sz) works if your terminal supports it. Though putty doesn't support it (wish it did.)


Title should be "between computers on a lan".


scp?)


$ scp file host.example.com:


SCP is significantly slower than FTP (and performance was an explicit goal of the project)


This method isn't as fast since it uses encryption.


Yeah few milliseconds are going to make a difference here.


For big files it would make a very noticeable difference actually, try scp with different encryption algorithms even and you will see how much it matters.

With that said, by 'quickly', I was focusing more on easiness and simplicity.


there's a constant size overhead, and file copying is network, not cpu limited (isn't it?). so why would large files be significantly slower?


I am not sure it is fair to say scp is cpu limited per se, but it is slow.

A quick article that may help you out: http://intermediatesql.com/linux/scrap-the-scp-how-to-copy-d...


if you read the article, it's because the faster alternatives use multiple streams and/or use compression. it's got nothing to do with encryption. and you can get scp to compress your data first (-C).


Compression also slows it down: http://www.spikelab.org/transfer-largedata-scp-tarssh-tarnc-...

This has been my experience in practice, too.


This assumes ssh is installed and running.


`bcp` assumes `make` and `gcc` along with a handful of devel headers are installed.


Your intended use is a computer that doesn't have SSH running?


I never run an ssh server on my laptops, no reason to really. Secondly, with ssh, it requires you know ip or hostname, both which I don't want to worry about.


Ah! I had not thought of copying between client computers.

Now it makes sense. And I can see it's use, at home.

Thanks.




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: