Hacker News new | past | comments | ask | show | jobs | submit login

You can also do well relying on the OS scheduler and networking stack by forking off many rsync processes using GNU parallel or xargs.



Which is how I've done it in the past - I'm sure these days there's utilities that will do it for you, but I had a bunch of perl code that would fork off N threads out of a queue and as one exited successfully, kick off another worker.

The issue with xargs back in the day was that you might need to run several hundred rsync processes, and suddenly launching 500+ processes in parallel made your server very very sad. So you needed some basic job queueing system.


$ man xargs

--max-args=max-args

-n max-args

Use at most max-args arguments per command line. Fewer than max-args arguments will be used if the size (see the -s option) is exceeded, unless the -x option is given, in which case xargs will exit.

...

--max-procs=max-procs

-P max-procs

Run up to max-procs processes at a time; the default is 1. If max-procs is 0, xargs will run as many processes as possible at a time. Use the -n option with -P; otherwise chances are that only one exec will be done.


This was also...14?-ish years ago. Sometimes on Solaris 2.6 or 8 boxes. I am pretty sure xargs didn't have that flag back then (which is why I said, "back in the day").




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: