If you're already using Git for version control, why not also deploy from it? I personally use it to simultaneously push to GitHub and production and it works very well:
I deploy everything with git. With branches and tags, that also lets me deploy different states of a given project to different servers (development, staging, production). Rolling back or switching between tags is easy (migrations aside).
I usually have a few files that I don't want on the server, or vice versa. The option --exclude-from=ignore-files is useful to maintain a list of files not be synced.
The file ignore-files would contain the keywords or files to ignore, one per line.
-a is already recursive, so -r is unnecessary. For most sites, it would also be wise to include --delete so renamed or locally deleted files would be removed on the destination.
* "-C" to exclude all common revision-control working directory files. Lowest filter priority, consider the rest of your args.
* Look up backup rsync guides and add snapshot or "--backup-dir" support to your deploys for a cheap alternative to system snapshots and other deployment reversion techniques.
* "-E" and "-X" for when you just want to apply execute settings and not blow away whatever custom permissions you may have on your deployed files ("--chmod" is also helpful here)
* "--delete-after" and "--delete-excluded" to clean up files you've deleted from your source files.
* "--timeout=120 --contimeout=60" Because nobody needs to wait 5 minutes to find out their deploy failed.
* "--compress-level=9"
* If you have a huge deploy tree, it may be faster (though less reliable) to run
This will make a list of only the past 7 days' worth of modified files to attempt to rsync, which may cut down on the total time to deploy considerably.
* "--log-file=/big/partition/deploy.log" To get a better idea of how the deploy is going or how it went.
* Consider if you need --delay-updates to make the deploy more atomic. If files get deployed a few at a time, will this cause the user experience to suffer? Load as well as long file lists can cause long periods between updates.
* If you have more than one developer that can deploy to the site at the same time, consider that rsync should only be your file-transfer tool. You need a whole other layer to account for transaction locking, who is deploying or reverting files, and basic logging is very handy to track down problems. But that's a bit out of scope for this link :)
I cringe every time I hear something like this called a "deployment". This is a lazy hack, not a deployment.
You can't deploy with ftp/rsync/put_your_own_tool_here sync.
Well, you can, kind of, but you better not.
A "proper" deployment must be (at least):
* Completely automated. Should be just a simple command line.
* Atomic. As in all the files are changed at once.
* Easily revertible.
If your are deploying a db-backed application, deployment process also must manage db versioning (including possibility of rollbacks).
All of the above is trivially provided by Vlad or Capistrano. And since they are super easy to use even for non Ruby projects, I fail to see the reason to keep using ftp or rsync other then laziness.
I've seen some people using git for deployment, which is better then ftp/rsync, but still lacks in atomicity and rollbacks can be quite tricky.
We just do subversion checkouts. Of course, we have the proper .htaccess rules set up to prevent access to the .svn folders. Subversion also allows us to commit database exports straight back into the version control system from the server.
We're talking about deployment to production machines - what's the difference between 3 seconds and 5 minutes? Your machine is sitting on a fat pipe in a datacenter, where you can push a 300MB deploy (which is a LOT) trivially. I highly doubt you're rolling out code changes every 10 minutes... not to production anyway.
Sending diffs removes your ability to roll back at all, or if you're somewhat devious about it, it still wouldn't allow you to roll back more than a single version.
I'm surprised you're so concerned about time-to-deploy for an action that happens, what, once a day at most? Twice? The amount of safeguard you gain for the trivial non-human-time (it's not as if some guy is sitting there copying files) increase is pretty massive.
As other people mentioned, if you send the diff by, say, pulling from your git repository, then you can roll back to any tag or revision.
What happens when your 300MB deploy needs to be pushed from your development machine out to 15 different production servers? I wouldn't want to send ~4.3GB for every deployment. Furthermore, a quick deployment time is valuable when a crucial fix needs to be pushed to production.
This is not to mention the space that your 300MB deploy is going to take up on the hard drives of your production machines if you deploy twice a day. That's over 213GB a year per machine, sounds like you would need to start deleting old revisions. Or start just sending the diffs.
We've been putting our code in rpms and it works well. Every box knows exactly what's deployed right now (along with dependencies and whether anything is tweaked), local edits to config files can be preserved, and "ssh yum install" gets a box from clean to production ready in under a minute.
It is a handy tool for dealing with versioning and rollback. But i'd hate to be the guy who deploys a hot fix and suddenly rpm is out of locker entries.
Renaming local edits out of the way or preserving them is up to you, depending on whether you put %config or %config(noreplace) in the spec. It does need to be handled carefully, but it can be handy if you want one machine giving special treatment to a representative sample of your requests or something.
I use this little bash script (as a daily cron job) to selectively mirror directories on two hard drives in case I lose one (very poor man's backup solution). I imagine it could be used for deploying code (modified for remote machine of course):
#!/bin/sh
RSYNC="/usr/bin/rsync" # Verify with 'which rsync'
DIRS="/home/xich /etc" # Directories to be backed up.
TARGET="/mnt/secondhd/backup" # Directory into which all backups are placed.
# For instance, if TARGET is /mnt/secondhd/backup and DIRS is "/home/user1 /home/user2"
# then after running, there will exist a /mnt/secondhd/backup/user1 and
# /mnt/secondhd/backup/user2. DO NOT use trailing slashes for any of these paths,
# as that will change the behavior of rsync.
LOGFILE="/var/log/mirror_hds.log" # For errors only.
for dir in $DIRS; do
INEX=""
if [[ -e "$dir/.mirror_include" ]]; then
INEX="--include-from $dir/.mirror_include"
fi
if [[ -e "$dir/.mirror_exclude" ]]; then
INEX="$INEX --exclude-from $dir/.mirror_exclude"
fi
$RSYNC -av --delete $INEX $dir $TARGET &> $LOGFILE
done
The .mirror_include and .mirror_exclude files are just newline-delimited lists of file masks (they do what you would expect). I did it like this so each user can modify his own exclusion/inclusion lists (the file belongs to the user), and doesn't have to mess with the cron script (which belongs to root). Inclusion takes precedence. As an example, my exclude file:
.* # any file that starts with a period
tv # my tv shows directory
movies
And my include file:
.mirror_include # since this would be filtered out above
.mirror_exclude
.conkyrc
A good point, but I believe that most recent versions (since around 2004, I think) of rsync now use ssh by default, so the -e flag is not needed to use ssh.
I do this too, and it's very convenient. Especially if you have multiple websites.
I actually put the rsnyc command in a Makefile so I can update by typing `make`.
I have a folder in my ~ directory which contains a folder for each of the remote machines I work with. I can update any remote machine by going into its folder and typing `make`.
We recently changed our deployment model from an all Capistrano model to a Capistrano plus rsync model. Our old model would take 20 to 45 minutes to update every server in our farm. Our new model updates one staging / deployment server via Capistrano then rsyncs to each of our production nodes. This process normally takes less then a minute and spikes to a few minutes for really large changes. We still have the ability to rollback code and this model is actually quicker at fixing bad deployments then our old model was.
A majority of our time before was spent updating both our codebase then checking external dependencies on each server. Rsync made this process much quicker since these checks are completed once then pushed to each server.
rsync works really well. It's what we use at our company. One trick we have considered implementing is pointing the HTML root to a symlink that points to the current release. When you rsync, create a new directory named after the version. After you have verified the transfer, flip over the symlink to point to the new code base. Doing it like this will make quick roll backs easy and protect you from interrupted transfers or users hitting your site mid upload (it has happened... we spotted an anomaly in the error logs and our "wtf"s per minute shot through the roof).
I love rsync, use it for deployment all the time. However, it is quite a pain to set it up for Windows. All the Windows rsync ports I know of run on top of Cygwin and Cygwin changes the permissions of the files to the Windows equivalent of 000 all the time. It is possible to get around this but not entirely straightforward. I'm seriously considering writing a step-by-step quick guide for setting up an rsync daemon on Windows as it is not trivial at all and it might save some time for other people.
#!/bin/bash
# run in the project root
git add -A # track added files
git commit # local commit
git push # push to remote bare repository
ssh REMOTE_HOST 'cd PROJECT_FOLDER; git pull' # expand a working folder from the remote bare repository; usually public www folder is contained there
extra benefits: two copies of complete history for backup local and remote
I have a public www folder too if I only want to open up part of the whole repo. But since there is usually no sensitive data in a public-facing repo anyway, I don't care if people can access .git or not. They may clone it if they want! :D
I use this for simple, single server, small codebase sites.
But for various reasons (mainly rollback, history and consistency(rsync takes time, time in which your site has files from different versions) I much prefer the "schlep entire codebase into new directory and then switch symlinks when your ready to go live". Such as Fabric, Capistrano, or your custom deploy scripts do.
"schlep" is the technical term for scp/rsync/checkout/etc.
Springloops (http://www.springloops.com) hosts my Subversion repositories and can also be set up to deploy each repository to a set of servers, either manually or automatically upon each commit.
We use rsync to deploy all our front-end code/files.
It is pretty robust, but, syncing multiple files is not an atomic operation across the batch as a whole. So if one file fails you do get an err report. But the other files would have been synced. Once you fix whatever the problem is and you re-run your script, that one file will then be synced.
In practice, rsync is really great for deploying.
It is also neat for backups - in fact, before I got my Mac and started using Time Machine, I used rsync for backups (and in fact used rsync.net for offsite backups).
The man page and docs in the source detail how it handles these different circumstances. The short answer is "very robust" and there are options to customize how it behaves during partial or total success/failure.
http://stackoverflow.com/questions/279169/deploy-php-using-g...