Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

We run hundreds of GlusterFS clusters in production on EC2. We're currently on 3.0 and in the process of fully migrating to 3.4 (and maybe 3.5 one day).

Our primary use case for Gluster is in serving persistent filesystems for Drupal. Our customers store potentially millions of files on their GlusterFS clusters.

We've built a number of tools/processes to help protect Gluster against failures in EC2 (for instance fencing network traffic at the iptables layer to help protect GlusterFS clients from hanging talking to down nodes), as well as to help our team perform common tasks (resizing clusters, moving customers from cluster to cluster, etc.). We haven't necessarily hit blocker issues recovering from underlying hardware fails, but our team is definitely very experienced with many possible failure modes.

Overall GlusterFS has been very reliable over the years and our research has shown it is the best option out there for when our customers can't use something such as S3 directly.

If you want more details or would love to hack on a 8000+ node EC2 cluster running things such as GlusterFS feel free to ping me.



At 8k+ nodes, bare metal would likely be much cheaper. Why would you possibly want to host this on EC2?!

Disclaimer, I've been at companies with thousands of servers for my past 3 jobs. I've never once used any "cloud" service other than Linode for my personal VPS and ganeti for the Oregon open source lab VM donated to the gnome foundation (I'm a sysadmin alum for gnome.org)


Just out of interest why don't you use S3 for this? Amazon provides a few options for scalable storage, is it cheaper to roll your own ontop of EC2?


Many of our customers do use S3 and we make use of S3 extensively ourselves. However, Drupal often expects to operate on a POSIX compatible filesystem. Drupal 7 does support PHP file streams which can be configured to use S3, but not every Drupal module follows the best practices. Plus, we support every flavor of Drupal under the sun (including custom code).

All of our enterprise customers receive a highly available setup running on multiple nodes--thus, we have the need for a persistent filesystem attached to multiple EC2 instances. We utilize GlusterFS to ensure all of our clients have the filesystem capabilities their apps may need.


Basically, they need EBS without the single-instance (writable) mount limit, if they're using it to sync Drupal installations (php files, temp folders, etc.) across multiple web servers. S3 isn't really intended for that kind of workload (it would be possible to use something like s3fs, but I don't think that will give you a POSIX-compatible file system which is required for Drupal, and it's probably too slow as well (lots of small files, etc.))




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: