Here's my verdict in MongoDB after using it for a few years in a very high traffic application:
I personally love most concepts behind Mongo. Schemaless, simple replication and sharing, JavaScript console, just to name a few. MongoDB is pretty much my dream database.
However, the ugly truth is that it doesn't work as advertised (this was 2-3 years ago). Replication would often crash or stall leaving us with only one option: DELETE OUR ENTIRE DATA SET and start fresh. Acknowledged by 10gen at the time. This happened 3 times.
Mongo was fast until we hit a wall. That was was at around 200GB for us, which is not that much when you put things in perspective. It was ridiculous. Running a given query took 1 ms and running it again right after could take up to 15 seconds. You'd think the second time would be faster.
The global write lock is just insane. This simply cannot work in a high traffic application. Our app was write heavy and this bit us in the ass quite a bit. We didn't feel this until much later, after we were fully invested in mongo.
Lastly, 10gen was literally the worst tech company I ever dealt with. They we're plain disrespectful to me, calling me cheap for not wanting to shell out 100k/year on a support contract. I was willing to pay them their hourly support rate (which is very high, but that's fine) to have them fix their replication bugs with our dataset.
For us, MongoDB was a terrible mistake that I'll never make again. I'm glad the truth has been coming out over the last couple of years.
Replication is indeed not as advertised and neither is sharding. MongoDB hands off a lot of important details of the replica failover to the drivers which implement them to varying degrees of quality. We had several instances where there was a node that was still up but had degraded network or other real life failure cases that were handled very badly causing the DB to become unavailable. The driver would attempt to contact the node on every connection due to it still being in the RS causing timeouts It handled a node becoming completely unreachable OK but that did not happen that often.
I think they just added too many features too fast to get adopted widely. It appears now too they are less focused with improving the actual DB and more interested in getting in the enterprise services area. They market themselves as a general purpose DB but it is certainly not. I think many developers were turned on by the initial ease of use, and really did not consider if they should really be using a document store. We used it 100% due to the replication which does not work often.
They refused to give us per incident support because they figured we could afford the yearly support plan (this was post acquisition at a public company)
Although I don't think I've hit 200gb (well, at least without blobs), I really like using couchdb as an object store. To get around limitations in CouchDB views, I usually just have ElasticSearch index the data in near 'real-time'.
I now think it actually makes a fair bit of sense to have storage and indexing happen in different applications, because it allows you to tune their performance and add/remove servers to the clusters separately.
MySQL or PostgreSQL. Exactly what we did at my new startup.
We actually did something that's quite unpopular here but it was the right choice for us.
We started with Postgres and eventually ran in all sorts of performance problems after hitting a certain scale. We realized that nobody in the company knew anything about PG and that good consultants were extremely hard to find.
On the flip side, everybody had good to excellent knowledge of MySQL and we happen to have a friend who's one of the best MySQL guys in the world. Pretty handy.
We switched from PostgreSQL to MySQL and couldn't be happier. I think the lesson for us was: go with what you know and master, and with what's easy to fix, rather than what people on hackernews tell you to use.
Curious - what made you decide MongoDB over PG/MySQL in the first place? Was it merely gut (as indicated below) or was it more of a "people say this is good, it fits our use case, so let's try it out"
> We switched from PostgreSQL to MySQL and couldn't be happier. I think the lesson for us was: go with what you know and master, and with what's easy to fix, rather than what people on hackernews tell you to use.
We had a giant table that constantly needed schema changes. With older versions of MySQL, adding a column meant many hours of downtime. And our business required 100% uptime. It wasn't a good fit. The promise of no schema sounded awesome.
MySQL 5.6 no longer has this limitation. Actually, pretty much everything we didn't like about MySQL has been fixed in 5.6. We kind of decided to give it another chance and we've been happy.
To be fair this is an article that takes the opinions of employees of Rackspace, specifically, to deliver it's "verdict" which isn't a verdict so much as a collection of pluses and minuses. More importantly, Rackspace offers a Mongodb service (which they plug at the end of the article) so take their opinion with a grain of salt.
Kind of like a Chevrolet dealer writing an article offering a "verdict" on the Chevy Volt that consists of opinions of Chevy engineers and mechanics and ending the article with a link to their showroom selection. Not invalid but not objective and not really a verdict.
Schemaless databases are such an awful concept for any data you actually care about. After 12 years in the industry, I've realized any natural controls (e.g. schema) you can place on maintaining the integrity of your data, the better off you'll be long-term.
Data matters the most. Your UI will be a tired POS in 3 years. Your backend data store won't scale for your new business problems. It'll be time to migrate, and then you'll start by asking yourself how clean and well-managed your data is.
Maybe it makes sense to use MongoDB when you're just getting started, before you fully understand the data, so it's easy to add new attributes/columns etc.
But why even bother with MongoDB? You could just use a text editor to make bunch of JSON text files, with one record per file.
Then, once you better-understand the data, and you haven't made many "schema" changes for a while, you can set up a SQL database and import everything.
One better: use a SQL database with a PK, and a text field containing your JSON data. If your DB is Postgres, it'll even let you parse & query the JSON on the DB side.
I personally love most concepts behind Mongo. Schemaless, simple replication and sharing, JavaScript console, just to name a few. MongoDB is pretty much my dream database.
However, the ugly truth is that it doesn't work as advertised (this was 2-3 years ago). Replication would often crash or stall leaving us with only one option: DELETE OUR ENTIRE DATA SET and start fresh. Acknowledged by 10gen at the time. This happened 3 times.
Mongo was fast until we hit a wall. That was was at around 200GB for us, which is not that much when you put things in perspective. It was ridiculous. Running a given query took 1 ms and running it again right after could take up to 15 seconds. You'd think the second time would be faster.
The global write lock is just insane. This simply cannot work in a high traffic application. Our app was write heavy and this bit us in the ass quite a bit. We didn't feel this until much later, after we were fully invested in mongo.
Lastly, 10gen was literally the worst tech company I ever dealt with. They we're plain disrespectful to me, calling me cheap for not wanting to shell out 100k/year on a support contract. I was willing to pay them their hourly support rate (which is very high, but that's fine) to have them fix their replication bugs with our dataset.
For us, MongoDB was a terrible mistake that I'll never make again. I'm glad the truth has been coming out over the last couple of years.