Hacker News new | past | comments | ask | show | jobs | submit login

I called out the notion that anyone ought to be depending on the defaults in the first place

Regardless of whether you should depend on them, many many people do. Heck, many people don't even understand that there's a choice to be made: after all, everyone knows that Oracle is ACID compliant, right?

Yes, many don't support SERIALIZABLE. Didn't contradict that either.

Sorry, I was confused by the bit about "If your workload needs serializability, set it" since that's physically impossible on Oracle 11g.

Just because a DB has a replication package available (like MySQL) does not mean that it is a distributed system. And the file backed DBs (like Berkeley) are definitely not distributed. Sure, there are some extremely expensive massively parallel DBs in use (like Volt), but the number of deployments for those systems is a drop in the bucket compared with single-node MySQL/Postgres/Oracle/SQLServer/DB2 instances.




Good points on both sides. One thing I'll point out is that many clustering, HA, and multi-master replication solutions either rely on a single master (what I think Michael means when he says they're not distributed) or don't provide serializability.

Things get harder when you're distributed. For example, if you cluster with a sharded master-slave configuration, then, to serialize transactions that span shards/partitions, you'll need to do 2 phase commit or similar between masters for writes and, for most read/write transactions, make sure you don't read from slaves. If you cluster via master-master/active-active, then, for serializability, you'll need locking or some other concurrency control across masters for each shard/partition. Both setups are definitely do-able (if not highly available) but require non-trivial engineering.


"Regardless of whether you should depend on them, many many people do. Heck, many people don't even understand that there's a choice to be made: after all, everyone knows that Oracle is ACID compliant, right?"

Ignorance is not an excuse. It just isn't.

"Just because a DB has a replication package available (like MySQL) does not mean that it is a distributed system. And the file backed DBs (like Berkeley) are definitely not distributed. Sure, there are some extremely expensive massively parallel DBs in use (like Volt), but the number of deployments for those systems is a drop in the bucket compared with single-node MySQL/Postgres/Oracle/SQLServer/DB2 instances."

I do not understand what point you are trying to make here. Are these systems de-facto non-distributed systems merely because of deployment counts? Is there an objective criteria here I should be aware of?


Ignorance is not an excuse. It just isn't.

I'm not interested in excusing anything, I'm interested in understanding the real world. And what I see is that a lot of people who insist on the absolute need for ACID don't really understand it because they're using non-ACID technology right now.

I do not understand what point you are trying to make here.

Consider MySQL with asynchronous replication to a slave. In a weak sense, that is a distributed system because the remote slave is on a different machine (and probably very distant) from the master. But the distributed bit here doesn't interfere with correctly implementing ACID: MySQL with async replication operates identically to single node MySQL: it just transmits the binary logs to a slave server. The database system itself is a single-node service whose state gets replicated at transaction boundaries.

In contrast, distributed shared-nothing databases like Volt have to work really hard to maintain consistency: there is no single node in those systems that does all the work and gets replicated: multiple nodes have to cooperate in order to get anything done.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: