Hi, Alan. Regarding CAP, I think that given redundant cluster interconnects, red...

jbellis · on Nov 25, 2013

Ah, the myth of the sufficiently redundant network.

http://pl.atyp.us/wordpress/index.php/2010/10/when-partition...

http://kellabyte.com/2013/11/04/the-network-partitions-are-r...

mtravis · on Nov 25, 2013

Redundant components and pathing, properly implemented and managed, are what allow enterprise storage arrays, mainframe clusters, Tandems, and the like, to operate 24x7 for years on end. Their myth seems to work.

vidarh · on Nov 26, 2013

Yes, but not without addressing what should happen when split brain occurs.

mtravis · on Nov 26, 2013

Well, split brain means two parts think they're both active. At that point, nothing can be done (short of manual intervention), because both parts are faulty, and likely, a bunch of others are failed.

But, to avoid split brain in a failover circumstance means that a replica won't come up unless it gets a majority of votes. There's no way for any other replica to also get a majority of votes, therefore only one replica will come active. No split brain. This, of course, assumes proper operation of the cluster management protocol.

(this is CP operation)

ithkuil · on Nov 26, 2013

Indeed, there's a lot of misunderstanding around this aspect.

The strength of the eventually consistent systems doesn't lie in the fact that it guarantees consistency during network partitioning, but that it maintains availability in face of partitioning. Even parts of the cluster that have been cut-off from the quorum can operate, in various degrees of degradation, ranging from being able to respond to stale queries, or also accept writes whose consistency is later resolved (for example with vector clocks or Commutative Replicated Data Types, see http://highscalability.com/blog/2010/12/23/paper-crdts-consi... or http://basho.com/tag/crdt/)

If I understood it correctly, InfiniSQL isn't trying to solve the problem of providing backend capacity for parts of your cluster that are currently partitioned, assuming that you can minimise the likelihood of this event to happen. If a network partition happens in the cluster, it's also very likely that all services in that partition will not be able to serve transactions, hence there is no much to gain from a system that is able to accept writes or perform stale reads without quorum.

On the other hand there are other workloads, like batch processing, that might benefit of being able to continue operating during a network partition without loosing big parts of processing capacity.

vidarh · on Nov 25, 2013

> configured properly.

... and there is where it falls apart. Sooner or later, "someone" is going to do something Incredibly Dumb that is going to take down a lot of nodes.

If you are betting that you can just add enough redundancy split brain can't happen, I have to question why I should take you seriously with important data.

(It also indicates to me this is going to be ludicrously expensive to set up though, but that's another issue)

mtravis · on Nov 26, 2013

Redundancy, quorum protocol, proactive testing, rigorous QA.

And actual 24x7 environments with important data are ludicrously expensive. I expect InfiniSQL to be less expensive since it's based on x86_64, Linux, is open source. But yeah, hardware and environment need to be right.

I'm not sure what I said wrong, but I'm nowhere claiming that split brain (or any failure scenario) can be ruled out entirely in all circumstances--but in practice, split brain is avoided 24x7 for years on end in many different architectures.

It's not magic.

Just like you can't have "enough" storage redundancy. You can have a 100-way mirror of hard drives that will still lose data if you lose 100 disks in less time than somebody replaces one of them.

DanWaterworth · on Nov 26, 2013

Congratulations, you have convinced me never to use your database. You can't ignore CAP, because machine failures, momentary high latency and bona fide network partitions happen.

spof3 · on Nov 26, 2013

>>Show me the numbers

https://amplab.cs.berkeley.edu/benchmark/

cmccabe · on Nov 26, 2013

I work at Cloudera (although not on Impala). I work on HDFS.

Impala was about creating a low-latency SQL engine for Hadoop, so that queries could be done interactively by a human at a keyboard. This is something that you don't really get with Hive (despite all the recent hype) because it simply is too slow, and has high startup costs. That's unlikely to change in the future because of the overhead of spinning up JVMs, starting MapReduce jobs, etc.

It seems like you are trying to target the OLTP market. That's a difficult market to crack. A lot of the value of things like Oracle and Microsoft SQL server is not in the database itself, but in the surrounding software. Performance is nice, but unless you can get orders of magnitude, it's very difficult to compete.

Will anybody ever bridge OLTP and OLAP? The last people who claimed to be trying to do that were Drawn to Scale, and we all know how that turned out. I think it's better to focus on doing one thing well.

Good luck.

gnaritas · on Nov 25, 2013

> Regarding CAP, I think that given redundant cluster interconnects, redundant managed power, odd # of cluster managers for quorum, all mean that split brain is just about out of the question, configured properly.

Failure is never out of the question, you either plan for it or you suffer from it. CAP applies.