I'm building distributed ml on top of spark and found it to be good overall. I'v...

I'm building distributed ml on top of spark and found it to be good overall. I've had to work out issues with partitioning and mini batching, but I've had a good time so far. The data frame initiative will certainly help things. The JVM ecosystem needs a scientific environment like python (pandas,scipy,..). The potential is there with scala as we're seeing here today.