Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> older comment of mine on cassandra: https://news.ycombinator.com/item?id=20430925#20432564

Wow, that's surprisingly bad

For real, I'm not against java, but I find time and again that those services that depend on the JVM have lots of "weird stuff". Not because Java is bad but Java fans (which you usually are when you start to develop something this big) think a lot of this crap is acceptable.




In my (extensive) experience in infrastructure. When people say the JVM is the problem, the JVM is never the problem. It's usually just a symptom of something else and lazy ops people just want to blame something and throw up their hands. I've never had to "tune a JVM" to make things work.

I'll give you an example in Spark. We had a huge job that was failing and after checking the logs, it was when results were being spooled to disk. More log diving showed a lot of GC on every node. At that point we could have gone down the route of tuning something in the JVM, but more digging found the real culprit. IOstats when the jobs ran showed while reading data, writes were completely blocked and write latency was in the 100s of ms. The spark executor trying to dump data was blocked and the first symptom that things were falling apart was... GC. We changed the scheduler on the nodes and magically everything worked great. The VM in JVM is virtual machine. Same rules for resources apply and if you run out of resources, don't expect the mythical ops faeries to save you.


Can you expand on which scheduler you swapped out? OS? Spark? Something else?


Sure I can answer both. The OS scheduler CFQ is generally bad for high volume disk applications. Used deadline in this case. Amy's Guide is still a treasure trove of info and a solid recommended read: https://tobert.github.io/pages/als-cassandra-21-tuning-guide...

If you are using Spark in a bare metal cluster with spark-submit, the above advice applies. In Kubernetes, never use the default pod scheduler with analytic workloads. Great choices are Volcano(https://volcano.sh/en/) and Yunikorn(https://yunikorn.apache.org/). Also great and evolving projects to support and contribute if you can.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: