The Linux scheduler: A decade of wasted cores (2016)

rob74 · on Nov 4, 2022

Since the article is more than 5 years old, it would be interesting to find out what became of their efforts. The GitHub repo linked in the article (https://github.com/jplozi/wastedcores) was last active in December 2017, and apparently contains some bugfixes, however with the caveat "The provided patches fix the issues encountered with our workloads, but they are not intended as generic bug fixes. They may have unwanted side effects and result in performance loss or energy waste on your machine." Did this result in any scheduling bugfixes that actually made it into the Linux kernel?

dang · on Nov 4, 2022

The Linux Scheduler: A Decade of Wasted Cores - https://news.ycombinator.com/item?id=11570606 - April 2016 (38 comments)

The Linux Scheduler: A Decade of Wasted Cores [pdf] - https://news.ycombinator.com/item?id=11501493 - April 2016 (142 comments)

GrumpySloth · on Nov 4, 2022

Are there any standard benchmarks for POSIX systems, facilitating objective comparisons between schedulers? When I was reading the dinosaur book, I was continually impressed by the elegance of solutions used in Solaris, scheduler being one of the highlights. However, a sense of elegance may be poorly calibrated, and I’d like to look at some hard data about how Solaris (today Illumos) stacks up against BSDs and Linux. Sadly I’m not knowledgable enough about operating systems to write a good set of such benchmarks myself, so I’d prefer to lean on the expertise of smarter people.

Symmetry · on Nov 4, 2022

From years of reading Phoronix articles, scheduling is generally one area where Linux really shines compared to other OSs. There are particular workloads where someone does better but not overall. And many of the problems described in this article are complaints about Linux trading off what's best for HPC users against approaches that are better on servers or user devices. Like, the overload-on-wakeup behavior is absolutely what you want on anything battery powered even if it hurts in TPC-H.

bombcar · on Nov 4, 2022

Some of those tradeoffs are made by the distributions - the kernel has (always) offered various schedulers but you have to pick one.

zokier · on Nov 4, 2022

> the kernel has (always) offered various schedulers but you have to pick one

Umm, mainline kernel has had only CFS scheduler available for the past 15 years. Sure, there are some out of tree options available, but with those comes the common problems of using out of tree patchsets.

bombcar · on Nov 4, 2022

Huh, maybe my kernels have always had patchsets, because I always get an option to change the scheduler (but never do).

mirashii · on Nov 5, 2022

Actually the history of how pluggable schedulers came to be in the kernel is a fascinating one, and one I recall watching unfold in the mid-2000s. There were out of tree schedulers and a pluggable scheduler implementation put forward by Con Kolivas before Ingo introduced the CFS patch, and a lot of frustration that pluggable scheduler patch sets were rejected up until that point.

chasil · on Nov 4, 2022

I do know this for sure:

'So it's not just "better", it's "Better" with a capital 'B'. Nothing else out there comes even close. The Linux dcache is simply in a class all its own.' -Linus Torvalds

https://www.tag1consulting.com/blog/interview-linus-torvalds...

I do realize that the dcache is not a direct relation to the scheduler (but will certainly impact it), but I trust that performance enthusiasts will go to great lengths to extend Linux's top benchmarks in TPC and elsewhere.

It has also not been widely reported that a) Oracle posted a top TPC-C score shortly after acquiring Sun, running on 11g/Solaris SPARC 10, and b) OceanBase has now beaten that by an order of magnitude.

To see both the Oceanbase and Oracle 11g/Solaris scores, historical benchmarks must be enabled:

https://www.tpc.org/tpcc/results/tpcc_results5.asp?print=fal...

TristanBall · on Nov 5, 2022

If your looking at the results I am, I see a system with with 28x the CPU's being 23x faster, after 10 years of cpu development. And substantially more expensive in total costs too? Are we looking at the same thing? Did I get the math wrong? ( always a possibility ). Yes, it's a much bigger topline number, but it doesn't seem very impressive given all the infrastructure differences?

trap_goes_hot · on Nov 4, 2022

Individually, every performance benchmark will test against a defined/repeatable workload. If you think about it, if you don't benefit from the performance improvements, does it really matter to you? And if it is noticeable to you, what metrics are you using to determine that? Once you narrow that down, it will be easy to come up with a workload to compare them.

bear24rw · on Nov 4, 2022

> the dinosaur book

Do you have a link?

fuckstick · on Nov 4, 2022

https://www.google.com/search?q=dinosaur+book+programming

antonios · on Nov 4, 2022

I'm now wondering how many decades are still being lost because of similar bugs in other OSes that don't get as much scrutiny, like OpenBSD or even FreeBSD.

josteink · on Nov 4, 2022

I'm not going to say scheduling is better or worse on different platforms, but it is clearly different.

When I tried to port the (at that time) new, open-source version of .NET Core to FreeBSD, one of the things which I simply couldn't fix in the .NET framework code itself was threading. For one, I had to (for some reason, don't remember now) use non-posix threading-functions to make it compile. But even with that in place, things weren't behaving as expected.

I mean... Threading worked, but .NET had a fairly big test-suite which was very opinionated about what sort of behaviour and performance characteristics different kind of threading-scenarios and threading-primitives should have.

On FreeBSD I was forced to extend time-outs and outright disable some tests to make the build pass.

wongarsu · on Nov 4, 2022

Sounds like that test suite could be a good starting point for a scheduling benchmark comparing Linux, FreeBSD, MacOS and Windows.

dathinab · on Nov 4, 2022

Not necessary, the problem is similar as what can be seen with garbage collection (latency vs. throughput).

For example if you give more smaller time slices to threads then you have better latency but worse throughput as it means more work when switching the time slices and more cache invalidation.

.NETs test suite is tuned for Windows. Windows is focusing more on desktop use-cases and is more tuned for lower latency then throughput on the other hand FreeBSD is mainly for servers so their scheduler is more tuned for throughput. This difference could very well explain the failure in the test suite.(Independent of weather there is a bug or not.) To test what I think it does test you have to be very thigh about the expected latencies, thigh enough to make the test suit fail if used on a more throughput optimized system.

Similar on Linux in some distros you have an alternative official kernel for media applications (e.g. gaming) which changes kernel parameters to be a bit more latency focused. E.g. linux-zen in case of arch linux.

xxpor · on Nov 4, 2022

Similarly, I know DragonFly BSD focuses on speed (making the kernel as non-blocking as possible, thread-per-core type stuff), but is there a comparison of the scheduler with FreeBSD's?

Symmetry · on Nov 4, 2022

There's always cross OS performance tests like this one

https://www.phoronix.com/review/bsd-linux-eo2021/2

dunno7456 · on Nov 4, 2022

I think the new Nest scheduler may help if not fix a lot of the issues reported, or? Anxiously waiting it to land!

https://www.phoronix.com/news/Nest-Linux-Scheduling-Warm-Cor...

hyperman1 · on Nov 4, 2022

I liked the morning paper, and hope it becomes active again in the future. I learned a lot from reading the mails every day on the train.

jeffbee · on Nov 4, 2022

Bummer, I did not even realize it had been over a year since their last, until you point it out. I also hope for a return.

insanitybit · on Nov 4, 2022

Same, I've really missed it. I'd love for someone to do something similar.

chasil · on Nov 4, 2022

I'm wondering if Exasol is the TPC-H contender in question, as they lead the upper categories by wide margins.

"...and a 14-23% decrease in TPC-H throughput for a widely used commercial database."

https://www.tpc.org/tpch/results/tpch_perf_results5.asp?resu...

arkj · on Nov 4, 2022

Better to add the year 2016 in the title.

dang · on Nov 4, 2022

Added. Thanks!