> Intermittent workloads on the order of 2 milliseconds, you mean? Yea. Most of ...

lucb1e · 2025-02-20T18:43:56 1740077036

Huh, that is interesting, thanks for clarifying and going so far as sharing benchmark examples! I will try this on my own hardware as well and edit in the results (though I'm far from running into performance limitations on the old laptop that I use for hosting various projects, it could still be something to tune when I run some big task with lots of queries)

Maybe for completeness, what CPU type is this on?

anarazel · 2025-02-20T19:16:23 1740078983

> I will try this on my own hardware as well

FWIW, my results corresponded to:

  cpupower frequency-set --governor powersave && cpupower idle-set -E

  cpupower frequency-set --governor performance && cpupower idle-set -E

  cpupower frequency-set --governor performance && cpupower idle-set -D0

It's perhaps worth pointing out that -D0 sometimes hurts performance, by reducing the boost potential of individual cores, due to the higher baseline temp & power usage.

> Maybe for completeness, what CPU type is this on?

This was a 2x Xeon Gold 5215. But I've reproduced this on newer Intel and AMD server CPUs too.

> (though I'm far from running into performance limitations on the old laptop that I use for hosting various projects, it could still be something to tune when I run some big task with lots of queries)

If you're run larger queries or queries at a higher frequency (i.e. client on the same host instead of via network, or the client uses pipelining), the problem doesn't typically manifest to a significant degree.

lucb1e · 2025-02-21T00:22:04 1740097324

Thanks, also for providing the ready-to-use commands!

I did four tests on my "server" with an Intel i7 3630QM CPU. Pseudocode:

    - Test 1, simple benchmark: `php -r '$starttime=microtime(1); while($starttime+1>microtime(1)){$loops++} print($loops);`, running in parallel for each real CPU core (not hyperthread)
    - Test 2A, fast queries: time `for ($i=1..1000){ $db->query('SELECT ' . mt_rand()); }` (localhost, querying into a different container)
    - Test 2B: intermittent fast queries: same as above, but on each loop it sleeps for mt_rand(1,10e3) microseconds to perhaps trick the CPU into clocking down
    - Test 3, ApacheBenchmark command requesting a webpage that does a handful of database queries: `ab -n 500 https://lucb1e.com/`, executed from a VPS in a nearby country, taking the 95th percentile response time

Governor results:

The governor makes no measurable difference for the benchmark and serial queries (tests 1 and 2A), but in test 3 there's a very clear difference: 86-88 ms for the 95th percentile versus 92-95 ms (ran each test 3 times to see if the result is stable). CPU frequency is not always at max when the performance governor is set (I had expected it would be at max all the time then). For test 2B, I see a 3% difference (powersave being slower) but I'm not sure that's not just random variation.

Idle states results:

Disabling idle states has mixed results, basically as you describe: it makes the CPU report maximum frequency all the time, which, instead of making it faster, seems to make it throttle for thermal reasons: the benchmark suffers and gets ~20 instead of ~27 million loops per core per second, while sensors shoot up from ~50 to ~80 °C. On the other hand, it has the same effect on web requests as setting the governor to performance (but I didn't change the governor), and on test 2B it has an even bigger impact: ~11% faster.

---

I'll have to ponder this. My first thought was that my HTTP-based two-way latency measurement utility should trigger the performance governor for more reliable results, but when I test it now here on WiFi (not that VPS with a stable connection as used in the test above), the results are indistinguishable before or after the governor change; the difference must be too small compared to the variability that WiFi adds (also on 5 GHz that can stably max out the throughput). My second thought is that this might give me another stab at exploiting a timing side channel in database index lookups, where the results were just too variable and I couldn't figure out how to make my work laptop set a fixed CPU frequency (the hardware or driver doesn't support it, iirc, as far as I could find). I was also not aware that there are power states besides "everything is running" (+/- frequency changes), "stand by / suspend", and "powered off". This 2012 laptop has 6 idle states already, all with different latency values, and my current laptop 9! Lots to learn here still, and I'm sure I'll think of more implications later

I've set things back to idle states enabled and governor powersave, since everything ran great on that for years, and I expect to keep using that virtually all the time. But now that I know this, I'll certainly set it to performance to see if it helps for certain workloads (timing side channels which already work fine may become more reliable if my CPU runs more predictably). Thanks for making me aware :)