In the 90's I was an architect on Intel's Willamette (Pentium4) thermal throttle (TT1). TT1 "knocked the teeth" out of clock cycles if the checker-retirement unit (CRU, the hottest part of the die) got too hot. This evolved into TT2/Geyserville (where you move up/down the V/F curve to actively stay under the throttle limit). We were browbeaten by upper management to prove this would not visibly impact performance and worked on one of the MANY MANY software simulators written throughout the company to prove this. (It was actually my favourite job there.) This is when the term "Thermal Design Power" arrived: top marketing brass to avoid using "Max Power" which was far higher. It is possible to have almost a 2x difference between max power (running a "power virus", which intel was terrified of from chipsets, to graphics, to CPUs) and what typical apps use (thermal design power). Performance was a bit dodgy on a few apps, but not a significant compared to run-to-run vairation. (Remember this is 1995-1997 after the half-arsed Pentium fiasco in 1993 when Motorola openly mocked intel for having a 16W CPU... FDIV wasn't thermal fiasco, but it was a proper cock up).
Die are sorted based on something called a bin split: die are binned immediately after wafersort based on their leakage (there are special transistors implanted near the scribe-lines that indicate tons of characteristics, as well as DFX units through out the die that are rings of 20 inverters that oscillate, also indicates tons of data on how the die behave, however testing the those buggered DFX circuits takes an enormous amount of time, and you can't slow down wafersort, so there are proxies).
The bins are designed in such a way to maximize profit and performance based on the die characteristics. Thermal throttle plays a role in this and each bin (among various vectors) is allowed some tolerance, which is exactly what OP has discovered. However, this has been going on for coming up on 30 years! So nothing really new here, I just thought I'd let you know that of course Intel is aware of this, and they never claim performance numbers outside of the tolerance allowed for thermal throttle.
Die are sorted based on something called a bin split: die are binned immediately after wafersort based on their leakage (there are special transistors implanted near the scribe-lines that indicate tons of characteristics, as well as DFX units through out the die that are rings of 20 inverters that oscillate, also indicates tons of data on how the die behave, however testing the those buggered DFX circuits takes an enormous amount of time, and you can't slow down wafersort, so there are proxies).
The bins are designed in such a way to maximize profit and performance based on the die characteristics. Thermal throttle plays a role in this and each bin (among various vectors) is allowed some tolerance, which is exactly what OP has discovered. However, this has been going on for coming up on 30 years! So nothing really new here, I just thought I'd let you know that of course Intel is aware of this, and they never claim performance numbers outside of the tolerance allowed for thermal throttle.