Yeah, it does seem like progress has plateaued considerably. The leaps from GPT 2 to 3, 3 to 4, and 4 to 5 shrinks with each one, with 5 being particularly disappointing.
I, with no evidence, feel like GPT-5 was an efficiency release. Save as much power/compute while mitigating the quality loss leaving only the top model (using similar compute as previous models) to show real improvement.
We should remember that Moore’s law was not just about the number of transistors but also the unit cost. GPT-5 works like any modern CPU with both power and efficiency cores.