3.5x on a normal distribution with mean 100 and SD 15 is pretty insane. But I agree with your point, being 26% better at a certain benchmark could be a tiny difference, or an incredible improvement (imagine the hardest questions being Riemann hypothesis, P != NP, etc).