Hacker News new | past | comments | ask | show | jobs | submit login

I don't think you need a statistics class to do performance testing.



You change an optimizer pass. Looks good on a microbenchmark. But then you try it out on some samples of real code. Turns out, it probably makes some real code a little bit faster. No difference on other real code, but the data are noisy, so it's hard to tell. And on one piece of code, it seems to actually cause a regression - but again, the data from multiple runs are noisy.

Should you enable the optimizer change by default, or not? Or do you still need to collect more data? How much more data? What data - more runs or more different code samples? How confident do you want to be, and how confident can you be?

These are questions you will face in your real day to day work, and a few statistics courses will be incredibly helpful to you in answering them.


I knew someone would say something like that, but I've never seen that sort of thing done personally and I really doubt lack of statistical knowledge is even close to the biggest obstacle to writing faster software. For micro-optimizations so subtle you need fancy techniques to even tell they work, non-quantitative factors (code impact, will it enable other optimizations, etc.) are more likely to be decisive. Techniques to reduce noise are either non-statistical (warming up caches) or unsophisticated (average many trials, best of three).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: