Thank you. This is an excellent list. Can you also point us to resources that ha...

graycat · on Nov 7, 2019

The world is awash in texts and papers on applied statistics. My notes were an introduction to provide a start on what is needed to read more in applied statistics.

For much of engineering, e.g., quality control, the statistical hypothesis testing I outlined is common.

There are now classic texts on hypothesis testing:

E. L. Lehmann, Testing Statistical Hypotheses, John Wiley, New York, 1959.

E. L. Lehmann, Nonparametrics: Statistical Methods Based on Ranks, ISBN 0-8162-4994-6, Holden-Day, San Francisco, 1975.

Sidney Siegel, Nonparametric Statistics for the Behavioral Sciences, McGraw-Hill, New York, 1956.

The last is also just hypothesis testing; apparently the book is still referenced.

From some statistical consultants inside GE, there is

Gerald J. Hahn and Samuel S. Shapiro, Statistical Models in Engineering, John Wiley & Sons, New York, 1967.

For machine learning, so far the approach appears to be cases of empirical curve fitting.

The linear case of versions of regression analysis remain of interest. There the matrix theory I outlined is central. And regression analysis goes way back, 100 or so years, and is awash in uses of statistical hypothesis tests for the regression (F-ratio), individual coefficients, (t-tests), and confidence intervals on predicted values.

For using probabilistic and statistical methods in the non-linear fitting, that is more advanced and likely an active field of research.

In my own work, I use and/or create new statistical methods when and as I need them. My current work has some original work in, really, applied probability that also can be considered statistics. And when I was doing AI at IBM, I created some new mathematical statistics hypothesis tests to improve on what we were doing in AI. Later I published the work. Parts of the machine learning community might call that paper a case of supervised learning, but the paper is theorems and proofs in applied probability that give results on false alarm rates and detection rates, that is, statistical hypothesis tests (both multi-dimensional and distribution-free). So, that paper can be regarded as relevant to machine learning, but the prerequisites are a ugrad pure math major, about two years of graduate pure math with measure theory and functional analysis, a course in what Leo Breiman called graduate probability, and some work in stochastic processes. At one point I use a classic result of S. Ulam that the French probabilist Le Cam called tightness -- there is a nice presentation in

Patrick Billingsley, Convergence of Probability Measures, John Wiley and Sons, New York, 1968.

The work was for anomaly detection in complex systems so could be regarded as for both engineering and machine learning. But I can't give all the background here.