Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I was in the same boat in 2014. I went a more traditional route by getting a degree in statistics and doing as much machine learning as my professors could stand (they went from groaning about machine learning to downright giddy over those two years). I worked as a data scientist for an oil-and-gas firm, and now work as a machine learning engineer (same thing, basically) for a defense contractor.

I’ve seen some really bad machine learning work in my short career. Don’t listen to the people saying “ignore the theory,” because the worst machine learning people say that and they know enough deep learning to build a model but can’t get good results. I’m also unimpressed with Fast AI for the reasons some other people mentioned, they just wrapped PyTorch. But also don’t read a theory book cover-to-cover before you write some code, that won’t help either. You won’t remember the bias-variance trade-off or Gini impurity or batch-norm or skip connections by the time you go to use them. Learn the software and the theory in tandem. I like to read about a new technique, get as much understanding as I think I can from reading, then try it out.

If I would do it all-over again I would:

1. Get a solid foundation in linear algebra. A lot of machine learning can be formulated in terms of a series of matrix operations, and sometimes it makes more sense to. I thought Coding the Matrix was pretty good, especially the first few chapters.

2. Read up on some basic optimization. Most of the time it makes the most sense to formulate the algorithm in terms of optimization. Usually, you want to minimize some loss function and thats simple, but regularization terms make things tricky. It’s also helpful to learn why you would regularize.

3. Learn a little bit of probability. The further you go the more helpful it will be when you want to run simulations or something like that. Jaynes has a good book but I wouldn’t say it’s elementary.

4. Learn statistical distributions: Gaussian, Poisson, Exponential, and beta are the big ones that I see a lot. You don’t have to memorize the formulas (I also look them up) but know when to use them.

While you’re learning this, play with linear regression and it’s variants: polynomial, lasso, logistic, etc. For tabular data, I always reach for the appropriate regression before I do anything more complicated. It’s straightforward, fast, you get to see what’s happening with the data (like what transformations you should perform or where you’re missing data), and it’s interpretable. It’s nice having some preliminary results to show and discuss while everyone else is struggling to get not-awful results from their neural networks.

Then you can really get into the meat with machine learning. I’d start with tree-based models first. They’re more straightforward and forgiving than neural networks. You can explore how the complexity of your models effects the predictions and start to get a feel for hyper-parameter optimization. Start with basic trees and then get into random forests in scikit-learn. Then explore gradient boosted trees with XGBoost. And you can get some really good results with trees. In my group, we rarely see neural networks outperform models built in XGBoost on tabular data.

Most blog posts suck. Most papers are useless. I recommend Geron’s Hands-On Machine Learning.

Then I’d explore the wide world of neural networks. Start with Keras, which really emphasizes the model building in a friendly way, and then get going with PyTorch as you get comfortable debugging Keras. Attack some object classification problems with-and-without pretrained backends, then get into detection and NLP. Play with weight regularization, batch norm and group norm, different learning rates, etc. If you really want to get deep into things, learn some CUDA programming too.

I really like Chollet’s Deep Learning with Python.

After that, do what you want to do. Time series, graphical models, reinforcement learning— the field’s exploded beyond simple image classification. Good luck!



This is the correct progression IMHO. I can tell you’ve been in industry because it mimics my experiences.

Always start with a simple model and see how far you can get. Most of the improvements I’ve seen comes from “working the data” anyway. You will be surprised how much you can improve model performance just by working the data, or improving the quality of the underlying data alone. Also simple models give you a “baseline”. What is the point of reaching for neural networks if you don’t have a baseline performance metric to compare against? XGBoost is a godsend. It trains extremely quickly and is surprisingly difficult to beat in practice.

As you say, constantly sharpen your saw with regards to probability theory and mathematics in general. There is simply no way around this in the long run.


/Thread

Excellent detailed advice! This is THE roadmap for ML study.

PS: While many of us may not have the time/resources for a graduate course, one can absolutely get the mandatory theoretical ideas from books/courses/videos/etc.


Impressed with your response, thanks for the clarity you have presented through your examples. Once again, thanks a lot.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: