Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I quite liked "How to win a data science competition" (https://www.coursera.org/learn/competitive-data-science) where I learned a lot about validation strategies and machine learning on tabular data. The course has its own Kaggle competition.

I also really liked "Discrete optimization" (https://www.coursera.org/learn/discrete-optimization). At the time that I took it it also had a competitive element where you would solve optimization problems and there was a leader board comparing all the students in the current batch. That was when courses still started in batches and were free so the experience would probably no longer be the same, unfortunately.



> I quite liked "How to win a data science competition

As a machine learning researcher I am on the one hand glad that folks are learning more about the topic. On the other hand, this is totally the wrong approach and it will teach you the wrong lessons.

The idea that you can just treat data as a uniform dump of tables and that grinding your way to high numbers is somehow worthwhile is simply terrible. The resulting systems won't work well in the real world and they produce horrific explanations of what is going on. This class teaches you not just the wrong tools, like boosting, it teaches you the wrong mental model.

I really can't think of a worse introduction to ML than this class. Even not knowing anything would actually be better.


Interesting. I definitely would not recommend this course as an only course in machine learning or indeed as an introduction, and I see where you’re coming from with the wrong mental models. I can’t be sure that I do have the right ones but I have taken a number of other courses as well and my sense is they’re ok.

My main takeaway from the course was definitely not that just grinding away for higher numbers is the right thing to do (but it might be a necessary evil in a competition context). The key thing I learned here was much more about paying very close attention that your validation strategy and your testing strategy are compatible because there are many ways you can mess it up, making your models valid in-sample only. Most of the other things I had done before were also more around SVMs and neural networks and getting some experience with decision tree based algorithms was interesting.


ok - what is the alternative?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: