Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It'd be a lot more accurate—not to say more honest—to say the author _estimated_ the number of all the yurts in Mongolia using machine learning. ML algorithms are stochastic; their outputs are whatever the algorithm deems the most probable of the options generated from the given inputs. They barely give a thought to all the ways their count could be wrong—no error analysis, no confidence intervals. There's a meaningless prediction score of 40%, and they blithely add "a hundred or so" to the count.

This is anti-information. People reading this uncritically will come away with completely wrong ideas about the number of yurts in Mongolia, about machine learning algorithms, about data science in general.



> People reading this uncritically will come away with completely wrong ideas about the number of yurts in Mongolia

Who is harmed by carrying around a mistaken number for this, especially if they notice the 40% confidence?

As to the rest, I read it as an application of tools for an interesting question, not a comprehensive or authoritative how-to. It’s scaled napkin math, and napkin math is very useful.


Assuming you selectively quoted me in good faith before asking "who is harmed", you should read the whole SEP entry on the ethics of manipulation, and then you should review the works it references.

https://plato.stanford.edu/entries/ethics-manipulation/

But to answer you directly:

- Whoever hires the author for their software engineering or data science expertise in part because of this blog post will pay for substandard work.

- By deceiving their audience as to the accuracy and precision of the demonstrated techniques, the author undermines the audience's ability to make good decisions about when to use or how to reason from the results of machine learning algorithms.

- The author disrespects their audience when they misrepresent themself, their work, and their results.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: