The simplest spam filtering algorithm would be a naive bayes filter. It's essent...

gus_massa · on July 27, 2020

I used SpamBayes a few years ago http://www.spambayes.org/ (Is the project dead now?) (It has a PSF licence https://en.wikipedia.org/wiki/Python_Software_Foundation_Lic... https://en.wikipedia.org/wiki/Comparison_of_free_and_open-so...)

The nice part is that SpamBayes gives you two numbers, the spam "probability" and the ham "probability". When one of them is very close to 1 (like > .99) and the other is very close to 0 (like <.01), there is a good chance that the message is really spam or ham. And this classify almost all the messages. But from time to time you get a message where the numbers are not so clear, or both are big or both are small, and this means the classifier is confused and you really must take a look at the message.

audiometry · on July 28, 2020

Wow when this came out (I think this was the ‘original’) it felt quite ground breaking. Perhaps early 2000s it was?

Then google started doing that or something similar at scale and effectively eliminated spam in my mailbox ever since. (With the curious recent exception of some highly similar bitcoins spams)