Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

According to a conversation I had with Paul Buchheit (paul here at HN) in the past, GMail spam filters are heavily reputation based, which leads me to believe that Bayesian filtering is not a first line of defense, and likely would be specific to each mailbox rather than universally applied. So, if it's happening to everyone, then it's a universal rule at GMail that's doing the filtering, and thus I would guess not heavily based on any Bayesian results. But I might be wrong, GMail spam filtering may have changed dramatically since paul worked on the project, or I may have simply misunderstood his explanation.

But pg did invent the most powerful recent spam fighting technique. I'm not sure why one would say the description provided in the essay isn't the same as inventing it? Certainly Bayesian analysis existed long before the essay, but I think it's safe to say that the inventor of the airplane was no less its inventor because the internal combustion engine existed before.



"But pg did invent the most powerful recent spam fighting technique. "

While it may be the case that PGs essay was many peoples first exposure to baysian spam filters, he didn't invent the technique.

The first time I heard of it was in a paper published in 1998 at Microsoft research. This predates 'A plan for spam' by 4 years. http://research.microsoft.com/~horvitz/junkfilter.htm


I think I knew about that and completely forgot about it.

I just don't pay as much attention to the spam problem as I once did a few years ago--it seems to be reasonably solved for me. I got over 600 spam messages to my primary address yesterday, and only 3 made it to my mailbox...and all I do is run SpamAssassin with almost entirely default settings and auto-white/black listing (which is Bayesian). Now, the primary things that motivate me are making the system more efficient with regard to resource usage, rather than more effective (though being more effective is also good). Some of our customers still seem to have problems, but I've not really figured out why SpamAssassin works so poorly for them and so well for me.


Maybe Gmail thinks it's one of these:

http://paulgraham.com/firstwatergatesltd.html


Good point wrt inventing. Wanted to be on the safe side, but that makes complete sense!




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: