To me it seems all the work of the same spammer(s). In such a case, do some manu... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		contem on Sept 18, 2018 \| parent \| context \| favorite \| on: Combating blog article theft by delaying RSS feeds To me it seems all the work of the same spammer(s). In such a case, do some manual intelligence and wrap it up. It won't scale to all forms of spam, but if a simple regex can uncover 250k+ results in 10 minutes, a manual spam fighter can still block millions of pages a day (and warn the webhost, remove these flakey ads from their networks, etc.). No doubt the recent machine learning hype has given spammers more advanced tools to avoid detection.

scrollaway on Sept 18, 2018 [–]

False positives are far more problematic than false negatives...

contem on Sept 18, 2018 | [–]

If you remove from index... sure. But for that URL that I posted, do you think there is even a single false positive in there?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact