Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Rather than block on UA, just add some honeypots. An invisible link. Any bot that pulls that page gets blocked as scrapers tend to pull all links from the page and follow.

Use the robots.txt to ban the pulling of specific pages. Bots 99% of the time ignore robots, so if they pull it: block

Check how quickly pages are pulled. If passes a threshold: block




Yes, using honeypots is one of the ways to identify bots. But that wasn't the focus of the post. I'll add some clarification.


I've seen bot traffic claiming to be recent versions of Firefox from residential IPs in the Ukraine pulling robots.txt. Sometimes this is one of the few clues to go on.


I'm pretty sure the point of the article was performance testing of C#, not best practices for banning bots...




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: