Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

there is simple solution for gentlemens:

robots.txt should allow to exclude all AI crawlers and AI crawlers should be forced to add "AI" to their crawl user agent headers and also respect robots.txt saying they can't crawl this website

right now we need to do this:

User-agent: *

Disallow: /



Nice. How do we force them to respect robots.txt?


they respect robots.txt at least major ones like meta, claude, google, openai, based on my infra observations robots.txt is enough in 90%, 10% is just banning ip ranges for couple of days but those are no AI companies




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: