Hacker News new | past | comments | ask | show | jobs | submit login

> Absolutely nothing has to obey robots.txt

And absolutely no one needs to reply to every random request from an unknown source.

robots.txt is the POLITE way of telling a crawler, or other automated system, to get lost. And as is so often the case, there is a much less polite way to do that, which is to block them.

So, the way I see it, crawlers and other automated systems have 2 options: They can honor the polite way of doing things, or they can get their packets dropped by the firewall.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: