Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I manage a few thousand client sites spread across a bunch of differing infrastructures but one common thing we have on all of them is all the specific robots rules to block ai crawlers.

Not a single one of them respects that request anymore. Anthropic is the worst at the moment. It starts hitting the site with an anthropic useragent. You can litterally see it hit the robots file, then it changes it's useragent to a generic browser one and carries on.

I say at the moment because they're all doing this kind of crap, they seem to take it in turns at ramping up hammering servers.



> Not a single one of them respects that request anymore

AI companies have seen the prisoner's dilemma of good internet citizenship and slammed hard on the "defect" button. They plan to steal your content, put it in an attribution-removing blender, then sell it to other people. With the explicit purpose of replacing human contribution on the internet and in the workplace. After all, they represent a reality-distorting amount of capital, so why should any rules apply to them?


I say at the moment because they're all doing this kind of crap...

Maybe they used AI to code their agent, and it's just not that good.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: