Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

They seem to have had this problem in the past and decided to skirt around it by ignoring robots.txt a year ago[0]. Does anybody know what happened to revert this decision?

[0]https://blog.archive.org/2017/04/17/robots-txt-meant-for-sea...



As per that post, they only ignored robots.txt for .gov and .mil sites.


IA disallow in robots.txt will still block archive.org, the blog post was about ignoring parts that were meant for search engines.


Yes, but it also says

> We are now looking to do this more broadly.

That's the part I'm asking about.


Right, but it doesn't mean they reverted it, they are probably still looking into it.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: