Meh. I'm pretty ambivalent about voluntary restrictions like robots.txt. As far ...

mikelabatt · on April 26, 2017

Yes, but should your computer also be allowed to disseminate that content without the original author's permission?

upofadown · on April 26, 2017

A robots.txt file is not any sort of license (anti-license?). It's existence has no bearing on the question of if the IA should be allowed to do what it does. It is only intended to provide helpful information to web crawllers.

* http://www.robotstxt.org/norobots-rfc.txt

true_religion · on April 26, 2017

Well yes, the whole point of robots.txt is that it's impolite to refuse to follow it, and doing so and getting caught might get you banned from the site.

SyneRyder · on April 26, 2017

That's a really good point. I'm not a fan of Internet Archive ignoring robots.txt, but if I'm really unhappy about it I can block their robot in Apache using .htaccess rules (as long as they continue using archive.org_bot as their user agent).

shouldbworking · on April 26, 2017

How is this any different from a human doing the same? The internet is meant to be open, it's free information after all. If you don't like it put your stuff behind a login

mikelabatt · on April 26, 2017

I see it as a subtle difference of automation and scale, plus the fact that the Internet Archive is not just saving these copies, but also making them available.

Imagine standing on the public road and taking a picture of your neighbor's home (or face) for your own use. Is that the same as a large company taking pictures of all homes (or faces) of the world, and making them available to the entire world, forever?

shouldbworking · on April 27, 2017

Tons of companies take satellite views of the entire earth and they're freely available online.