> (Because the Wayback Machine starts from scratch each time?) I don't know what...

pbhjpbhj · on April 25, 2018

I'd imagine they're in a dodgy copyright situation and so guard against it by being conservative wrt robots.txt.

The robots.txt shows a positive assertion that parts of a site should be excluded from being used by automated systems.

In most cases I imagine WBM does not have permission of the owner to keep a duplicate of the site, it's certainly tortuous in UK law.

Sites that don't change their robots.txt are probably highly correlated with sites that don't sue for the infringement.