> This is almost weirder. If LinkedIn wanted to force users to sign in to view profile info, would they be not allowed to do that because some company had signed a contract that implicitly assumed access to that data? If someone writes a web scraper for my site, and I unknowingly change my site in a way that breaks that scraper, can a court force me to revert the change?
LinkedIn has long wanted to have their cake and eat it too - they advertised that data as being publicly accessible and allow google to index specific user pages but then attempt to restrict other bots from crawling it.
If you have private data behind a login there isn't an issue here - if you have public data but want some people to login before viewing it (or not be able to view it) then that's where this ruling comes up. So, this mostly hits sneaky SEO folks and dark UX patterns that rely on tempting someone with accessible data and then pulling the rug out from them at the last minute.
If your website places data outside of authentication then everyone should be able to see that data... I'm curious to see the specifics around
> Surely I'm not reading this correctly. This would seem to suggest that websites are not legally allowed to prevent bots from crawling their sites. Lots of sites have ToS preventing such things, are those legally void now? Are captchas on public pages illegal, even if you request the page 8000 times in a second?
though - DoS attacks are clearly illegal, but with this precedent there's going to be a lot of back and forth to see where the line between DoS and scraping falls... and I think that makes this precedent a lot weaker than the headline would have you believe. A company can still threaten to drag you through a lot of litigation by accusing you of malicious page requests, it'll take a few cases to define where that line needs to fall.
This reminds me about Twitter, when I click to see a thread for a tweet it asks me to login, but if I open the link in a new tab it loads the thread just fine.
LinkedIn has long wanted to have their cake and eat it too - they advertised that data as being publicly accessible and allow google to index specific user pages but then attempt to restrict other bots from crawling it.
If you have private data behind a login there isn't an issue here - if you have public data but want some people to login before viewing it (or not be able to view it) then that's where this ruling comes up. So, this mostly hits sneaky SEO folks and dark UX patterns that rely on tempting someone with accessible data and then pulling the rug out from them at the last minute.
If your website places data outside of authentication then everyone should be able to see that data... I'm curious to see the specifics around
> Surely I'm not reading this correctly. This would seem to suggest that websites are not legally allowed to prevent bots from crawling their sites. Lots of sites have ToS preventing such things, are those legally void now? Are captchas on public pages illegal, even if you request the page 8000 times in a second?
though - DoS attacks are clearly illegal, but with this precedent there's going to be a lot of back and forth to see where the line between DoS and scraping falls... and I think that makes this precedent a lot weaker than the headline would have you believe. A company can still threaten to drag you through a lot of litigation by accusing you of malicious page requests, it'll take a few cases to define where that line needs to fall.