i just don't get why people use web scraping as a battleground for moral ethics
its bizarre just like equating copyright infringement to theft of property.
where does this moral high ground come from? nobody scraping is thinking "oh im so evil im scraping without respecting robot.txt and using residential ip addresses to bypass detection"
Google does it nobody has a problem but when the little guy does it suddenly they are an outlaw.
Historically, when Google did it, they did it to create an index, which a lot of people found useful as a way to find information they were looking for. This used to mean people would come and visit your website, where they could engage with the website creator directly through a variety of different means.
Google doing it now to digest all the content and mulch it all together to return a regurgitated form of it is a very different proposition, and that is what people are annoyed about when "the little guys" (funny name for startups with multiple billions of dollars of raised capital) are doing the same thing.
For many it's not about "moral ethics", it's about actual survival. If nobody is visiting their website, nobody is buying their products or engaging with their community or whatever.
If you're scraping content for no other purpose than to mechanistically reword it for commercial purposes, then it's not really surprising that people have issues with it.
You're taking someone else's labor and profiting off it, without any credit or compensation. To add insult to injury, the person you're scraping pays money to support your traffic. It's a one sided transaction.
You can’t generalise that. Maybe I crawl to provide an annotated preview of their website, to make users of my application more likely to click the link and visit it? There are lots of ways in which crawling benefits everyone, it just requires some mutual respect.
its bizarre just like equating copyright infringement to theft of property.
where does this moral high ground come from? nobody scraping is thinking "oh im so evil im scraping without respecting robot.txt and using residential ip addresses to bypass detection"
Google does it nobody has a problem but when the little guy does it suddenly they are an outlaw.