This has been happening already. The market is trying really hard to price out w...

altfredd · on July 25, 2023

> The market is trying really hard to price out web scraping... scraping is becoming non-existent in user-space apps

Uhh... Those two matters are pretty much unrelated to each other. Scraping is becoming non-existing because the era of static web pages has ended. No need to "scrap" when you have a nice, performant JSON REST API provided for you.

flagrant_taco · on July 25, 2023

SSG vs SSR really has nothing to do with whether an API exists to provide the data you would otherwise need to scrape.

When was the last time you saw a site with a JSON API providing metadata, like the json-ld for a product on an e-commerce site? Or an API just for the open graph data? How would you even discover these APIs for sites that you don't own?

It's also worth noting that very, very few JSON APIs today are actually REST. They rarely include all the context needed, and in general JSON is much less useful than XML when you're talking to other APIs that you don't own since JSON can't easily describe the shape and datatypes of the content.

wraptile · on July 26, 2023

> No need to "scrap" when you have a nice, performant JSON REST API provided for you.

There are no performant json rest APIs provided these days though. The days of public APIs are long gone.

altfredd · on July 26, 2023

HTML "APIs" weren't meant for public either.

In practice, if there is a mobile app, there is an API. Whether it's creators object to your usage is mostly their own problem.

CalRobert · on July 25, 2023

But of course search engines are fine

wraptile · on July 25, 2023

Having your cake and eating it too is a natural goal of every business and honestly it was just a matter of time till web pages figured out they can have the benefits of public data and avoid the costs. Web scraping and botting is basically a solved problem too - just put a login gate for the data which allows you to legally litigate against scrapers and bots. Done. However, nobody wants to lose the benefits of public data so here we are.

CalRobert · on July 25, 2023

I used to care about respecting robots.txt until it was clear that established search engines are fine but any newcomers can go right to hell.