Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Probably the people who operate archive.is just purchased subscriptions for the most common newspaper sites. And then they can use something like https://pptr.dev/ to automate login and article retrieval.

I guess the business model is to inject their ads into someone else's content, so kinda like Facebook. That would also surely generate more money from the ads than the cost of subscribing to multiple newspapers.



> Probably the people who operate archive.is just purchased subscriptions for the most common newspaper sites. And then they can use something like https://pptr.dev/ to automate login and article retrieval.

I would expect to see login information rather than "Sign In" and "Subscribe" buttons on archived articles then. Unless they're stripping that from the archive?


Exactly. It also would not be difficult for website operators to embed hidden user info in their served pages, thereby finding out the archive.is account. This approach seems risky for archive.is.


They could just copy the div with the content over to evade detection of the website’s owner


> Probably the people who operate archive.is just purchased subscriptions for the most common newspaper sites. And then they can use something like https://pptr.dev/ to automate login and article retrieval.

I wouldn't be surprised. IIRC, the whole thing is privately funded by one individual, who must have a lot of money to spare.


I don't think anyone knows who runs archive.is. I've tried looking into it a couple of times in the past but there is surprisingly little information to be found. It must cost thousands if not tens of thousands a month to host all that data and AFAIK they do not monetize it in any way. From what I gather it probably is some Russian person as there were some old stackoverflow conversations regarding the site that lead to an empty github account with a russian name. Also back in 2015 the site owner blocked all Finnish ip addresses due to "an incident at the border"[1]. Finnish IPs have since been unblocked. It appears the site owner somehow thought he could end up in EU wide blacklist which seemed like very conspiratorial thinking from him.

1: https://archive.is/Pum1p


When I visit each page has three ads, Left right and bottom. Maybe you have an ad blocker?


Would it be possible to check if archive.is is logged into a newspaper site by archiving one of the user management pages?


Negative. I used to assume this as well, but they somehow also bypass local paywalls which have gotten me temporarily banned from r/Baltimore lol.

They can somehow even bypass the Baltimore suns paywalls, and I doubt they have subscriptions to every regional paper, could they?


Wait, you got banned from /r/Baltimore for posting archive.is links there? That's against the rules there? I would not have known that myself! (Also a Baltimorean).


I even tried to convince them to be in the mindset that Paul Graham created Hacker News to get more mindshare on YC. He gave the idea of Reddit to the 3 brilliant Ivy League founders who applied to YC with a basic GMAIL extension I think that copied emails or something.

So I tried convincing them that if it’s okay here on PG’s creation, then it should be okay on his other creation.


Yeah! Hahah

I thought knowledge was free, and the Baltimore sun sucked anyways. They charge money and don’t even write hood stuff anymore. They laid off a bunch of people, and moved printing to Delaware. My bet is the next step is that they announce they are shutting down all Locust Point operations, and are selling out so that Kevin Plank can build some new buildings there.

I think I had to appeal my ban with a mod, and they mentioned how it’s posted all over by the auto bot that sharing links to websites that bypass paywalls are against their subreddit rules :(

I even tried an official proposal to r/Baltimore to reconsider and life that rule. The general consensus on the poll was that people felt that the Baltimore sun and the writers should be getting paid for their work, and I shouldn’t be bypassing their paywalls lol.


You did it ONCE and got banned?

I still can't find anything in the subreddit rules that clearly says this. (Not that most people read the rules first). Why don't they just add it to the rules?

This is one of the things I dislike most about reddit, it seems to be common to ban people for a single rules violation of a poorly-documented unstated rule.

My main problem with reading the Sun online is it has so much adware that my browser slows to a crawl and sometimes crashes when I try to read it!


But is it true? What evidence is there?

This is a plausible explanation but is it true?


So scihub but for newspapers




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: