The problem presented by services like ReCaptcha and Cloudflare is a tough nut t...

hombre_fatal · on Jan 21, 2020

Yeah, you can't really talk about downsides of Recaptcha/Cloudflare without also acknowledging the extreme amount of malicious actors and abuse on the internet.

We're in a "this is why we can't have nice things" predicament and you have malicious actors to thank for that, yet most people on HN only seem capable of attacking the few affordable solutions to that problem.

I'm even down with the theory that Cloudflare is a US government outfit, that's the only way I can wrap my head around such a generous free tier. But at what point does it worry you that the internet has so many fundamental issues that people willingly centralize behind such a large behemoth? How many options do I have when a kid is holding my forum hostage with a $5 booter service?

It's easy to shit on everything. Let's hear some real solutions.

danShumway · on Jan 21, 2020

> Let's hear some real solutions.

It's by no means a full solution (there likely is no single full solution), and it may even be a bad solution -- but lately I've been trying to think about what the Internet would look like if we didn't have a massive arbitrage potential around server requests.

Part of the reason why everyone is trying to detect bots is because bots will very, very rapidly eat up your bandwidth and CPU time. We're used to offering our bandwidth/CPU for free to humans and either swallowing the cost if we're running a free service, or making up the cost in an adjacent way (ads, subscriptions, etc...). It's not bots that are the problem. It's that when someone asks our servers to do something, we do it for free. Bots are just a big category we can ban to make that problem smaller.

In many (but not all) cases, we shouldn't care about bots, and the only reason we do is because our systems aren't scalable to that level.

So I've been wondering lately what a server-defined per-pageload, or even per-request fee would look like on the Internet, maybe one that scaled as traffic got heavier or lighter and that was backed by a payment system that wasn't a complete rubbish privacy-disrespecting dumpster fire.

My immediate thought is, "well, everything would be expensive and inaccessible." But, the costs don't change. You still have to pay server costs today. Businesses today still need to make that money somehow. There are almost certainly downsides (all our current payment systems are horrible), but I wonder if it's more or less efficient overall to just be upfront about costs.

Imagine if I could put up a blog on a cloud service anywhere with scalable infrastructure. Then a post goes temporarily viral. Imagine if my server could detect it was under heavy load, detect that it was getting hit by bad actors, automatically increase the prices of requests by a fraction of a cent to compensate, and then automatically ask my provider to scale up my resources without costing me any extra money?

For a static site, suddenly I don't need to care if people or bots are hammering it, I don't need to care about anything except whether each visitor/bot is paying for the tiny amount of hosting costs they're hoisting on me. If bad actors start pushing traffic my way, I don't need to ban them. I just force them to pay for themselves.

tjoff · on Jan 22, 2020

> Part of the reason why everyone is trying to detect bots is because bots will very, very rapidly eat up your bandwidth and CPU time.

It is?

Thought bot detection was only done during registration etc. to stop them from sending spam etc. to real users.

If anything the javascript world we live in helps combat this. You need insane resources on the client just to have a page open. Several orders of magnitude more than the server need to generate and send that page.

AstralStorm · on Jan 22, 2020

In that case, an IP or IP block throttling is good enough.

Except then there are those pesky CGNATs to handle including Chinese Great Wall.

Anyway, high profile spammers will emulate enough of the browser to render any measure based on browser anomaly detection worthless. Including using a headless browser. The only way to defeat them would be too put some quite computationally intensive JS operation... (On par with mining, ruining all the laptops, phones and tablets. But you can make it not trigger every time.) This would make spamming expensive.

Server-side we have excellent AI spam filters that nobody seems to be using to fire off a captcha check later. The big problem here is that you cannot offload to some provider without inviting big privacy concerns. (Same problem as forum/chat/discussion platform providers.)

massaman_yams · on Jan 22, 2020

No. Botnets are large and broadly distributed enough to render protection methods based only on the IP or IP block ineffective. They're commonly used for mailbombing attacks such as those described here: https://www.wired.com/story/how-journalists-fought-back-agai...

Do you think a botnet with 10k machines is going to be meaningfully inhibited by making each machine's cpu run calculations for a second or two for each submission?

I'm sure reCAPTCHA looks at the IP and IP block as one of the inputs to its ML algorithm, but as one or two of perhaps a dozen different features - including mouse movement and/or keyboard input, which is quite a bit harder to fake.

thu2111 · on Jan 22, 2020

high profile spammers will emulate enough of the browser to render any measure based on browser anomaly detection worthless

Based on actual experience of fighting spammers, that isn't the case. Like a lot of people new to spam fighting you're making assumptions about the adversaries that aren't valid.

bradknowles · on Jan 23, 2020

There are many different types of spammers and attackers.

Some will be stopped by the simplest protection mechanisms.

Some will be indistinguishable from real humans, and you won’t be able to stop them without crippling your services for your real users.

But those are the two extremes. The real problem is the ones between those extremes.

Every intentional stumbling block you put in the path to try and stop those in the middle might also have a negative impact on your real users. The real problem is that the most troublesome attackers will learn and adapt to whatever stumbling blocks you put in the path. So, how many of your own toes are you willing to sacrifice with your foot guns in the name of stopping the attackers?

thu2111 · on Jan 23, 2020

Very few, but that's OK. Good spam fighters don't have to sacrifice many or really any toes to stop nearly all spam. You seem to be assuming a linear relationship between effort and false positives, but that would be a very ineffective spam fighting team relative to the ones I've worked on. In practice you can have nearly no false positives combined with nearly no false negatives.

This isn't easy and many firms fail at it, but you it can be done and we routinely did it.

tjoff · on Jan 22, 2020

IP block for registration?

Seems highly unrealistic.

scottmotte · on Jan 21, 2020

> automatically increase the prices of requests by a fraction of a cent to compensate

Great concept.

CPU, bandwidth, electricity, it's all just energy. And to a significant degree, money is just energy stored. I generate energy with my own work, store it in the form of money, and then transfer that energy to someone else, maybe to heat my home or cook me a meal.

Before money, I had to barter for those things. Maybe conceptually the internet is in a similar state at the moment. It doesn't have 'money'. Why can't I put CPUs in my wallet and then spend them? And why can't I charge visitors to my site by the CPUs they are costing me?

Instead, I have to, in a way, barter. For example, maybe I use ad revenue to earn my income, so I generate all this content, I barter that to the search engines, which barter with the advertisers, which barter with me, and I barter back to security guards to protect me from 'bad' actor bots. I'd really just like to receive CPU and bandwidth payments from them.

perl4ever · on Jan 22, 2020

Isn't the reason we are freed from barter in daily life is because the government is intimately involved in the financial/banking system, and regulates it and issues money and so on? Maybe we continue to struggle with the internet because it started out unregulated and has never really transcended that because people insist on thinking freedom is best for commerce without appreciating the nuances.

rashkov · on Jan 22, 2020

There are alternatives to that. For all of the hype and vaporware of the cryptocurrency movement, the idea of digital-native programmable internet money is a powerful one. I’m personally excited by the idea of involving currency at the protocol level and having it interact naturally over tcp/ip and http. There is an alternative to ads if we can make it work.

eeZah7Ux · on Jan 22, 2020

> Before money, I had to barter for those things

Not at all. Barter was quite uncommon also unpractical. Most societies used (and use) social connections and trust.

thu2111 · on Jan 22, 2020

I used to work on spam fighting.

This sort of solution is frequently proposed but doesn't work, because:

• Serving costs are rarely the problem. Normally it's annoying actions taken by spammers and the bad reaction of valuable users that matters, not the machine cost of serving them.

There are occasional exceptions. Web search engines ban bots because left unchecked they can consume vast CPU resources but never click ads. However, they only get so much bot traffic because of SEO scraping. Most sites don't have an equivalent problem.

• There is no payment system that can do what you want. All attempts at creating one have failed for various hard reasons.

• You would lose all your users. From a user's perspective I want to access free content. I don't want to make micropayments for it, I especially don't want surge pricing that appears unrelated to content. Sites that use more typical spam fighting techniques to fend off DDoS attacks or useless bot traffic can vend their content to human users for free, well enough that only Linux users doing weird stuff get excluded (hint: this is a tiny sliver of traffic, not even a percentage of traffic but more like an occasional nuisance).

• You would kill off search engine competition. Because you benefit from crawlers, you'd zero rate "good" web bots using some whitelist. Now to make a new search engine I have to pay vast sums in bot fees whilst my rich competitors pay nothing. This makes an already difficult task financially insurmountable.

The current approach of using lots of heuristics, JavaScript probes and various other undocumented/obscure tricks works well. Cases like this one are rare, caused by users doing weird stuff like committing protocol violations and such users can typically escalate and get attention from the right operators quickly. There are few reasons to create a vast new infrastructure.

wolco · on Jan 22, 2020

That's how ads work. More visitors more pageviews/clicks. People who serve ads don't want to pay for bots which is why they are a problem.

Doesn't medium do this?

danShumway · on Jan 22, 2020

> That's how ads work. More visitors more pageviews/clicks.

That's not asking people to pay for bandwidth/compute power, it's selling something adjacent to your content that you hope makes up for the loss.

> People who serve ads don't want to pay for bots which is why they are a problem.

That's kind of my point. When you ignore the arbitrage potential of serving requests for free, it forces you to care about making sure that your content is only available to the "right" users. You have to care about things like scraping/bots, because you're not directly covering your server costs, you're swallowing your server costs and just hoping that ads make up the difference.

Theoretically, in a world where server costs were directly transferred to the people accumulating those costs, you wouldn't need to care about bots. In fact, in that world, you shouldn't care whether or not I'm using an automated browser, since digital resources aren't limited by physical constraints.

In most cases, the only practical limit to how many people can visit a website is the hardware/cost associated with running it. A website isn't like an iPhone where we can run out of physical units to sell. So if they're paying for the resources they use, who cares if bots make a substantial portion of your traffic?

> Doesn't medium do this?

No, Medium just sells subscriptions, you don't pay for server usage. As far as I know, no one does this -- probably in part because of problems I haven't thought of, also probably in part because there are no good micro-payment systems online (and arguably no really good payment systems at all).

The closest real-world example is probably AWS, where customers pay directly for the resources they use. But those costs aren't then directly passed onto the user.

wolco · on Jan 22, 2020

If you had to pay for each request you would make fewer and limit the requests to serious requests (school, work, medical).

Having said that, you could provide a central service where people would buy credit to be used on many sites. So the micropay isn't the problem.

oefrha · on Jan 22, 2020

> you could provide a central service where people would buy credit to be used on many sites.

That central service is going to lock out many countries and regions as well as lots of people (minor, unbanked, poor, etc.) in non-locked out countries and regions. Payment is frigging hard especially on the international scale. This is every bit against freedom of information and strictly worse than Cloudflare.

JohnFen · on Jan 21, 2020

> most people on HN only seem capable of attacking the few affordable solutions to that problem.

I doubt that many would attack those solutions if they actually worked well, but they don't. These "solutions" are a big part of the reason why the web gets smaller for me every day as more and more websites become unusable.

dmix · on Jan 21, 2020

Cloudflare is like the TSA for the internet, I'm not convinced it needs to be as aggressive as it is. And yes I know websites have some control over how aggressive it will be but much like Reddit-moderation policy it's choosing the safety over everything approach, which hits enough false-positives on the edges to be a serious problem.

Cloudflare is very much anti-internet. And I'm a very security-obsessed person. Just like Reddit I believe we need to dial things a bit closer back towards chaos like a venn-diagram (safety)[x](chaos) there's a balance and I believe the internet is worse off when this balance is out of wack.

There might be some awful stuff on sites like 4chan but it also generated a ton of the memes that later filtered down into mainstream internet culture. Culture and innovation often happens in the chaos and fringes, which is an area I believe the world is becoming completely intolerant of in some attempt at idealism. But there are real sacrifices in between (ie, the mostly harmless stuff getting tagged as bad guys).

We need to be better at calming down and embracing the chaos, pushing back against FUD, and maintain a good balanced default. That chaos and flexibility is what originally made the internet great and endlessly promising.

Based on the various posts I've seen from Cloudflare founders on here I'm not convinced they are taking this problem as seriously as they need to be.

gnopgnip · on Jan 21, 2020

A comparison to TSA is flawed. Captcha is not a pass fail system, it is a score that is passed on to the web host and they decide what to do with it. Really any similar product to block malicious users would have the same problems, and the solution is to educate the website operators so they can avoid blocking legitimate users.

creato · on Jan 21, 2020

They do work well for the vast majority of people. Only on HN do I ever see people complaining about cloudflare/captchas/etc.

JohnFen · on Jan 21, 2020

I don't dispute that they work well for the majority of people -- but the majority of people are not security-conscious.

However, I see people complaining about Cloudflare in lots of places other than here. The number of people adversely affected by Cloudflare is not small.

ar0 · on Jan 22, 2020

I see lots of complaints about Captchas in the “real” world, too. Not regarding the centralisation etc. aspect but more regarding how painful they are to complete correctly, but there are definitely complaints.

Regarding Cloudflare, a regular user will have no idea about what Cloudflare is and what they do. If something like the OP happens to them, they will just figure “the site is broken” and move on. So there could be a large hidden number of users who have suffered from overzealous Cloudflare blocking without being able to identify it as such.

jlarocco · on Jan 21, 2020

> It's easy to shit on everything. Let's hear some real solutions.

My solution more and more is to just not bother with it. If a site is unreadable because I'm using uBlock and uMatrix, and I have to spend more than a minute or two tweaking things, then I just leave.

That said, I don't have any problem with Cloudflare. I'm much more annoyed by the overuse of *.googleapis.com. I'd love if somebody would setup a service that I could point my hosts file at so that googleapis.com silently went somewhere else.

ficklepickle · on Jan 22, 2020

I've been thinking about a local proxy that caches CDN assets. The first request to a cdn URL goes through, subsequent requests come from cache.

I think it would work fine with versioned libraries, fonts, etc. I'm thinking of setting up a container and squid config to achieve this.

Any obvious problems or alternative solutions?

Obviously enumerating the worlds CDN URLs would be a task. But I think even covering the most common CDNs would be a benefit.

thefreeman · on Jan 22, 2020

I mean your browser basically already does this by utilizing cache-control and expires headers, which all CDN's are going to set

computerfriend · on Jan 22, 2020

Something like https://decentraleyes.org/?

deftturtle · on Jan 22, 2020

uMatrix is great for blocking 3rd party stuff globally in your browser. Outside of the browser, I rely on DNS blocking rather than modifying hosts files.

Wrote a little post about how I configured my blacklists and whitelists with AdGuard Pro for iOS.

https://www.calebyers.com/blog/dns-ad-blocking.html

joepie91_ · on Jan 21, 2020

The problem is that the narrative has been poisoned by Cloudflare and Google (for Recaptcha) - they both overstate the size of the problem, as well as the effectiveness of their solution.

In other words: when someone demands "real solutions", they're typically expecting a degree of solution that quite likely just does not exist at all, to solve a problem that isn't as severe as people believe, just because that's the bar that those companies have set in the public discourse.

This makes it impossible for well-intentioned people to 'compete' with these services, because whatever alternative is suggested (hidden form elements, a random VPS provider with DDoS mitigation, serving assets locally, etc.) is immediately dismissed as "that can't possibly be as effective / effective enough", even though it'd be perfectly adequate for the vast majority of cases.

The alternative and competitive solutions exist, and have existed for a long time. You don't need a 1:1 replacement for these services. People just often refuse to believe that the simple alternatives work, and won't even bother trying.

(For completeness, my background is that of having run several sites dealing with user-submitted content, including some very abuse-attracting ones.)

nemothekid · on Jan 21, 2020

>because whatever alternative is suggested (hidden form elements, a random VPS provider with DDoS mitigation, serving assets locally, etc.) is immediately dismissed as "that can't possibly be as effective / effective enough", even though it'd be perfectly adequate for the vast majority of cases.

They are immediately dismissed because I don't want to pay a fulltime engineer to play cat and mouse with skiddies on the internet.

Analemma_ · on Jan 21, 2020

I think you're confusing what you wish was true with what is actually true. For instance, here was a post from a few weeks ago about how one annoyed user was able to take down a Mastodon instance until the admin gave up and put it behind CF: https://news.ycombinator.com/item?id=21719793. Bear in mind, if you're running a Mastodon instance, you're probably well-aware of the downsides of centralization and would only give in as a last resort.

CF has problems, but pretending it isn't solving a real issue that is nearly impossible to fix otherwise, especially for individual admins running a side project, doesn't help anybody.

yc12340 · on Jan 22, 2020

> I think you're confusing what you wish was true with what is actually true.

And you are cherry-picking poorly sourced anecdotes to better suite your position.

A VPS with 100Mbps virtual adapter physically can't withstand DoS from single attacker with fiber connection (or equivalent of it). This does not have much to do with anatomy of DoS attacks, just simple math.

Cloudflare subsidizes their free users by giving a bit of bandwidth for free — the amount, that can be purchased from a decent hoster for several hundreds dollars. Of course, an attacker with several hundreds dollars can easily rent a botnet, that will demolish that "protection".

laughinghan · on Jan 22, 2020

Huh?

"All Cloudflare plans offer unlimited and unmetered mitigation of distributed denial-of-service (DDoS) attacks, regardless of the size of the attack, at no extra cost."

https://www.cloudflare.com/ddos/

Do you know of an example of an attacker "easily demolishing" Cloudflare's free DDoS protection for a website with a few hundred dollars worth of botnet?

altfredd · on Jan 23, 2020

> Do you know of an example of an attacker "easily demolishing" Cloudflare's free DDoS protection

I can name dozens of websites, that folded under Cloudflare's supposedly flawless DDoS protection (at the time when they were still using it). Of course, the ones who fold are always websites themselves — Cloudflare itself is never affected, because when the DDoS gets particularly bad, they just detach websites from their CDN and expose it to attackers.

laughinghan · on Feb 3, 2020

...so name them?

SpicyLemonZest · on Jan 21, 2020

If I care deeply about my site staying up, a solution that’s perfectly adequate for the vast majority of cases isn’t sufficient. I don’t want to end up in the mirror image of the original author’s situation, where my site randomly falls down and I have no way to figure out what’s wrong or fix it.

smkellat · on Jan 21, 2020

Cloudflare is publicly traded under the symbol NET and has quite a number of institutional investors. A list can be found here: https://old.nasdaq.com/symbol/net/institutional-holdings

If all those companies are fronts for various parts of the US intelligence community then we're really screwed, I suppose.

ta999999171 · on Jan 21, 2020

That's... not how that would work.

z3t4 · on Jan 21, 2020

For caching: Learn how to code. If your web page dies when there are only two visitors, then that's on you.

DDoS attack: If possible, the easiest solution is to just swallow the traffic. If that doesn't work you want to block all networks that allows IP spoofing. Then it's a wack-a-mole game. And if you have the resources, use any-cast and many co-locations. Or ask your ISP for help.

Hiding your server: Use onion address via TOR network.

SSL certificate: Use Letsencrypt

Edge SSL/DNS/CDN: Use a fast web server or proxy, like Nginx. With Cloudflare the connection to the Edge server might be faster, but time to first byte (on your site) often slower. So you get better bang for the buck by optimizing on your end.

Note that DNS by itself already have edge caching out of the box, for free! eg. if a user looks up your domain, it will be cached both at their ISP and LAN. So you don't need Cloudflare for DNS.

megous · on Jan 21, 2020

> Yeah, you can't really talk about downsides of Recaptcha/Cloudflare without also acknowledging the extreme amount of malicious actors and abuse on the internet.

What percentage of traffic on the long tail of 95% of smallest websites served by CF is malicious then? So that we talk in numbers.

amylene · on Jan 21, 2020

I have run a number of small and medium websites (20 users per month up to 2 million). At least 50% of the traffic I see in my logs includes some sql injection or other mass script kiddie bs.

tyingq · on Jan 21, 2020

That's fairly black and white. Blocking an unusual user-agent because you "think" it might be malicious is another thing.

amylene · on Jan 22, 2020

It might be a poor business decision, but probably not for the reason most people would think.

An unusual UA is unlikely to move the needle on top line metrics, but it is a distraction and a misuse of resources to play cat and mouse. (Unless your business would be materially harmed by someone scraping your data... in which case, you’re doomed anyway.)

megous · on Jan 21, 2020

I've looked at my logs, and obvious nonsense like POST or GETs with any search params on a website that only has static html pages wich should not generate these kinds of requests is about 1% of last 25000 requests.

ryanlol · on Jan 21, 2020

Why care about such traffic? Blocking it seems like a pointless exercise.

amylene · on Jan 22, 2020

I was responding to OPs question. Iirc, we discussed it and never implemented any blocking.

dependenttypes · on Jan 22, 2020

> Let's hear some real solutions

The old recaptcha which did not need js, did not serve you with unsolvable challenges, and did not refuse to serve you because you used tor/because you used the audio challenge too much.

Thriptic · on Jan 21, 2020

Somewhat of a topic hijack and a naive question, but assuming Cloudflare is a government entity, wouldn't they still have to comply with whatever their terms of service / contracts with their users are? As they are a US company, barring illegality, theoretically they can't actually do shady shit without being in breach of contract right? They would also open themselves up to shareholder lawsuits.

tzs · on Jan 21, 2020

If they were an actual part of government, sovereign immunity would be something that would have to be considered. In a nutshell, the government cannot be sued unless it decides to allow it.

The government has passed laws to allow itself to be sued under certain circumstances. The Federal Tort Claims Act (FTCA), for example, allows suits for a variety of torts.

I believe (but am not actually sure) that most normal business-type transactions with the government are covered under FTCA or other acts, so a breach of contract by Cloudflare-the-government-entity would probably be pretty much like a breach by any random non-government entity.

Still, if you were going to depend on that it would be a good idea to actually look into the details of the FTCA and other such acts and compare to the actual Cloudflare TOS.

I have no idea whatsoever how sovereign immunity works in the case of a corporation chartered under some state's corporate law (Delaware in the care of Cloudflare) that is owned (fully or in part) by the government. I'd guess that it could only possibly apply if the government owns enough of the company to have control.

Cloudflare is public, so we can probably not worry about that scenario. If the government actually controls them, it is doing it surreptitiously, and so even if sovereign immunity should be somehow applicable I'd expect that the government would not bring it up because doing so would necessarily bring to light their control.

turkeydonkey · on Jan 22, 2020

What ever happened to proof of work protocols? I remember in the 00's they were being touted as The Solution™ to our spam/bot woes. Are botnets just so large that even PoW doesn't significantly affect them?

littlestymaar · on Jan 22, 2020

> Yeah, you can't really talk about downsides of Recaptcha/Cloudflare without also acknowledging the extreme amount of malicious actors and abuse on the internet.

But recaptcha bas been broken for years now by several different means. At this point, it is so broken it's almost a scam (and just another way for Google to get personal data from as many website as they can).

thaumasiotes · on Jan 21, 2020

> How are you going to talk about the downsides of Recaptcha/Cloudflare without also acknowledging the extreme amount of malicious actors and abuse on the internet?

This is acknowledged by the original question

>> Also, how would your implementation differ to solve this issue?

hombre_fatal · on Jan 21, 2020

To be clear, I agree with their comment and tried to double down on their point that it's a tough nut to crack. I improved my opening line.

deogeo · on Jan 22, 2020

> malicious actors and abuse

It's hard to consider simply viewing content to be malicious or abusive, no matter how automated.

tomc1985 · on Jan 22, 2020

I used to work for a data-scraping firm and very often we would accidentally knock many web sites offline when we pointed our crawlers at them.

I'd love to agree with you, but the crawler problem is 100x worse today than it was a decade ago

deogeo · on Jan 22, 2020

This would be much better solved with IP-based rate limits. And if IP-based doesn't work, then you're dealing with a DDOS, and it doesn't sound like this case was DDOS protection.

tomc1985 · on Jan 23, 2020

IP-based rate limiting is easily foiled via proxies, VPN services, tor, or botnets

deogeo · on Jan 23, 2020

And user agent string based protections are even more easily foiled, that's why I don't believe this can be plausibly counted under DDoS protection.

rodgerd · on Jan 21, 2020

> Yeah, you can't really talk about downsides of Recaptcha/Cloudflare without also acknowledging the extreme amount of malicious actors and abuse on the internet.

Cloudflare have a long history of supporting those malicious actors, so it's not like the problem is unrelated to the purported solution.

hwbehrens · on Jan 21, 2020

> Two steps backwards in every conceivable way. The giants gain more invisible power and powerusers suffer decreased productivity/privacy. Not going to happen.

I agree with the first two sentences, but disagree with the third. I believe that this state is actually the intended end goal.

Previously, for many years, I browsed the web with Javascript disabled. At the time, this had very little impact on my browsing experience; perhaps some of the layout would be broken, but not in a way that would interfere with the functionality or content of the site.

Nowadays, not only is this totally impossible, blocking even a subset of a site's JS (such as through uMatrix) is trial and error to get the site to load at all, or to do simple tasks like click on a "login" button.

With Google's plan to "phase out" cookies [0], I expect the web to become even more opaque and difficult to modify "on the fly" -- that is, on the user's local machine prior to displaying the content. In particular, this will affect ad and tracker blocking the most, as the pain from effective ad blockers starts to bite harder and harder.

So, when you say "The giants gain more invisible power", that is true and desireable from their perspective, and since they write the code that actually underpins most web browsers, why wouldn't they?

When you say, "powerusers suffer decreates productivity/privacy", yes, that's absolutely true. Why would they care? It's such a small fraction of their business. Some users will go to more and more extremes to preserve their privacy, eventually accessing only some small subset of sites from an esoteric Kali-derived distro, and others will capitulate and shift their behavior back to the herd.

In the end, the giants still win.

[0]: https://www.inc.com/jason-aten/google-says-chrome-will-end-s...

isodude · on Jan 21, 2020

Like always we find another venue where the giants haven't their foot yet. Or the malicous users does not find a huge interest in.

Rebooting the web into something else is still possible and something that will eventually happen when enough people are too tired about the current state.

This battle has been lost long time ago.

Liquix · on Jan 21, 2020

Great points, I absolutely agree with everything you've said. The 'not going to happen' is less a prediction of the future and more a reflection of my personal stubbornness/frustration regarding the direction things are headed (increased 'opacity' as you put it).

GoblinSlayer · on Jan 22, 2020

I have javascript disabled by default, most of internet works fine.

gpm · on Jan 21, 2020

4. Power users and nerds need to upstream the privacy improvements to everyone.

- To do so you need to avoid making user experience worse

wlesieutre · on Jan 21, 2020

Apple's work on tracking protection in Safari are a great step for this. Normalizing ITP across the whole Mac / iOS / iPad OS userbase means that sites have to accept it or block a huge number of normal users.

mhasbini · on Jan 21, 2020

> Launch a competing product that accomplishes the same thing.

Disclaimer: I was part of hCaptcha team.

https://hcaptcha.com/ is competing with ReCaptcha. It's a drop-in replacement for ReCaptcha.

It's privacy focused (supports privacy pass), and is fair: webmasters get a cut for each captcha that is solved correctly (they can choose to directly donate it to a charity of their choice), hCaptcha get a cut for running the service and a customer will get their images/data labeled.

nemoniac · on Jan 21, 2020

An alternative? This service uses both Cloudfare and googleapis. What's that all about?

errnoh · on Jan 21, 2020

This is an interesting subject indeed. Even though both of them are in the "try to grab as big portion of internet traffic as possible" business I wouldn't compare them that easily.

Cloudflare is actually rather good in doing what they do (DDoS side of things). They rarely break normal internet use, the only time when that happens is when a site is put into the "I'm under attack" mode that forces browser to do javascript proof. They do get huge amounts of traffic information though, but that is pretty much required to their core business (DDoS prevention, not the tinfoil kind)

Google/ReCaptcha is another thing. I have hard time understanding any reason to put captcha on any site that a normal incremental delay between login tries & ban sources that keep doing that for too long wouldn't already prevent. They're getting traffic data and ML training data and neither one is required for the thing captcha is trying to solve. Sites are just feeding their business and captcha is actually making the internet worse place for humans.

(Captcha requirement for things like posting on a discussion could be handled by simple spam/bot detection, captcha is just overkill)

heyyyouu · on Jan 21, 2020

Aren't there configuration options/levels that can be set within CloudFlare to mitigate these issues?

EDIT: Another user posted this below, answering my question:

sgtfrankieboy 1 hour ago | undown [-]

In CloudFlare go to "Firewall" and then click Settings on the right. Here you can set the Security Level and if you want to use Browser Integrity Checks among other thing

llama052 · on Jan 22, 2020

Yup there sure is, but the Cloudflare hate bandwagon has started. This is more on the site-owner than Cloudflare in my opinion.

mirimir · on Jan 21, 2020

My solution is connecting via nested VPN chains, working in VMs, and compartmentalizing stuff using multiple personas.

Reputable VPN services do a good job of keeping their IPs off blocklists. Occasionally I'll get blocked, because some jerk has been abusing the VPN server that I'm exiting from. But if it doesn't resolve promptly, I just switch to a different exit server.

So I only use this VM, and this VPN exit, as Mirimir. And given that, I don't go out of my way to prevent tracking. Not enough, anyway, to trigger blocking. Because I don't really care if everything that Mirimir does gets linked. Indeed, I pretty much always use "Mirimir" as my username, or sometimes "Dimi" or whatever.

If I don't want stuff linked, I use a different persona in a different VM, using a different VPN chain. Or that via Tor using Whonix.

jakecopp · on Jan 22, 2020

I'd love to use an open source/more beneficial to society version of ReCaptcha - say validating OpenStreetMap data/project Gutenberg/something else. Maybe this already exists and I don't know about it!

Would sites then move to this and reduce the lock in and inflexibility with ReCaptcha?

RileyJames · on Jan 21, 2020

4. Outsource. Use a ReCaptcha/Cloudflare filling service (usually just someone else manually typing these in).

I won’t link, but search “ReCaptcha solver” and you’ll find plenty.

It highlights just how broken the system is. It doesn’t stop determined spammers/devs, until the value of the task is lower than the cost to solve.

Considering it’s 50c USD per 1000....

jabart · on Jan 21, 2020

Everyone keeps bringing this up but unless you have something of monetary value on the other end, this won't happen. ReCaptcha and a few if statements has stopped all contact us form spam on our site. Same for other sites I help to manage, no one is paying 50 cents per a thousand contact us spam messages.

ficklepickle · on Jan 22, 2020

I found all my contact form spam was being sent superhumanly fast. Well under 10 seconds from initial page load. A user can't make it to the form and type a meaningful message that fast.

Adding a short timeout eliminated my contact form spam. I also only allow JSON on the back end, so they must execute JS to even have a shot.

This has allowed me to avoid blocking TOR exit nodes... So far anyway.

xur17 · on Jan 21, 2020

I've seriously considered doing exactly this. I've solved numerous challenges for Google this week alone.

jpalomaki · on Jan 22, 2020

I think the long term solution could be in making the network providers and ISPs more responsible for the malicious traffic originating from their networks.

For example the botnet traffic is best stopped at the origin. If there was some pressure for the service providers, I'm fairly sure they could do more to detect subscriptions with compromised devices and take appropriate actions. Actions can include educating the users and if necessary, blocking the subscription until problems are fixed.

While this certainly would not immediately cover the whole world, it would be a start. On the website level you could then treat traffic from networks that have agreed to cut malicious traffic in different way.

sturgill · on Jan 21, 2020

I’m a fan of Recaptcha v3. There are many actions where you can ask for some additional input in a non-puzzle way. Simple example is sending a confirmation email before signup when the score is below a certain threshold.

Because I freaking hate those captcha puzzles...

rstupek · on Jan 21, 2020

Except v3 has to be on every page of the site and that gives Google a full view of all traffic on the site

WillPostForFood · on Jan 21, 2020

So does Google Analytics, so does AdSense. ReCaptcha V3 is probably the least used of the three.

catalogia · on Jan 22, 2020

Blocking GA and AdSense doesn't render websites unusable.

ddalex · on Jan 23, 2020

Google doesn't force you to do AdSense an all pages. And Analytics, well, you can also skip on sensitive pages.

A4ET8a8uTh0 · on Jan 21, 2020

I don't see myself as a poweruser, but it would be very hard for me to give up on my current setup. I honestly do not understand how my wife deal with it. Even with piholed wifi it is a horrifying experience..

kragen · on Jan 21, 2020

4. Mirror the contents of this Dark Web 3.0 on a Light Web 3.0 accessed using privacy-protecting technologies like Tor to make sure it remains accessible. Obviously this won't help for logging into Twitter and your Fecebutt account but it should be fine for the nginx documentation.

The underlying incentive here is that centralized websites are slow and vulnerable to DDoS. Massively mirroring their content is the solution, and it's what Cloudflare does. Let's do it in a way that protects human rights rather than taking them away.

GoblinSlayer · on Jan 22, 2020

ReCaptcha works because any captcha works.

https://kevv.net/you-probably-dont-need-recaptcha/

https://blog.codinghorror.com/captcha-effectiveness/

young_unixer · on Jan 21, 2020

4. Boycott Cloudflare.

lucb1e · on Jan 21, 2020

Isn't that #1? Or do you mean more like what OP is experiencing by being banned?

GP already mentioned why #1 is not a solution (which I see the same way) and OP made it quite clear that not visiting CF sites isn't quite working either.