User-agent might be a useful signal but treating it as an absolute flag is slopp...

likeabatterycar · 2025-02-05T21:25:56 1738790756

> For one thing it's trivial for malicious actors to change their user-agent.

Absolutely true. But the programmers of these bots are lazy and often don't. So if Cloudflare has access to other data that can positively identify bots, and there is a high correlation with a particular user agent, well then it's a good first-pass indication despite collateral damage from false positives.

plaguuuuuu · 2025-02-05T23:22:38 1738797758

The programmers of these bots are not lazy - this space is a thriving industry with a bunch of commercial bots, the abiluty of whcih to evade cloudflare/etc is the literal metric that determines their commercial viability

likeabatterycar · 2025-02-06T00:41:29 1738802489

My data says otherwise and you have provided nothing to back up your claim other than saying we have an industry full of dirty money paying programmers to write unethical code. I'm sure it inspires them to do their best work.

Half these imbeciles don't even change the user-agent from the scraper they downloaded off GitHub.

I employ lots of filtering so it's possible the data is skewed towards those that sneak through the sieve - but they've already been caught, so it's meaningless.

ok_dad · 2025-02-05T21:49:11 1738792151

I would hope Cloudflare would be way, way beyond a “first pass” at this stuff. That’s logic you use for a ten person startup, not the company who’s managed to capture the fucking internet under their network.

sangnoir · 2025-02-05T22:19:22 1738793962

> So if Cloudflare has access to other data that can positively identify bots

They do not - not definitively [1]. This cat-and-mouse game is stochastic at higher levels, with bots doing their best to blend in with regular traffic, and the defense trying to pick up signals barely above the noise floor. There are diminishing returns to battling bots that are indistinguishable from regular users.

1. A few weeks ago, the HN frontpage had a browser-based project that claimed to be undetectable

fbrchps · 2025-02-05T22:54:46 1738796086

> a browser-based project that claimed to be undetectable

For now

sangnoir · 2025-02-05T23:39:59 1738798799

That's just part of the game. Sometimes you're ahead, sometimes you're behind, but there's never a decisive winner.

sleepybrett · 2025-02-05T21:43:09 1738791789

I mean, do we need to replace user agent with some kind of 'browser signing'?

doctor_radium · 2025-02-07T15:53:34 1738943614

If you're thinking of Google's WEI, I'm thankful that went down in flames:

"Google is adding code to Chrome that will send tamper-proof information about your operating system and other software, and share it with websites. Google says this will reduce ad fraud. In practice, it reduces your control over your own computer, and is likely to mean that some websites will block access for everyone who's not using an "approved" operating system and browser."

https://www.eff.org/deeplinks/2023/08/your-computer-should-s...