Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

User-agent might be a useful signal but treating it as an absolute flag is sloppy. For one thing it's trivial for malicious actors to change their user-agent. Cloudflare could use many other signals to drastically cut down on false positives that block normal users, but it seems like they don't care enough to be bothered. If they cared more about technical and privacy-conscious users they would do better.


> For one thing it's trivial for malicious actors to change their user-agent.

Absolutely true. But the programmers of these bots are lazy and often don't. So if Cloudflare has access to other data that can positively identify bots, and there is a high correlation with a particular user agent, well then it's a good first-pass indication despite collateral damage from false positives.


The programmers of these bots are not lazy - this space is a thriving industry with a bunch of commercial bots, the abiluty of whcih to evade cloudflare/etc is the literal metric that determines their commercial viability


My data says otherwise and you have provided nothing to back up your claim other than saying we have an industry full of dirty money paying programmers to write unethical code. I'm sure it inspires them to do their best work.

Half these imbeciles don't even change the user-agent from the scraper they downloaded off GitHub.

I employ lots of filtering so it's possible the data is skewed towards those that sneak through the sieve - but they've already been caught, so it's meaningless.


I would hope Cloudflare would be way, way beyond a “first pass” at this stuff. That’s logic you use for a ten person startup, not the company who’s managed to capture the fucking internet under their network.


> So if Cloudflare has access to other data that can positively identify bots

They do not - not definitively [1]. This cat-and-mouse game is stochastic at higher levels, with bots doing their best to blend in with regular traffic, and the defense trying to pick up signals barely above the noise floor. There are diminishing returns to battling bots that are indistinguishable from regular users.

1. A few weeks ago, the HN frontpage had a browser-based project that claimed to be undetectable


> a browser-based project that claimed to be undetectable

For now


That's just part of the game. Sometimes you're ahead, sometimes you're behind, but there's never a decisive winner.


I mean, do we need to replace user agent with some kind of 'browser signing'?


If you're thinking of Google's WEI, I'm thankful that went down in flames:

"Google is adding code to Chrome that will send tamper-proof information about your operating system and other software, and share it with websites. Google says this will reduce ad fraud. In practice, it reduces your control over your own computer, and is likely to mean that some websites will block access for everyone who's not using an "approved" operating system and browser."

https://www.eff.org/deeplinks/2023/08/your-computer-should-s...




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: