Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The underlying issue is that many sites aren't going to get feedback from the real people they've blocked, so their operators won't actually know that tuning is required (also, the more strict the system, the higher percentage of requests will be marked as bots, which might lead an operator to want things to be even more strict...)


I will say -- a higher-end bot detection service should provide paper trails on the block actions they take (this may not be available for freemium tiers, depending on the vendor).

But to your point, the real kicker is the "many sites aren't going to get feedback from the real people they've blocked" since those tools inherently decided that the traffic was not human. You start getting into Westworld "doesn't look like anything to me" territory.


I'm not into westworld so can't speak to the latter paragraph, but as for "high-end" vendors' paper trail: how do log files help uncover false blocks? Any vendor will be able to look up these request IDs printed on the blocking page, but how does it help?

You don't know if each entry in the log is a real customer until they buy products proportional to some fraction of their page load rate, or real people until they submit useful content or whatever your site is about. Many people just read information without contributing to the site itself and that's okay, too. A list of blocked systems won't help; I run a server myself, I see the legit-looking user agent strings doing hundreds of thousands of requests, crawling past every page in sequence, but if there wasn't this inhuman request pattern and I just saw this user agent and IP address and other metadata among a list of blocked access attempts, I'd have no clue if the ban is legit or not

With these protection services, you can't know how much frustration is hiding in that paper trail, so I'm not blocking anyone from my sites; I'm making the system stand up to crawling. You have to do that regardless for search engines and traffic spikes like from HN




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: