Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The single agent pulling regularly 10k records, which nobody will ever use, is worse than the 10k agents coming from the same source, and using the same cache, they fill when doing a targeted request. But even worse are 10k agents from 10k different sources, scraping 10k sites each, of which 9999 pages are not relevant for their request.

At the end it's all about the impact on the servers, and those can be optimized, but this does not seem to happen at the moment at large. So in that regard, centralizing usage and honouring the rules is a good step, and the rest are details to figure out on the way.





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: