Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Maybe you're thinking of Library of Congress when you say ~50TB? Internet is definitely larger..


Indeed, a quick lookup doesn't give many reliable-sounding sources but they're all on the order of zettabytes (tens to thousands of them), also for years before any LLM was halfway usable. One has to wonder how much of that is generated, thinking of point of my own websites where the pages are derived statistics from player highscores, or the websites that jokingly index all Bitcoin addresses and UUIDs

Perhaps the 50TB estimate is unique information without any media or so, but OP can back up where they got that number from than I can do with guesswork




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: