Assuming any image or audio stream is available on the internet (not true but very close), you can have a super large compression by replacing the media by their url.
LLM's (the set of connections and their weights) are in fact a compressed version of a large part of the internet.
So what is found by the article should not surprise.
> Assuming any image or audio stream is available on the internet (not true but very close), you can have a super large compression by replacing the media by their url.
That reminds me of the "Dropship" utility: to save server storage and reduce file upload time, Dropbox used to deduplicate uploaded files globally in a way that files would have the same hash regardless of who uploaded them. Anyone who knew the hash of an uploaded file could download it to their Dropbox folder, so people could share large files by just sharing the Dropbox hash.
LLM's (the set of connections and their weights) are in fact a compressed version of a large part of the internet.
So what is found by the article should not surprise.