>>>There is an astronomical amount of data siloed by publishers, professional jo... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		staticman2 on Aug 21, 2024 \| parent \| context \| favorite \| on: Artificial intelligence is losing hype >>>There is an astronomical amount of data siloed by publishers, professional journals etc. that is yet to be tapped. You seem to think these models haven't already been trained on pirated versions of this content, for some reason.

dartos on Aug 21, 2024 [–]

Yep, books3 is what llama was famously trained on before it was taken down.

That’s not even considering AI crawlers or all the copyright text on archive.org

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact