Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The article cites the Google vs Authors Guild case (https://www.techdirt.com/2013/11/14/google-gets-total-victor...) which was a total victory for Google. This seems fairly conclusive that the textual analysis here is fair use to me.

> Similarly, Google Books is also transformative in the sense that it has transformed book text into data for purposes of substantive research, including data mining and text mining in new areas, thereby opening up new fields of research. Words in books are being used in a way they have not been used before. Google Books has created something new in the use of book text — the frequency of words and trends in their usage provide substantive information.

Furthermore, is this actually AI training? This just looks like stats based on heuristics to me, I.e., garden variety sentiment analysis.



I think Google Books is cool, but "the frequency of words and trends in their usage provid[ing] substantive information" pre-dates Google Books by a long time. For example, there's a collection of word frequencies in the complete works of John Keats from 1917 [1]. Manually tabulated, too!

[1] https://catalog.hathitrust.org/Record/001023999


Sure, but the Google Books case is a massive and well-funded court case that was a pretty resounding victory for Google and textual analysis of copyrighted works in general, so anybody arguing that this is obviously copyright violation needs to explain why the Google Books case isn’t relevant.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: