Interesting. The authors focused a lot on competing with market vs archival purposes. For me, and I reckon a lot of other people, I used libgen because I just want a PDF or epub that is not muddied with DRM or behind a university wall. Most of the books I download I already have a paper copy of, but want to be able to use it digitally. For me it’s all about convenience. I guess that would be pretty hard to measure.
This is close to my use case as well. I mainly buy e-books these days, but they are often published as an afterthought[1] and not designed for readers – things like maths and plots get mangled in the process and vary from excellent to completely unreadable. So I want something closer to the physical book to look up the equations and visuals, but I don't want to have to go to the library[2] for each one. A PDF scan of the print book is very convenient.
[1] Sometimes they are literally an OCR of the print book! You can tell because of the symbol substitutions.
[2] Even most digital libraries I've tried are very inconvenient to "go to". Archive.org might be the exception.
my friend in college bought texts books before the first semester when their parents took them to the book store. then a classmate showed them libgen and didn't buy any more text books except the one weird custom textbook by some professor.
i guess they buy all their books now that they have a job, but also with generous use of used book stores and physical libraries.
This pretty much mirrors my experience exactly, but these days I lean a lot more heavily on the Libby app for borrowing from my library which is amazing.
What’s the issue with Amazon/Kindle though, the fakes?
it's actually very easy to measure. the audience here is close to the 1%, so your usecase is already discarded in the grand scheme with a 2% error margin when measuring behavior for the large population.
I buy 10 books per year off Amazon and the occasional Economist magazine. I am of course not paying for academic journal access or tons of academic books outside my field.
Download 10 ebooks on a subject, scroll through, read a few excerpts, and the one that I actually start reading, I buy from amazon or a local bookstore and actually read it.
Many books look great, but once you start reading them, are total garbage. This is in the field of IT literature though, so books tend to be very hit or miss, and quickly outdated.
Not to devalue the tireless work of authors, but we must also recognize that students in particular do not have enough money to pay for all these books.
At the same time, it can be argued that writing a single book, and selling it thousands times over, is a bit to "easy" in terms of ways to make money. The hard part is to get people to actually buy your book – it does not matter that you wrote the best book in the world if nobody gives you well deserved attention for it.
Bloggers suffer from the same problem, and it is not necessarily because their work is bad. The truth of the matter just is, nobody cares about quality information anymore (And I am guilty of this as well).
We want fast and summarized answers so we can move on to the important part: solving whatever concrete problem we are working on.
AI will probably delude the little remaining value of information even further, and at a point, nobody will manually write comprehensive information anymore. At least not unassisted by AI, and while the quality may suffer, we must also realize that we do not really need 100% accurate information. If we get a statistically significant amout of accurate AI provided information, then there is no need for anyone to write books anymore. It will be a complete unappreciated waste of their time, and nobody is going to buy them.
Even now that I am in a decent job, I still prefer not to buy books, instead relying on free sources on the internet (not piracy). If a given book/information is not available for free, then it is often not important enough for me to bother (note often – not always).
It is also a matter of prioritizing – reading a book takes me way too long, and the process is far-from comfortable due to my slow reading, and for that reason alone I tend to avoid reading entire books. It strikes me as an antiquated way to gain information even without AI. I may open a specific chapter of interest, but reading the entire thing is painfully tedious, and probably unnecessary.
I read lots of non fiction books. I think the same.
Most of them could condense it's contents in a couple of chapters.
It would be great to have modular books, like Emacs manual. If sections where independent modules you could rearange the book or even create books from a series of sections from diferent books.
That way you could choose different outlines, maybe predefined by the author like, to create books tailed to your needs:
This is what's done with sufficiently academic books in the hard sciences. They are so modular, that books are simply topics with each chapter written by a chosen author, and those authors will treat the chapters they've written similarly to papers they've had published.
I think if you dive deeper in to more rigorous nonfiction books you'll find that less time is spent on the 'pop' side of popsci literature. Which might be where you're encountering that fluff.
I hate to bring it up, but AI would be perfect for that. It could create logical segues between the new chapter order. So you could basically create books on demand based on real content with only the AI providing context.
"writing a single book, and selling it thousands times over, is a bit to "easy" in terms of ways to make money"
$2 profit per book is perhaps a high figure that an author may recieve from the publisher. 2 x 3000 is $6000 for maybe months or years of work. And this would be a 'successful' author. It's not all JK Rowling out there ya know!
I don't understand this question. Book reviews are not summaries. And even they were, "why do people read books, instead of just reading summaries?" is still a ridiculous question.
When I was at university (Oxford, UK, 2009), I bought about four key books, and spent well over 100GBP. I simply couldn't afford more. I had a student loan which covered tuition and some of my living costs, as well as a bursary from the university which covered some of my other living costs (but not all!).
What was annoying was that our library didn't have enough books for all the students. We'd all be assigned the same reading list for the week, and then have to race to the library to get the books before they were all gone.
Imagine someone from a 3rd world country. Let's say India.
Even the books like K&R C, Tanenbaum operating systems or CLRS / Skienna Algorithms will be north of 1000 INR in India.
Count 5 - 6 such books per semester, that's at least 5k - 10k INR. But for obscure books the price often goes to 5K for a single book. Let's say 10K INR.
Which is a significant amount, and for some students can exceed a month's living expenses for a semester.
So they're hesitant to spend that much. Often they end up with shittily written local books.
Now imagine you want to consult some book for specialist topics, like Windows internals or something, you will have to sell an organ.
Libgen fills an important gap in the paid archives/libraries. There are always some books, papers or articles that are not available through my university library.
It's a huge help to have documents available immediately and not having to pay a small fortune for a single document that might prove irrelevant.
If you are interested in Library Genesis (and shadow libraries in general) I can recommend this book by Joe Karaganis: https://boook.link/Shadow-Libraries
They aren't too bad with modern reader software, you can automatically trim margins and set the viewport size and scan pattern to match the text (makes double-column papers much nicer to read).
Interesting research. However, in practical terms, I wouldn't count Kindle as a form of digital availability, unless the book is composed of pure text.
Browsing figures or tables is usually a very bad experience. The figures are usually in low resolution. And the tables are sometime just pictures. Even if the publisher bothered to encode the table as a table, if it spans over one screen size, navigation becomes very hard. Not to mention symbols and formulas.
I predict that Libgen/SciHub and pirate sites in generall will be rendered functionally inaccessible in the developed west in the coming years. My reasoning is as follows:
Historically, IP owners have tried various strategies to increase revenues by curtailing infringement. In the early 2000s, IP owners tried to extract revenue directly from pirates via honeypot torrents and ISP subpoenas. Since about 2010, IP owners have shifted towards a more passive approach where they prioritize infringement for commercial purposes only and largely leave piracy for personal consumption alone. It just didn't make sense to chase after teenagers and students who probably wouldn't have had the money to make a legit purchase anyway.
LLMs trained on huge corpuses that include pirated content like LibGen change everything. Now, these IP holders face an existential (or at least severe) threat to their business models in the form of AI generated content. At the very least, these IP holders missed out on a massive opportunity to extract some the wealth created by AIs by virtue of their laxity in going after easily available pirate content.
I'd expect to see a strong swing back towards very draconian enforcement of against even personal infringement: domestic ISP DNS blocking, perhaps even mandatory browser or operating system level blocking of infringement.
In Spain my home internet provider (Movistar) blocks access to LibGen and the like. My university, however, does not. It realizes that de facto these sources serve as the library for its professors. So I wind up having to connect through the university VPN anyway ...
They usually block via DNS, so it's pretty easy to circumvent by using alternative DNS servers such as 1.1.1.1 or 9.9.9.9, or Tor (which you sometimes need for books that have been DMCA'd)