Library Genesis in Numbers: Mapping the Underground Flow of Knowledge (2018) [pdf]

Xerox9213 · on Nov 8, 2023

Interesting. The authors focused a lot on competing with market vs archival purposes. For me, and I reckon a lot of other people, I used libgen because I just want a PDF or epub that is not muddied with DRM or behind a university wall. Most of the books I download I already have a paper copy of, but want to be able to use it digitally. For me it’s all about convenience. I guess that would be pretty hard to measure.

kqr · on Nov 8, 2023

This is close to my use case as well. I mainly buy e-books these days, but they are often published as an afterthought[1] and not designed for readers – things like maths and plots get mangled in the process and vary from excellent to completely unreadable. So I want something closer to the physical book to look up the equations and visuals, but I don't want to have to go to the library[2] for each one. A PDF scan of the print book is very convenient.

[1] Sometimes they are literally an OCR of the print book! You can tell because of the symbol substitutions.

[2] Even most digital libraries I've tried are very inconvenient to "go to". Archive.org might be the exception.

posterboy · on Nov 8, 2023

> Sometimes they are literally an OCR of the print book! You can tell because of the symbol substitutions.

Not really. TEX for example does a lot of weird character substitution behind the scenes to mangle the layout.

InitialLastName · on Nov 8, 2023

how often does TEX substitute "rn" for "m" though?

kqr · on Nov 9, 2023

Comma instead of subscript i is one of my pet peeves.

_oghd · on Nov 8, 2023

my friend in college bought texts books before the first semester when their parents took them to the book store. then a classmate showed them libgen and didn't buy any more text books except the one weird custom textbook by some professor.

i guess they buy all their books now that they have a job, but also with generous use of used book stores and physical libraries.

oh and dont buy though amazon, especially kindle.

aegypti · on Nov 8, 2023

This pretty much mirrors my experience exactly, but these days I lean a lot more heavily on the Libby app for borrowing from my library which is amazing.

What’s the issue with Amazon/Kindle though, the fakes?

_oghd · on Nov 8, 2023

their de-facto monopoly status on basically all things books and increasingly on publishing, and of course their absolutely inane DRM.

libby is amazing, people are completely sleeping on it if you're not using it. free e-books!

wolverine876 · on Nov 8, 2023

Talking to some students, I'm told that people often freely pass a book on to the next semester's students.

fyokdrigd · on Nov 8, 2023

it's actually very easy to measure. the audience here is close to the 1%, so your usecase is already discarded in the grand scheme with a 2% error margin when measuring behavior for the large population.

vmfunction · on Nov 8, 2023

Yup hnews is not a good place to do market research for general population.

crest · on Nov 8, 2023

But it is a good place to get an insight into what could motivate enough technically minded (and connected) people to sustain such risky projects.

unixhero · on Nov 8, 2023

I used it to not have to pay.

I buy 10 books per year off Amazon and the occasional Economist magazine. I am of course not paying for academic journal access or tons of academic books outside my field.

dacryn · on Nov 8, 2023

I use it to wade through the masses.

Download 10 ebooks on a subject, scroll through, read a few excerpts, and the one that I actually start reading, I buy from amazon or a local bookstore and actually read it.

Many books look great, but once you start reading them, are total garbage. This is in the field of IT literature though, so books tend to be very hit or miss, and quickly outdated.

nicolas_t · on Nov 10, 2023

I've done that a lot especially with books on raising a child, education etc... Most books are bad, I don't want to pay until I know the book is good.

uniqueuid · on Nov 8, 2023

Great to see this here.

The author Balázs Bodó is an all around great guy with very refreshing research methods and an immensely broad portfolio of things he works on [1].

[1] https://www.uva.nl/en/profile/b/o/b.bodo/b.bodo.html#Publica...

JacobSeated · on Nov 8, 2023

Not to devalue the tireless work of authors, but we must also recognize that students in particular do not have enough money to pay for all these books.

At the same time, it can be argued that writing a single book, and selling it thousands times over, is a bit to "easy" in terms of ways to make money. The hard part is to get people to actually buy your book – it does not matter that you wrote the best book in the world if nobody gives you well deserved attention for it.

Bloggers suffer from the same problem, and it is not necessarily because their work is bad. The truth of the matter just is, nobody cares about quality information anymore (And I am guilty of this as well).

We want fast and summarized answers so we can move on to the important part: solving whatever concrete problem we are working on.

AI will probably delude the little remaining value of information even further, and at a point, nobody will manually write comprehensive information anymore. At least not unassisted by AI, and while the quality may suffer, we must also realize that we do not really need 100% accurate information. If we get a statistically significant amout of accurate AI provided information, then there is no need for anyone to write books anymore. It will be a complete unappreciated waste of their time, and nobody is going to buy them.

Even now that I am in a decent job, I still prefer not to buy books, instead relying on free sources on the internet (not piracy). If a given book/information is not available for free, then it is often not important enough for me to bother (note often – not always).

It is also a matter of prioritizing – reading a book takes me way too long, and the process is far-from comfortable due to my slow reading, and for that reason alone I tend to avoid reading entire books. It strikes me as an antiquated way to gain information even without AI. I may open a specific chapter of interest, but reading the entire thing is painfully tedious, and probably unnecessary.

taopai · on Nov 8, 2023

I read lots of non fiction books. I think the same.

Most of them could condense it's contents in a couple of chapters.

It would be great to have modular books, like Emacs manual. If sections where independent modules you could rearange the book or even create books from a series of sections from diferent books.

That way you could choose different outlines, maybe predefined by the author like, to create books tailed to your needs:

- General Ideas.

- General Ideas + observed cases

- Mixed outline (Concepts + Stories + Conclusions)

- All details about one topic

generalizations · on Nov 8, 2023

This is what's done with sufficiently academic books in the hard sciences. They are so modular, that books are simply topics with each chapter written by a chosen author, and those authors will treat the chapters they've written similarly to papers they've had published.

I think if you dive deeper in to more rigorous nonfiction books you'll find that less time is spent on the 'pop' side of popsci literature. Which might be where you're encountering that fluff.

whycome · on Nov 9, 2023

I hate to bring it up, but AI would be perfect for that. It could create logical segues between the new chapter order. So you could basically create books on demand based on real content with only the AI providing context.

getwiththeprog · on Nov 9, 2023

"writing a single book, and selling it thousands times over, is a bit to "easy" in terms of ways to make money"

$2 profit per book is perhaps a high figure that an author may recieve from the publisher. 2 x 3000 is $6000 for maybe months or years of work. And this would be a 'successful' author. It's not all JK Rowling out there ya know!

fikama · on Nov 8, 2023

I am curious, what books you found important enough to bother? So my list of (shame) books to read could grow even longer.

squigz · on Nov 8, 2023

Why do people buy books now, instead of just reading reviews?

eclecticfrank · on Nov 8, 2023

Obvious troll, but anyway: Reviews are there to help identify if books are worth reading/buying. Reviews cannot and are not supposed to replace books.

squigz · on Nov 8, 2023

Well I'm certainly not trolling you but okay. Have a good day.

sycamoretrees · on Nov 8, 2023

Why do people go to see movies when they could just watch the trailer on YouTube?

squigz · on Nov 8, 2023

Well I'm not the one claiming people are going to stop reading because of AI, so I'm not the one to ask :)

DiggyJohnson · on Nov 8, 2023

I don't understand this question. Book reviews are not summaries. And even they were, "why do people read books, instead of just reading summaries?" is still a ridiculous question.

squigz · on Nov 8, 2023

It is ridiculous. That's the point. Same as suggesting people would stop reading books in favor of summaries written by AI

And indeed, 'review' was the wrong word to use, but I appreciate you understanding what I meant

agumonkey · on Nov 8, 2023

people do watch movies / show in 1.5x now.. it's common.

charcircuit · on Nov 8, 2023

>but we must also recognize that students in particular do not have enough money to pay for all these books.

Sure they do. They happen to be able to pay for food, water, electricity, rent, tuition, transportation, pens, pencils, paper, etc just fine.

JR1427 · on Nov 8, 2023

Academic books are terribly expensive.

When I was at university (Oxford, UK, 2009), I bought about four key books, and spent well over 100GBP. I simply couldn't afford more. I had a student loan which covered tuition and some of my living costs, as well as a bursary from the university which covered some of my other living costs (but not all!).

What was annoying was that our library didn't have enough books for all the students. We'd all be assigned the same reading list for the week, and then have to race to the library to get the books before they were all gone.

gizajob · on Nov 8, 2023

If people could download any of those things for free off the internet, they would.

fragmede · on Nov 8, 2023

by "just fine", you mean "go into crippling amounts of debt that isn't dischargeable by bankruptcy"?

webefbskj · on Nov 8, 2023

Imagine someone from a 3rd world country. Let's say India.

Even the books like K&R C, Tanenbaum operating systems or CLRS / Skienna Algorithms will be north of 1000 INR in India.

Count 5 - 6 such books per semester, that's at least 5k - 10k INR. But for obscure books the price often goes to 5K for a single book. Let's say 10K INR.

Which is a significant amount, and for some students can exceed a month's living expenses for a semester.

So they're hesitant to spend that much. Often they end up with shittily written local books.

Now imagine you want to consult some book for specialist topics, like Windows internals or something, you will have to sell an organ.

Source: I was the Indian student.

abridges6523 · on Nov 8, 2023

“Just fine” here means riddled with debt.

eclecticfrank · on Nov 8, 2023

Libgen fills an important gap in the paid archives/libraries. There are always some books, papers or articles that are not available through my university library.

It's a huge help to have documents available immediately and not having to pay a small fortune for a single document that might prove irrelevant.

Cenk · on Nov 8, 2023

If you are interested in Library Genesis (and shadow libraries in general) I can recommend this book by Joe Karaganis: https://boook.link/Shadow-Libraries

abridges6523 · on Nov 8, 2023

But which shadow library can I find this book on?

Funes- · on Nov 8, 2023

Just click on "PDF".

ajsnigrutin · on Nov 8, 2023

PDFs suck on e-readers.

But you can get an epub on at least one of the shadow libraries :)

t-3 · on Nov 8, 2023

They aren't too bad with modern reader software, you can automatically trim margins and set the viewport size and scan pattern to match the text (makes double-column papers much nicer to read).

Funes- · on Nov 9, 2023

>PDFs suck on e-readers.

They are delightful to read on a ~13-inch e-reader.

hodanli · on Nov 8, 2023

This is an incredibly valuable resource for those outside of the US and EU, where the prices for books in English (imported) are extremely high.

crvdgc · on Nov 8, 2023

Interesting research. However, in practical terms, I wouldn't count Kindle as a form of digital availability, unless the book is composed of pure text.

Browsing figures or tables is usually a very bad experience. The figures are usually in low resolution. And the tables are sometime just pictures. Even if the publisher bothered to encode the table as a table, if it spans over one screen size, navigation becomes very hard. Not to mention symbols and formulas.

tschwimmer · on Nov 8, 2023

I predict that Libgen/SciHub and pirate sites in generall will be rendered functionally inaccessible in the developed west in the coming years. My reasoning is as follows:

Historically, IP owners have tried various strategies to increase revenues by curtailing infringement. In the early 2000s, IP owners tried to extract revenue directly from pirates via honeypot torrents and ISP subpoenas. Since about 2010, IP owners have shifted towards a more passive approach where they prioritize infringement for commercial purposes only and largely leave piracy for personal consumption alone. It just didn't make sense to chase after teenagers and students who probably wouldn't have had the money to make a legit purchase anyway.

LLMs trained on huge corpuses that include pirated content like LibGen change everything. Now, these IP holders face an existential (or at least severe) threat to their business models in the form of AI generated content. At the very least, these IP holders missed out on a massive opportunity to extract some the wealth created by AIs by virtue of their laxity in going after easily available pirate content.

I'd expect to see a strong swing back towards very draconian enforcement of against even personal infringement: domestic ISP DNS blocking, perhaps even mandatory browser or operating system level blocking of infringement.

dzonga · on Nov 8, 2023

library genesis is a god-send.

there's plenty of out of print books or that you won't find at specialized retailers but would find on lib-gen.

bigbacaloa · on Nov 8, 2023

In Spain my home internet provider (Movistar) blocks access to LibGen and the like. My university, however, does not. It realizes that de facto these sources serve as the library for its professors. So I wind up having to connect through the university VPN anyway ...

wyan · on Nov 9, 2023

They usually block via DNS, so it's pretty easy to circumvent by using alternative DNS servers such as 1.1.1.1 or 9.9.9.9, or Tor (which you sometimes need for books that have been DMCA'd)