Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I was annoyed by Wikipedia's noisy design choices, so I built a very scaled down read-friendly mirror of Wikipedia that presents the text as though it was a book, doing away with inline links. It's desktop-first as I don't really use mobile browsers, my fingers are too big.

https://encyclopedia.marginalia.nu/article/Hacker_News



I like this. How does this work? Is there an api you use to obtain the text from the wiki articles?


It's a script that reads openzim files, and then cleans up the MediaWiki HTML into a sqlite database. The encyclopeida service then does some basic rendering to produce this website. Search and navigation is provided by tortured misuse of a PATRICIA Trie.

It's very much a one-off sort of a thing that's not super well documented, but here are the sources: https://github.com/MarginaliaSearch/encyclopedia.marginalia....

It's a surprisingly light service. I used to host it on a raspberry pi 4 for a long while, and it wasn't significantly slower than it is today.


From the footer:

>The wikipedia contents are from OpenZIM dumps, which typically lag behind the main Wikipedia project by up to a year.

That being said, while it wouldn't be a good fit for that project, Wikipedia, et al., have a fairly robust API: <https://www.mediawiki.org/wiki/API>


> The wikipedia contents are from OpenZIM dumps, which typically lag behind the main Wikipedia project by up to a year.

(edit: too late)


This is great! Would be nice to have some way of accessing inline images though.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: