Circa 2003 I carried around a pared down copy on a Pocket PC. Dropping a few chosen categories (who needs Sports?) allowed it to barely fit on a 1-GB SD card.
I was curious how they achieve this. It looks like the underlying file format uses LZMA, or optionally Zstd, compression. Both achieve pretty high compression ratios against plain text and markup.
> Its file compression uses LZMA2, as implemented by the xz-utils library, and, more recently, Zstandard. The openZIM project is sponsored by Wikimedia CH, and supported by the Wikimedia Foundation.
The more important thing is that they aggressively downsize the images and omit the history and talk pages. Even if they were using LZW it would probably only triple the filesize.
File size is always an issue when downloading such big content, so we always produce each Wikipedia file in three flavours:
Mini: only the introduction of each article, plus the infobox. Saves about 95% of space vs. the full version.
nopic: full articles, but no images. About 75% smaller than the full version
Maxi: the default full version.