Yup, they win. My biggest SQLite database is 1.7TB with, as of just now 2314851188 records (all JSON documents with a few keyword indexes via json_extract).
Works like a charm, as in: the web app consuming the API linked to it returns paginated results for any relevant search term within a second or so, for a handful of concurrent users.
I think FS-level compression would be a perfect match. Has anyone tried it successfully on large SQLite DBs? (I tried but btrfs failed to do so, and I didn't get to the bottom of why).
> I think FS-level compression would be a perfect match. Has anyone tried it successfully on large SQLite DBs?
I've had decent success with `sqlite-zstd`[0] which is row-level compression but only on small (~10GB) databases. No reason why it couldn't work for bigger DBs though.
I can see that you're a user of AWS. Check some prices on dedicated servers one day. They're an order of magnitude cheaper than similar AWS instances, and more powerful because all compute and storage resources are local and unshared.
They do have a higher price floor, though. There are no $5/month dedicated servers anywhere - the cheapest is more like $40. There are $5/month virtual servers outside of AWS which are cheaper and more powerful than $5/month AWS instances.
A Windows Server VM on a self-hosted Hyper-V box, which has a whole bunch of 8TB NVMe drives; this VM has a 4TB virtual volume on one of those (plus a much smaller OS volume on another).
Using the SQLite backup API, which pretty much corresponds to the .backup CLI command. It doesn't block any reads or writes, so the performance impact is minimal, even if you do it directly to slow-ish storage.
That's not great advice for a large database (and I wouldn't recommend it for small databases either). That incidentally works when the db is small enough that the copy is nearly atomic, but with a big copy, you can end up with a corrupt database. SQLite is designed such that a crashed system at any time does not corrupt a database. It's not designed such that a database can be copied linearly from beginning to end while it's being written to without corruption. Simply copying the database is only good enough if you can ensure that there are no write transactions opened during the entire copy.
A reliable backup will need to use the .backup command, the .dump command, the backup API[0], or filesystem or volume snapshots to get an atomic snapshot.
Works like a charm, as in: the web app consuming the API linked to it returns paginated results for any relevant search term within a second or so, for a handful of concurrent users.