I’m building a local-first web app, and SQLite works well for my case since a si...

rogerbinns · 2025-03-03T21:49:00 1741038540

SQLite has a session extension that can record changes on a local database into a changeset and you can replay those changes on another SQLite instance. Note that it replays what the changes were, not the queries that resulted in the changes. When applying changes you provide a conflict handler. (You can also invert changesets making a handy undo/redo feature.)

You can save conflicts to another changeset. There is also a rebaser to help deal with multiple way syncing.

https://www.sqlite.org/sessionintro.html - overview

https://www.sqlite.org/session/sqlite3changeset_apply.html - conflict information

https://www.sqlite.org/session/rebaser.html - rebaser

chii · 2025-03-04T03:52:25 1741060345

there's also a CRDT version of this, which allows two databases to be sync'ed to each other in real time (aka, updates to one will eventually make it to the other, and both database would eventually contain the same data).

It's https://vlcn.io/docs/cr-sqlite/intro , and i find it amazing that this is doable in sqlite. It is perfect for small scale collaboration imho, but it also works to sync across local client and remote server (for a single db per user scenario).

hitekker · 2025-03-04T07:24:15 1741073055

Interesting link, it'd be great if their solution meets expectations.

Right now, the proof-of-concept they've provided seems simplistic. Their progress seems to have shifted from cr-sqlite into "Zero" instead. I'm guessing it has something to do with CRDTs being quite app-specific and hard to generalize.

I would want to see this library used in production first before hyping it

chii · 2025-03-04T07:34:44 1741073684

in a sense it is quite specific. In a difference sense, this is as generic a CRDT as you can get - it's CRDT on table(s). There's no merging of rows iirc (unless you write a custom merge, which is supported but probably need some tweaking and could lead to poor results?).

stronglikedan · 2025-03-04T15:39:50 1741102790

> in real time (aka, updates to one will eventually make it to the other

The term you're looking for is "eventual consistency".

roncesvalles · 2025-03-04T05:39:30 1741066770

This is just clobbering one of the divergent copies with per-field granularity.

0cf8612b2e1e · 2025-03-03T18:35:38 1741026938

Maybe I am misunderstanding which part you want in the cloud, but that sounds like litestream. Let’s you transparently backup a live SQLite database to a remote destination.

https://litestream.io/

gwking · 2025-03-03T21:44:59 1741038299

I depend on litestream for production backups and as the months wear on without any releases I am getting more nervous. To be clear, I don’t feel entitled to anything with an open source project like this, but bug reports and fixes seem to be accumulating. I have flirted with the idea of building from main.

I’ve also flirted with the idea of forking litestream and stripping it down dramatically. The reason why is that I don’t like the idea of the production server being in charge of rotation and deletion. It seems like the thing getting backed up shouldn’t have the privilege of deleting backups in case it gets compromised. I might even go so far as to propose that the “even liter stream” process merely writes to a different local volume and then some other process does the uploading but I haven’t gotten beyond the daydream stage.

normie3000 · 2025-03-04T04:00:54 1741060854

Having run litestream in prod for 2+ years, I share all of these concerns.

> It seems like the thing getting backed up shouldn’t have the privilege of deleting backups in case it gets compromised.

For backups, I added a nightly cron job which exports my SQLite db to a write-only S3 bucket.

superq · 2025-03-04T17:26:15 1741109175

If that will fit your RPO, why not only do that? Saves a lot of complexity (and risk).

normie3000 · 2025-03-04T19:51:40 1741117900

It doesn't fit my RPO.

What's the additional risk?

superq · 2025-03-07T04:49:38 1741322978

Complexity == risk.

> It seems like the thing getting backed up shouldn’t have the privilege of deleting backups in case it gets compromised.

(agreed)

> For backups, I added a nightly cron job which > exports my SQLite db to a write-only S3 bucket.

Why not only do this and use an s3 sync instead? You can safely backup SQLite databases while they're being written to, so no need to export (dump) them; just copy the files themselves.

This might mean that your entire backup/restore strategy is just to copy some files. If so, that's ideal.

(Of course, s3 sync does require reading as well as writing, so perhaps just increase your cron job to run more often so it fits within your RPO)

normie3000 · 2025-03-16T20:35:24 1742157324

I'm starting to buy it.

> You can safely backup SQLite databases while they're being written to

Is this true with WAL enabled?

chubot · 2025-03-04T00:16:17 1741047377

What kind of bugs have you experienced or are you worried about? Backup software shouldn’t need to be frequently updated

dspillett · 2025-03-04T12:46:32 1741092392

I think the implication isn't that there are bugs they are immediately concerned about, but that other issues not being addressed might mean that should they run into a bug that does cause problems there may not be a timely solution, if any.

edoceo · 2025-03-04T03:45:12 1741059912

And comine with the session features that @rogerbinns mentioned. Feels resilient.

adamtaylor_13 · 2025-03-03T21:12:07 1741036327

Yeah, I was about to suggest litestream. Isn't it local-first-with-cloud-backups?

conradev · 2025-03-04T09:24:39 1741080279

SQLite has the ability to do exactly this, minus the versioning: https://sqlite.org/cloudsqlite/doc/trunk/www/index.wiki

Implementing snapshot backups on top of that would be as simple as a VACUUM and S3 PUT

For point-in-time restores, mvsqlite is a cool solution: https://github.com/losfair/mvsqlite

Daril · 2025-03-03T19:38:40 1741030720

Have your tried CR-SQLite ? https://vlcn.io/docs/cr-sqlite/intro

It implements CRDT as SQLite extension.

roncesvalles · 2025-03-03T19:04:56 1741028696

Offline-first databases are a hard problem because there isn't just one copy of the database on the user's side, there are N copies - every browser tab or device on which the user can open the local database and make an edit. It's basically an AP multi-master database (= the same row can be edited at different nodes at the same time), and you likely cannot achieve good results without a database that natively supports multi-master operations.

9dev · 2025-03-03T20:57:47 1741035467

That’s not necessarily true; if you use Origin Private Filesystem along with a Web Worker that acts as a local database server and works off a single SQLite database, you at least have a single DB file per device. From there on, your problem becomes state reconciliation on the server, which CRDTs should help solving.

Not an easy problem for sure, but the web platform is surprisingly capable these days.

roncesvalles · 2025-03-05T03:42:49 1741146169

CRDTs are so-so and likely cause issues with maintaining relational DBs' transactional consistency. There's a reason none of the NewSQL databases (to my knowledge) are multi-leader.

larkost · 2025-03-03T20:10:41 1741032641

I too think that CRDT databases are probably something you should explore. You generally have the whole database locally, and changes get synced pretty easily (but you have to live within the rules of your CRDT).

The one I thought of (mostly because I worked there before they went under/bought by MongoDB) is RealmDB: https://en.wikipedia.org/wiki/Realm_(database)

I have long since lost touch with the state of it, but at the time the syncing to their server was fast and had worked with a long list of environments/languages.

The one thing I will caution: their model was that you almost had to have a database-per-customer. You could have a second one that contained common information, but they had no concept of only syncing part of a database based on some logic. So many customer implications had the clients syncing multiple databases, and then a back-end client that would aggregate the data from all of those databases into one for backend processes. Extra complexity that I always thought was a real killer.

jimbokun · 2025-03-03T20:37:44 1741034264

Isn't the simplest way to "sync" to just replace the remote database file with the local database file? One of the nice things about each database being encapsulated as a single file.

timewizard · 2025-03-03T20:46:54 1741034814

Enabling WAL mode means you don't have a single file anymore.

catgirlinspace · 2025-03-04T00:00:59 1741046459

You could do a checkpoint first though I believe? And if the database is only being updated on your local client I don’t think WAL mode would have much benefit since it’s probably not getting many concurrent writes.

timewizard · 2025-03-04T07:35:26 1741073726

The WAL has a minimum size. In this context I assumed you would not be using SQLite to serve requests serially.

pstuart · 2025-03-04T01:16:38 1741050998

More work than grabbing a single file but still easy enough to use: https://www.sqlite.org/backup.html

Cthulhu_ · 2025-03-04T14:24:12 1741098252

> What I’d really like is an easy way to sync the SQLite database state to a cloud service.

Don't do this, but an application I used to work on (to replace it) copied the sqlite file to a folder then used rsync to sync it with a backup node. Apparently it worked and was good enough for that use case (inefficient php backend application with at most a dozen concurrent users).

100.000 rows is only a few megabytes at most, right? Should be fine.

superq · 2025-03-04T17:27:51 1741109271

> Don't do this

What's wrong with that? Of course it will work fine; SQLite, with or without WAL, has a ton of protections against corruption from writes-in-progress, which is what makes hot backups work.

galaxyLogic · 2025-03-03T18:45:11 1741027511

How about: Have 1 + N separate SQLite database-files.

Each user would have their own database-file which contains only information about that user. Then 1 shared database-file which contains info needed for all users.

Users would update their own data, which is a small database file which can be easily uploaded. They would not need to update the shared data.

Not knowing your app I don't know what the shared data would contain, presumably something. Perhaps the shared data-file would be updated on the server based on what individual user-data the users upload.

anovick · 2025-03-03T18:57:42 1741028262

In this multi-tenant arrangement, you run into synchronization problems.

Developers should expect users to connect to the service using multiple devices (clients).

AFAIK bare SQLite doesn't offer synchronization mechanisms between multiple SQLite instances.

I believe Turso offers some solution of this kind, but not sure if that's open source or not.

galaxyLogic · 2025-03-03T19:07:35 1741028855

> expect users to connect to the service using multiple devices (clients).

But probably using only one device at a time by a single user?

My thought, and it is just a thought, here is that instead of trying to provide a GENERAL solution for all kinds of data-update patterns, it is often possible to think in terms of what my current application specifically needs. It is easier to come up with such a solution with SQLite per app because SQLite is so "lite".

I can't speak for the "general solution" except to say that many times you don't need an all-encompassing general solution, just a solution for your current app.

normie3000 · 2025-03-04T04:14:11 1741061651

> But probably using only one device at a time by a single user?

It depends on your expectations of concurrent use. Computer + tablet + phone means many users may use different devices within seconds of each other. If you want to support offline-first usage, concurrent updates from different clients for the same user becomes more likely.

ammo1662 · 2025-03-03T20:41:26 1741034486

A simple, manual backup would be fine I think. You can just put an "upload" or "backup to cloud" button to allow user push a full version with timestamp to S3.

Synchronization may introduce a lot more problems, especially when you want to automatically sync the database to some other place. You will need to deal with sync errors, inconsistency, version conflicts, rollbacks...

If your users could accept that, a simple full version backup is the best solution.

pbronez · 2025-03-04T13:22:45 1741094565

Dolt would do that for you. It has push/pull semantics like git. As a bonus you can use its version control features to implement sophisticated undo/redo features.

https://dolthub.com/blog/2022-09-23-dolt-rollback-options/

osigurdson · 2025-03-04T05:40:10 1741066810

I've wanted to use SQLite a few times for the simplicity. I always end up using Postgres though because I don't understand how multiple services / replicas can make use of it. If another piece of infrastructure is needed to support it (even nfs), that seemingly counters any simplicity gains.

vvern · 2025-03-03T21:07:42 1741036062

Check out https://sqlsync.dev/

ozim · 2025-03-04T07:07:23 1741072043

Why not local storage or in browser db? If it is a local web app there is no need for backend.

redwood · 2025-03-04T12:18:37 1741090717

Have you seen PowerSync?

isaachinman · 2025-03-03T22:32:33 1741041153

Forget some roll-your-own stuff. Have a look at Replicache, Zero, and Evolu.