This looks really interesting. I have been digging deep into sync for the past f...

aboodman · on Jan 28, 2020

Be aware that CRDTs like automerge are solving a different (and harder) problem than Replicache. They are trying to implement convergence in an asynchronous system where there is no central authority. See the excellent article https://www.inkandswitch.com/local-first.html for more on this type of application.

Most classic web services don't have this requirement because they do in fact have a central authority -- the service itself.

Moreover, for web services, it is crucial that the central authority actually be authoritative. You don't want client and server state kind of gets smooshed together arbitrarily, but for the client's view of the state to be a mere suggestion - one which the server always overrides.

So my view is that CRDTs are not really an appropriate basis for building this kind of feature in a web service.

However I think the tech is awesome (Replicache actually started out as a true CRDT and moved to its current design after extensive iteration with customers).

See https://www.figma.com/blog/how-figmas-multiplayer-technology... for how Figmas came to same conclusion wrt CRDTs for their service.

--

(P|C)ouchDB:

- Using couch as your backend db ends up being a nonstarter for most applications. A distributed multitenant database is a big big thing and a hugely important technical decision. Most orgs are not going to go with couch just to get sync. See https://medium.com/wandering-cto/mobile-syncing-with-couchba... for an example of this.

- The couchdb replication protocol offers no help with conflict resolution. It just tells you there was a conflict and gives you two conflicting documents. This isn't practical for most applications.

daleharvey · on Jan 29, 2020

PouchDB author here, your project looks great good job.

I certainly agree that switching backends to CouchDB has made it hard for people to adopt Pouch/Couch. I have often considered how I could make Pouch work with arbitrary data sources, but as you well know its a tricky problem.

However I dont understand from your website or comment here about how conflict resolution is easier? Given the situation of having 2 clients recieving an initial data of {key: value}, the clients go offline, 1 client writes {key: foo} and the other writes {key: bar}, the clients then both reconnect. What is the new state?

aboodman · on Jan 29, 2020

Hi Dale,

Thank you for the comment. I think there is a technical difference and an ergonomic difference:

1. The technical difference is that when you do conflict resolution with Replicache you have more information, specifically the intent of the mutations. Consider something very simple like a positive-only counter. The parent is `1` and the forks are `2` and `0`. Is the correct resolution `2`? Is it `0`? Or is it `1`? There's no way to know because we don't know what the intent of those changes was. Was fork 1 incrementing? Was it multiplying? Was fork 2 decrementing? By how much? Now multiply this simple example by real applications with many developers, many features, and many client versions in the wild. Having the intent of each change travel with the change is crucial.

2. The ergonomic difference is that conflict resolution in Replicache isn't something separate that is done after-the-fact. Replicache applies mutations to the server by calling normal HTTP APIs, just with potentially old arguments. This forces developers to consider conflict resolution at the point they are writing APIs, and keeps conflict resolution code colocated with the corresponding services.

sizediterable · on Jan 29, 2020

> This forces developers to consider conflict resolution at the point they are writing APIs

Would an example of what you have in mind be writing an atomic "/increment" endpoint instead of a "/set?value=possiblyOldValue+1" endpoint? (using GET notation instead of POST just to make illustration easier)

aboodman · on Jan 29, 2020

zffr · on Jan 30, 2020

From what I can tell from the website, replicache retries HTTP requests periodically until they succeed. With an non-idempotent request like `/increment` how does replicache know when to stop retrying?

if request succeed, but the response fails, could replicache increment twice?

aboodman · on Jan 30, 2020

The APIs Replicache calls at customer service must be idempotent. Replicache passes a version vector to the customer that customer uses to ensure this, and to enforce causal consistency.

Customers must make several relatively small changes to their backends to support Replicache, including this one.

oblib · on Jan 29, 2020

I think it's fair to offer a look at what CouchDB says about conflict resolution and what's provided to help manage it.

https://docs.couchdb.org/en/stable/replication/conflicts.htm...

After scanning that doc it looked to me like a developer could create a way to alert a user of conflicts and provide options to resolve them. On the client side it would seem like PouchDB might be used to create a friendly way to detect and help implement those decisions.

I just started looking into conflict resolution this month, and more out of curiosity than need. PouchDB's "Live Sync" for feature CouchDB works amazinly well and that alone reduces the opportunities for conflicts substantially by insuring when you go offline you always have the latest data.

As a rule of thumb I'd say that when using an app offline the user should avoid editing living documents. PouchDB makes it easy enough to create a new document with that data for others to review and merge into the living document when you come back online.

But reality is conflicts will happen and we do need tools to deal with them, so it's great to see the Replicache team working on that.