The "Merkle tree" algorithm here isn't using a Merkle tree, it's just a binary p...

dan-robertson · on May 1, 2024

I think the author was using the search_after strategy (or something roughly equivalent). They were using ‘cursor’ in the generic sense as a place in the list of documents, rather than the specific durable cursors api offered by elasticsearch

ramsicandra · on May 3, 2024

I think this is a fair criticism. The Merkle Tree here isn't used. I was just inspired by the diagram and come up with binary partitioning solution.

In terms of performance, it's fair to say that this binary partitioning algo is slightly worse than a cursor / search after pagination since there is an overhead of checking count while the cursor pagination does not need to.

Hmm, It never cross my mind to change/correct the design of using ES as a primary data source. My guess now is it would take as much effort or higher to migrate between ES -> SQL compared to ES -> ES.

I think the snapshot approach is interesting. If I had to start over, I'll most likely explore that.