Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This article does not mention Jumbostore (Kave Eshghi, Mark Lillibridge, Lawrence Wilcock, Guillaume Belrose, and Rycharde Hawkes) which used content defined chunking recursively on the chunk list of a content defined chunked file in 2007. This is exactly what a Prolly Tree is.


I was aware of this kind of structure when I coined 'prolly tree'. It's the same thing bup was doing, which I referenced in our design docs:

https://github.com/attic-labs/noms/blob/master/doc/intro.md#...

The reason I thought a new name was warranted is that a prolly tree stores structured data (a sorted set of k/v pairs, like a b-tree), not blob data. And it has the same interface and utility as a b-tree.

Is it a huge difference? No. A pretty minor adaptation of an existing idea. But still different enough to warrant a different name IMO.


My use of "exactly" was an overstatement. The important difference is that internal nodes in a prolly tree contain not only the hashes of the child nodes but also the index keys as is done in a B-tree. The divisions at each level however similarly are decided by the application of content defined chunking method to the entire level of the tree.


Amazing! all these people reinvented my SuperMegaTree!




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: