Camlistore: a new project from Brad Fitzpatrick

trotsky · on Jan 29, 2011

Looks like a pretty neat project, though I didn't have a very good sense of what they're trying to do until I read the use cases page: http://camlistore.org/docs/uses

Too bad they mostly seem to be focusing on go (for local) and google app engine (for remote) implementations. That's probably a serious barrier to excitement for most people I know, even if they plan to encourage other implementations in the future.

bradfitz · on Jan 29, 2011

What Brett said. Also, as the project matures we will provide binaries and downloads but for now the barrier to entry is probably convenient.

Go isn't the project's choice. It's my personal choice as I continue to love the language. The project encourages all languages.

codemechanic · on Jan 30, 2011

It sounds more like Tonido (http://www.tonido.com).

At least, the broader principles:

    * Disk is cheap and getting cheaper

    * Put the user in control. Own your data.

    * Privacy and paranoia

    * Decentralization is important, but..

    * End users won't be dorks. Must also be possible to be easy, hosted.

    * Content-Addressability has so many awesome properties (validation, cachability, etc). Use it as much as possible.

    * Redundancy and over-explicitness is fine. Compression will help. Redundancy and over-explicitness will be convenient for future digital archeologists, too.

bslatkin · on Jan 29, 2011

http://camlistore.org/docs/contributing

Write code. We have Go, Java, JavaScript, Python, Perl in the repo. Multiple implementations are the goal.

pangram · on Jan 29, 2011

I also found overview.txt (link below) in the documentation useful for getting a picture of the project, which if I had to summarize in three words would be "git-like content-addressable filesystem." Some of the things they're talking about for synchronization are neat and could open up some really interesting possibilities.

http://camlistore.org/code/?p=camlistore.git;f=doc;hb=master

vdm · on Jan 30, 2011

overview.txt:

http://camlistore.org/code/?p=camlistore.git;a=blob_plain;f=...

andrewcaito · on Jan 29, 2011

I was about to submit this myself - one of the most interesting Google 20% projects I've come across in a long time.

If you don't look through the whole website, at least skim their vision for use cases: http://camlistore.org/docs/uses

The single feature of being able to keep a private store of different web services could make this really take off for a lot of people.

joshfraser · on Jan 29, 2011

My three-word summary would be "application strength Dropbox"

jganetsk · on Jan 30, 2011

I was thinking about how proxying would work. You could get very cool proxying, similar to the way AptProxy works, where you instead of accessing http://camlistore.org:3179/camli/hash-1, a client would access http://localproxy?url=http://camlistore.org:3179/camli/hash-.... It would parse the URL and see if a blob is being referenced that a local camlistore has cache, and serve out of that.

Then I looked at the camlistore sharing model. The proposal involves storing private data in blobs camlistore, protecting it behind a 401, and requiring the client to append a ?via=hash-2, where hash-2 is a claim that says "ok, let this data through". I'm not a big fan of that, because proxies won't reproduce your security model, and private data is stored in the clear.

What one should do, really, is store private data encrypted. Then, you can reference a "phantom blob", which is not actually an object in storage but represents the cleartext of the private data. Your claim could now be a recipe on how to reconstruct this phantom blob, i.e. get chunk X of ciphertext and decrypt it with this key.

Now, dumb proxies won't just store sensitive stuff in the clear (although, they would store enough data that the plaintext could easily be reconstructed), but at least your security model is preserved.

bslatkin · on Jan 31, 2011

Encryption should occur higher in the app layer. For cache privacy everything should be over https and avoid proxies. There are uses for fully public blobs too and those do not have these issues.

jganetsk · on Jan 30, 2011

Sounds like git for personal storage in the cloud.

adulau · on Jan 30, 2011

Right. At least, there is a blob server in the architecture and a blobref in the reference document. In Camlistore, they have something called "schema blob" that looks more a generalized version of the specific "tree", "commit" and "tag" in Git. I hope they could have a sample skeleton in their schema to support the Git Object Model. In that scope, they could host standard git repository in their store and maybe benefit from existing git tools...

bradfitz · on Jan 31, 2011

We could (and kinda expected that somebody _would_) but I don't quite see the existing tools working. I'd love to be proven wrong, though. I suppose it could be done with some remapping front-end, but I think that front-end would also need to maintain maps between as-git-computed blobrefs and Camli blobrefs.

zaphar · on Jan 30, 2011

If you look at the docs for the Signed Claims[1] perhaps the most interesting part of this is the safe way to share content. Being able to cryptographically verify a claim of access to or ownership to content sounds pretty awesome to me.

I spoke to Brad and Brett at OrdCamp[2] here in chicago this weekend and it sounded pretty interesting. They plan to make the tooling around this hide the difficulties in dealing with public private keys from the average user. I suppose if it's done right this could do for sharing content what SSL did for Commerce on the web.

[1] http://camlistore.org/docs/json-signing [2] http://ordcamp.com

6ren · on Jan 30, 2011

meta: their schema seems to be JSON instances plus comments. http://camlistore.org/code/?p=camlistore.git;a=tree;f=doc/sc...

This is probably just a temporary notation, but perhaps it's clear enough for ongoing use?

Optional/required is handled by comments; JSON "[ ]" syntax indicates lists - but it's unclear if the contents represent a repeated group, like (abc)* , or alternatives, like (a|b|c)*; alternatives in general don't seem to be handled - in general, alternatives could also appear as the value of a field, not just in a list.

sanxiyn · on Jan 30, 2011

This reminded me of Venti. http://en.wikipedia.org/wiki/Venti

silentbicycle · on Jan 30, 2011

They mention it in their list of influences.

I've been working on a distributed, Venti-like archival filesystem, too. While plan9 never really caught on, it had many great systems within it that really deserve wider exposure.

escanda · on Jan 29, 2011

I wanted to give it a try, but you need enabled Billing to use at least the blobserver on Appengine.

space-monkey · on Jan 30, 2011

I believe you can enable billing and leave your cap at $0.

bradfitz · on Jan 29, 2011

You could use the local version.

ericmsimons · on Jan 30, 2011

Ok, that is awesome. You had me hooked at "Apache License".

hachiya · on Jan 30, 2011

Very interesting project. Brad, are there any similarities with your previous work on brackup? Perhaps with how the files are tracked?