Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Most of the metadata activity is contained within a single shard: > > - File creation, same-directory renames, and deletion. > - Listing directory contents. > - Getting attributes of files or directories.

I guess this is a trade-off between a file system and an object store? As in S3, ListObjects() is a heavy hitter and there can be potentially billions of objects under any prefix. Scanning only on a single instance won't be sufficient.





It's definitely a different use case but given they haven't had to tap into their follower replicas for scale, it must be pretty efficient and lightweight. I suspect not having ACLs helps. They also cite a minimum 2MB size, so not expecting exabtyes of little bytes.

I wonder if a major difference is listing a prefix in object storage vs performing recursive listings in a file system?

Even in S3, performing very large lists over a prefix is slow and small files will always be slow to work with, so regular compaction and catching file names is usually worthwhile.


2MB median to be fair, so half of our files are under 2MB.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: