The idea is to limit each "shard" to some number configurable number of objects, say, 1 million. As the container grows, the db can be split in two and each of the two new pieces can grow. The original container entity keeps an index listing of what each of its "child shards" hold, ie the start and end markers.
There are tricky problems to solve, of course. How do listings work? Will shards ever be collected? What are the performance tradeoffs? How does replication handle shard conflicts?
These issue will be worked out, and it should eliminate the write bottleneck in large containers. (Note that reads are/were never affected by this issue.)
This implementation of container sharding is something that is being evaluated. It may or may not ever make it into swift itself.
There are tricky problems to solve, of course. How do listings work? Will shards ever be collected? What are the performance tradeoffs? How does replication handle shard conflicts?
These issue will be worked out, and it should eliminate the write bottleneck in large containers. (Note that reads are/were never affected by this issue.)
This implementation of container sharding is something that is being evaluated. It may or may not ever make it into swift itself.