UUIDv7 is a nice idea, and should probably be what people use by default instead of UUIDv4 for internal facing uses.
For the curious:
* UUIDv4 are 128 bits long, 122 bits of which are random, with 6 bits used for the version. Traditionally displayed as 32 hex characters with 4 dashes, so 36 alphanumeric characters, and compatible with anything that expects a UUID.
* UUIDv7 are 128 bits long, 48 bits encode a unix timestamp with millisecond precision, 6 bits are for the version, and 74 bits are random. You're expected to display them the same as other UUIDs, and should be compatible with basically anything that expects a UUID. (Would be a very odd system that parses a UUID and throws an error because it doesn't recognise v7, but I guess it could happen, in theory?)
* ULIDs (https://github.com/ulid/spec) are 128 bits long, 48 bits encode a unix timestamp with millisecond precision, 80 bits are random. You're expected to display them in Crockford's base32, so 26 alphanumeric characters. Compatible with almost everything that expects a UUID (since they're the right length). Spec has some dumb quirks if followed literally but thankfully they mostly don't hurt things.
* KSUIDs (https://github.com/segmentio/ksuid) are 160 bits long, 32 bits encode a timestamp with second precision and a custom epoch of May 13th, 2014, and 128 bits are random. You're expected to display them in base62, so 27 alphanumeric characters. Since they're a different length, they're not compatible with UUIDs.
I quite like KSUIDs; I think base62 is a smart choice. And while the timestamp portion is a trickier question, KSUIDs use 32 bits which, with second precision (more than good enough), means they won't overflow for well over a century. Whereas UUIDv7s use 48 bits, so even with millisecond precision (not needed) they won't overflow for something like 8000 years. We can argue whether 100 years is future proof enough (I'd argue it is), but 8000 years is just silly. Nobody will ever generate a compliant UUIDv7 with any of the first several bits aren't 0. The only downside to KSUIDs is the length isn't UUID compatible (and arguably, that they don't devote 6 bits to a compliant UUID version).
Still feels like there's room for improvement, but for now I think I'd always pick UUIDv7 over UUIDv4 unless there's an very specific reason not to. Which would be, mostly, if there's a concern over potentially leaking the time the UUID was generated. Although if you weren't worrying about leaking an integer sequence ID, you likely won't care here either.
100 years sounds short-sighted for something that's supposed to be "universally" unique. We're already having problems with the 32-bit Unix timestamp not being large enough. If you're willing to use 160-bit (or longer) identifiers, you might as well give a few more bits to the timestamp. Round it up to an even number of base-62 characters, too. That part of KSUID has always struck me as a weird decision.
I wish UUIDv7 pulled the version/variant bits up front, though, just to make sure that the identifiers don't all start with null bytes.
Apparently, humanity is damned to repeat it's mistakes over and over again.
"100 years should be enough" is what led us to a mountain of Y2K issues, because when would a two digit year ever be ambigious?
But I guess it's a psychological issue. Unless you're a megalomaniac, it's just natural to assume that your decisions won't matter much outside of your life and lifetime. And in that case, 100 years totally is enough because I probably won't live that long.
And even more, in a lot of cases, it's also the correct assumption and the project won't live longer than a few years.
So, thinking about it, unless you are developing a novel standard or something that you want the world to adopt, 100 years probably IS fine.
Unfortunately, KSUID wants to be a novel standard, so there's an issue.
If the version bits were up front, then switching to a hypothetical UUIDv8 in several years would be guaranteed to break the sortability. So I see that decision as a bit of future proofing.
How so? It seems like the only real use case for these timestamps is to get data from around the same time together. A second is fine for that. It's not about concurrency or avoiding collisions. A second can't handle that, but neither can a millisecond.
I’d think that the locality would only matter at the scale of your query. I’m sure someone has queries with a window less than a second and so much traffic, but it seems niche enough to not optimize the standard for it.
I could definitely be off. I work at a company that gets those levels of traffic but don’t deal with it directly.
For me the whole value prop for ULIDs is that they can be generated by any node in a distributed system without coordination, while roughly preserving time order. "Roughly" meaning: all IDs will be globally ordered at millisecond precision, subject to the accuracy of each node's system clock; and IDs from a specific node will be locally ordered, subject to the details of the monotonicity part of the ID generator. This is important for me, because most of the things I attach IDs to will happen many many many times per second.
If you need more than second precision then millisecond doesn't get you much further. The fact that the epoch ends in 120 years is a bit more worrying, but is also just about non-critical enough that it will be ignored for at least the next century.
Also, to all future historians of 2150, sorry about the mess, but yes we knew this was going to happen. Whatever it was.
For the curious:
* UUIDv4 are 128 bits long, 122 bits of which are random, with 6 bits used for the version. Traditionally displayed as 32 hex characters with 4 dashes, so 36 alphanumeric characters, and compatible with anything that expects a UUID.
* UUIDv7 are 128 bits long, 48 bits encode a unix timestamp with millisecond precision, 6 bits are for the version, and 74 bits are random. You're expected to display them the same as other UUIDs, and should be compatible with basically anything that expects a UUID. (Would be a very odd system that parses a UUID and throws an error because it doesn't recognise v7, but I guess it could happen, in theory?)
* ULIDs (https://github.com/ulid/spec) are 128 bits long, 48 bits encode a unix timestamp with millisecond precision, 80 bits are random. You're expected to display them in Crockford's base32, so 26 alphanumeric characters. Compatible with almost everything that expects a UUID (since they're the right length). Spec has some dumb quirks if followed literally but thankfully they mostly don't hurt things.
* KSUIDs (https://github.com/segmentio/ksuid) are 160 bits long, 32 bits encode a timestamp with second precision and a custom epoch of May 13th, 2014, and 128 bits are random. You're expected to display them in base62, so 27 alphanumeric characters. Since they're a different length, they're not compatible with UUIDs.
I quite like KSUIDs; I think base62 is a smart choice. And while the timestamp portion is a trickier question, KSUIDs use 32 bits which, with second precision (more than good enough), means they won't overflow for well over a century. Whereas UUIDv7s use 48 bits, so even with millisecond precision (not needed) they won't overflow for something like 8000 years. We can argue whether 100 years is future proof enough (I'd argue it is), but 8000 years is just silly. Nobody will ever generate a compliant UUIDv7 with any of the first several bits aren't 0. The only downside to KSUIDs is the length isn't UUID compatible (and arguably, that they don't devote 6 bits to a compliant UUID version).
Still feels like there's room for improvement, but for now I think I'd always pick UUIDv7 over UUIDv4 unless there's an very specific reason not to. Which would be, mostly, if there's a concern over potentially leaking the time the UUID was generated. Although if you weren't worrying about leaking an integer sequence ID, you likely won't care here either.