Like with any other number. UUIDs are just 128-bit numbers, ranging from 00000000-0000-0000-0000-000000000000 to FFFFFFFF-FFFF-FFFF-FFFF-FFFFFFFFFFFF.
I'll concede that iterating through the entirety of that range would take a very long time, but this still presumes that said iteration in its entirety is necessary in the first place.
> If you generate a billion UUIDs every second for 100 years you have a 50% chance to have 1 (one!) collision.
Maybe, if they're indeed randomly-generated. Are they indeed UUIDv4? Is your RNG up to snuff?
And the probability doesn't need to be as high as 50% to be a tangible danger. A 0.5% chance of an attacker guessing a UUID to exploit is still more than enough to consider that to be insufficient as a layer of security. Meanwhile, you're likely fragmenting your indices and journals, creating performance issues that could be exploited for a denial-of-service. Tradeoffs.
> Sessions can be hijacked anyway, so why not leave the user session db exposed to the web unprotected.
Hijacking a session is harder than finding a publicly-shared ID floating around somewhere.
> So your suggestion is to leave them all incrementing instead, do I understand you correctly?
My suggestion is to use what makes sense for your application. That might be a random number, if you've got a sharded database and you want something resistant to collisions. That might be a sequential number, if you need cache locality or simply want to know at a glance how many records you're dealing with. It might be a number with sequential high digits and random low digits, if you want the best of both worlds. It might be a text string if you just care about human readability and don't care about performance. It might even be multiple fields used together as a composite primary key if you're into that sort of thing.
My point, as was obvious in my original comment and pretty much every descendant thereof, is that of the myriad useful factors around picking a primary key type, security ain't really among them.
> There might not even be the concept of users and restrictions.
In which case everything served from the database should be treated as public - including the quantities thereof.
> You've been given plenty of examples in other comments, and I am sure you understand the points.
Correct, and if those examples were satisfactory for arguing a security rationale for avoiding sequential IDs, then I wouldn't still be here disagreeing with said arguments, now would I?
> One example could be recipes, and you want to protect your work while giving real visitors (not users, there are no users to hack) the ability to search by name or ingredients. With incremental IDs you can scrape them all no problem and steal all that work. With a UUID you have to guess either the UUID for all, or guess every possible search term.
You don't need to guess every possible search term. How many recipes don't use the most basic and common ingredients? Water, oil, sugar, salt, milk, eggs... I probably don't even have to count with my toes before I've come up with a list of search terms that would expose the vast majority of recipes for scraping. This is peak security theater.
> Another could be a chat app, and you don't want to expose the total number of users and messages sent. If the only way to start a chat is knowing a public key and all messages have a UUID as PK how would you enumerate this?
Dumbest way would be to DDoS the thing. At some number of concurrent sessions and/or messages per second, it'll start to choke, and that'll give you the upper bound on how many people are using it, at least at a time.
Smarter way would be to measure request times; as the user and message tables grow, querying them takes longer, as does inserting into them if there are any indices to update in the process - even more so when you're opting for random IDs instead of sequential IDs, because of that aforementioned index and journal fragmentation, and in general because they're random and therefore entail random access patterns.
> I don't even understand your aversion to random IDs, are they so difficult to implement? What's the real problem?
My aversion ain't to random IDs in and of themselves. They have their place, as I have repeatedly acknowledged in multiple comments in this thread (including to you specifically).
My aversion is to treating them as a layer of security, which they are not. Yes, I'm sure you can contrive all sorts of narrowly-tailored scenarios where they just so happen to provide some marginal benefit, but in the real world "sequential IDs are insecure" is the sort of cargo-cultish advice that's impossible to take seriously when there are countless far-bigger fish to fry.
My additional aversion is to treating them as a decision without cost, which they also are not. As touched upon above, they carry some rather significant performance implications (see e.g. https://www.2ndquadrant.com/en/blog/sequential-uuid-generato...), and naïvely defaulting to random IDs everywhere without considering those implications is a recipe for disaster. There are of course ways to mitigate those impacts, like using UUIDs that mix sequential and random elements (as that article advocates), but this is at the expense of the supposed "security" benefits (if you know how old a record is you can often narrow the search range for its ID), and it still requires a lot more planning and design and debugging and such compared to the tried-and-true "just slap an autoincrementing integer on it and call it a day".
> I'll concede that iterating through the entirety of that range would take a very long time
You don't say?
> but this still presumes that said iteration in its entirety is necessary in the first place.
It is, because it's compared to sequential IDs where you know exactly the start and end. No way of knowing with UUIDs.
> Maybe, if they're indeed randomly-generated. Are they indeed UUIDv4? Is your RNG up to snuff?
Stop constantly moving the goalposts, and assume they're used correctly, Jesus Christ. Anytime someone talks about UUID it's most likely v4, unless you want non-random/deterministic v5 or time-sortable v7. But the most common is v4.
> Hijacking a session is harder than finding a publicly-shared ID floating around somewhere.
Even firebase stores refresh tokens accessible by javascript (localstorage as opposed to HTTPOnly). Any extension with sufficient users is more viable than finding a single collision of a UUID. Programs with access to the cookies on the filesystem, etc. It's much easier to hijack sessions than guessing UUIDs.
> Dumbest way would be to DDoS the thing. At some number of concurrent sessions and/or messages per second, it'll start to choke, and that'll give you the upper bound on how many people are using it, at least at a time.
Won't tell you a single thing. Might be a beefy server serving 3 users, or a slow server with lots of downtime for millions of users. They may all be using it at once, or few of them sporadically. No way of knowing which. It's a retarded guesstimate. But this only shows the mental gymnastics you're willing to use to be "right". Soon you'll tell me "I'll just ask my contact at the NSA".
> sequential IDs are insecure
Nobody claimed this. They can be enumerated which may or may not be a problem depending on the data. Which I said in my first comment, and you seem to agree. This entire thread was a complete waste of my time.
Like with any other number. UUIDs are just 128-bit numbers, ranging from 00000000-0000-0000-0000-000000000000 to FFFFFFFF-FFFF-FFFF-FFFF-FFFFFFFFFFFF.
I'll concede that iterating through the entirety of that range would take a very long time, but this still presumes that said iteration in its entirety is necessary in the first place.
> If you generate a billion UUIDs every second for 100 years you have a 50% chance to have 1 (one!) collision.
Maybe, if they're indeed randomly-generated. Are they indeed UUIDv4? Is your RNG up to snuff?
And the probability doesn't need to be as high as 50% to be a tangible danger. A 0.5% chance of an attacker guessing a UUID to exploit is still more than enough to consider that to be insufficient as a layer of security. Meanwhile, you're likely fragmenting your indices and journals, creating performance issues that could be exploited for a denial-of-service. Tradeoffs.
> Sessions can be hijacked anyway, so why not leave the user session db exposed to the web unprotected.
Hijacking a session is harder than finding a publicly-shared ID floating around somewhere.
> So your suggestion is to leave them all incrementing instead, do I understand you correctly?
My suggestion is to use what makes sense for your application. That might be a random number, if you've got a sharded database and you want something resistant to collisions. That might be a sequential number, if you need cache locality or simply want to know at a glance how many records you're dealing with. It might be a number with sequential high digits and random low digits, if you want the best of both worlds. It might be a text string if you just care about human readability and don't care about performance. It might even be multiple fields used together as a composite primary key if you're into that sort of thing.
My point, as was obvious in my original comment and pretty much every descendant thereof, is that of the myriad useful factors around picking a primary key type, security ain't really among them.
> There might not even be the concept of users and restrictions.
In which case everything served from the database should be treated as public - including the quantities thereof.
> You've been given plenty of examples in other comments, and I am sure you understand the points.
Correct, and if those examples were satisfactory for arguing a security rationale for avoiding sequential IDs, then I wouldn't still be here disagreeing with said arguments, now would I?
> One example could be recipes, and you want to protect your work while giving real visitors (not users, there are no users to hack) the ability to search by name or ingredients. With incremental IDs you can scrape them all no problem and steal all that work. With a UUID you have to guess either the UUID for all, or guess every possible search term.
You don't need to guess every possible search term. How many recipes don't use the most basic and common ingredients? Water, oil, sugar, salt, milk, eggs... I probably don't even have to count with my toes before I've come up with a list of search terms that would expose the vast majority of recipes for scraping. This is peak security theater.
> Another could be a chat app, and you don't want to expose the total number of users and messages sent. If the only way to start a chat is knowing a public key and all messages have a UUID as PK how would you enumerate this?
Dumbest way would be to DDoS the thing. At some number of concurrent sessions and/or messages per second, it'll start to choke, and that'll give you the upper bound on how many people are using it, at least at a time.
Smarter way would be to measure request times; as the user and message tables grow, querying them takes longer, as does inserting into them if there are any indices to update in the process - even more so when you're opting for random IDs instead of sequential IDs, because of that aforementioned index and journal fragmentation, and in general because they're random and therefore entail random access patterns.
> I don't even understand your aversion to random IDs, are they so difficult to implement? What's the real problem?
My aversion ain't to random IDs in and of themselves. They have their place, as I have repeatedly acknowledged in multiple comments in this thread (including to you specifically).
My aversion is to treating them as a layer of security, which they are not. Yes, I'm sure you can contrive all sorts of narrowly-tailored scenarios where they just so happen to provide some marginal benefit, but in the real world "sequential IDs are insecure" is the sort of cargo-cultish advice that's impossible to take seriously when there are countless far-bigger fish to fry.
My additional aversion is to treating them as a decision without cost, which they also are not. As touched upon above, they carry some rather significant performance implications (see e.g. https://www.2ndquadrant.com/en/blog/sequential-uuid-generato...), and naïvely defaulting to random IDs everywhere without considering those implications is a recipe for disaster. There are of course ways to mitigate those impacts, like using UUIDs that mix sequential and random elements (as that article advocates), but this is at the expense of the supposed "security" benefits (if you know how old a record is you can often narrow the search range for its ID), and it still requires a lot more planning and design and debugging and such compared to the tried-and-true "just slap an autoincrementing integer on it and call it a day".