> It is generally a good practice to not expose your primary keys to the externa...

tetha · on Feb 18, 2024

> I see this "best practice" advocated often, but to me it reeks of security theater. If an attacker is able to do anything useful with a guessed ID without being authenticated and authorized to do so, then something else has gone horribly, horribly, horribly wrong and that should be the focus of one's energy instead of adding needless complexity to the schema.

Yes, but the ability to guess IDs can make this security issue horrible, or much much worse.

If you had such a vulnerability and you are exposing the users to UUIDs, now people have to guess UUIDs. Even a determined attacker will have a hard time doing that or they would need secondary sources to get the IDs. You have a data breach, but you most likely have time to address it and then you can assess the amount of data lost.

If you can just <seq 0 10000 | xargs -I ID curl service/ticket/ID> the security issue is instantly elevated onto a whole new level. Suddenly all data is leaked without further effort and we're looking at mandatory report to data protection agencies with a massive loss of data.

To me, this is one of these defense in depth things that should be useless. And it has no effect in many, many cases.

But there is truely horrid software out there that has been popped in exactly the described way.

dijit · on Feb 18, 2024

Case in point, a recent security issue Gitlab experienced (CVE-2023-7028; arbitrary password reset by knowing one of the accounts associated mail addresses) was made worse by a feature of gitlab that few people know about; that the "userID" is associated with a meta/internal mail address.

This meant that people could send password resets for any user if they knew their userID. The mail format was like user-1@no-reply.gitlab.com or something.

Since it's a safe bet that "user ID 1" is an admin user, someone weaponised this.

plagiarist · on Feb 18, 2024

I've already resolved to never use Gitlab entirely on the basis of that CVE but that makes it worse.

Password resets should just never go to an email that hasn't been deliberately attached to an account by the account's owner, full stop. There should not be a code path where it is possible to send any such thing to arbitrary emails. And redirect emails should never be treated as account emails in any way.

yellowapple · on Feb 18, 2024

Even without that auto-incrementing ID, there are plenty of other options for guessing valid email addresses to use with that exploit. For example, if you're able to figure out the format an organization uses for their email addresses (e.g. first.last@company.com), and you're able to figure out who works at that org (via e.g. LinkedIn), then there's a very good chance you can reset passwords for, say, the company's CTO or other likely-highly-privileged users.

That is: this kind of proves my point. Removing autoincrementing IDs from the equation is of minimal benefit when things have already gone horribly horribly wrong like this. It's a little bit more work on the attacker's part, but not by anywhere near enough for such a "mitigation" to be of much practical benefit.

s4i · on Feb 18, 2024

It’s mentioned in the article. It’s more to do with business intelligence than security. A simple auto-incrementing ID will reveal how many total records you have in a table and/or their growth rate.

> If you expose the issues table primary key id then when you create an issue in your project it will not start with 1 and you can easily guess how many issues exist in the GitLab.

no-dr-onboard · on Feb 19, 2024

Business intelligence isn’t really applicable on the database level with guids..way too many abstraction layers down.

coldtea · on Feb 18, 2024

>I see this "best practice" advocated often, but to me it reeks of security theater.

The idea of "security theater" is overplayed. Security can be (and should be) multilayered, it doesn't have to be all or nothing. So that, when they break a layer (say the authentication), they shouldn't automatically gain easy access to the others

>If an attacker is able to do anything useful with a guessed ID without being authenticated and authorized to do so, then something else has gone horribly, horribly, horribly wrong and that should be the focus of one's energy instead of adding needless complexity to the schema.

Sure. But by that time, it's will be game over if you don't also have the other layers in place.

The thing is that you can't anticipate any contigency. Bugs tend to not preannounce themselves, especially tricky nuanced bugs.

But when they do appear, and a user can "do [something] useful with an ID without being authenticated and authorized to do so" you'd be thanking all available Gods that you at least made the IDs not guassable - which would also give them also access to every user account on the system.

yellowapple · on Feb 19, 2024

> Security can be (and should be) multilayered, it doesn't have to be all or nothing.

In this case the added layer is one of wet tissue paper, at best. Defense-in-depth is only effective when the different layers are actually somewhat secure in their own right.

It's like trying to argue that running encrypted data through ROT13 is worthwhile because "well it's another layer, right?".

> you'd be thanking all available Gods that you at least made the IDs not guassable - which would also give them also access to every user account on the system.

I wouldn't be thanking any gods, because no matter what those IDs look like, the only responsible thing in such a situation is to assume that an attacker does have access to every user account on the system. Moving from sequential IDs to something "hard" like UUIDs only delays the inevitable - and the extraordinarily narrow window in which that delay is actually relevant ain't worth considering in the grand scheme of things. Moving from sequential IDs to something like usernames ain't even really an improvement at all, but more of a tradeoff; yeah, you make life slightly harder for someone trying to target all users, but you also make life much easier for someone trying to target a specific user (since now the attacker can guess the username directly - say, based on other known accounts - instead of having to iterate through opaque IDs in the hopes of exposing said username).

coldtea · on Feb 19, 2024

>I wouldn't be thanking any gods, because no matter what those IDs look like, the only responsible thing in such a situation is to assume that an attacker does have access to every user account on the system. Moving from sequential IDs to something "hard" like UUIDs only delays the inevitable*

Well, there's nothing "inevitable". It's a computer system, not the fullfilment of some prophecy.

You can have an attack vector giving you access to a layer, without guaranteed magic access to other layers.

But even if it "just delays the inevitable", that's a very good thing, as it can be time used to patch the issue.

Not to mention, any kind of cryptography just "delays the inevitable" too. With enough time it can be broken with brute force - might not even take millions of years, as we could get better at quantum computing in the next 50 or 100.

yellowapple · on Feb 19, 2024

> But even if it "just delays the inevitable", that's a very good thing, as it can be time used to patch the issue.

My point is that in this case, the additional time is nowhere near sufficient to make much of a difference. This is especially true when you consider that an attacker could be probing URLs before finding an exploit, in which case that tiny delay between "exploit found" -> "all users compromised" shrinks to zero.

metafunctor · on Feb 18, 2024

Bugs happen also in access control. Unguessable IDs make it much harder to exploit some of those bugs. Of course the focus should be on ensuring correct access control in the first place, but unguessable IDs can make the difference between a horrible disaster and a close call.

It's also possible to use auto-incrementing database IDs and encrypt them, if using UUIDs doesn't work for you. With appropriate software layers in place, encrypted IDs work more or less automatically.

lordgrenville · on Feb 18, 2024

> The only case where this might be valuable is business intelligence

Nitpick: I would not call this "business intelligence" (which usually refers to internal use of the company's own data) but "competitive intelligence". https://en.wikipedia.org/wiki/Competitive_intelligence

EE84M3i · on Feb 18, 2024

See also "German Tank Problem" https://en.m.wikipedia.org/wiki/German_tank_problem

remus · on Feb 18, 2024

In general it's a defense-in-depth thing. You definitely shouldn't be relying on it, but as an attacker it just makes your life a bit harder if it's not straightforward to work out object IDs.

For example, imagine you're poking around a system that uses incrementing ints as public identifiers. Immediately, you can make a good guess that there's probably going to be some high privileged users with user_id=1..100 so you can start probing around those accounts. If you used UUIDs or similar then you're not leaking that info.

In gitlabs case this is much less relevant, and it's more fo a cosmetic thing.

serial_dev · on Feb 18, 2024

> In gitlabs case this is much less relevant (...)

Why, though? GitLab is often self hosted, so being able to iterate through objects, like users, can be useful for an attacker.

yellowapple · on Feb 18, 2024

In my experience self-hosted GitLabs are rarely publicly-accessible in the first place; they're usually behind some sort of VPN.

As for an attacker being able to iterate through users, if that information is supposed to be private, and yet an attacker is getting anything other than a 404, then that's a problem in and of itself and my energy would be better spent fixing that.

Anon1096 · on Feb 19, 2024

This is again a defense in depth thing. In the age of WFH, cracking a corporate VPN is really not that difficult. If you can make an attacker's life harder for low cost you should do it just in case.

yellowapple · on Feb 19, 2024

Except you ain't really putting up a meaningful obstacle against an attacker here. Compared to the typical effort of cracking a corporate VPN, brute-forcing IDs is downright trivial.

Like I said elsewhere: it's like calling ROT13 "defense in depth".

remus · on Feb 18, 2024

You're right, fair point.

worksonmine · on Feb 18, 2024

> What value would there be in preventing guessing?

It prevents enumeration, which may or may not be a problem depending on the data. If you want to build a database of user profiles it's much easier with incremental IDs than UUID.

It is at least a data leak but can be a security issue. Imagine a server doing wrong password correctly returning "invalid username OR password" to prevent enumeration. If you can still crawl all IDs and figure out if someone has an account that way it helps filter out what username and password combinations to try from previous leaks.

Hackers are creative and security is never about any single protection.

yellowapple · on Feb 18, 2024

> If you can still crawl all IDs and figure out if someone has an account that way it helps filter out what username and password combinations to try from previous leaks.

Right, but like I suggested above, if you're able to get any response other than a 404 for an ID other than one you're authorized to access, then that in and of itself is a severe issue. So is being able to log in with that ID instead of an actual username.

Hackers are indeed creative, but they ain't wizards. There are countless other things that would need to go horribly horribly wrong for an autoincrementing ID to be useful in an attack, and the lack of autoincrementing IDs doesn't really do much in practice to hinder an attacker once those things have gone horribly, horribly wrong.

I can think of maybe one exception to this, and that's with e-commerce sites providing guest users with URLs to their order/shipping information after checkout. Even this is straightforward to mitigate (e.g. by generating a random token for each order and requiring it as a URL parameter), and is entirely inapplicable to something like GitLab.

worksonmine · on Feb 18, 2024

> Right, but like I suggested above, if you're able to get any response other than a 404 for an ID other than one you're authorized to access, then that in and of itself is a severe issue. So is being able to log in with that ID instead of an actual username.

You're missing the point and you're not thinking like a hacker yet. It's not about the ID itself or even private profiles, but the fact that you can build a database of all users with a simple loop. For example your profile here is '/user?id=yellowapple' not '/user?id=1337'.

If it was the latter I could build a list of usernames by testing all IDs. Then I would cross-reference those usernames to previous leaks to know what passwords to test. And hacking an account is not the only use of such an exploit, just extracting all items from a competitors database is enough in some cases. It all depends on the type data and what business value it has. Sometimes an incrementing ID is perfectly fine, but it's more difficult to shard across services so I usually default to UUID anyway except when I really want an incrementing ID.

Most of the time things don't have to go "horribly horribly wrong" to get exploited. It's more common to be many simple unimportant holes cleverly combined.

The username can still always be checked for existence on the sign-up step, and there aren't many ways of protecting from that. But it's easier to rate-limit sign-ups (as one should anyway) than viewing public profiles.

Do you leave your windows open when you leave from home just because the burglar can kick the front door in instead? It's the same principle.

yellowapple · on Feb 19, 2024

> Do you leave your windows open when you leave from home just because the burglar can kick the front door in instead?

Yes (or at least: I ain't terribly worried if I do forget to close the windows before leaving), because the likelihood of a burglar climbing up multiple floors to target my apartment's windows is far lower than the likelihood of the burglar breaking through my front door.

But I digress...

> You're missing the point and you're not thinking like a hacker yet.

I'd say the same about you. To a hacker who's sufficiently motivated to go through all those steps you describe, a lack of publicly-known autoincremented IDs is hardly an obstacle. It might deter the average script kiddie or botnet, but only for as long as they're unwilling to rewrite their scripts to iterate over alphanumerics instead of just numerics, and they ain't the ones performing attacks like you describe in the first place.

> For example your profile here is '/user?id=yellowapple' not '/user?id=1337'.

In either case that's public info, and it's straightforward to build a list of at least some subset of Hacker News users by other means (e.g. scraping comments/posts - which, mind you, are sequential IDs AFAICT). Yes, it's slightly more difficult than looping over user IDs directly, but not by a significant enough factor to be a worthwhile security mitigation, even in depth.

Unless someone like @dang feels inclined to correct me on this, I'm reasonably sure the decision to use usernames instead of internal IDs in HN profile URLs has very little to do with this particular "mitigation" and everything to do with convenience (and I wouldn't be all that surprised if the username is the primary key in the first place). '/user?id=worksonmine' is much easier to remember than '/user?id=8675309', be it for HN or for any other social network / forum / etc.

> Sometimes an incrementing ID is perfectly fine, but it's more difficult to shard across services so I usually default to UUID anyway except when I really want an incrementing ID.

Sharding is indeed a much more valid reason to refrain from autoincrementing IDs, publicly-known or otherwise.

worksonmine · on Feb 19, 2024

> multiple floors

But not if you lived on the ground floor I assume?

> iterate over alphanumerics instead of just numerics

Why would you suggest another sequential ID as a solution to a sequential ID? I didn't, UUIDs have decent entropy and are not viable to brute force. Don't bastardize my point just to make a response.

> I'm reasonably sure the decision to use usernames instead of internal IDs in HN profile URLs has very little to do with this particular "mitigation" and everything to do with convenience

It wasn't intended as an audit of HN, I was holding your hand when walking you through an example and chose the site we're on. I missed my mark and I don't think a third attempt will do much difference when you're trying so hard not to get it. If you someday have public data you don't want a competitor to enumerate you'll know the solution exists.

yellowapple · on Feb 19, 2024

> But not if you lived on the ground floor I assume?

Still wouldn't make much of a difference when my windows are trivial to break into even when closed.

> Why would you suggest another sequential ID

I didn't.

> UUIDs have decent entropy and are not viable to brute force.

They're 128 bits (for UUIDv4 specifically; other varieties in common use are less random). That's a pretty good amount of entropy, but far from insurmountable.

And you don't even need to brute-force anything; if these IDs are public, then they're almost certainly being used elsewhere, wherein they can be captured. There's also nothing stopping an attacker from probing IDs ahead of time, prior to an exploit being found. The mitigations to these issues are applicable no matter the format or size of the IDs in question.

And this is assuming the attacker cares about the specific case of attacking all users. If the attacker is targeting a specific user, or just wants to attack the first user one finds, then none of that entropy matters in the slightest.

Put simply: there are enough holes in the "random IDs are a meaningful security measure" argument for it to work well as a strainer for my pasta.

> when you're trying so hard not to get it

Now ain't that the pot calling the kettle black.

> If you someday have public data you don't want a competitor to enumerate you'll know the solution exists.

If I don't want a competitor to enumerate it then I wouldn't make the data public in the first place. Kinda hard to enumerate things when the only result of attempting to access them is a 404.

worksonmine · on Feb 19, 2024

> I didn't

Then how would you iterate it? You said:

>> iterate over alphanumerics instead of just numerics

I assume you mean kind of like youtube IDs where 123xyz is converted to numerics. I wouldn't call brute-forcing iterating, at least not in the sense we're discussing here.

> They're 128 bits (for UUIDv4 specifically; other varieties in common use are less random). That's a pretty good amount of entropy, but far from insurmountable.

If you generate a billion UUIDs every second for 100 years you have a 50% chance to have 1 (one!) collision. It's absolutely useless to try to guess even a small subset.

> And you don't even need to brute-force anything; if these IDs are public, then they're almost certainly being used elsewhere, wherein they can be captured.

Sessions can be hijacked anyway, so why not leave the user session db exposed to the web unprotected. Right? There will always be holes left when you fix one. That doesn't mean you should just give up and make it too easy.

> And this is assuming the attacker cares about the specific case of attacking all users. If the attacker is targeting a specific user, or just wants to attack the first user one finds, then none of that entropy matters in the slightest.

So your suggestion is to leave them all incrementing instead, do I understand you correctly?

> Put simply: there are enough holes in the "random IDs are a meaningful security measure" argument for it to work well as a strainer for my pasta.

It's such a simple thing to solve though that it doesn't really matter.

> If I don't want a competitor to enumerate it then I wouldn't make the data public in the first place. Kinda hard to enumerate things when the only result of attempting to access them is a 404.

There are lots of things you may want to have public but not easily crawlable. There might not even be the concept of users and restrictions. A product returning 404 for everything to everyone might not be very useful to anyone. You've been given plenty of examples in other comments, and I am sure you understand the points.

One example could be recipes, and you want to protect your work while giving real visitors (not users, there are no users to hack) the ability to search by name or ingredients. With incremental IDs you can scrape them all no problem and steal all that work. With a UUID you have to guess either the UUID for all, or guess every possible search term.

Another could be a chat app, and you don't want to expose the total number of users and messages sent. If the only way to start a chat is knowing a public key and all messages have a UUID as PK how would you enumerate this? With incrementing IDs you know you are user 1337 and you just sent message 1,000,000. With random IDs this is impossible to know. Anyone should still be able to add any user if the public key is known, so 404 is no solution.

I'm sure you'll have something to say about that as well. The point is to make it difficult to abuse, while still being useful to real visitors. I don't even understand your aversion to random IDs, are they so difficult to implement? What's the real problem?

yellowapple · on Feb 19, 2024

> Then how would you iterate it?

Like with any other number. UUIDs are just 128-bit numbers, ranging from 00000000-0000-0000-0000-000000000000 to FFFFFFFF-FFFF-FFFF-FFFF-FFFFFFFFFFFF.

I'll concede that iterating through the entirety of that range would take a very long time, but this still presumes that said iteration in its entirety is necessary in the first place.

> If you generate a billion UUIDs every second for 100 years you have a 50% chance to have 1 (one!) collision.

Maybe, if they're indeed randomly-generated. Are they indeed UUIDv4? Is your RNG up to snuff?

And the probability doesn't need to be as high as 50% to be a tangible danger. A 0.5% chance of an attacker guessing a UUID to exploit is still more than enough to consider that to be insufficient as a layer of security. Meanwhile, you're likely fragmenting your indices and journals, creating performance issues that could be exploited for a denial-of-service. Tradeoffs.

> Sessions can be hijacked anyway, so why not leave the user session db exposed to the web unprotected.

Hijacking a session is harder than finding a publicly-shared ID floating around somewhere.

> So your suggestion is to leave them all incrementing instead, do I understand you correctly?

My suggestion is to use what makes sense for your application. That might be a random number, if you've got a sharded database and you want something resistant to collisions. That might be a sequential number, if you need cache locality or simply want to know at a glance how many records you're dealing with. It might be a number with sequential high digits and random low digits, if you want the best of both worlds. It might be a text string if you just care about human readability and don't care about performance. It might even be multiple fields used together as a composite primary key if you're into that sort of thing.

My point, as was obvious in my original comment and pretty much every descendant thereof, is that of the myriad useful factors around picking a primary key type, security ain't really among them.

> There might not even be the concept of users and restrictions.

In which case everything served from the database should be treated as public - including the quantities thereof.

> You've been given plenty of examples in other comments, and I am sure you understand the points.

Correct, and if those examples were satisfactory for arguing a security rationale for avoiding sequential IDs, then I wouldn't still be here disagreeing with said arguments, now would I?

> One example could be recipes, and you want to protect your work while giving real visitors (not users, there are no users to hack) the ability to search by name or ingredients. With incremental IDs you can scrape them all no problem and steal all that work. With a UUID you have to guess either the UUID for all, or guess every possible search term.

You don't need to guess every possible search term. How many recipes don't use the most basic and common ingredients? Water, oil, sugar, salt, milk, eggs... I probably don't even have to count with my toes before I've come up with a list of search terms that would expose the vast majority of recipes for scraping. This is peak security theater.

> Another could be a chat app, and you don't want to expose the total number of users and messages sent. If the only way to start a chat is knowing a public key and all messages have a UUID as PK how would you enumerate this?

Dumbest way would be to DDoS the thing. At some number of concurrent sessions and/or messages per second, it'll start to choke, and that'll give you the upper bound on how many people are using it, at least at a time.

Smarter way would be to measure request times; as the user and message tables grow, querying them takes longer, as does inserting into them if there are any indices to update in the process - even more so when you're opting for random IDs instead of sequential IDs, because of that aforementioned index and journal fragmentation, and in general because they're random and therefore entail random access patterns.

> I don't even understand your aversion to random IDs, are they so difficult to implement? What's the real problem?

My aversion ain't to random IDs in and of themselves. They have their place, as I have repeatedly acknowledged in multiple comments in this thread (including to you specifically).

My aversion is to treating them as a layer of security, which they are not. Yes, I'm sure you can contrive all sorts of narrowly-tailored scenarios where they just so happen to provide some marginal benefit, but in the real world "sequential IDs are insecure" is the sort of cargo-cultish advice that's impossible to take seriously when there are countless far-bigger fish to fry.

My additional aversion is to treating them as a decision without cost, which they also are not. As touched upon above, they carry some rather significant performance implications (see e.g. https://www.2ndquadrant.com/en/blog/sequential-uuid-generato...), and naïvely defaulting to random IDs everywhere without considering those implications is a recipe for disaster. There are of course ways to mitigate those impacts, like using UUIDs that mix sequential and random elements (as that article advocates), but this is at the expense of the supposed "security" benefits (if you know how old a record is you can often narrow the search range for its ID), and it still requires a lot more planning and design and debugging and such compared to the tried-and-true "just slap an autoincrementing integer on it and call it a day".

worksonmine · on Feb 19, 2024

> I'll concede that iterating through the entirety of that range would take a very long time

You don't say?

> but this still presumes that said iteration in its entirety is necessary in the first place.

It is, because it's compared to sequential IDs where you know exactly the start and end. No way of knowing with UUIDs.

> Maybe, if they're indeed randomly-generated. Are they indeed UUIDv4? Is your RNG up to snuff?

Stop constantly moving the goalposts, and assume they're used correctly, Jesus Christ. Anytime someone talks about UUID it's most likely v4, unless you want non-random/deterministic v5 or time-sortable v7. But the most common is v4.

> Hijacking a session is harder than finding a publicly-shared ID floating around somewhere.

Even firebase stores refresh tokens accessible by javascript (localstorage as opposed to HTTPOnly). Any extension with sufficient users is more viable than finding a single collision of a UUID. Programs with access to the cookies on the filesystem, etc. It's much easier to hijack sessions than guessing UUIDs.

> Dumbest way would be to DDoS the thing. At some number of concurrent sessions and/or messages per second, it'll start to choke, and that'll give you the upper bound on how many people are using it, at least at a time.

Won't tell you a single thing. Might be a beefy server serving 3 users, or a slow server with lots of downtime for millions of users. They may all be using it at once, or few of them sporadically. No way of knowing which. It's a retarded guesstimate. But this only shows the mental gymnastics you're willing to use to be "right". Soon you'll tell me "I'll just ask my contact at the NSA".

> sequential IDs are insecure

Nobody claimed this. They can be enumerated which may or may not be a problem depending on the data. Which I said in my first comment, and you seem to agree. This entire thread was a complete waste of my time.

yellowapple · on Feb 19, 2024

> This entire thread was a complete waste of my time.

The feeling is mutual. Have a nice day.

JimBlackwood · on Feb 18, 2024

I follow this best practice, there’s a few reasons why I do this. It doesn’t have to do with using a guessed primary ID for some sort of privilege escalation, though. It has more to do with not leaking any company information.

When I worked for an e-commerce company, one of our biggest competitors used an auto-incrementing integer as primary key on their “orders” table. Yeah… You can figure out how this was used. Not very smart by them, extremely useful for my employer. Neither of these will allow security holes or leak customer info/payment info, but you’d still rather not leak this.

tomnipotent · on Feb 18, 2024

> extremely useful for my employer.

I've been in these shoes before, and finding this information doesn't help you as an executive or leader make any better decisions than you could have before you had the data. No important decision is going to be swayed by something like this, and any decision that is probably wasn't important.

Knowing how many orders is placed isn't so useful without average order value or items per cart, and the same is true for many other kinds of data gleamed from this method.

JimBlackwood · on Feb 18, 2024

That’s not correct. Not every market is the same in it’s dynamics.

Yes, most of the time that information was purely insightful and was simply monitored. However, at some moments it definitely drove important decisions.

tomnipotent · on Feb 18, 2024

Like what?

What's going to change how a team develops (physical) products? What's a merchandiser or buyer going to learn that influences how they spend millions of dollars or deal with X weeks-on-hand of existing inventory? What's an operations manager going to learn that improves their ability to warehouse and ship product? How's marketing going to change their strategy around segmentation or channel distribution? What's a CEO going to learn that changes what departments or activities they want to invest in?

At best you get a few little tidbits of data you can include in presentations or board decks, but nothing that's going to influence critical decisions on how money is getting spent to get the job done or how time is getting allocated to projects. Worst case you have a inexperienced CEO that's chasing rather than leading, and just end up copying superficial aspects of your competitors without the context or understanding of why they did what they did.

I've called up execs at competitors and had friendly chats that revealed more about their business in 30 minutes than you could possibly find out through this "method".

kehers · on Feb 18, 2024

One good argument I found [^1] about not exposing primary keys is that primary keys may change (during system/db change) and you want to ensure users have a consistent way of accessing data.

[^1]: https://softwareengineering.stackexchange.com/questions/2183...

dalore · on Feb 19, 2024

It's also exposes your growth metrics. When using sequential id's one can tell how many users you have, how many users a month you are getting and all sorts of useful stuff that you probably don't want to expose.

It's how the British worked out how many tanks the German army had.

SkyMarshal · on Feb 18, 2024

> This is especially important when you use sequential auto-incrementing identifiers with type integer or bigint since they are guessable.

I thought we had long since moved past that to GUIDs or UUIDs for primary keys. Then if you still need some kind of sequential numbering that has meaning in relation to the other fields, make a separate column for that.

sgarland · on Feb 20, 2024

Except now people are coming back around, because they’re realizing (as the article mentions) that UUID PKs come with enormous performance costs. In fairness, any non-k-sortable ID will suffer the same fate, but UUIDs are the most common of the bunch.

strzibny · on Feb 19, 2024

Whether you think it's a real problem or not, if you want to solve it somehow, I compiled the current options one have in Rails apps:

https://nts.strzibny.name/alternative-bigint-id-identifiers-...

cyberfart · on Feb 19, 2024

There are also reasons outside infosec concerns. For example where such PKs would be directly related to your revenue, such as orders in an e-commerce platform. You wouldn't want competitors to have an estimate of your daily volume, that kind of thing.

mnahkies · on Feb 18, 2024

It can be really handy for scraping/archiving websites if they're kind enough to use a guessable id

heax · on Feb 18, 2024

It really depends but useful knowledege can be derived from this. If user accounts use sequential ids the id 1 is most likely the admin account that is created as first user.