When deciding recently whether to use CHECK ('a', 'b', 'c') vs ENUM, I believe a search/LLM-query stated that it was easier to change a CHECK's values later and not easy for ENUM, so that's what I went with.
As for a lookup table, truly curious, is it worth the complexity of the foreign reference and join?
Please read source docs instead of relying on LLMs, especially for RDBMS. I’ve found they quite often get something subtly wrong; for example, recommending that the PK be added to a secondary composite index in MySQL - this is entirely unnecessary, because all secondary indices in MySQL implicitly include the PK.
> lookup table worth it
Is not doing it worth the risk of referential integrity violations? How important is your data to you? You can say, “oh, the app will handle that” all you want, but humans are not perfect, but RDBMS is as close as you’re ever going to come to it. I have seen orphaned rows and referential violations at every company I’ve been at that didn’t enforce foreign key constraints.
There is a performance hit at scale to not doing it, also: imagine you have a status column with some ENUM-esque values, like CANCELED, APPROVED, etc. If stored as TEXT or VARCHAR, that’s N+(1-2 bytes) per string. At the hundreds of millions or billions of rows scale, this adds up. Storage is cheap, but memory isn’t, and if you’re wasting it on repeated text strings, that’s a lot fewer rows per page you can fit, and so more disk access is required. JSON objects are the same, since both MySQL and Postgres only shift large blob-type objects off-page after a certain threshold.
What you said does matter at scale. As an alternate perspective, for an initial, smaller app, I could see the CHECK method for e.g. processing/completed providing the same correct value constraints as an ENUM or a lookup table, with more flexibility, speed and less complexity. And can be refactored later if needed.
I legitimately don’t understand why a lookup table is ever seen as adding any meaningful amount of complexity.
SELECT a.foo, b.bar
FROM alpha a
JOIN bravo b ON
a.id = b.alpha_id
The speed hit from an INNER JOIN on indexed columns is negligible, and will be dwarfed by network overhead for most places. As to flexibility, if you need a new value, insert it.
for postgres ENUM should be just as easier to change as CHECK
* for adding, there is no problem you can just add entries to an ENUM
* for removing there is a problem because you can't easily remove entries from an ENUM. its only possible to create a new enum type and then change the column type but that is going to cause problems with big tables. however, now your ENUM solution decays to a CHECK+ENUM solution. so it is not really any worse than a CHECK solution.
also, it is possible to add new CHECK constraints to a big table by marking the constraint as 'NOT VALID'. existing rows will not be checked and only new rows will be checked.
As for a lookup table, truly curious, is it worth the complexity of the foreign reference and join?