Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I adore tagging systems and have worked on them in several different applications and implementations, but there are always pitfalls and trade offs, and it’s possible to bury yourself

Nowadays I nearly always store the assigned tags as an integer array column in Postgres, then use the intarray extension to handle the arbitrary boolean expression searches like “((1|2)&(3)&(!5))”. I still have a tags table that stores all the metadata / hierarchy / rules, but for performance I don’t use a join table. This has solved most of my problems. Supertags just expand to OR statements when I generate the expression. Performance has been excellent even with large tables thanks to pg indexing.



Do you index arrays? What index type is that? Any tips?

I’ve used array column in PG before, haven’t indexed arrays though.


AFAIK, postgres first got its reputation of high performance because of array indexes.

People usually go with GIN indexes, that can be used on the contains, overlaps or equals comparisons.


Yes, it’s explained in the intarray doc here. GiST is the one I use, but as it states GIN should be faster on reads. I haven’t really thought about that in many years, I should run some perf tests.

https://www.postgresql.org/docs/9.1/intarray.html


Read perf would be good in isolation, sure; but what about the lock contention over those rows due to the frequent writes to the tags column? Usually "taggings" have a much higher rate-of-change than the objects they tag. If you're not at least keeping a thing_taggings has-one table (thing bigserial, tags intarray) separate from your things table, I could see this degrading performance for any query that wants to touch the table.


I just made tags nosql, it's stored as json data in pg with a user set to each tag.

I think the relation was that each user could have mutlitple posts that each contain multiple tags.

Where each tag is global with an user ids stored with the user Id as the key. Or something approximatating that.

Mainly to not need insane queries for many to many relationships. O(1) baby.


Can you search by tag expressions like “((1|2)&(!3))” with your method? I’m not understanding it completely. I often use json_data but I didn’t consider tagging methods because of the overhead it has.


The tradeoff here is that you lose the foreign key constraint, correct? So if you delete a tag, there is no way for the database to automatically remove all references to it. Or is there some way to do this now?


> So if you delete a tag, there is no way for the database to automatically remove all references to it.

I'm not sure about implementation/support in Postgres specifically, but in the general case of a column of tag bitfields, the database could easily maintain a global popcount (ie, "number of rows with this tag") and soft-delete flag for each tag, and clear any soft-deleted tags on (possibly-only-write-)access. When a soft-deleted tag reaches popcount == zero, it counts as garbage collected and can be reused for a new tag.


But then again, the reasons for _deleting_ a tag are very low. What really happens is that you want your searches on that tag to just return nothing, and that's more of an application level responsibility. Your #absolutely_vile_tag_that_would_get_you_in_jail just enters a blacklist, and you're done.


Yes, but that’s easily handled with a trigger. My first implementation actually had a regular join table of items_tags which used a trigger to update the items.tags intarray. Wasn’t super performant but let us use our existing templates for 1-many to implement the UI.

Nowadays you can just use a tagging component with integrated search for the UI.


Right . More like nosql FKs.

How high is the business risk if you have a random tag with no name? Skip it’s display jn the UI


Would you mind sharing a simple example that demonstrates this? Sounds great!




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: