I adore tagging systems and have worked on them in several different application...

srcreigh · on Oct 18, 2022

Do you index arrays? What index type is that? Any tips?

I’ve used array column in PG before, haven’t indexed arrays though.

marcosdumay · on Oct 18, 2022

AFAIK, postgres first got its reputation of high performance because of array indexes.

People usually go with GIN indexes, that can be used on the contains, overlaps or equals comparisons.

dahdum · on Oct 19, 2022

Yes, it’s explained in the intarray doc here. GiST is the one I use, but as it states GIN should be faster on reads. I haven’t really thought about that in many years, I should run some perf tests.

https://www.postgresql.org/docs/9.1/intarray.html

derefr · on Oct 19, 2022

Read perf would be good in isolation, sure; but what about the lock contention over those rows due to the frequent writes to the tags column? Usually "taggings" have a much higher rate-of-change than the objects they tag. If you're not at least keeping a thing_taggings has-one table (thing bigserial, tags intarray) separate from your things table, I could see this degrading performance for any query that wants to touch the table.

langsoul-com · on Oct 19, 2022

I just made tags nosql, it's stored as json data in pg with a user set to each tag.

I think the relation was that each user could have mutlitple posts that each contain multiple tags.

Where each tag is global with an user ids stored with the user Id as the key. Or something approximatating that.

Mainly to not need insane queries for many to many relationships. O(1) baby.

dahdum · on Oct 19, 2022

Can you search by tag expressions like “((1|2)&(!3))” with your method? I’m not understanding it completely. I often use json_data but I didn’t consider tagging methods because of the overhead it has.

RussianCow · on Oct 18, 2022

The tradeoff here is that you lose the foreign key constraint, correct? So if you delete a tag, there is no way for the database to automatically remove all references to it. Or is there some way to do this now?

a1369209993 · on Oct 18, 2022

> So if you delete a tag, there is no way for the database to automatically remove all references to it.

I'm not sure about implementation/support in Postgres specifically, but in the general case of a column of tag bitfields, the database could easily maintain a global popcount (ie, "number of rows with this tag") and soft-delete flag for each tag, and clear any soft-deleted tags on (possibly-only-write-)access. When a soft-deleted tag reaches popcount == zero, it counts as garbage collected and can be reused for a new tag.

ohgodplsno · on Oct 19, 2022

But then again, the reasons for _deleting_ a tag are very low. What really happens is that you want your searches on that tag to just return nothing, and that's more of an application level responsibility. Your #absolutely_vile_tag_that_would_get_you_in_jail just enters a blacklist, and you're done.

dahdum · on Oct 19, 2022

Yes, but that’s easily handled with a trigger. My first implementation actually had a regular join table of items_tags which used a trigger to update the items.tags intarray. Wasn’t super performant but let us use our existing templates for 1-many to implement the UI.

Nowadays you can just use a tagging component with integrated search for the UI.

brianwawok · on Oct 18, 2022

Right . More like nosql FKs.

How high is the business risk if you have a random tag with no name? Skip it’s display jn the UI

justinpage · on Oct 18, 2022

Would you mind sharing a simple example that demonstrates this? Sounds great!