Personally - I've come to the absolute opposite opinion. To be overly blunt:
"Tags fucking suck."
They are literally the worst possible way to store and organize your information, and they are only useful when you just want a random sampling of a category - not a specific document or piece of information. Ex: Great for social media or looking at old photos or just playing a song from a genre you like, bad (fucking terrible) for organization and structure.
---
Hierarchical structures have downsides, but the exact thing you complain about (artifacts of the physical world) is exactly their strength... You have a body that is adapted to the physical world - routing and navigation through a series of ordered steps is a VERY well developed human skill. We are primed to be able to remember things like:
- Go left at the tree,
- Straight until you hit road
- Right at the road
- continue until you hit a red house with a big garden
- etc...
That skill set maps directly into the hierarchical system of folder:
- Find the "documents" folder on the desktop
- scroll down to "my super sweet project"
- open that folder
- Find the "icons" folder
- open it and double click "exactly_the_thing_you_wanted.jpg"
------
You can absolutely still make horrible, unorganized messes - but if done well (ex: this article is actually a fairly good system) it's a much, much better system than tags.
Your example about navigating roads has nothing to do with hierarchy. And, in fact, most road networks are not hierarchical and the interconnectedness is their strength:
Your brain doesn't organize information hierarchically. Let's say I ask you:
1. Name a band that starts with "B".
2. Name a band from England.
3. Name a rock band.
If your brain stored bands in a hierarchy, you'd only be able to come up with "The Beatles" as an answer for one of those questions. You'd have to figure out whether to categorize the Beatles by name, location, or genre and it would be absent from the other categories.
Or you'd have to do an inefficient search in order to find something that matched, which would be slow, but not impossible.
Or you'd have to maintain several redundant hierarchies.
(I agree with you that our subjective experience and speed in thinking of things is evidence that we probably don't mentally represent things this way.)
I strongly agree with the commentator who likened the hierarchal folder structure to the physical world, it’s a much more direct mapping of how human memory actually works.
Humans aren’t actually magical AI computers of energy floating in midair, they’re made of physical meat. Even if some abstract concepts (like tags) may make more theoretical sense (I agree with people who say that certain things can be classified in 2 different locations), it may not play to the actual structure and advantages of the human brain.
> I strongly agree with the commentator who likened the hierarchal folder structure to the physical world, it’s a much more direct mapping of how human memory actually works.
But the physical world isn't hierarchical at all. It's spatial. It's much more like a graph than a tree where there are usually multiple paths between any two points.
If you have to pick up your kid from school and stop at the grocery store for milk on the way home from work, you probably do not:
1. Drive to school and get kid.
2. Drive back to work.
3. Drive to grocery story to get milk.
4. Drive back to work.
5. Drive home from work.
Or:
1. Drive to school and get kid.
2. Drive to grocery story to get milk.
3. Drive back to school.
4. Drive back to work.
5. Drive home from work.
If the physical world was hierarchical, all navigation through multiple waypoints would look like this kind of stack pushing and popping.
I'm telling you that all navigation through multiple waypoints DOES usually look like this kind of pushing and popping (just on a massive scale).
So here's a possible day for me:
I work at corporate office A, it's near the highway entrance. I have to pick up my kid - they are at school down the local street heading west. I travel west and pick up my child.
Now I need milk. The closest grocery is back east, just past my office, so I drive back by my office and pull into the grocery.
Then I load up and set off for home. To get there, I need to take the highway to the north, so I head back past my office on that same street and get on the highway using the closest entrance.
I take the highway until I'm home.
---
That sure seems like a normal day to me. It's exactly what you said folks would never do, but it's super common. And it's hardly something the modern introduced with cars - there's a cost function to travelling anywhere in the world, and people like to connect using low cost paths - which tends to model a folder hierarchy.
Sure, some routes end up being tree-like, because trees are a subset of graphs. But just as often you see waypoints like:
1. Leave the office.
2. Drive to the grocery store.
3. Drive to school.
4. Drive home.
Where there is no backtracking between them.
> And it's hardly something the modern introduced with cars - there's a cost function to travelling anywhere in the world, and people like to connect using low cost paths - which tends to model a folder hierarchy.
A tree doesn't minimize the cost for any given trip or for the aggregate cost of all trips between pairs of points. Because a tree has only a single path between any two points, it has the highest possible aggregate trip cost for all possible trips while still being connected.
What it does minimize is the cost of building and maintaining the paths. Since there is only a single path between any pair of points, it has the fewest redundant edges. If you were tasked with building a road network for a country and your sole goal was to minimize the amount of concrete used, you'd build a tree.
If your only goal was to minimize the aggregate distance all travellers took, you'd build a fully-connected graph where every pair of destinations has a dedicated road.
In practice, road networks are designed to minimize both road maintenance costs and drive time and balance those opposing forces. The result is more connected than a tree but less connected than a complete graph, something like a semilattice.
It seems that navigation memory theory should imply not a hierarchical structure, but a wiki-like structure with many links. In a tree, there’s only one path to a given element, which is not the case in the physical world.
> Navigation memory is the most core type of memory- most other forms of memory evolved later. There’s a reason why GPS usage is correlated with dementia. Human memory actually evolved out of a sense of navigation.
That seems very possible, and probably important, but it's hard for me to relate that to the experience (as an "anatomically modern human") of having other kinds of associative memory that are very effective and don't have a discernible spatial or other hierarchical component.
I agree. But are there any better solutions than manually ln -s? I'm in a band, and also manage booking for a venue. I have $venue/poster/$date\ $bands/$posterfile. I also have $band/poster/$date\ $venue
I don't know of any system that lets a single poster be in multiple places at the same time.
If you want to model this using your filesystem, that's exactly why symlinks (shortcuts on Windows and Mac) were invented.
On Mac, you can write tags on files and then use Spotlight to search for them. Pick one (more or less arbitrary) primary category to use as the directory for the file, then write tags for the other ways you want to be able to search for it.
Tags are superior because tags can model hierarchies, but hierarchies cannot model tags. There are far too many times when a single document crosses multiople categories that are served by tags. I used Outlook for 15+ years and thought tags were a joke, then moved to GSuite for 13 years and learned to use tags, now I"m back on outlook and I feel like I'm suffocating without them. That's two decades of experience with both systems. Not to make a fallacy / whizzing contest out of this, but how long have you tried both systems? I'm guessing not as long.
> Tags are superior because tags can model hierarchies
Tags are inferior because tags must be coerced into hierarchies.
Tags are inferior because they do not properly link hierarchies that they model without extensive software support (which is present for file directories by design, and absent for tags). I have yet to see a hierarchical tagging scheme work well when you need to do something like change a mid-level directory name (you end up having to re-write many tags, often without good software support for what you're trying to do)
Tags themselves are fine. It's a perfectly valid way to label data. It is not a good way to organize that data for human recall and reference.
And here I am, using Johnny Decimal for over five years and I can find everything all the time. As Johnny himself said below, if it doesn't work for you - that's cool - use something else. But you assertion that this can't work is not correct. It's just that it can't work for YOU.
Hierarchies are better because they form a natural hypertext.
I'm in my documents folder. I see a list of all the categories of stuff I have. Whatever I'm looking for, it's in one of them. I go into a folder, and I see all the categories in that folder and none of the stuff outside of it. I've narrowed my focus and increased my depth. I can browse.
Sure, tags are more flexible, but (1) I find I almost never actually need them, because in most cases a hierarchy is good enough, and (2) tags don't function as a hypertext and won't let me explore. A big list of tags is much harder to dig through than nested folders.
Granted, it doesn't stop at tags or hierarchy. You can use both—on top of which, there are hierarchical tags, soft links, hard links, and even textual hyperlinks. But out of all of these, I find hierarchy to be the most important one. Given the choice among all of them, I always start with hierarchy and I typically find I don't need anything else.
I’ve thought a bit about tags++, that is adding some logical and not-so-logical features to them.
For instance there are ideas from OWL where you could define a category instead of other categories and their attributes, for instance tag D could be the union of tag A and tag B and the complement of tag C.
Implication is also useful both as a way to implement subclassing but also containment relationships. For instance on Danbooru a character that has several forms would have the various forms of the character imply that character and the character would imply the media property that the character comes from.
I am looking at what a tagging system looks like in the transformer age and one key idea is a kind of three value logic around tags which can be in a “positive”, “indeterminant” and “negative” state. If you are training a machine learning system to auto tag you will need (1) a number of examples where a tag does not apply (the tag not being applied is not evidence that the tag doesn’t apply, poor coverage of negative examples is one reason why YouTube recommendation is worse than TikTok) and (2) to deal with cases where the ML model tags something incorrectly. If the model tagging something puts it in an indeterminant polarity and that result can later be switched to negative or positive that is a great way to manage the situation.
They used to call the semantic web that OWL is a part of “Web 3.0” which failed to make an impression or was overwritten with the “Web3” moniker for NFT grifts by exceptionally ignorant people.
I learned OWL the hard way, I had been involved with the semantic web for 10+ years on and off and didn’t meet anyone who knew how to do meaningful modeling with OWL until last year, and that even includes famous academics who”ve written books in it.
OWL and RDF interest me immensely, intellectually. I've never been positioned to use either one professionally, but it looks fascinating. Is there a shorter path to successful modeling than the hard way? Is there a good source on this?
If you are willing to eat the up-front cost of coordinating global resource identification— a daunting task make no mistake, you get non-trivial dataset integration almost for free. Imagine if concatenating two ginormous JSON documents describing different aspects of the same entity would amount to a useful merge into a single combined JSON. If you Need this with a big N, RDF has no alternative.
The rise of SSDs has also more or less obviated the need for clustered indexes as a practical performance consideration. For the small price of trebling your storage footprint, commodity RDF triplestores will index _all_ your attributes/columns without a schema (usually red/black or equiv). Will it scan an integer PK over 100b records as fast as postgres? No. Is that use case in your hot path? Also no (most likely).
Edit: as for OWL, just take the plunge into rule based inference directly. From forward chaining inference (if you want performance and decidability guarantees) all the way up to full blown prolog or [miniKanRen](http://minikanren.org/) (if you want it in a library in your runtime of choice)
Everywhere where you have a lot of stuff to manage (photos, music, videos, documents, links) hierarchies don't work and only tags can tame all the chaos.
The analogy to "path finding" doesn't hold, imho. That's not how our brains organize information! We organize memories by association and not by some hierarchical structures.
there have been many, MANY historical attempts to organize the worlds knowledge hierarchically. They have all failed to achieve their goals spectacularly.
some of the most common reasons
- things exist in multiple categories that aren't in the same branch of the tree
- different state of mind during data retrieval means you expect the same item to be in different categories.
- different humans think the same thing belongs in different hierarchical locations
there's also been a LOT of scientific research around informational organization. It all came to the same conclusion. Hierarchies have interesting promises but fail when it meets the practical reality of the human brain.
in the end hierarchical organization of knowledge is a terrible solution expect in VERY restricted cases.
Do you have any suggestions of where to start reading on this? A seminal paper or cluster of papers? I want to deep dive on this not just to map out where it doesn't work but also to get a map of the restrictive cases where it does work.
edit: never mind, I just put your quote into gpt-4 and it passed me on to Eleanor Rosch, prototype theory and some other interesting works. I feel like this is my own modern lmgtfy moment.
Tags are great as an adjunct to a thoughtful folder hierarchy, IMHO.
Links are great as part of that too, they can provide shortcuts.
Real-world use: I am an artist, and I have found that the best way to organize my work is with a series of yearly directories. If I begin a large, multi-year project, it goes in a directory within the year I start it; I'll make a link to it that lives next to all the yearly directories.
I also use OSX's tags a ton. Files get marked as 'in progress', 'complete', 'paid for', 'commission', and 'experiment' (and a few other things). When I want to decide what to work on in any particular day it's super easy to open up the saved search for "everything in progress" that I keep on my desktop; this shows me everything in those yearly directories that's marked as 'in progress', whether it's personal work, client work, whether it's part of a large multi-file project with its own folder hierarchy or just a single file in the yearly directory. I also have a saved search for 'commission'+'in progress' for those days when I know I want to work on clearing the commission queue. And whenever I spend some time just fooling around with different effects to create interesting looks, I'll save my scribblings with the 'experiment' tag; when I decide to use it later I can easily tell Illustrator to open a file, and look through the 'experiment' tag to find the file full of some crazy procedural explorations, regardless of how long ago I did it. This habit has saved me hours of digging for that one file where I did that cool trick once.
Trying to organize all the files in my artwork directory with just tags would be a total fucking nightmare, the subdirectory for a multi-year graphic novel has its own folder hierarchy that's several levels deep, and when I know that what I want to work on today is "getting the prepress files together for book 3 of the graphic novel" it's definitely great to be able to just hit the top-level link to the graphic novel directory, then go into "books", then "3", and have its own little file hierarchy in there.
Tags by themselves are not very good for serious organization, but they can be very good for pulling things out of a hierarchical structure. They take work - I have to remember to mark a new file as 'in progress' and possibly a 'commission', though that's become routine, and changing something from 'in progress' to 'complete' is a pleasure. But it's work well worth doing to create a nice little network of shortcuts and secret passages through the terrain of your thoughtfully-laid-out tree of folders.
> You have a body that is adapted to the physical world - routing and navigation through a series of ordered steps is a VERY well developed human skill.
I find that this skill is better utilized with a system that has hyperlinks like Obsidian.
> To my surprise, we tend to think in hierarchical categories all the time. As I have written in my article on Logical Disjunct Categories Don't Work, the real world does not fit into disjunct categories.
> Therefore, we should embrace multi-classification more often. If you do want to learn more about the rationale, you may as well read the first chapters of my PhD thesis or the book "Everything is Miscellaneous" by David Weinberger, just to give you two resources of many.
> Long story short: tagging does take away the burden of finding one single spot in a strict hierarchy of entities which is actually a heavily intertwined network of concepts we do find in the real world. It's far from being a neat hierarchy. Everybody who tries to put "the world" into a strict hierarchy will fail.To my surprise, we tend to think in hierarchical categories all the time. As I have written in my article on Logical Disjunct Categories Don't Work, the real world does not fit into disjunct categories.
The only reason we're even discussing the topic is because search is so poorly implemented in client operating systems. Tags suck, hierarchical structures suck, everything that isn't search sucks. Search still kind of sucks, but it sucks much more because the search available on your own computer for your own files is about thirty years behind the state of the art.
I've done both as well, tagging everything and then assigning the tags into exclusive hierarchical relationships (for discovery purposes and grouping), but it only works to subdivide within an existing noun like "talent" or "wood panels", without a seed noun tags start becoming too abstract and the object with those tags start to lose all semantic cohesion.
I think once you start talking about unbounded universal tagging with hierarchies, they are not compatible, you need search and weighting or intelligent interfaces.
Search and LLMs really are major organizational improvements in our lifetime imo.
"Tags fucking suck."
They are literally the worst possible way to store and organize your information, and they are only useful when you just want a random sampling of a category - not a specific document or piece of information. Ex: Great for social media or looking at old photos or just playing a song from a genre you like, bad (fucking terrible) for organization and structure.
---
Hierarchical structures have downsides, but the exact thing you complain about (artifacts of the physical world) is exactly their strength... You have a body that is adapted to the physical world - routing and navigation through a series of ordered steps is a VERY well developed human skill. We are primed to be able to remember things like:
- Go left at the tree,
- Straight until you hit road
- Right at the road
- continue until you hit a red house with a big garden
- etc...
That skill set maps directly into the hierarchical system of folder:
- Find the "documents" folder on the desktop
- scroll down to "my super sweet project"
- open that folder
- Find the "icons" folder
- open it and double click "exactly_the_thing_you_wanted.jpg"
------
You can absolutely still make horrible, unorganized messes - but if done well (ex: this article is actually a fairly good system) it's a much, much better system than tags.