As a person who uses LLMs daily, I do in fact do this. Couple problems with this approach:
- there are billions of people who are not accustomed to using software this way, who are in the expected target market for this software. Most people cannot tell you the major version number of their mobile OS.
- this approach requires each individual to routinely perform experiments with the expanding firmament of models and versions. This is obviously user-hostile.
Anyway, my hot take here is that making things easier for users is better. I understand that is controversial on this site.
Imagine if this is what people suggested when I asked what kind of screwdriver I should use for a given screw, because they're all labelled, like, "Phillips. Phillips 2.0. Phillips.2.second. Phillips.2.second.version 2.0. Phillips Head Screwdriver. Phillips.2.The.Second.Version. Phillips.2.the.second.Version 2.0"
To their credit, they did get this part correct. "ChatGPT" is the user-facing apps. The models have terrible names that do not include "ChatGPT".
Anthropic, by contrast, uses the same name for the user-facing app and the models. This is confusing, because the user-facing apps have capabilities not native to the models themselves.
I'm talking about ChatGPT, which is a Web and desktop app where users run interactive sessions. What does "production" mean in this sense?