Yes all models are steered or filtered. You seem to get that, where many of the commenters here don't, e.g. "dur hur grok will only tell you what musk wants".
For whatever reason, gender seems to be a cultural litmus test right now, so understanding where a model falls on that issue will help give insight to other choices the trainers likely made.
DALL-E forced diversity in image generation, I ask for a group photo of a Romanian family in middle ages and I get very stupid diversity, a person in wheel chair in medieval times, the family has different races and also foced muslim clothing. Solution is to ensure you ask n detail the races of the people, the religion , the clothing otherwise the pre prompt forces the diversity over natural logic and truth
Remember the black nazis soldiers?
ChatGPT refusing to process a fairy tale text because it is too violent, though I think the model is not that retarded but the pre filter model is. So I am allowed to process only Disney level of stories because Silicon Valley needs to make happy the extreme left and the extreme right.
All trained models have loss/reward functions, some of which you and I might find simplistic or stupid. Calling some of these training methods "bias" / "injected opinion" versus other is a distortion, what people are actually saying is "this model doesn't align with my politics" or perhaps "this model appears to be adherent to a naive reproduction of prosocial behavior that creates weird results". On top of that, these things hallucinate, they can be overfit, etc. But I categorically reject anyone pretending like there is some platonic ideal of an apolitical/morally neutral LLM.
As it pertains to this question, I believe some version of what Grok did is the correct behavior according to what I think an intelligent assistant ought to do. This is a stupid question that deserves pushback.
You can argue philosophically that on some level everyone has a point of view and neutrality is a mirage, but that doesn't mean you can't differentiate between an LLM with a neutral tone that minimizes biased presentation, and an LLM that very clearly sticks to the party line of a specific contemporary ideology.
Back in the day, don't know if it's still the case, the Christian Science Monitor was used as the go-to example of an unbiased news source. Using that point of reference, it's easy to tell the difference between a "Christian Science Monitor" LLM and a Jacobin/Breitbart/Slate LLM. And I know which I'd prefer
Stupid is stupid, creating black nazi soldiers it is stupid, it might be a consequences of trying to fix some bad bias in the model but you can't claim it not to be stupid. Same with refusing to accept children stories because they are violent , if a child can handle that there are evil characters that do evil things then also a an extremist conservative/racist/woke/libertarian/MAGA should be able to handle it. Of couse you can say it is aa bug, they try to make happy both extreme and you get this stupidity , but this AI guys need to grab the money so they need to suck the d of both extremes.
Or we claim now that classical children stories are bad for society and we need to only allow the modern american Disney stories where everything is solved with songs and the power of friendship.
1 they train AI on internet data
2 they then try to fix illegal stuff, OK
3 but then they try to put political bias from both extremes and make the tools less productive since now a story with monkeys is racist and a story with violence is to violent and soem nude art is too vulgar.
The AI companies could decide to have the balls to only censor illegal shit, and if their model is racist or vulgar then cleanup their data and not do the lazy thing of adding some lazy stupid filter or system prompt to make happy the extremists.
All models are "steered or filtered", that's as good a definition of "training" as there is. What do you mean by "injected opinions"?