I don’t think the third+ flavor of “bad release” this year, of the sort nobody else in this crowded space suffers from, is as innocuous as you think it is.
And Tay was a non-LLM user account released a full 6 years before ChatGPT; you might as well bring up random users’ markov chains.
I posted the Wikipedia page, do you really think I don't know how long ago Tay was? I don't think the capabilities matter if we're just talking about chat bots being racist online.
Also IDK what you mean by third+ flavor? I'm not familiar with other bad Grok releases, but I don't really use it, I just see it's responses on Twitter. Also do you not remember the Google image model that made the founding fathers different races by default?
It seems that there is tremendous incentive for people like yourself (I see you're very active in these comments) to claim that. But I see you've presented no quantitative evidence. Given the politicization of the systems and individuals involved, without evidence, it all reads like partisan mud slinging.
Any LLM can be convinced to say just about anything. Pliny has shown that time and time again.
Does ChatGPT start ranting about Jews and "White Genocide" unprompted? How can I even quantify that it doesn't do that?
This is a classic "anything that can't be empirically measured is invalid and can be dismissed" mistake. It would be nice if we could easily empirically measure everything, but that's not how the world works.
The ChatGPT article is of a rather different nature where ChatGPT went off the rails after a long conversation with a troubled person. That's not good, but just no the same as "start spewing racism on unrelated questions".
I don't think I'm the one being presumptuous or demanding. I've actually tried to help you make a stronger argument. Shooting a hundred or even a thousand queries to 3 or 4 LLMs and shoving the results through established sentiment analysis algorithms is something ChatGPT can one-shot in just about any language. You demand people agree with your opinion and refuse to spend 20 minutes supporting it with facts. Not my problem, I tried to help. You may not see it that way. That's fine.
You can't just run a few queries and base conclusion off that, you need to run tens of thousands of different ones and then somehow evaluate the responses. It's a huge amount of work.
Demanding empirical data and then coming up with shoddy half-arsed methodology is unserious.
> Funny how ChatGPT is vanilla and grok somehow has a new racist thing to say every other week
To be fair, 'exposing' ChatGPT, Claude, and Gemini as racist will get you a lot fewer clicks.
Musk claims Grok to be less filtered in general than other LLMs. This is what less filtered looks like. LLMs are not human; if you get one to say racist things it's probably because you were trying to make it say racist things. If you want this so-called problem solved by putting bowling bumpers on the bot, by all means go use ChatGPT.
> if you get one to say racist things it's probably because you were trying to make it say racist things.
When it started ranting about the Jews and "Mecha Hitler" it was unprompted on unrelated matters. When it started ranting about "white genocide" in SA a while ago it was also unprompted on unrelated matters.
It's so "less filtered" that they had to add a requirement in the system prompt to talk about white genocide
This idea that "less filtered" LLMs will be "naturally" very racist is something that a lot of racists really really want to be true because they want to believe their racist views are backed by data.
No I'm saying the consequences of over-filtering are apparent with Copilot 's response: no answer.
And I'm also saying Grok was reportedly sabotaged into saying something racist (which is a blatantly obvious conclusion even without looking it up), and that seeing this as some sort of indictment against it is baseless.
And since I find myself in the position of explaining common sense conclusions here's one more: you don't succeed in making a racist bot by asking it to call itself Mecha Hitler. That is a fast way to fail in your goal of being subversive.
Nobody’s trying to get grok to talk about MechaHitler. At that point you just know Musk said that out loud in a meeting and someone had to add it to groks base prompt.
Remember Tay Tweets?
https://en.m.wikipedia.org/wiki/Tay_(chatbot)
Honestly I really don't think a bad release of an LLM that was rolled back is really the condemnation you think it is.