Glad to see more models made available. Sadly it has the same kind of licensing problems as Facebook's LLaMa [1]:
* "If you are commercially using the Materials, and your product or service has more than 100 million monthly active users, You shall request a license from Us."
* "You can not use the Materials or any output therefrom to improve any other large language model (excluding Tongyi Qianwen or derivative works thereof)."
So, thanks Facebook for setting a truly awful precedence that other corporate lawyers will now copy. We may not be entitled to your work, but your awful behaviour seem to cause Alibaba as well to claim that their model is open rather than what it really is: available. I need to come up with a t-shirt design mocking the appropriation going on (well, outright violation when Facebook says that LLaMa is compatible with open science) so that I can wear it at upcoming conferences.
There's a new 7B version that was trained on more tokens, with longer context, and there's now a 14B version that competes with Llama 34B in some benchmarks.
We have such a weird fascination with Tiananmen, like it’s some sort of big gotcha. You can ask actual Chinese citizens and they know about it, and care about it about as much as an American citizen cares about the Kent State massacre.
The only reason we are fascinated with it is because it is a banned topic in chat, in public Chinese forums, and in language models. Yes, you can ask a Chinese about it in a private setting, no, you can’t talk about it with a large group. No, the what aboutism Kent State massacre doesn’t really compare because the US government hasn’t taken down the Wikipedia page on it.
The Chinese government makes the topic fascinating by banning it. Just like banned books in China get a sales boost at street side book sellers, or kids wear something their parents specifically tell them not to wear.
What’s the whataboutism re: Kent State? My point is the average Chinese citizen cares about Tiananmen about as much as the average American cares about Kent State, which is to say not at all.
I guess we can continue to be fascinated by it and Chinese citizens can continue to look at us weird when we speak about it like it’s the secret to immortality or something.
The average Chinese citizen doesn’t care about Kent state because no one is trying to ban talking about it. People here aren’t internet in 6/4/89, they are interested in the CCP’s banning of it. Anything you try to ban will stand out.
Having lived in Beijing for 9 years, Chinese are mostly the same, they are interested in whatever is controversial in the states (if they are interested in the state, since we don’t ban much) and ignore anything else. Ya, you aren’t going to get any points bringing up 6/4, even at PKU (but you can’t go to PKU on 6/4 so don’t worry about that), but you’ll hear plenty of dirty laundry about your country from them, so don’t worry, the conversations will still be good.
Massively different, from the size of the protests (Beijing was not the only city which was experiencing student protests) to the scale of the carnage, to the all-stops attempts to erase the memory of the event.
So for an LLM it's an interesting question about what comprises its training model. How massaged/edited/censored is the input data, in whose service is the cleansing being performed?
That's because Kent State massacre is not a politically incorrect topic in the West. What happens when you ask ChatGPT about a genuinely politically incorrect topic in the West?
So at best the complaint boils down to this: China has the wrong politically incorrect topics and we have the right politically incorrect topics. And of course, once the complaint is reduced to this form, its irrelevance to the topic at hand (LLMs) is evident.
GPT won't let you be a bigot, this model won't discuss the historical fact that thousands of people were murdered by the Chinese state. That's a complete false equivalence.
in the West, "politically incorrect" means that a topic is liable to offend people. examples include HIV rates (LGBT people), racial crime statistics (racial minorities), shari'a (Muslims), etc. in contrast, Tiananmen square is just embarrassing for the CCP. so it seems like you're making a false equivalence.
that seems unlikely. the Streisand effect suggests that censoring a topic sparks interest in it. like, the American public was so skeptical of Jeffery Epstein's apparent suicide that "Epstein didn't kill himself" became a widespread meme, and that's without your Internet connection getting reset if you Google his name. does the Streisand effect just not function in mainland China?
To be fair, it's a stupid distraction from discussing the model. If every thread just turns into politics it would not make for good discussions. People can start threads about specific ideologies that language models have and they can be discussed there (and have been). Bringing that up every time a model is discussed feels off topic and legit to flag (I didn't flag it)
Edit: but now I see the thread has basically been ruined and it's going to be about politics instead of anything new and interesting about the model, congrats everyone.
Alignment is also a distraction, its OpenAI marketing and something people who don't understand ML talk about, not a serious topic.
Like I said, discussing model politics have a place but bringing it up every time a model is mentioned is distracting and prevents adult discussion. It would be like if every time a company comes up, the thread gets spammed with discussion about the worst thing that company has ever done instead of discussing it in context.
The condescension is unfounded and unnecessary. Discussion of the usefulness of a model or its interface also includes these topics. If the refusal to discuss the topic came from anything other than it simply being absent from training data, that’s highly interesting from multiple perspectives.
For example, ChatGPT’s practical usability is regularly hobbled by alignment concerns, notably more so in 3.5 than 4. It’s a worthy topic, not a distraction and characterizing it as something other than ‘adult discussion’ is nothing more than an egocentric encoding of your specific interests into a ranking of importance you impose on others. A little humility goes a long way.
We’re here to be curious and that includes addressing misconceptions, oversights and incorrect assumptions. That all still counts as adult discussion.
Imagine if a Chinese company releases a model that kicks the state of the art's ass and everyone starts using it because it works so well. Now the censorship has leaked into all the systems that use it.
Any model produced by any North American or European companies, even X, may be trained in a way that would be politically correct to the taste of the company, some are left leaning and some are right leaning, but the topics being censored by the model will still be far lesser than the model being created by a Chinese/Russian companies. This is because for the companies to survive in a totalitarian government, the company must bent and satisfy all the filtering request by the government.
China could learn from Western examples. If you ask "what happened in the Chagos islands?" of Google Bard, you get a decent summary, even though it's an ongoing situation.
Didn't the chinese also have a model with >1 trillion parameters when GPT-3 came out? Didn't the chinese also tweak the lmsys leaderboard to make their model appear on top? It's sad but unfortunately as with so many other things coming from china, one has to take their claims with a pinch of salt.
That would be something completely reasonable to hear in China. Your point? In countries far far away, people tend to abstract entities involved (the Americans did X, not Tesla did X).
I like how we stopped using Chinaman (like Frenchman and Englishman) to dehumanize them. The Geary Act of 1892 refers to “Chinese person or person of Chinese descent.” Now it’s in fashion again.
"Chinaman" strikes me as a term that's only offensive to non-Chinese.
Although 中国人 Zhōng guó rén = China person, Chinaman is close to literal.
The whole "person" instead of "man" is also a Western fad, since 他/她 tā = he/she is naturally ambiguous spoken, and 人 rén = person/people occurs naturally when gender is less relevant.
* "If you are commercially using the Materials, and your product or service has more than 100 million monthly active users, You shall request a license from Us."
* "You can not use the Materials or any output therefrom to improve any other large language model (excluding Tongyi Qianwen or derivative works thereof)."
[1]: https://github.com/QwenLM/Qwen/blob/main/LICENSE
So, thanks Facebook for setting a truly awful precedence that other corporate lawyers will now copy. We may not be entitled to your work, but your awful behaviour seem to cause Alibaba as well to claim that their model is open rather than what it really is: available. I need to come up with a t-shirt design mocking the appropriation going on (well, outright violation when Facebook says that LLaMa is compatible with open science) so that I can wear it at upcoming conferences.