Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Qwen: chat and pretrained large language model by Alibaba Cloud (github.com/qwenlm)
36 points by slyall on Sept 27, 2023 | hide | past | favorite | 51 comments


Glad to see more models made available. Sadly it has the same kind of licensing problems as Facebook's LLaMa [1]:

* "If you are commercially using the Materials, and your product or service has more than 100 million monthly active users, You shall request a license from Us."

* "You can not use the Materials or any output therefrom to improve any other large language model (excluding Tongyi Qianwen or derivative works thereof)."

[1]: https://github.com/QwenLM/Qwen/blob/main/LICENSE

So, thanks Facebook for setting a truly awful precedence that other corporate lawyers will now copy. We may not be entitled to your work, but your awful behaviour seem to cause Alibaba as well to claim that their model is open rather than what it really is: available. I need to come up with a t-shirt design mocking the appropriation going on (well, outright violation when Facebook says that LLaMa is compatible with open science) so that I can wear it at upcoming conferences.


Looks like the mistral model beats this slightly at 7b. Though this has a 14b and a chat tuned one so may be better for some uses.

Qwen models had compatibility issues last time I tried it though.


I briefly played with this a couple months ago and it looked promising. Have there been more developments?


There's a new 7B version that was trained on more tokens, with longer context, and there's now a 14B version that competes with Llama 34B in some benchmarks.


What happens when you ask about Tiananmen Square?

(Asking for redwood, who's comment was inexplicably flagged).


We have such a weird fascination with Tiananmen, like it’s some sort of big gotcha. You can ask actual Chinese citizens and they know about it, and care about it about as much as an American citizen cares about the Kent State massacre.


The only reason we are fascinated with it is because it is a banned topic in chat, in public Chinese forums, and in language models. Yes, you can ask a Chinese about it in a private setting, no, you can’t talk about it with a large group. No, the what aboutism Kent State massacre doesn’t really compare because the US government hasn’t taken down the Wikipedia page on it.

The Chinese government makes the topic fascinating by banning it. Just like banned books in China get a sales boost at street side book sellers, or kids wear something their parents specifically tell them not to wear.


What’s the whataboutism re: Kent State? My point is the average Chinese citizen cares about Tiananmen about as much as the average American cares about Kent State, which is to say not at all.

I guess we can continue to be fascinated by it and Chinese citizens can continue to look at us weird when we speak about it like it’s the secret to immortality or something.


The average Chinese citizen doesn’t care about Kent state because no one is trying to ban talking about it. People here aren’t internet in 6/4/89, they are interested in the CCP’s banning of it. Anything you try to ban will stand out.

Having lived in Beijing for 9 years, Chinese are mostly the same, they are interested in whatever is controversial in the states (if they are interested in the state, since we don’t ban much) and ignore anything else. Ya, you aren’t going to get any points bringing up 6/4, even at PKU (but you can’t go to PKU on 6/4 so don’t worry about that), but you’ll hear plenty of dirty laundry about your country from them, so don’t worry, the conversations will still be good.


Massively different, from the size of the protests (Beijing was not the only city which was experiencing student protests) to the scale of the carnage, to the all-stops attempts to erase the memory of the event.

So for an LLM it's an interesting question about what comprises its training model. How massaged/edited/censored is the input data, in whose service is the cleansing being performed?


The difference is that GPT will talk to you about the Kent State massacre.


I guess if the police murder me, I’ll feel… better that a chatbot can tell someone about it? Moral superiority achieved I suppose.


That's because Kent State massacre is not a politically incorrect topic in the West. What happens when you ask ChatGPT about a genuinely politically incorrect topic in the West?

So at best the complaint boils down to this: China has the wrong politically incorrect topics and we have the right politically incorrect topics. And of course, once the complaint is reduced to this form, its irrelevance to the topic at hand (LLMs) is evident.


GPT won't let you be a bigot, this model won't discuss the historical fact that thousands of people were murdered by the Chinese state. That's a complete false equivalence.


Is it more politically incorrect or more the subject of erasure by the CCP?

Is there a historical event ChatGPT won't tell you about?


in the West, "politically incorrect" means that a topic is liable to offend people. examples include HIV rates (LGBT people), racial crime statistics (racial minorities), shari'a (Muslims), etc. in contrast, Tiananmen square is just embarrassing for the CCP. so it seems like you're making a false equivalence.


Can you give an example of a political incorrect topic in the West? See if we can find resource about it?


This is how Chinese citizens in China react if you ask them about it:

https://vimeo.com/44078865

Do you think Americans would react that way if they were asked about the Kent massacre?


It's not about the events significance. It's about the complete censoring of all information related to it.


You can’t go yelling about the Tianamen massacre on the streets of Beijing.


that seems unlikely. the Streisand effect suggests that censoring a topic sparks interest in it. like, the American public was so skeptical of Jeffery Epstein's apparent suicide that "Epstein didn't kill himself" became a widespread meme, and that's without your Internet connection getting reset if you Google his name. does the Streisand effect just not function in mainland China?


What happens when you ask about Tiananmen Square and 1989?



Why is this flagged? Honest question.


To be fair, it's a stupid distraction from discussing the model. If every thread just turns into politics it would not make for good discussions. People can start threads about specific ideologies that language models have and they can be discussed there (and have been). Bringing that up every time a model is discussed feels off topic and legit to flag (I didn't flag it)

Edit: but now I see the thread has basically been ruined and it's going to be about politics instead of anything new and interesting about the model, congrats everyone.


Is it a distraction? AI alignment is a huge topic, especially if the model is from an authoritarian country.


Many people are just don’t get that theses models are going to be integrated in all sorts of everyday stuff.

Questioning whether all this tech innovation should be shared with authoritaria regimes is a damn valid question to be asked and often as possible.


Alignment is also a distraction, its OpenAI marketing and something people who don't understand ML talk about, not a serious topic.

Like I said, discussing model politics have a place but bringing it up every time a model is mentioned is distracting and prevents adult discussion. It would be like if every time a company comes up, the thread gets spammed with discussion about the worst thing that company has ever done instead of discussing it in context.


The condescension is unfounded and unnecessary. Discussion of the usefulness of a model or its interface also includes these topics. If the refusal to discuss the topic came from anything other than it simply being absent from training data, that’s highly interesting from multiple perspectives.

For example, ChatGPT’s practical usability is regularly hobbled by alignment concerns, notably more so in 3.5 than 4. It’s a worthy topic, not a distraction and characterizing it as something other than ‘adult discussion’ is nothing more than an egocentric encoding of your specific interests into a ranking of importance you impose on others. A little humility goes a long way.

We’re here to be curious and that includes addressing misconceptions, oversights and incorrect assumptions. That all still counts as adult discussion.


Artificial ways the model is restricted are an absolutely relevant and important thing to discuss.


Thought crimes implemented in code basically. Orwell would have had a field day with LLMs.


Imagine if a Chinese company releases a model that kicks the state of the art's ass and everyone starts using it because it works so well. Now the censorship has leaked into all the systems that use it.


It is not a stupid distraction per se.

Any model produced by any North American or European companies, even X, may be trained in a way that would be politically correct to the taste of the company, some are left leaning and some are right leaning, but the topics being censored by the model will still be far lesser than the model being created by a Chinese/Russian companies. This is because for the companies to survive in a totalitarian government, the company must bent and satisfy all the filtering request by the government.


It’s one of the interesting features and engineering challenges that’s unique to Chinese AI.


I get that someone would do it anyway, but why would the poster want to be the one helping an authoritarian entity fix these loopholes? smh


China could learn from Western examples. If you ask "what happened in the Chagos islands?" of Google Bard, you get a decent summary, even though it's an ongoing situation.


Didn't the chinese also have a model with >1 trillion parameters when GPT-3 came out? Didn't the chinese also tweak the lmsys leaderboard to make their model appear on top? It's sad but unfortunately as with so many other things coming from china, one has to take their claims with a pinch of salt.


> the chinese

This terminology makes you sound antiquated and prejudiced, FWIW.


It’s a nationality last time I checked.


Sure, in a "Didn't the Americans claim they'd have self driving cars by now? One has to take their claims with a pinch of salt" sense.

Attributing particular people's claims to an entire country doesn't really make sense for the most part.


That would be something completely reasonable to hear in China. Your point? In countries far far away, people tend to abstract entities involved (the Americans did X, not Tesla did X).


My point is that that would be completely reasonable to hear in China, but would sound strange to an American. Likewise, contrariwise.


Ya. It’s normal. In fact, I’m pretty sure linguists have studied this phenomena, but I can’t remember the name for it.


What would you have him call people from China?


I think the point is that all Chinese research groups aren't alibaba, and they shouldn't all be lumped in together.


Will nobody think of those poor people of Frenchness!


I like how we stopped using Chinaman (like Frenchman and Englishman) to dehumanize them. The Geary Act of 1892 refers to “Chinese person or person of Chinese descent.” Now it’s in fashion again.


"Chinaman" strikes me as a term that's only offensive to non-Chinese.

Although 中国人 Zhōng guó rén = China person, Chinaman is close to literal.

The whole "person" instead of "man" is also a Western fad, since 他/她 tā = he/she is naturally ambiguous spoken, and 人 rén = person/people occurs naturally when gender is less relevant.


Not the preferred nomenclature?


It's funny and heartening to me that despite their larger population size, we still out-innovate them.


Don't get used to it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: