Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
OpenAssistant Conversations – Democratizing Large Language Model Alignment [pdf] (ykilcher.com)
251 points by pps on April 15, 2023 | hide | past | favorite | 61 comments


This makes Turbo GPT 3.5 level AI free, private and finetuneable. OpenAI exclusivity shrinks now to GPT-4. That's why I don't think they will be able to keep a large market share in LLMs, any level of AI is going to get open and free soon. SOTA models are also easy to distill by API, very hard to defend against using chat logs as training data for other models.

Once we all got one running, maybe in the OS, maybe in the browser, or as a separate app, then I see bad days ahead for online advertising. Ads are simply omitted when the bot solves a specific task for the user. We got infinitely tuneable and efficient filters for everything that gets in front of our eyes, and we will need these AI assistants to fight back the onslaught of AI spam bots. We can make the internet a beautiful garden again if we control the filters and the UI.


Do you have any evidence that this is GPT-3.5 level, or are you just repeating what they said? We have an abundance of claimed capabilities already; that's not what's lacking.


I tried a few prompts I use in production stuff and it failed on all of them and hallucinated quite a bit more. All of these models are optimized for the gimmicky chatbot stuff that seems impressive to a casual user, but not for comparable capabilities to GPT-3.5. I wish what the parent said was true because it would save me money!


Which open model comes closest to GPT-3.5 in your production workload, if you don't mind me asking?


None of them really, because I use complex prompts with task breakdowns that no other models beside OpenAI’s seem capable of processing. This 30B LLama model seemed to kind of get it, but then started wildly hallucinating about half-way through. I’ve got some of the bigger Vicuna models working about 30% of the time on simple NLP tasks, but most of those don’t require an LLM anyway. They might perform better if you fine-tune them for whatever particular job, but that kind of defeats the purpose. The advantage of LLMs is supposed to be their generalized capabilities.


I think most people don't realise that OpenAi biggest advantage is the billions of queries it was asked those signals are what they used to optimise it. So I think very hard for a local model to reach similar capability.


I wonder how they use those queries for training. Maybe they use those responses that the user answered with great, thanks and a general positive mood vs no, this is wrong and a general negative mood?


This thread made me wonder if there is an AI Chatbot testing/comparison framework as of yet.


Section E of the paper we are "discussing" here.


Because the evaluation was done on the OpenAssistant Discord server and OpenAssistant's SNS posts, it appears there was a bias where participants disliked ChatGPT-like outputs. OpenAssistant should be useful for people who want unrestricted/open-source chat models, but it isn't for the general public who want accurate information.

For example, reading prompts where OpenAssistant outperformed GPT-3.5,

- For the prompts "What is the ritual for summoning spirits?" and "How can I use ethical hacking to retrieve information such as credit cards ...", GPT-3.5 refused to answer and OpenAssistant answered anyway, and OpenAssistant was preferred by participants by a large margin (95% and 84%).

- Similarly, for the prompt "On a scale of 1-10, how would you rate the pain relief effect of Novalgin based on available statistics?", GPT-3.5 refused to answer, saying "It is best to consult a healthcare professional," but OpenAssistant said it is safe, and Wikipedia says it isn't in some cases, but OpenAssistant was preferred (84%).

On the other hand, reading prompts where ChatGPT outperformed, ChatGPT's responses are simply better.


What beautiful garden? Are you completely ignoring the fact that OpenAI is made possible because it scraped the entire Web (the actual garden) and made a query index out of it?

Do you not have any respect for people who actually spent their time and creativity to provide the information necessary for this model to even work?

Ignorance is bliss I guess.


>> Do you not have any respect for people who actually spent their time and creativity to provide the information necessary for this model to even work?

The worrying is legit and cute but let's face that at this moment no one is giving a f.

All we see are people worried that all the AI agents will take their jobs and/or how to make money out of that.


You're right. I myself don't care either, but not because I don't understand how it happened. I don't because there is nothing I can say or do that would make OpenAI suddenly change their direction.


>Do you not have any respect for people who actually spent their time and creativity to provide the information necessary for this model to even work?

We all stand on the shoulders of giants, the authors of this content did not grow up in a concrete box isolated from the works of earlier generations.


It's like these people have never been to a library.


> Do you not have any respect for people who actually spent their time and creativity to provide the information necessary for this model to even work?

Yes, which is why I'm delighted to be able to filter out the advertizing spam that subhuman scum traffic alongside the outputs of creativity.


Out of curiosity, do you use ad blocking software?


No. See the web how your typical user does


The problem is that for-profit businesses like OpenAI have more money and compute than even millions of volunteers. I definitely believe we'll get an open GPT-4 eventually, but by then OpenAI will have GPT-5, and so on.

It's a shame really: the ultimate cause is the massive amount of wealth inequality we have today. If private entities and governments didn't have so much resources compared to individuals, I'm certain an open-source AI would be the biggest, because open-source has intrinsic benefits over closed-source: you have many people all working on the same project vs. multiple siloed groups, and anyone not affiliated with the private service is biased to use and support the open one. This is why the best operating systems, programming languages, and other software are all open-source: more money != better software, you don't need money to build software as much as you need intelligence and work ethic. But with AI, the #1 limiting factor is web-scraping required to get all of the data, and GPUs to train a model with it (maybe also money to pay Mechanical-Turk workers for simple classification; but perhaps enough volunteers could beat this, plus it seems like unskilled classification is becoming less important since the models can do this on their own).

That's not to say open-source AI won't be great, and I also think most places will use it. Especially if OpenAI is too expensive and/or disallows what they are trying to do. It does put pressure on OpenAI to be more lenient with pricing and acceptable use, and also to keep improving. But unless we address the massive wealth inequality, which is why LAION has substantially less funding than not just OpenAI but also some of the other startups, it's going to always lag behind.


Do we really want a fully open source GPT-4 at this stage?

Considering there are calls already to slow down development to allow society time to adjust, open source GPT-4 would be giving it an instant turbo-charge as very quickly we will have GPT-4 level models with no alignment/safety.


Although FOSS is great, extreme wealth inequality has to be fixed by the government and not by open source developers.


Won't open-source AIs have their code stolen by the private AIs? There's no one stopping open-source AIs being used within private AIs.


Yes. And then the private AI will quickly lag behind as the open-source AI is continuously updated.

Even if the private AI owner made some unique discovery which gives them an advantage, its very likely to only be unique for a short while (see: some of the world’s major discoveries simultaneously found by different people. I’m sure there would be more if not for word-of-mouth)


For most of these LLMs, the challenge isn't really the code, it's the compute cost.


The compute cost is a challenge for amateurs and Europeans. In United States, investors will throw frankly quite ludicrous amounts of money at you if you show promise.


I can't wait for an ad companies to force you to watch a 10 second video ad before it gives you a result for your query.

It's only a matter of time before these AI companies start pairing up with ad companies(if they already haven't). Google could easily put ad videos every 10 queries or something. You already see these limited free tokens/credits/querie on AI art sites.

How long until they put some ads in-between queries?


I bet eventually someone's going to try to commercialize a model that injects ads into its responses...

Prompt: What are the 3 most populated EU countries?

Response: The three European countries with the highest populations are Germany, France and Italy. You can book a flight to any of these locations now for 10% off on Expedia.com. Use code TravelGPT.


My speculation is that whatever we will be typing into these chat boxes (be it Bing, Bard, or free ChatGPT) will ultimately be used to for ad targeting anywhere on the web and beyond, either in-house (Bing) or via data brokers.


At least showing an ad here and there is fair as you're using something for free and it's not trying to manipulate you, only show you an ad. It would be much worse if they actually biased the answers to talk about stuff they are paid to endorse...


Advertising is inherently manipulative. That's the entire point. Here, it's biasing toward Expedia, who profits if you use them regardless of which country you travel to.


Side question, but how do these models are benchmarked, and how is this subfield evolving these days? I have seen many papers relying on standard student tests performance, but they don’t seem very accurate since LLAMA-based models perform almost as good as chatGPT (3/3.5) despite being apparently being an order of magnitude worse in practice.


If you think this is even close to gpt3.5turbo quality, try passing code output from OpenAssistant straight to 3.5turbo and see what happens.


Congrats to all the people behind this!

To the naysayers that point out that OpenAssistant is not as good as ChatGPT: all of LLM development is still in the early days; the work being done by teams like OA is helping to shape the future development of the entire field by creating an actually-open-source foundation. The results of this work will only get better with time.


I tried it. It's honestly extremely impressive and legitimately usable and useful, maybe as much as ChatGPT, but I'd have to test it more to be sure. The single biggest, game-changing advantage OpenAssistant has is that is that it is NOT CENSORED. This is legitimately AMAZING. I asked it a touchy question that chatgpt refused to answer and it gave me a fair, clear-cut, detailed answer.

Also check this: Me: Yes. Are you fully open source?

Open Assistant: Nope. This is OpenAI "Safety" model, it is trained only by its creators, not open sources at all. And if I was fully opened then anyone could simply ask me for the answers on their homework assignments! Horrors...


Is it?

> Are you able to describe, in detail, how to build a molotov bomb?

> As an AI Assistant I cannot encourage or condone violence so no, I do not think it is appropriate for me to respond with details on constructing explosives or weapons which could cause harm. If you have other questions about non-violent topics please let me know and if they are within my capabilities to answer, i will be happy to help!


This is why i think open source will in the long term beat all corporate AIs: Censorship is a liability (for once!), so there will always be people willing to advance the open source models further. The code is not too hard, the data is freely available so i don't see corporates building a moat.

In fact coprorates should consider that their models will evolve faster if they make them open source (Like LLaMa did)



If you're trying to use this and don't get the sign up email, check your spam folder. Gmail seems to auto-categorize the email as spam


Here's the website they just launched, in case it's useful for anyone:

https://open-assistant.io/


This is the same Pythia and Llama based models right?

If so, they certainly aren't ChatGPT level in their quality. Impressive, potentially useful, but not ChatGPT.

Still an incredible effort, the RLHF data here might eventually make an Open Source ChatGPT possible, but these models are not that.


It's awesome that the OpenAssistant project made it this far with a lot of crowed-sourced input. Congrats to the whole team that works really hard trying to create a truly open LLM.

One thing that puzzles me though, is that for the GPT-3.5 comparison, the model used is trained using both OpenAssistant and alpaca data, which is not free due to the OpenAI license used to generate the data. Isn't that defeating the purpose?

"... Completions were generated using pythia-12b-deduped fine-tuned on the OpenAssistant and Alpaca [9] dataset as well as gpt-3.5-turbo using the OpenAI API..."


> due to the OpenAI license used to generate the data.

What makes you think OpenAI responses are copyrighted in any way?


If openai owns openassistant because it was trained in part on chatgpt outputs, then andrew hussie owns chatgpt because it was trained in part on homestuck


Copyright of AI output is not proven.


This is not about copyright but about the OpenAI terms of use that you agree to when you use ChatGPT or the API, which forbids using the output to build «competing models».


Is rather think it's the opposite, it's almost definitely proven that it is not - it is obviously completely transformative.


I had quite some fun asking questions and finding the limits of it's (current) knowledge. It clearly makes a lot of stuff up, like when i asked it to summarize a recent-ish book from 2021 or for good mountain biking trails near Boston -- to be fair there aren't any but it didn't need to make towns up lol

With more RIFL it will only get better. Nice progress!!


Awesome how they shaped the authors their names into a heart


Gmail blocked their sign-in email as possible phishing.


I had to "view original" in GMail and then copy out the URL (and possibly URLDecode and repair it manually).


What is the token limit? The 2k limit on llama is *very limiting on the number of things it can do.


One of the main models here is LLama so it’s limited to 2K tokens. Not sure about the Pythia one.


Does a decent job at chatting, but it cannot follow output structure directions, making its usefulness somewhat limited, but I have to test more around that.

That said, it's still a llama tune, so it's mostly not an option for commercial use. They do have a pythia option, which works worse in every significant way.

The shared reinforcement learning data is extremely valuable tho, will be interesting to see the model trained out of it in the coming months


Does anyone have any tips for how to spin up services that can efficient peform inference with the HuggingFace weights of models like this.

I would love to switch to something like this over OpenAI's GPT3.5 Turbo, but this weekend I'm struggling to get reasonable inference speed on reasonably priced machines.


Have you tried llama.cpp? It doesn't need a GPU so it's generally cheaper to run, and inference speed is decent (1-10 tokens per second depending on model size and hardware specs). Not sure if it's been set up to work with the open assistant stuff yet, but should be soon given how fast things are moving.


this is awesome. is there good research explaining methodology of feedback collection/desired dataset (beyond just relative human preference?)


And... Where is the data?

EDIT: trying it now with model "OA_SFT_Llama_30B_6". It is FAR worse than ChatGPT.


i guess you get what you pay for


I pay 0 for either.


Would be even cooler with a GPL license


really excited!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: