Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
ChatGPT is not all you need. A SOTA Review of large Generative AI models (arxiv.org)
157 points by georgehill on Jan 20, 2023 | hide | past | favorite | 52 comments


Is it just me or does anyone else cringe when they read "is/is not all you need" in the title of an AI-related paper?

Also, what is "SOTA" for review? There isn't exactly a benchmark to compare...

In terms of comprehensiveness they don't mention PaLM + variants, which probably should be mentioned as it is currently the largest LLM with SOTA on several benchmarks (e.g. MedQA-USMLE).

In terms of correctness, I admittedly skipped to the sections I'm familiar with (LLMs) but I don't understand why they are distinguishing 'text-science' from 'text-text', they're both text-text and there is no reason why you can't, for example, adapt GPT3.5 to a scientific domain domain (some people even argue this is a better approach). A lot of powerful language models in the biomedical domain were initialized from general language models and use out-of-domain tokenizers/vocabularies (e.g. BioBERT).

The authors also make this statement regarding Galactica:

"The main advantage of [Galactica] is the ability to train on it for multiple epochs without overfitting"

This is not a unique feature of Galactica and has been done before. You're allowed to train LLMs for more than 1 epoch and in fact it can be very beneficial (see BioBERT as an example of increasing training length).

People GENERALLY don't do this because the corpus used during self-supervised training is filled with garbage/noise, so the model starts to fit to that instead of what you desire. There is nothing special about Galactica's architecture that specifically allows/encourages longer training cycles but rather they curated the dataset to minimize garbage. As another example, my research involves radiology NLP and when doing domain adaptive pretraining on a highly curated dataset we have been going up to 8 epochs without overfitting.


> Is it just me or does anyone else cringe when they read "is/is not all you need" in the title of an AI-related paper?

I believe it's a reference to 'attention is all you need,' which is a very famous paper now. I'm guessing you probably already know that -- did you cringe with it? It was a landmark paper so at least it's worth hyperbole in the title. Maybe I'm missing that its usage started before that.

Reminds me of my ML prof who would complain about people using the word 'optimum' or 'optimized' in titles of papers... one could almost always optimize more, and with respect to what is not specified by the title.


Yes, aware of the reference and hence the cringe. 'Attention is all you need' is a clever play on words for a seminal paper whose title directly related to its main scientific contribution.

Not sure that every other AI-related paper needs to copy this format, this trend is the academic equivalent of click-baiting (seemingly trying to associate with the Vaswani et al paper) and in my anecdotal experience the usage of this play on words seems inversely correlated with the paper's quality.


Those kinds of references/cliches in titles always bug me too. Always feels lazy. I chalk this one up there with:

Dr. Strangelove Titles - "<product>: or how I learned to stop <doing something> and love <idea>"

Dairy Farmers of America reference - "Got <Product>?"

Breakin' 2 reference - "<Product> 2, Electric Boogaloo"

Honorable Mention, titles that end in phrases like "And That's a Good Thing!", "Here's What That Means", "This is Why That Matters"


"Repeated Tropes Considered Harmful"


I believe it has an analogue with Heinlein's "there are two kinds of jokes; funny always and funny one time". Subsequent uses are a bit cringe in the same way that a subsequent telling of a funny-one-time joke is.


> 'Attention is all you need' is a clever play on words

Can you elaborate? (I'm not a native speaker)


I think the clever play on words is the fact that the title is cute and references the main finding directly: transformers have a mechanism called 'self-attention' that distinguishes them from previous approaches.


Also likely referencing "Love is all you need", from The Beatles' https://en.wikipedia.org/wiki/All_You_Need_Is_Love


I’m sorry, but what is the play on words in the title “Attention is all you need” here?


"love is all you need" is the Beatles lyric they're referencing, the title could also be read as a statement on the basic human need for attention, like love, so it's a play on words for me. Do you prefer humorous allusion?


It has been a wildly surprising fact over the last few years that attenion basically is all you need, in a sense much stronger than the one intended by the authors.


> Maybe I'm missing that its usage started before that.

Afaik it wasn't used prior to it. While attention (and residual connections) were used with recurrent / auto regressive models, the paper showed that an encoder-decoder (non auto regressive and auto-regressive) architecture with just attention (and residuals) is sufficient to achieve great results given that you provide positional embeddings.

So prior to transformers it was RNN + Attention (+ Residuals), but the paper claimed that Attention (+Residuals+Pos Embeddings) is all you need.


Sensationalist click bait titles have taken over more or less completely at this point. :(


If this counts as sensationalist then I'd say the term click bait has lost all meaning.


Sensationalist might not have been the best word choice, I just kind of feel like the title is clout hunting or something by using the word choices it uses.


We need an AI that de-click baits titles. That would be a great service to the world.


I like that you ask what SOTA means when the acronym is expanded in the title you see as soon as you click the link.

"ChatGPT is not all you need. A State of the Art Review of large Generative AI models"


You missed the point. It's obvious what it means to anybody in ML. But that's not how the term is used.


I’m aware of what SOTA means, but how does that apply to a review? How do we or the authors assess that it is in fact the state of the art (best) without an objective measure? If they mean it’s the most current, every review is state of the art at time of publication by definition.


I agree. This paper also has issues in the areas I'm familiar with (e.g. Jukebox: being "limited to English songs" is the least of its issues). I don't know why it was randomly submitted to HN.


I really haven't found ChatGTP to be very good. Ask it the lyrics to a song and it'll come up with some of the lyrics but not all of the lyrics. Like this is what I got for "In Bloom" by Nirvana:

Verse 1: Sell the kids for food Weather changes moods Spring is here again Reproductive glands Verse 2: A country battle song Multiply, exciting people Come on, join the party Come on, everybody Chorus: In bloom In bloom In bloom In bloom Verse 3: Subhumanity is fun You have one, you have none A soap impression of his wife Which he ate and donated to the National Trust Verse 4: I'm not like them But I can pretend The sun is gone But I have a light Chorus: In bloom In bloom In bloom In bloom


It's not a search engine. Asking it for direct lookup of things plays to its weaknesses, not to its strengths.

Similarly, don't judge it based on its ability to solve math problems.

When you use it for the many tasks it IS suitable for it's really, really impressive.


> It's not a search engine.

I'm curious how this is going to play out over time. They way I interact with ChatGPT is very very different from how I interact with Google. When I try to use them the same way one of them fails in frustrating ways.

> When you use it for the many tasks it IS suitable for it's really, really impressive.

I completely agree with this. For me I've found ChatGPT to be very useful in helping to generate ideas, learn about things, or explore topics that I don't fully understand. Basically if I'm curious about something, then ChatGPT is really useful. The more I use it, the more I find that I'm poking at responses with follow up questions. It feels much more like a conversation and my interactions/approach is changing as a result. I'm now finding my approach with ChatGPT to start broad ("What's the difference between ____ and ____?"), followed by more questions to dive deeper ("Can you tell me more about ____ and provide some examples?"). It does have it's limits, but I find the way of interacting and teasing out the details that I'm after to be far more interesting and useful. And the more I use ChatGPT, the less I want to use Google. At this point, Google is mostly just simple searches only like "which service is streaming ____?" or "restaurants in my area".


Temporality trips it up quite easily. Some simple math makes it easier for us to see the weakness, which makes me question how strong it really is in other domains.

Question: Yesterday evening a tree had 5 apples, a bird ate an apple from the tree at midday yesterday. No other apples were eaten from the tree. How many apples were on the tree yesterday morning?

Response: There were 5 apples on the tree yesterday evening. If a bird ate one apple from the tree at midday yesterday, then there were 5-1=4 apples on the tree yesterday morning.


There's a vast difference between confidently stating the answer and "something like this: ..." When you don't know what it's talking about, it seems very impressive, when you do, it looks very foolish.


Try "Write me a Nirvana song about angry weasels invading the French alps" if you want to see it doing something impressive.


I don’t know, people have asked to compose a Shakespearean work and what it produces is puffery that doesn’t sound like Shakespeare.


"Nay, good sir, do not dismiss the AI's composition as mere puffery. 'Tis true, the machine doth not possess the soul of the Bard, yet it doth strive to emulate his style and grace. And who are we, mere mortals, to deem what does or does not sound like Shakespeare? Let us not forget, he too was but a man, with faults and flaws. Let us instead embrace the evolution of language and the advancements of technology, for they may lead us to new heights of literary splendor."



thank you for showing me the light. I think we are onto something sublime: https://news.ycombinator.com/edit?id=34465197


example of tasks its suitable for?


Summarization.

Roleplay exercises - practice difficult management conversations with an employee, for example.

Brainstorming ideas.

Explaining well known concepts - especially if you ask it to invent analogies or mnemonics.

Helping come up with names for things.

Puns, surprisingly.

Finding and fixing bugs in code.

Generating test data, and examples generally.

Telling stories.

Showing examples of formal writing that you haven't encountered before - I've used it to help me see what things like grant applications and sales manuals look like.

Games - so many fun games you can play with it.

Those are all examples of things I've used it for just in the past week.


It produces text that at a glance or with loosely defined parameters seems like it could have plausibly been written by a human.

I think something like automatically generating dialog for characters inside of video games in a way that's more dynamic and scalable than the clearly hardcoded dialog we have today.

People have mentioned using it to cheat on high school level English papers and blackhat SEO spam / fake social media accounts.


> This work consists on an attempt to describe in a concise way the main models are sectors that are affected by generative AI and to provide a taxonomy of the main generative models published recently.

I had trouble parsing this sentence.

And why doesn't the abstract provide at least some basic explanation of the title?


"[X] Is All You Need" is a popular trope in AI paper titles, sort of like "[X] Considered Harmful".


Yeah I know that, but the abstract doesn't explain why I (don't) need it.


I also didn't like the abstract. It feels more like what the introduction section of the paper should be. Instead here they should summarize what their key finds or results were.


Academic papers shouldn't read like mystery novels or vague, pointless ramblings. If the abstract doesn't describe or summarize any notable results or contributions, a reader might (very reasonably) assume that there aren't any.

The last line of the abstract does seem to be relevant:

> This work consists on (sic) an attempt to describe in a concise way the main models are (sic) sectors that are affected by generative AI and to provide a taxonomy of the main generative models published recently.

I think the authors may have meant something like:

"This work presents a concise overview of generative AI models and their application areas, and provides a taxonomy of recently published generative models."

They could have gone on to explain that the models are categorized by input and output type.


How it _should_ read if it were concise and clear?

>This work concisely describes the main sectors affected by generative AI, along with a taxonomy of recently published models.


May be because this paper was generated by ChatGPT :-)


No OpenScience’s Bloom (opensource, 175B params), no Google’s T5 (opensource, 11B).

The AI/ML space moves so fast that this review is already outdated.


No Riffusion, no VALL-E, no Point-E either.


I think the simple chat interface is why chatGPT became popular. If other people build relatively simple interfaces to leverage other models they might be successful as well.


I think it's the chat interface combined with memory. The second not many people had seen used in practise. The chat bot also generates much longer prompts, most other chat models seem to be trained for short prompts, and purely conversational.


As a user of both, I tend to prefer the playground for most productive uses. I think chatgpt is a very specific use-case of language models that's mostly just for fun, and Google searches- but even for that the playground gives you greater control.


Would you consider checking out my site aidev.codes? You can use text-davinci-003 or code-davinci-002 and it will immediately create and host pages on the site. Or you can just collect whatever files and there is a download button. I plan to add GitHub and a lot of other stuff.


I am not sure what is the framework you are using for your site but it is not very mobile friendly. There are many frameworks available that can make a responsive site without much effort and also set up SSO authentication for you (like one can use google to login). I will try it later from a laptop.Good luck!


It uses Bootstrap.


A correct anticipation of the future is a big part of all successful projects.

However, that is mostly do the effort required to change course. If we reach a point where it's easy enough to regenerate everything from scratch, will it be so important to correctly plan ahead?


Depends on the temperature of the prompt model. Getting random results with each generation is a feature of GPT.

Determinism still requires planning.

(this response generated by GPT ;-) )


The diagrams in this paper did not fill me with confidence: there doesn't appear to be any reason for the layout of the boxes and lines in them other than to fill some space.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: