Has Llama-3 just killed proprietary AI models?

airstrike · on April 21, 2024

> Meta released Llama-3 only three days ago, and it already feels like the inflection point when open source models finally closed the gap with proprietary models. The benchmarks show that Llama-3 70B matches GPT-4 and Claude Opus in most tasks, and the even more powerful Llama-3 400B+ model is still training.

I'm all for open models, but where do the benchmarks show that?

The official page https://llama.meta.com/llama3/ does not show any comparisons with GPT-4 or Claude Opus

Looking at https://arena.lmsys.org/, Llama-3-70b-Instruct is ranked #5 while current GPT-4 models and Claude Opus are still tied at #1. Meanwhile, Llama-3-8b-Instruct is ranked #14

Would love to be corrected, but either way an article should include sources for these types of claims.

rb2k_ · on April 21, 2024

Some Llama3 400B numbers for the April 15th checkpoint are listed at https://ai.meta.com/blog/meta-llama-3/

Direct image link: https://scontent-atl3-2.xx.fbcdn.net/v/t39.2365-6/439015366_...)

natrys · on April 21, 2024

It's above Opus in second position in the "English" only category. It probably suffers in the overall score due to poor multilingual ability (afaik 95% of its training data was English only).

Though usual caveat about small sample size applies, as of now the CI is fairly big. It's also not at the level of those two in "Code" category, I hope Meta will give the CodeLlama variant an update again.

gtt · on April 22, 2024

How significant is the difference in Elo scores for the end user? Should I use Claude Opus if I already have GPT4, despite its similar score?

Additionally, how much better is Claude for coding?

floydnoel · on April 22, 2024

I use Poe.com, which gives me access to every model available for $20. When a new model comes out it is quickly added to the available list. So, I've been doing some comparison against Claude & GPT4. I still think GPT4 is better at following instructions and giving me the result I'm after.

afro88 · on April 21, 2024

Came here to say this. Thought I missed something!

400B does look set to meet GPT-4, which will be exciting, but it's not finished yet.

faeriechangling · on April 21, 2024

OpenAI is also a moving target and Llama3 is probably due to get a few months of parity before GPT5 wallops it.

To me, it looks like OpenAI is ahead of the competition by 12-18 months, which isn't nothing but the competition is certainly nipping at their heels.

anon373839 · on April 21, 2024

At the risk of this comment aging poorly, I am not convinced there is a magic GPT-5 around the corner, waiting to wallop anybody. I have a suspicion that we’re into diminishing returns with model scaling. All of the recent flagship releases (Gemini Ultra, Claude Opus, GPT-4 updates) have only advanced the frontier a little. Could OpenAI train something another 1-2 orders of magnitude bigger? Perhaps, but then it wouldn’t be economical to deploy it.

_kava · on April 22, 2024

Sora is a thing. It literally walloped everything else in the field to the point the second closest video generation model looked like a broken nonfunctional mess compared to it.

GPT5 might not be the thing we are all assuming it would be. It might not just be a chatbot but an actual AI that can "see" and "talk". By that point, I don't care if performance wise it isn't a big improvement over GPT4 (though it should as we know these models improve significantly when trained with another modal of data, e.g. text+image was much better than text only)

nabakin · on April 21, 2024

Not ahead of Anthropic. Anthropic is trading blows with the best GPT-4 models these days.

For the open weight/source community I'd put it at 3-6 months behind and catching up. Llama 3 405B might catch the community up unless OpenAI has another LLM up their sleeve that makes a significant jump forward, but we haven't seen evidence of that yet.

faeriechangling · on April 21, 2024

Anthropic's Claude 3 came out 12 months after GPT4.

nabakin · on April 21, 2024

GPT-4 has been updated multiple times since then. Anthropic has been trading blows with those updates to GPT-4.

boppo1 · on April 21, 2024

I thought OpenAI said they weren't training gpt5 and were doing more smaller multimodal models

pskkk · on April 21, 2024

I wonder if that was a case of being literal as in, technically speaking gpt-5 isn’t being _trained_, while omitting that they are working on collecting and preparing the data.

faeriechangling · on April 21, 2024

They have recently confirmed that they were working on GPT5, without confirming or denying that they were training it.

I guess it isn't unthinkable that GPT5 is more than a year out given the gap between GPT3 and GPT4 was 3 years. At the same time, the rumourmongers at businessinsider said their sources are expecting a Summer release, which seems plausible. Nobody really knows when it will release besides Sam Altman though.

airstrike · on April 21, 2024

> Nobody really knows when it will release besides Sam Altman though.

The day after some model truly beats GPT-4, plus or minus 1-2 weeks, is my guess.

woleium · on April 21, 2024

what they say and what they do are different things

willsmith72 · on April 21, 2024

Yeah I came to post the same thing. It's nothing like gpt4, just great for the size.

geor9e · on April 21, 2024

If they did or didn't is up for debate, but I'm pretty sure scorched earth was their goal from the start (specifically targeting the giants, Google and OpenAI). They don't want to be #3 in this race.

paxys · on April 21, 2024

Every big new tech category in recent years has quickly matured into a duopoly of a closed, proprietary, premium option and a mass market, public, "open" one. OpenAI, Anthropic, Google, Amazon, Microsoft and a ton of other players are battling for the #1 option, but Meta has cleverly theorized that #2 is going to be easier and a better fit for them.

pfdietz · on April 21, 2024

I think it's not just big categories of software, although I'm sure there are niches with no OS option.

prpl · on April 21, 2024

There was a clear leader 18 months ago. Then there was a lot of catchup by everyone.

It’s no longer “hard” build a product like ChatGPT + GPT 3.5, and that would include creating the model. There’s a few trade secrets, but it seems like not much beyond that protecting an ~8 month mote.

echelon · on April 21, 2024

It's the smartest possible play.

The leader always wants to be a monopoly. The distant runners up can catch up the most ground by playing nice, being open source, and working with other companies to erode the market share that the leader wanted to lock away.

I'm so glad to see this happening. I'm terrified of a single company winning all of AI. It's starting to look like this won't happen and that OpenAI is simply stretched too thin.

If this pattern holds, OpenAI may wind up as a footnote. Their inability to open up and work with others means that competitors and would-be collaborators will choose the open alternatives.

Meta can win that mindshare and be the friendly facilitator and rails that an entire ecosystem of business is built upon. OpenAI will never be that. They're not "open" enough.

cpeterso · on April 23, 2024

Open models weren’t Meta’s original plan. The LLaMA 1 model was only available to Meta-approved researchers until someone leaked the model weights on 4chan in 2023. Meta issued DMCA takedown requests to HuggingFace and GitHub.

greggsy · on April 21, 2024

I was hoping for more detail in the post but it really just a seems like a quick and easy way to attract eyeballs? Never heard of the platform, and I cant help but think that the logo is a rip of the old okta brand.

jsheard · on April 21, 2024

Perhaps for now, but I wouldn't count on there always being a Meta spending $$$ to train enormous models and then giving them away for free. What's the long-term game plan for open-source models when the corporate charity inevitably dries up?

ozr · on April 21, 2024

If your product is an AI model (OpenAI, Anthropic, etc) you can't give it away for free.

If your product is a social graph w/ ads (Meta), you can.

It's hardly corporate charity:

* Meta releasing these models creates an improvement and tuning ecosystem around it, giving them access to tons of free developer time.

* It's also a strong recruiting tool, for engineers and researchers frustrated by, e.g., Google and OpenAI becoming increasingly closed. They know they can publish at Meta.

* The cost is insignificant. Meta had 30B in revenue just in Q2 2023.

faeriechangling · on April 21, 2024

It's great PR. I hear people refer to Mark Zuckerberg as a "real engineer" and other such platitudes and it's doing wonders to reduce the stigma around Meta amongst the tech savvy minority.

Despite Meta being very similar to Google in terms of incentives, and Google nowadays being decidedly uncool. Doesn't hurt that Mark shipped something worth a damn whereas Google has been floundering for ages.

talldayo · on April 21, 2024

With credit to Google, they were basically neck-and-neck with Facebook AI Research in a lot of ways. Both were publishing text transformers (BERT vs FastText), both were maintaining SOTA inference libraries (Tensorflow vs Pytorch) and both were investing heavily into researching the field further. I'd even argue that Google was the largest contributor to making open-source AI more like Linux and less like a shitty proprietary product.

There's a whole history of recent machine-learning development where both Google and Facebook have worked together and against each other to push things forward. I think it's entirely mistaken to characterize Google as the understudy when in many ways it's the other way around.

faeriechangling · on April 21, 2024

I mean I have been watching what Google has been doing in AI with wonder for ages and do agree that they seem to be rather underrated merely because they've recently been caught on the back foot.

At the end of the day though I use Llama and I use GPT4 and I don't use Bard. Google has an amazing legacy around AI but it really hasn't been performing in the last couple of years. I can imagine they'll have a comeback, but one does wonder if Google has lost their mojo.

drivingmenuts · on April 21, 2024

How is it that Zuckerberg is suddenly an "engineer"? He's a CEO, but just because you run a business, does not mean you actually do any of the underlying technical work. Are they blind to the emperor's clothing?

ozr · on April 22, 2024

He's obviously not closing Jira tickets at Meta, but that doesn't mean he's not an engineer. As an example of the positive impact that the Llama releases have had, this post from him has been doing the rounds lately in response to criticsm like yours:

https://www.facebook.com/notes/775294156352065/

drakenot · on April 21, 2024

Commoditize your complement and all that.

jeffnappi · on April 21, 2024

Building open models is a very strong approach to cornering the market on top tier AI researchers. And as other commenters have mentioned, the raw models are not the product - the vast majority of the value is in how they are integrated into useful products.

_giorgio_ · on April 21, 2024

Exactly, a free standalone AI is an unsustainable product.

You can't offer it for free, unless you make money in another way.

polio · on April 21, 2024

The corporate charity will not dry up. AI makes it easier to generate content, and Meta's in the business of facilitating the sharing of that content. Content is surface area for ads. AI will also make the virtual realities of the "metaverse", as defined by Mark, easier to reify. It's also a giant marketing and recruiting strategy.

jsheard · on April 21, 2024

AI does make it easier to generate content, but the type of content it lends itself towards the most is spam. Whether a Facebook where the majority of the content is neural net slurry is something that people will want to engage with once they realise what's going on is an open question I think.

Anecdotally it seems like older demographics are the prime target of the current wave of AI engagement farming on Facebook, because they just don't understand that this technology exists now and assume that all of the "photos" they're seeing are real.

dragonwriter · on April 21, 2024

> What’s the long-term game plan for open-source models when the corporate charity inevitably dries up?

Once open models reach and stay at near parity for a while, it’ll make sense for commercial downstream users to support open source community efforts rather than building their own, same as has happened in many other categories of key infrastructure software.

sharkjacobs · on April 21, 2024

Unless Meta’s bet is that, going forward, models themselves won’t be the competitive differentiator, that it will be about integration. They can give away Llama3,4,5 for free, because no one else can put them Whatsapp or whatever.

idk

lunfardl · on April 21, 2024

I guess many here can think a lot 5D chess business strategies. For me it is just Zuck trying to reach greatness, down in history his name will prob pop up when people think about who brought LLM/AI/AGI? to the masses.

mepian · on April 21, 2024

Maybe they are commoditizing their complement here.

jsheard · on April 21, 2024

Maybe, but at some point vague strategic moves like that will need to actually justify themselves at the earnings call. Throwing endless millions of dollars down the drain to what, depreciate the value of another companies product, which doesn't even directly compete with Metas main breadwinners? What do they actually get out of this?

I suppose if there's any consolation for the open-source AI community it's that Meta has demonstrated a willingness to burn a lot of money for unclear benefit, they're still single-handedly keeping the VR industry on life support at the expense of about $4 billion per quarter. A decade on from acquiring Oculus and no closer to making it profitable, if Metas AI efforts get the same treatment then the free models will probably keep coming for a while.

suriyaG · on April 21, 2024

Zuckerberg's reasoning on AI/ML seemed very justifiable to me in the Dwarkesh podcast

IIRC,

- Having a better model is a competitive advantage in fighting against spam - Better models enables facebook itself to understand their code vulnerabilities, better employee productivity etc. - Being in the frontier of open source, keeps them at an advantage in terms of updates from community

tyre · on April 21, 2024

Also Meta is in the business of sharing content. Instagram, Facebook, Threads, and WhatsApp are all about sharing content.

They’re highly incentivized to expand the options for generating content. OpenAI and Anthropic have to make money from the generation, which raises the barrier to creation. Meta can say fuck it, let everyone create and well sell ads against what they create.

seydor · on April 21, 2024

They spent much more on metaverse

pointlessone · on April 21, 2024

8k context is tiny compared to what’s out there. They promise much larger contexts but until then it can’t even reliably summarize every web page out there.

tarruda · on April 22, 2024

There are techniques to extend the context window via fine tuning.

The authors claim this method was used to extend Llama 2 to 128k: https://github.com/jquesnelle/yarn

segmondy · on April 21, 2024

It hasn't. My guess is that 90% of the folks using proprietary models through chat interface have no need or wants to run their own locally nor do they have the hardware. For those of us who wish to run our models and do have the hardware, I would reckon maybe 20% of us using proprietary models will drop them. The latest models released are very good, but I'm not convinced they are GPT/Claude quality yet.

nojvek · on April 22, 2024

After Apple’s ad tracking blocking, Meta was worth $500B. After they announced they are full in on AI and want to be too they are now worth $1.2T.

So even if they are losing a bit on infra and training investment, Meta is relevant again. They are cool again.

Meta made a huge comeback.

jauntywundrkind · on April 21, 2024

Llama-3 is semi-proprietary though. It's definitely not open source!!

Has anything changed in the last 9 months: https://news.ycombinator.com/item?id=36815255 ? Is there better access to anything more than weights? Can we now train new models using llama? Are we no longer as restricted in use?

Haven't seen any comments questioning the premise here. It seems pretty constrained what we can do and how built-atoppable Llama is. I like the idea of it as a safeguard against control by some giant but I'm not sure if it's a big enough grant of rights to be something we can built atop.

comex · on April 21, 2024

Llama 3 is just as restricted as Llama 2. The licenses and acceptable use policies are almost identical; the only real change consists of added attribution requirements. Products that use Llama 3 have to "prominently display 'Built with Meta Llama 3' on a related website, user interface, blogpost, about page, or product documentation", and models based on Llama 3 have to "include 'Llama 3' at the beginning of any such AI model name."

mehulashah · on April 21, 2024

Commoditization of the AGI models is inevitable. OpenAI is building the next generation of compound AI systems — an LLM OS, if you will. it does more than just predict the next token. It figures out the right function and service to call when needed. That’s where the arms race has moved to. Meta isn’t going to build a free LLM OS. The comparison is apples to oranges.

EcommerceFlow · on April 21, 2024

GPT4 was released a year ago, and trained two years ago. I wouldn’t sleep on OpenAi, especially how confident Sam Altman sounded in his Lex Friedman podcast.

The real race is to AGI anyways, as whoever gets there first will immediately capture 100% of the market.

dragonwriter · on April 21, 2024

> I wouldn’t sleep on OpenAi, especially how confident Sam Altman sounded in his Lex Friedman podcast.

How confident Sam Altman sounds doesn’t figure much into my assessment’s of reality, other than the reality of Sam Altman’s promotional skills.

> The real race is to AGI anyways, as whoever gets there first will immediately capture 100% of the market.

AGI has no actual objective definition, and nothing supports this beyond naked conjecture and quasi-religious dogma.

EcommerceFlow · on April 22, 2024

Im just reluctant to jump to conclusions until I see OpenAi’s next hand.

True on AGI, but I guess I meant a “self improvement” Ai? Whether that exists or is a pipe dream is to be seen, but seems like the goal of most these companies.

barfbagginus · on April 22, 2024

You would be wrong to accept this hype anyways.

Llama 3 is not beating GPT4 in benchmarks I've seen, and it's not beating it on LLM Arena. That's all that really matters. It needs to beat GPT4 in the benchmarks and leader boards, or it's a nothingburger, as far as OpenAI's dominance goes.

nextworddev · on April 21, 2024

This article will age poorly when GPT-5 is released /s There will be always a market for proprietary models that is the best, though it’s unclear whether that will be OpenAI’s

thorum · on April 21, 2024

GPT-4 and Opus are better for complex/precision tasks, and Haiku is cheaper for everything else.

Good 7B/8B models are still really useful but let’s not be hyperbolic.

Havoc · on April 21, 2024

Wouldn't say killed, but for certain usage cases definitely provided a strong alternative.

The 8B model seems particularly good at summarization tasks.

mushufasa · on April 21, 2024

The brilliance of Meta's strategy here is: if they offer a (F?)OSS model at near-performance compared to leaders, they commoditize the product of their would-be competitors. Meta doesn't have to make money on API calls, but they could face an existential risk if someone else built the everyday AI companion of the future (e.g. users on ChatGPT UI, microsfoft's much advertised "copilot for everyday").

So -- a defensive play with some positive externalities (e.g. developer ecosystem mindshare + roadmap control, ability to use within their own products at cost, without giving up margin to suppliers).

bagels · on April 21, 2024

It can also power features in Instagram, etc.

drivingmenuts · on April 21, 2024

As long as businesses have proprietary processes, there will be proprietary models. An open-source AI cannot know everything.

colinng · on April 23, 2024

I tested Llama-3-70b-8192 on Groq against ChatGPT 4, and while Groq ran it super fast, it hallucinated one answer, and didn’t get the logic correct on another question.

So, ChatGPT 4 is still more reliable for my use case. But if I were to want an LLM to process data, summarize, and so forth, Llama-3 on Groq is very fast.

Questions:

Do you know anything about Intel Hala Point?

Groq: bullshit, but admitted it when I called it out. ChatGPT: did a Bing search (it knew what it didn’t know).

Question 2a (separate chat): If you’re in Canada, what’s the best way to use a TFSA?

2b: Okay, if your portfolio has some tech stocks, some cash cows, and some government bonds, which should be allocated to the TFSA?

The reason I chose Question 2 is that most banks are happy to recommend bad products if it benefits them. Llama-3’s answer reflects the bank bullshit. ChatGPT 4 gives the advice your trustworthy and financially savvy friend would give you.

Follow-on questions for Llama-3:

2c: You have it backwards.

2d: Why did you get it backwards? Were you influenced by the glut of “advice” proffered by banks?

ladzoppelin · on April 21, 2024

So they are doing the "bait and switch" like Google did for market share and in the end Facebook controls the "best" model? This sounds horrible.

rrr_oh_man · on April 21, 2024

«Any headline that ends in a question mark can be answered by the word no.»

— Betteridge's law of headlines (https://w.wiki/3b$V)

rmellow · on April 21, 2024

Betteridge's corollary:

As an online discussion about a headline that ends in a question mark grows longer, the probability of a citation of Betteridges's law approaches 1.

QuantumGood · on April 21, 2024

In the same domain as Godwin's law

"As an online discussion grows longer, the probability of a comparison involving Nazis or Hitler approaches 1."

franze · on April 21, 2024

ronsor · on April 21, 2024

Something something law of headlines... No. As much as I'd like for this to be the case, it's not.

I think proprietary models will gradually become less popular, though, due to their lack of consistency and control.

debacle · on April 21, 2024

Not yet, but I imagine at some point it will. Open source, much like the oft maligned Thanos, is inevitable.