I wonder what about this one gets the +0.5 to the name. IIRC the 2.0 model isn’t...

forbiddenvoid · 2025-03-25T17:16:41 1742923001

I would consider this a case of "expectation management"-based versioning. This is a release designed to keep Gemini in the news cycle, but it isn't a significant enough improvement to justify calling it Gemini 3.0.

jstummbillig · 2025-03-25T18:47:40 1742928460

I think it's reasonable. The development process is just not really comparable to other software engineering: It's fairly clear that currently nobody really has a good grasp on what a model will be while they are being trained. But they do have expectations. So you do the training, and then you assign the increment to align the two.

8n4vidtmkvmk · 2025-03-26T05:27:45 1742966865

I figured you don't update the major unless you significantly change the... algorithm, for lack of a better word. At least I assume something major changed between how they trained ChatGPT 3 vs GPT 4, other than amount of data. But maybe I'm wrong.

eru · 2025-03-26T06:48:24 1742971704

The number is purely for marketing.

If you could get much better performance without changing the algorithm (eg just by scaling), you'd still bump the number.

KoolKat23 · 2025-03-25T20:04:15 1742933055

Funnily enough, from early indications (user feedback) this new model would've been worthy of the 3.0 moniker, despite what the benchmarks say.

aoeusnth1 · 2025-03-25T17:23:04 1742923384

I think it's because of the big jump in coding benchmarks. 74% on aider is just much, much better than before and worthy of a .5 upgrade.

Workaccount2 · 2025-03-25T17:20:44 1742923244

At least for OpenAI, a .5 increment indicates a 10x increase in training compute. This so far seems to track for 3.5, 4, 4.5.

utopcell · 2025-03-26T01:57:07 1742954227

It may indicate a Tick-Tock [1] process.

[1] https://en.wikipedia.org/wiki/Tick%E2%80%93tock_model

alphabetting · 2025-03-25T17:30:54 1742923854

The elo jump and big benchmark gains could be justification

falcor84 · 2025-03-25T17:19:02 1742923142

Agreed, can't everyone just use semantic versioning, with 0.1 increments for regular updates?

laurentlb · 2025-03-25T17:30:49 1742923849

Regarding semantic versioning: what would constitute a breaking change?

I think it makes sense to increase the major / minor numbers based on the importance of the release, but this is not semver.

falcor84 · 2025-03-25T20:43:51 1742935431

As I see it, if it uses a similar training approach and is expected to be better in every regard, then it's a minor release. Whereas when they have a new approach and where there might be some tradeoffs (e.g. longer runtime), it should be a major change. Or if it is very significantly different, then it should be considered an entirely differently named model.

morkalork · 2025-03-25T17:31:24 1742923884

Or drop the pretext of version numbers entirely since they're meaningless here and go back to classics like Gemini Experience, Gemini: Millennium Edition or Gemini New Technology

joaogui1 · 2025-03-25T18:11:08 1742926268

Would be confusing for non-tech people once you did x.9 -> x.10

guelo · 2025-03-25T18:45:56 1742928356

What would a major version bump look like for an llm?

eru · 2025-03-26T06:49:25 1742971765

Going from English to Chinese, I guess? Because that would not be a compatible version for most previous users.