What they really destroyed was the idea that OpenAI would be able to charge $200...

toomuchtodo · 2025-01-29T16:22:21 1738167741

> The Free tier and $20/month Plus tier along with their API business (minus any future plan to charge a ridiculous amount for API access to o1) will be fine.

Do the unit economics make this sustainable?

DebtDeflation · 2025-01-29T17:13:10 1738170790

If only there were a way to make the models more efficient. Oh wait.

jl6 · 2025-01-29T17:37:02 1738172222

But doesn’t Deepseek’s innovation apply only to training, not inference?

Zacharias030 · 2025-01-29T18:22:39 1738174959

Actually no! If we take their paper at face value, the crucial innovation to get a strong model with efficiency is their much reduced KV cache and their MoE approach: - where a standard model needs to store two large vectors for each token at inference time (and load/store those over and over from memory) deepseek v3/R1 only stores one smaller vector C that is a „compression“ from which the large k,v vectors can be decoded on the fly. - They use a fairly standard Mixture of Expert (MoE) approach, which works well in training with their tricks, but whose inference time advantages are immediate and equal to all other MoE techniques, which is to say that from ~85% of the 600B+ params that are inside the MoE layers, the model at each token inference step will only pick a small fraction to use. This reduces FLOPs and memory io by a large factor in comparison to a so-called dense model where all weights are used for every token (cf Llama 3 405B)

freeone3000 · 2025-01-29T18:24:50 1738175090

Reducing R&D expense also reduces breakeven price.