I don't understand why it's bad for Nvidia either.
The fact that DeepSeek-R1 is so much better than DeepSeek-V3 at various important tasks means that Chain-of-though / thinking-before-answering models are better. But they are also more compute intensive at inference time than their instruction non-thinking counterparts.
So even if the DeepSeek-V3 pretraining + GRPO COT post-training procedure was cheaper than anticipated to reach o1 grade performance, inference is still costly, even if you use a distilled model.
Deepseek offers API pricing directly on their website, so it's pretty easy to compare inference costs indirectly: It's $60.00 vs. $2.19 for 1M output tokens. Openai is 27x as expensive.
The fact that DeepSeek-R1 is so much better than DeepSeek-V3 at various important tasks means that Chain-of-though / thinking-before-answering models are better. But they are also more compute intensive at inference time than their instruction non-thinking counterparts.
So even if the DeepSeek-V3 pretraining + GRPO COT post-training procedure was cheaper than anticipated to reach o1 grade performance, inference is still costly, even if you use a distilled model.