> I've tried their 7b model Anything other than their 671b model are just distil... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

donsupreme 3 months ago | parent | context | favorite | on: Nvidia’s $589B DeepSeek rout

> I've tried their 7b model

Anything other than their 671b model are just distilled models on top of Qwen and Llama using their 671b reasoning data output, right?

KiwiJohnno 3 months ago [–]

Correct. Its the best model I've been able to run locally, by a long shot

Join us for AI Startup School this June 16-17 in San Francisco!
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact