The size reduction while keeping the model coherent is incredible. But I'm skept... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		afro88 7 months ago \| parent \| context \| favorite \| on: Run DeepSeek R1 Dynamic 1.58-bit The size reduction while keeping the model coherent is incredible. But I'm skeptical of how much effectiveness was retained. Flappy bird is well known and the kind of thing a non-reasoning model could het right. A better test would be something off the beaten path that R1 and o1 get right that other models don't.

whimsicalism 7 months ago [–]

yeah it is pretty unclear how lobotomized it is without benchmark.

i’ve gotten full fp8 running on 8xh100, probably going to keep doing that

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact