Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
yorwba
6 months ago
|
parent
|
context
|
favorite
| on:
Run DeepSeek R1 Dynamic 1.58-bit
The performance advantage comes from doing 1/32 of the floating point operations compared to a dense layer with the same number of parameters.
iamnotagenius
6 months ago
[–]
The performance comes mostly from a fraction of memory bandwidth needed, as LLM are mostly memory constrained. Compute matters too, but usually far less than memory.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: