Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
jasonjmcghee
64 days ago
|
parent
|
context
|
favorite
| on:
Lossless LLM compression for efficient GPU inferen...
I read it similarly - that this is a specific attribute of bfloat16, so the quants folks tend to run on local hardware don't have the same inefficiency to exploit
Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: