Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
Squeeze2664
6 months ago
|
parent
|
context
|
favorite
| on:
Qwen3-235B-A22B-Thinking-2507
How do you determine the importance of a layer in this case?
smallerize
6 months ago
|
next
[–]
https://unsloth.ai/blog/dynamic-v2
danielhanchen
6 months ago
|
parent
|
next
[–]
Yes also
https://unsloth.ai/blog/deepseekr1-dynamic
,
https://unsloth.ai/blog/dynamic-4bit
,
https://docs.unsloth.ai/basics/unsloth-dynamic-2.0-ggufs
kkzz99
6 months ago
|
prev
[–]
Afaik they have a test bench that they use and take the activation data from that.
danielhanchen
6 months ago
|
parent
[–]
Yes we have around 1 to 3 million tokens of high quality self verified data that we use to calibrate models!
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: