Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
barbegal
9 months ago
|
parent
|
context
|
favorite
| on:
Un Ministral, Des Ministraux
Does anyone know why Mistral use a 17 bit (131k) vocabulary? I'm sure it's more efficient at encoding text but each token doesn't fit into a 16 bit register which must make it more inefficient computationally?
cpldcpu
9 months ago
[–]
The tokens are immediately transformed into embeddings (very large vectors), so the 17 bit values are not used for any computation.
Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: