Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

>There’s probably quite a lot of complexity on the backend in sourcing the right model for the right query.

This isn't how Sparse MoE models work. There isn't really any complexity like that. And different models will or can pick each token.

Sparse models aren't an ensemble of models.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: