Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is true for LLMs themselves. If a new LLM is really better than all the other ones then it can be used to help improve other LLMs.


Is it? Last I checked when you trained an LLM on another's output, at best you got the same performance as the original, and it was more likely you significantly degraded usefulness. (I'm not talking about distillation, where that tradeoff is known in return for a smaller, more efficient parameter set)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: