Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I skimmed the paper but I couldn't figure out what they're doing to make concepts fundamentally different from tokens.

I would think that the purpose of concepts is to capture information at a higher density than tokens, so you can remember a longer conversation or better produce long-form output.

Given that, I would have expected that during the training phase, the concept model is evaluated based on how few concepts it emits until it emits a stop.



Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: