Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I believe you're misunderstanding what the OP means about "long-term" memory. From what I can tell, it's not actively modifying the weights of the underlying model, it just "remembers" things from a high number of tokens into the past of its context. The point is that this allows it to remember something it read ~200 pages ago in a very long context window, not that it can remember something from one session into another clean session.




This model has fast weights, which actually are modified during inference.

Marketplace for fast weights inbound



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: