Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

For a transformer, context is already always being repeated every token. They can fetch information that became useful anytime they want. I don't see what problem there is to solve here.


For a transformer, context is limited, so the same kind of problem applies after you exceed some size.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: