Hacker News new | past | comments | ask | show | jobs | submit login

GPT looks a lot like an IIR filter that transforms a sequence of vectors. Edit: IIR filters are linear functions of N past inputs and N past output - the latter gives them "memory" and non-trivial abilities to process signals. GPT is mostly linear, uses 8192 past inputs and outputs. I'd be tempted to introduce the 3rd sequence - an "internal buffer" with 8192 tokens - that GPT updates even with null inputs, the process that corresponds to "thinking".



Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: