It's not a secret, in OpenAI api you have to keep sending previous question and responses on top of your new question, essentially you are asking new question but give it more context with the previous questions and answers
Thank you for clarifying this mystery. My mental model of how ChatGPT works is now clearer. I was somehow thinking that in chat mode it would either (a) need to “update state”, or (b) be fed increasingly longer history. I thought (b) would slow down later responses but I guess with their ginormous compute it’s not perceptible to users.
Someone made a version with a buffer which only has a fixed size context history so it stays in constant speed mode. It will only focus on the latest questions.