A normal transformer model doesn't have online learning [0] and only "acts" when prompted. So you have this vast trained model that is basically in cold storage and each discussion starts from the same "starting point" from its perspective until you decide to retrain it at a latter point.
Also, for what it's worth, while I see a lot of discussions about the model architectures of language models in the context of "consciousness" I rarely see a discussion about the algorithms used during the inference step, beam search, top-k sampling, nucleus sampling and so on are incredibly "dumb" algorithms compared to the complexity that is hidden in the rest of the model.
What if we loop it to itself? An infinite dialog with itself... An inner voice? And periodically train/fine-tune it on the results of this inner discussion, so that it 'saves' it to long-term memory?
Difference between online & offline is subjective. Fast forward time enough, it’s likely there would be no significant difference unless the two models were directly competing with one another. It’s also highly likely this difference will change in the near future; already notable efforts to enable online transformers.
Understand to you it is, but to me it’s not, only question is if the path leads to AGI, beyond that, time wise, difference between offline and online is simply matter of resources and time — being conscious does not have a predefined timescale; and as noted prior, it’s already an active area of research with notable solutions already being published.
Also, for what it's worth, while I see a lot of discussions about the model architectures of language models in the context of "consciousness" I rarely see a discussion about the algorithms used during the inference step, beam search, top-k sampling, nucleus sampling and so on are incredibly "dumb" algorithms compared to the complexity that is hidden in the rest of the model.
https://www.qwak.com/post/online-vs-offline-machine-learning...