Trust me I am one of the people absolutely baffled and pleasantly surprised by a...

Buttons840 · on March 19, 2023

Neither RNNs nor transformers will change their parameters during a forward pass. In production GTP just does a forward pass through the network, but never calculates the parameter gradients, and this would be true if GTP was built on RNNs as well, and without those gradients the AI will not improve.

seanhunter · on March 19, 2023

That seems an implementation detail that would be entirely solvable with some additional work rather than a fundamental limitation though.

Der_Einzige · on March 19, 2023

Unfortunately we have this thing called the curse of dimensionality and specifically we do not understand how to prevent catastrophic forgetting. This is why incrementally learning NNs are still so rare in practice.

yoz-y · on March 19, 2023

You could periodically retrain the model using the original dataset plus the conversations from the period. Sort of a commit of the short term memory into long one.

glenstein · on March 19, 2023

I think you could say this effectively is what is happening within the narrow confines of any single conversation.

Although it throws away that conversation once it's done.

_frkl · on March 19, 2023

Ha. So, it needs to 'sleep', kind...