Question answering and learning are just a corner of LLM usage, but they have learning signals for the AI. Say a user asks about Pythagoras, the LLM provides an explanation, the user doesn't get it. The LLM tries again.
Repeat this loop a million times with diverse students and you get a distribution of what kind of explanations work. The model gets better at explaining through its own experience.
Sounds like you'd end up with pop science. The loop stops when the explanation is satisfying, not when it's correct. Vibe science isn't based in reality.
Repeat this loop a million times with diverse students and you get a distribution of what kind of explanations work. The model gets better at explaining through its own experience.