Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

My understanding is the base model is pretty good about knowing whether it knows stuff or not. it's human feedback training that causes it to lose that signal.


Do you have any references? I know of the emergent deception problem that seems to be created through feedback.

https://bounded-regret.ghost.io/emergent-deception-optimizat...


Base GPT-4 was highly calibrated. read open ai's technical paper.

also, this paper on gpt-4 performance of medical challenge problems confirmed the high calibration for medicine https://arxiv.org/abs/2303.13375


Thanks, but I didn't find any details about performance of pre reinforcement training and after. Looking to understand more about the assertion that hallucinations are introduced by the reinforcement training.


https://arxiv.org/abs/2303.08774 The technical report has before and after comparisons. It's a bit worse on some tests. and they pretty explicitly mention the issue of calibration (how well confidence on a problem results in the ability or accuracy solving that problem).

Hallucinations are a different matter.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: