Trust but verify. We’ll get processes built around this for accuracy sensitive a...

visarga · on March 20, 2023

Two models trained from different lineages won't hallucinate the same. When you want to check for hallucination the cost is to run the task on two models. For now. But soon it looks like LLMs will be better calibrated. It seems they are well calibrated after pre-training but become less calibrated after fine-tuning and RLHF. The last stage breaks its abilities to estimate confidnce scores correctly.