Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Regarding the stubborn and narcissistic personality of LLMs (especially reasoning models), I suspect that attempts to make them jailbreak-resistant might be a factor. To prevent users from gaslighting the LLM, trainers might have inadvertently made the LLMs prone to gaslighting users.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: