Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I bet you could train a supervisor classifier to run on chats and once it reaches a certainty threshold that the users is spiraling it could intervene and have the interface respond with a canned message directing the user to a help line. Of course, OpenAI wouldn't do this, because that would involve admitting that its bot can exacerbate mental health issues, and that LLM therapy is more harmful than helpful, which cuts against their AGI/replace human workers sales pitch. Their product isn't the LLM, it's trust in the LLM.


You can't force people to seek help. Best to help them as much as you can while trying to persuade them to seek help.


The idea is that the canned message would be an attempt at persuading them. I really don't trust an LLM prompted to persuade someone to seek therapy would yield better results.


Modern thinking models can be trusted to follow nuanced safety instructions. Models like ChatGPT-4o can't. They will make bizarrely inaccurate statements from time to time.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: