Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It won't. Humans are vulnerable to the same "prompt injection" attacks. And it's not something you can "just" solve - you'd be addressing a misuse of a core feature by patching out the feature itself.


By that time we could have 10 other LLMs supervising the one you're worried about ...


panopticum!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: