Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> I've seen very little convincing discussion about what to do about this problem.

I think we will need adversarial AI agents whose task is to monitor other agents for anything suspicious. Every input and output would be scrutinized and either approved or rejected.



They will also be vulnerable to the same attack though.


It's AI agents all the way down




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: