Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

>What if we need top-notch malware to take down the robot dogs lobbing mortars at our madmaxian compound?!

I wouldn't sweat it. According to it's developers, Codex understands 'malicious software', it has just been trained to say, "But I won't do that" when such requests are made to it. Judging from the recent past [1][2] getting LLMs to bypass such safeguards is pretty easy.

1.https://hiddenlayer.com/innovation-hub/novel-universal-bypas... 2.https://cyberpress.org/researchers-bypass-safeguards-in-17-p...



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: