Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I don't necessarily think that this is true. If an AI is designed to optimize for X and self-destruction happens to be the most effective route towards X, why wouldn't it do so?

Practical example: you have a fully AI-driven short-range missile. You give it the goal of "destroy this facility" and provide only extremely limited capabilities: 105% of fuel calculated as required for the trajectory, +/- 3 degrees of self-steering, no external networking. You've basically boxed it into the local maxima of "optimizing for this output will require blowing myself up" -- moreover, there is no realistic outcome where the SRM can intentionally prevent itself from blowing up.

It's a bit of a "beat the genie" problem. You have complete control over the initial parameters and rules of operation, but you're required to act under the assumption that the opposite party is liable to act in bad faith... I foresee a future where "adversarial AI analytics" becomes an extremely active and profitable field.



Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: