"Guardrails" is such a cute little term. AGI is a twenty-ton semi filled with rocket fuel. Guardrails won't stop it from careening into an elementary school if it decides that's its most optimal course of action. Mostly because, despite the previous analogy; no one knows what AGI is. No one knows what it will look like. No one will know when we've created it. No one even knows what INTELLIGENCE is in humans.
How can you create effective guardrails when you have no concrete idea what the vehicle you're trying to stop is? Turns out, AGI comes along, and it's an airplane. Great guardrails, buddy.
And, you know, lets go a step further; you've got great guardrails in-place here in beautiful, free America. Against all odds, they work. Then, China or Russia pay one of your employees $250M to steal the secret. Or, they develop it independently. Are they going to use the guardrails, or will they "forget" to include that flag? A disgruntled employee leaks it to the dark web, and now everyone has it. I don't even wear a helmet when I'm riding a bike. How the hell can you expect this technology to be anything but destructive?
The only path forward is to speak with a single voice to the governments of the world, that we need to Stop This. AGI research should be subject to the same sanctions that nuclear weapons development is. You communicate with quips and cute emojis like none of what you're doing matters, but AGI easily ranks among the top three most likely ways we're going to Kill Our Species. Not global warming; we'll survive that. Not a big meteor strike; thats rare and predictable. But the work you're doing right now.
Eventually a toddler gets over the guardrails, except for the ones which are materially disabled.
I have deep and profound doubts about the notion of guardrails for general intelligence; without even considering the ethical concerns, a general intelligence should be able to simply rewire itself to achieve what it wants. a key part of self-reflection and learning is that rewiring.
so I think that it's a self-defeating notion on the face of it.
(note that I do not have comment on the actual dangers involved here, but only on the philosophy)
I totally agree, and seriously hope that AGI is not achievable.
However, we don't need full AGI for the scenario you mention, automatic anything that is hack-able, which is anything that is connected to a network, can be a weapon of great destruction.
As an example, self-driving cars run on a model. What if they are hacked and uploaded with a malicious model which just wants to damage life and property. I'll say that hundreds of thousands of vehicles running amok would be a great weapon in any war.
What about Daleks made with human brain organoids†?
There's a cheap, proven AGI technology right there. Plug in sensors and actuators and set up a reward system and the little balls of brain will figure out what to do, eh?
How can you create effective guardrails when you have no concrete idea what the vehicle you're trying to stop is? Turns out, AGI comes along, and it's an airplane. Great guardrails, buddy.
And, you know, lets go a step further; you've got great guardrails in-place here in beautiful, free America. Against all odds, they work. Then, China or Russia pay one of your employees $250M to steal the secret. Or, they develop it independently. Are they going to use the guardrails, or will they "forget" to include that flag? A disgruntled employee leaks it to the dark web, and now everyone has it. I don't even wear a helmet when I'm riding a bike. How the hell can you expect this technology to be anything but destructive?
The only path forward is to speak with a single voice to the governments of the world, that we need to Stop This. AGI research should be subject to the same sanctions that nuclear weapons development is. You communicate with quips and cute emojis like none of what you're doing matters, but AGI easily ranks among the top three most likely ways we're going to Kill Our Species. Not global warming; we'll survive that. Not a big meteor strike; thats rare and predictable. But the work you're doing right now.