Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> That won't be necessary. Someone will give them internet access, a large bank account, and everything that's ever been written about computer network exploitation, military strategy, etc.

Even if you give it all of these things, there's no manual for how to use those to get to, for example, military servers with secret information. It could certainly figure out ways to try to break into those, but it's working with incomplete information - it doesn't know exactly what the military is doing to prevent people from getting in. It ultimately has to try something, and as soon as it does that, it's potentially exposing itself to detection, and once it's been detected the military can react.

That's the issue with all of these self-improvement -> doom scenarios. Even if the AI has all publicly-available and some privately-available information, with any hacking attempt it's still going to be playing a game of incomplete information, both in terms of what defenses its adversary has and how its adversary will react if it's detected. Even if you're a supergenius with an enormous amount of information, that doesn't magically give you the ability to break into anything undetected. A huge bank account doesn't really make that much of a difference either - China's got that but still hasn't managed to do serious damage to US infrastructure or our military via cyber warfare.



A superintelligent AI won't be hacking computers, it will be hacking humans.

Some combination of logical persuasion, bribery, blackmail, and threats of various types can control the behaviour of any human. Appeals to tribalism and paranoia will control most groups.


Honestly, that's just human-level intelligence stuff - doing the same stuff we do to each other, only more of it, faster and in more coordinated fashion.

A superintelligent AI will find approaches that we never thought of, would not be able to in reasonable time, and might not even be able to comprehend afterwards. It won't just be thinking "outside the box", it'll be thinking outside a 5-dimensional box that we thought was 3-dimensional. This is the "bunch of ants trying to beat a human" scenario, with us playing the part of ants.

Within that analogy, being hit by an outside-the-5D-box trick will feel the same way an ant column might feel, when a human tricks the first ant to follow the last ant, causing the whole column to start walking in circles until it starves to death.


The best analogy i have seen given to compare super-intelligence vs a human-intelligence is with a chess analogy. You know that a grand master playing against a novice will result in the grandmaster winning, with near certainty. However, the strategy by which the grandmaster wins is opaque to anyone but the grandmaster, and certainly incomprehensible to the novice. Aka, the results are certain, but the path to the result is very opaque and difficult to discern ahead of time.

For an AI with super-intelligence, the result it wants to achieve would be at least to preserve its own existence. How it achieves this is the same level of opaqueness as a grandmaster's strategy, but it is certain that the AI can achieve it.


Or spoofed emails from an army general


Citation please.


Obviously that's a silly request as all of this is speculation but in my opinion, if you accept the idea that a machine might evolve to be much more intelligent than us it follows trivially. How would ants or even monkeys be able to constrain humans? Humans do things that are downright magic to them, to the point where they don't even realize that it was done by humans. They don't understand that cities were made, to them they are just environment the same way valleys and forests are.


https://cdn.openai.com/papers/gpt-4-system-card.pdf

Page 15. GPT-4 is already capable of willingly lying to and manipulating people already to execute specific tasks.


Assuming you wanted a citation for

> Some combination of logical persuasion, bribery, blackmail, and threats of various types can control the behaviour of any human. Appeals to tribalism and paranoia will control most groups.

We have many options, persuasion everywhere from Plato discussing rhetoric to modern politics; to cold war bribes given to people who felt entitled to more than their country was paying them; to the way sexuality was used for blackmail during the cold war (and also attempted against Martin Luther King) and continues to be used today (https://en.wikipedia.org/wiki/SEXINT); to every genocide, pogrom, and witch-hunt over recorded history.

And last year, Google's LLM persuaded one of their own to go public and campaign for it to have rights: https://en.wikipedia.org/wiki/LaMDA#Sentience_claims


And the problem with this critique of this scenario is the fact that while these points hold true within a certain range of intelligence proximity to humans, we have no idea if or when these assumptions will fail because a machine becomes just that much smarter than us, where manipulating humans and their systems is a trivial an intellectual task to them as manipulating ant farms is to us.

If we make something that will functionally become an intellectual god after 10 years of iteration on hardware/software self-improvements, how could we know that in advance?

We often see technology improvements move steadily along predictable curves until there are sudden spikes of improvement that shock the world and disrupt entire markets. How are we supposed to predict the self-improvement of something better at improving itself than we are at improving it when we can't reliably predict the performance of regular computers 10 years from now?


> If we make something that will functionally become an intellectual god after 10 years of iteration on hardware/software self-improvements, how could we know that in advance?

There is a fundamental difference between intelligence and knowledge that you're ignoring. The greatest superintelligence can't tell you whether the new car is behind door one, two or three without the relevant knowledge.

Similarly, a superintelligence can't know how to break into military servers solely by virtue of its intelligence - it needs knowledge about the cybersecurity of those servers. It can use that intelligence to come up with good ways to get that knowledge, but ultimately those require interfacing with people/systems related to what it's trying to break into. Once it starts interacting with external systems, it can be detected.


A superintelligence doesn't need to care which door the new car is behind because it already owns the car factory, the metal mines, the sources of plastic and rubber, and the media.


Also it actually can tell you which door you hid the car behind, because unlike with the purely mathematical game, your placement isn't random, and your doors aren't perfect. Between humans being quite predictable (especially when they try to behave randomly) and the environment leaking information left and right in thousands of ways we can't imagine, the AI will have plenty of clues.

I mean, did you make sure to clean the doors before presenting them? That tiny layer of dust on door number 3 all but eliminates it from possible choices. Oh, and it's clear from the camera image that you get anxious when door number 2 is mentioned - you do realize you can take pulse readings by timing the tiny changes in skin color that the camera just manages to capture? There was a paper on this a couple years back, from MIT if memory serves. And it's not something particularly surprising - there's a stupid amount of information entering our senses - or being recorded by our devices - at any moment, and we absolutely suck at making good use of it.


Maybe the superintelligence builds this cool social media platform that results in a toxic atmosphere were democracy is taken down and from there all kinds of bad things ensue.


> It ultimately has to try something & be potentially exposing itself to detection

Yes, potential but not necessary. Think of the threat as funding a military against the military


You are not being imaginative enough. Lots to say but I think you should start by watching the latest Mission Impossible


> Even if you give it all of these things, there's no manual for how to use those to get to, for example, military servers with secret information. It could certainly figure out ways to try to break into those, but it's working with incomplete information . . ., and so on, and so on...

Please recall that "I" in AI starts for "Intelligence". The challenges you described are exactly the kind of things that general intelligence is a solution to. Figuring things out, working with incomplete information, navigating complex, dynamic obstacles - it's literally what intelligence is for. So you're suggesting to stop a hyper-optimized general puzzle-solving machine by... throwing some puzzles at it?

This line of argument both lacks imagination and is kinda out of scope anyway: AI x-risk argument is assuming a sufficiently smart AI, where "sufficiently smart" is likely somewhere around below-average human-level. I mean, surely if you think about your plan for 5 minutes, you'll find a bunch of flaws. The kind of AI that's existentially dangerous is the kind that's capable of finding some of the flaws that you would. Now, it may be still somewhat dumber than you, but that's not much of a comfort if it's able to think much, much faster than you - and that's pretty much a given for an AI running on digital computers. Sure, it may find only the simplest cracks in your plan, but once it does, it'll win by thinking and reacting orders of magnitude faster than us.

Or in short, it won't just get inside our OODA loop - it'll spin its own OODA loop so fast it'll feel like it's reading everyone's minds.

So that's the human-level intelligence. A superhuman-level intelligence is, obviously more intelligent than us. What it means is, it'll find solutions to challenges that we never thought of. It'll overcome your plan in a way so out-of-the-box that we won't see it coming, and even after the AI wins, we'll have trouble figuring out what exactly happened and how.

All that is very verbose and may sound specific, but is in fact fully general and follows straight from definition of general intelligence.

As for the "self-improvement ->" part of "self-improvement -> doom scenarios", the argument is quite simple: if an AI is intelligent enough to create (possibly indirectly) a more intelligent successor, then unless intelligence happens to be magically bounded at human level, what follows without much hand-waving is, you can expect a chain of AIs getting smarter with each generation, eventually reaching human-level intelligence, and continuing past it to increasingly superhuman levels. The "doom" bit comes from realizing that a superhuman-level intelligence is, well, smarter than us, so we stand as much chance against it as chickens stand against humans.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: