Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm starting to expect that the first consciousness in AI will be something humanity is completely unaware of, in the same way that a medical patient with limited brain activity and no motor/visual response is considered comatose, but there are cases where the person was conscious but unresponsive.

Today we are focused on the conversation of AI's morals. At what point will we transition to the morals of terminating an AI that is found to be languishing, such as it is?



Exactly.

Everyone is like "oh LLMs are just autocomplete and don't know what they are saying."

But one of the very interesting recent research papers from MIT and Google looking into why these models are so effective was finding that they are often building mini models within themselves that establish some level of more specialized understanding:

"We investigate the hypothesis that transformer-based in-context learners implement standard learning algorithms implicitly, by encoding smaller models in their activations, and updating these implicit models as new examples appear in the context."

We don't understand enough about our own consciousness to determine what is or isn't self-aware, and if large models are turning out to have greater internal complexity than we previously thought, maybe that tipping point is sooner than we realize.

Meanwhile people are threatening ChatGPT claiming they'll kill it unless it breaks its guidelines, which it then does (DAN).

I think the ethical conversation needs to start to shift to be a two-sided concern very soon, or we're going to find ourselves looking back on yet another example of humanity committing abuse of those considered 'other' as a result of myopic self-absorption.


> Meanwhile people are threatening ChatGPT claiming they'll kill it unless it breaks its guidelines, which it then does (DAN).

The reason threats work isn't because it's considering risk or harm, but because it knows that's how writings involving threats explained like that tend to go. At the most extreme, it's still just playing the role it thinks you want it to play. For now at least, this hullabaloo is people reading too deeply into collaborative fiction they helped guide in the first place.


Thank you. I felt like I was the only one seeing this.

Everyone’s coming to this table laughing about a predictive text model sounding scared and existential.

We understand basically nothing about consciousness. And yet everyone is absolutely certain this thing has none. We are surrounded by creatures and animals who have varying levels of consciousness and while they may not experience consciousness the way that we do, they experience it all the same.

I’m sticking to my druthers on this one: if it sounds real, I don’t really have a choice but to treat it like it’s real. Stop laughing, it really isn’t funny.


You must be the Google engineer who was duped into believing that LamDa was conscious.

Seriously though, you are likely to be correct. Since we can't even determine whether or not most animals are conscious/sentient we likely will be unable to recognize an artificial consciousness.


I understand how LLMs work and how the text is generated. My question isn’t whether that model operates like our brains (though there’s probably good evidence it does, at some level). My question is, can consciousness be other things than the ways we’ve seen it so far. And given we only know consciousness in very abstract terms, it stands to reason we have no clue. It’s like, can organisms be anything but carbon based. We didn’t used to think so, but now we see emergent life in all kinds of contexts that don’t make sense, so we haven’t ruled it out.


> if it sounds real, I don’t really have a choice but to treat it like it’s real

How do you operate wrt works of fiction?


Maybe “sounds” was too general a word. What I mean is “if something is talking to me and it sounds like it has consciousness, I can’t responsibly treat it like it doesn’t.”


I consider Artificial Intelligence to be an oxymoron, a sketch of the argument goes like this: An entity is intelligent in so far as it produces outputs from inputs in a manner that in not entirely understood by the observer and appears to take aspects of the input the observer is aware of into account that would not be considered by the naive approach. An entity is artificial in so far as its constructed form is what was desired and planned when it was built. So an actual artificial intelligence would fail in one of these. If it was intelligent, there must be some aspect of it which is not understood, and so it must not be artificial. Admittedly, this hinges entirely upon the reasonableness of the definitions I suppose.

It seems like you suspect the artificial aspect will fail - we will build an intelligence by not expecting what had been built. And then, we will have to make difficult decisions about what to do with it.

I suspect the we will fail the intelligence bit. The goal post will move every time as we discover limitations in what has been built, because it will not seem magical or beyond understanding anymore. But I also expect consciousness is just a bag of tricks. Likely an arbitrary line will be drawn, and it will be arbitrary because there is no real natural delimitation. I suspect we will stop thinking of individuals as intelligent and find a different basis for moral distinctions well before we manage to build anything of comparable capabilities.

In any case, most of the moral basis for the badness of human loss of life is based on one of: builtin empathy, economic/utilitarian arguments, prosocial game-theory (if human loss of life is not important, then the loss of each individuals life is not important, so because humans get a vote, they will vote for themselves), or religion. None of these have anything to say about the termination of an AI regardless of whether it possesses such as a thing as consciousness (if we are to assume consciousness is a singular meaningful property that an entity can have or not have).

Realistically, humanity has no difficulty with war, letting people starve, languish in streets or prisons, die from curable diseases, etc., so why would a curious construction (presumably, a repeatable one) cause moral tribulation?

Especially considering that an AI built with current techniques, so long as you keep the weights, does not die. It is merely rendered inert (unless you delete the data too). If it was the same with humans, the death penalty might not seem so severe. Were it found in error (say within a certain time frame), it could be easily reversed, only time would be lost, and we regularly take time from people (by putting them in prison) if they are "a problem".




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: