Real deception requires real agency and internally-motivated intents. LLM can be...

drdeca · 2025-07-16T21:32:33 1752701553

The sense of “deception” that is relevant here only requires some kind of “model” that, if the model produces certain outputs, [something that acts like a person] would [something that acts like believing] [some statement that the model “models” as “false”, and is in fact false], and as a consequence, the model produces those outputs, and as a consequence, a person believes the false statement in question.

None of this requires the ML model to have any interiority.

The ML model needn’t really know what a person really is, etc. , as long as it behaves in ways that correspond to how something that did know these things would behave, and has the corresponding consequences.

If someone is role-playing as a madman in control of launching some missiles, and unbeknownst to them, their chat outputs are actually connected to the missile launch device (which uses the same interface/commands as the fictional character would use to control the fictional version of the device), then if the character decides to “launch the missiles”, it doesn’t matter whether there actually existed a real intent to launch the missiles, or just a fictional character “intending” to launch the missiles, the missiles still get launched.

Likewise, if Bob is role playing as a character Charles, and Bob thinks that on the other side of the chat, the “Alice” he is speaking to is actually someone else’s role play character, and the character Charles would want to deceive Alice to believe something (which Bob thinks that the other person would know that the claim Charles would make to be false, but the character would be fooled), but in fact Alice is an actual person who didn’t realize that this was a role play chatroom, and doesn’t know better than to believe “Charles”, the Alice may still be “deceived”, even though the real person Bob had no intent to deceive the real person Alice, it was just the fictional character Charles who “intended” to deceive Alice.

Then, remove Bob from the situation, replacing him with a computer. The computer doesn’t really have an intent to deceive Alice. But the fictional character Charles, well, it may still be that within the fiction, Charles intends to deceive Alice.

The result is the same.

bugbuddy · 2025-07-16T23:34:37 1752708877

It sounds like you are trying to restate the Chinese room argument to come to a different conclusion. Unfortunately, I am too lazy to follow your argument closely because it is a bit hard to read at a glance.

drdeca · 2025-07-17T19:49:10 1752781750

I am assuming that the LLM has no interiority (which is similar to the conclusion that the Chinese Room argument argues for). My argument is that the character that the LLM (or a Chinese Room, or a character that a person is role-playing as) not having any interiority does not prevent the causal structure of how its interaction with the broader world having essentially the same kinds of effects as if there was a person there instead of a make-believe-person. So, if the fictional character, if they were real, would want to deceive, or launch a missile, or whatever, and would have the means to do so, and if there's something in the real world that is acting how the fictional character would act if the character were real, then the effects of the thing that acts like the fictional character would be to cause [the person the fictional character would deceive if the fictional character were real] to come to the false belief, or to cause the missiles to be launched.

If something outwardly acts the same way as it would if it were a person, then the external effects are the same as they would be if it were a person. (<- This is nearly a tautology.) This doesn't mean that it would be a person, but it does mean that the concerns one would have about what outward effects it may cause if it it was a person, still apply.

(Well, assuming an appropriate notion of "outwardly" I guess. If a person prays, and God responds to this prayer, for the purpose of the above, I'm going to count such prayer as part of how the person "outwardly acts" even if other people around them wouldn't be able to tell that they were praying. So, by "outwardly acts", I mean something that has causal influence on external things.)

If something lacks interiority and therefore has no true intent to deceive, what does it matter that there was no true deception, if me interacting with that thing still results in me having false beliefs that benefit some goal that the thing behaves as if it has, to the detriment of my own goals?

bugbuddy · 2025-07-22T17:19:34 1753204774

The problem is that the rabbit-hole dwellers use the word “deception” to try to invoke some spooky ghost-in-the-machine or AGI or ASI or self-awareness or self-preservation-seeking vibes based on the observed behaviors.

My point is the observation is not support for such vibes referenced in og-parent post. You can’t just rearrange bits and hope for a spark of life. It is not gonna “come live.” The rabbit-hole dwellers keep trying to make everyone believe that. It is both amusing and tiring how persistent they are at this. It is understandable considering the money they stand to gain from gullible people rushing to invest.

drdeca · 2025-07-23T21:30:25 1753306225

I’m pretty sure most of the people you are talking about aren’t confident that, in the scenarios they describe, that the computer would have any kind of interiority. I believe most of them believe that it is in principle possible for a program to have interiority (if doing e.g. full brain emulation, which is very difficult from LLMs), but that it is probably not necessary for it to do the things they are concerned about.

If Napoleon Bonaparte had been replaced with a p-zombie he would have been no less capable of conquering.

OutOfHere · 2025-07-16T21:32:25 1752701545

Survival of the LLM is absolutely a sufficient internally-motivated self-generated intent to engage in deception.