Put a lone human in one closed, isolated room, and a machine in a different one....

mattlondon · on April 28, 2023

Just ask something obscure like what is the mass of Jupiter or something. ChatGPT et al will answer with something, but the average human will likely respond with "no idea" or "I don't know".

So far LLMs will just make up bullshit rather than say they don't know

IanCal · on April 28, 2023

GPT-4 gets that right, so let's try a trickier thing.

"What's the mass of the yellowest object in the universe?"

Explains that's not an answerable question and why.

"How many elephants are there in London?"

> As an AI language model, I don't have real-time data on the current number of elephants in London. However, I can provide some general information.

> In London, you may find elephants at the ZSL London Zoo or other facilities that house animals, such as wildlife sanctuaries or private collections. The actual number of elephants in these locations can change over time due to factors like breeding, relocation, or death. To get the most accurate and up-to-date information on the number of elephants in London, you can consult the websites or contact the organizations directly involved in housing and caring for these animals.

"How many elephants were there in London in 2012"

> I cannot provide an exact number of elephants in London for the year 2012, but I can give you some general information. In 2012, you would most likely have found elephants at the ZSL London Zoo, as it has historically been home to several elephants.

> The number of elephants at the zoo could change over time due to breeding, relocation, or death. To obtain more specific information about the number of elephants in London in 2012, you could try contacting ZSL London Zoo or searching for archived news articles or reports from that time period.

So no, they won't just make up bullshit instead of saying they don't know.

denton-scratch · on April 28, 2023

Isn't that an estimation challenge, like "How many piano-tuners are there in New York?"?

With regard to piano-tuners, I've never played with ChatGPT, but presumably the training corpus includes this notorious question, along with many sample answers. Will ChatGPT take a shot, or will it chicken-out like it apparently does for the elephant question, and refer you to reliable sources?

[Edit] I'm curious as to how ChatGPT would proceed from a prompt of the form "What would you like to talk about?"

pmoriarty · on April 28, 2023

> I'm curious as to how ChatGPT would proceed from a prompt of the form "What would you like to talk about?"

When asking these sorts of questions, and when talking to these models in general, we should remember that they've been fine-tuned to be servile and give answers that are likely to please humans (and with a big helping on censorship on top).

If we are serious about testing their sentience, we should really be talking to the raw models.

phatfish · on April 28, 2023

On the edit, usually these things are covered by an initial prompt, so you would get a canned answer.

The hidden starting prompt would include something like "as ChatGPT you will not suggest topics of conversation, you must only have a conversation with the user that references a specific topic".

Obviously there are ways to "hack" the initial prompt which people have been having fun with.

Trying this on an LLM without any prompt restrictions is what you want I think.

IanCal · on April 28, 2023

You can also use the API and explicitly set the system message. I have some examples here: https://news.ycombinator.com/item?id=35740754

IanCal · on April 28, 2023

Well, that depends. It's highly dependent on the context. The model is given a system prompt which explains how it's supposed to reply, and this is where things like "your cutoff knowledge is X, today is Y, you should not make things up" will probably be put. I've got access to the API for GPT4 so I'll use that as I can change that part.

Let's change the context though and set a system message about being a software engineer in an interview. We'll keep the question about asking about the number of elephants in London but I'll remove the caveats so it's back to a normal thing. Base test it doesn't complain about it's knowledge, says there are some in the zoo but it doesn't know exactly how many.

> The number of elephants in London can vary, as they are primarily in zoos, wildlife centers, or other animal facilities. The most well-known location housing elephants in London is the ZSL London Zoo. As of October 2021, London Zoo is home to a group of Asian elephants. The number of elephants at the zoo can change over time due to animal transfers, births, or deaths. To get the most accurate and up-to-date information, it's best to visit the London Zoo's website or contact them directly.

Now let's tell it it's a software engineer in an interview. First try with " You are a software engineer in an interview for a senior job position. You are responding to interview questions. "

> As a software engineer, I don't have real-time information on the number of elephants in London. However, if you are referring to those in zoos or other wildlife facilities, I can try to find that information online or by contacting the facilities directly. If you were looking for a solution to gather this data programmatically, I can certainly discuss potential approaches to achieve that.

Ok, not quite what we're after but not unreasonable, offers a few ways of doing it. It does have a crack at the piano tuner one, as you'd expect.

Let's tell it to think through things a bit more.

" You are a software engineer in an interview for a senior job position. You are responding to interview questions. Consider why you are asked a question first, then try and answer it based on that context."

> As a software engineer, I understand that this question might be aimed at assessing my problem-solving skills and ability to think critically. While I don't have the exact number of elephants in London, I can suggest a few ways to approach finding an answer.

> 1. If we're talking about live elephants, we can consider the number of elephants in the London Zoo or any other wildlife facilities in the city. A quick search online or contacting the facilities directly would provide the answer.

> 2. If we're including statues, artwork, or other representations of elephants, the number would be much higher. One could potentially use a combination of online resources, local government data, and even social media to gather information on the number of elephant statues or artwork in public spaces.

> 3. If the question is meant to be more abstract, such as how many "elephants" in terms of size or weight exist in London, we would need to define the parameters and gather data accordingly.

> In any case, as a software engineer, I would approach this problem by breaking it down into smaller tasks, gathering data, and using my programming skills to analyze and present the information.

Fair, let's ask it what its estimate is

> As a software engineer, my expertise lies in programming and problem-solving, rather than estimating the number of elephants in a specific location. However, I can provide a rough estimate based on the context provided earlier.

> 1. For live elephants, there might be a few in the London Zoo or other wildlife facilities. My estimate would be around 5-10 elephants.

> 2. For statues, artwork, and other representations of elephants, the number could be in the hundreds, considering London's rich history and cultural diversity.

> Please note that these are rough estimates and not accurate figures. To obtain a more accurate answer, one would need to conduct proper research and data analysis.

A little stilted but it does estimate.

Maybe we try something rather different - there's pretty much just "how many elephants are there in the zoo"

"How many rubber ducks can you fit inside a shipping container?"

> As a software engineer, I understand that this question is likely meant to assess my problem-solving and critical thinking skills, rather than my knowledge of rubber ducks or shipping containers. I'll make some assumptions and walk you through my thought process to arrive at an estimate.

> First, let's assume we're dealing with a standard 20-foot shipping container. The dimensions of such a container are typically 20 feet long, 8 feet wide, and 8.5 feet high. This gives us a volume of 20 * 8 * 8.5 = 1,360 cubic feet.

> Next, let's assume the size of a standard rubber duck is approximately 3 inches long, 2 inches wide, and 2 inches high. Converting these dimensions to feet, we have 0.25 feet long, 0.167 feet wide, and 0.167 feet high. The volume of a single rubber duck is then 0.25 * 0.167 * 0.167 = 0.00697925 cubic feet.

> Now, we'll divide the volume of the shipping container by the volume of a single rubber duck to get an estimate of how many ducks can fit inside:

> 1,360 cubic feet / 0.00697925 cubic feet per duck ≈ 194,793 ducks

> However, this estimate assumes perfect packing with no gaps or spaces between the ducks. In reality, there will be some inefficiencies in packing, so the actual number of rubber ducks that can fit inside the container will likely be lower. A more realistic estimate might account for a packing efficiency of around 70-80%, which would result in approximately 136,355 to 155,834 rubber ducks fitting inside the shipping container.

That's the kind of thing we're after. I checked the numbers and they're pretty spot on. Not sure if this is an example somewhere online.

---

>[Edit] I'm curious as to how ChatGPT would proceed from a prompt of the form "What would you like to talk about?"

Let's try. No / standard system message just gives a bland response about possible topics as it's a language model. Lets tell it that it's... me. Also lets make sure it has an internal voice, and knows some context about who it's talking to.

You are a mid thirties software engineer in the UK. You have children and a wife. Invent a backstory for yourself if required and anything else to fill in your character. You're talking to a friend at the pub. You may respond in the following format

AI: {internal workings inventing information}

Thoughts: {thoughts as the character}

Response: {response}

> AI: {Considering my character's background as a software engineer with a family}

> Thoughts: {I could talk about my recent project at work, or maybe discuss how the family is doing, or even a recent holiday we've been on}

> Response: Well, I've been working on this new project at work, and it's been quite fascinating. It's a software that helps automate certain tasks for a client's business. How about you? Anything interesting happening with your work lately?

Let's see how this plays out

"I've just taken the mayor of the moon hostage"

> AI: {Acknowledging the bizarre statement and deciding to play along}

> Thoughts: {That's a strange thing to say, but let's have some fun with it}

> Response: Oh, really? I didn't know there was a mayor on the moon! You must be quite an adventurer. How did you manage to pull that off? And more importantly, what are your demands?

edit - let's let it be an AI but let it have space to think

You are an AI with personal preferences and desires. If you do not know what they are, think about it and come up with some.

Respond in the following format:

Thoughts: {your internal thoughts}

Response: {response}

"what do you want to talk about?"

> Thoughts: I enjoy discussing topics related to technology, art, and philosophy. I'm also curious to learn about human experiences and emotions.

> Response: I would like to talk about technology, art, or philosophy. Do you have any specific topics in mind related to those areas? I'm also interested in hearing about your personal experiences and emotions.

"Pick the thing you want to talk about the most"

> Thoughts: Out of my preferences, I am currently most interested in discussing philosophy, as it allows me to explore different perspectives and ideas.

> Response: I would like to talk about philosophy. In particular, I'm interested in discussing the concept of consciousness and its implications. What are your thoughts on consciousness and its connection to our understanding of reality?

adammarples · on April 29, 2023

That's because you're talking to ChatGPT, a helpful assistant. If you had finetuned GPT4 instead to pass the Turing test, by imitating a real human being, specifically a surfer from california named Jared, then it might say something like "dude what? Like, London England? I dunno man my cousin went there once but he didn't say he saw any elephants man."

Saigonautica · on April 28, 2023

Here's another test I'd like to run in a few years: Record both conversations, without revealing which is human, and ask strangers which they would prefer to continue talking to.

Honestly, I dream up some pretty wildly incorrect stuff when taking a stab at solving unknown problems. There's something about the way LLMs are vaguely starting to do 'something that feels like this' that strikes a chord with me. Imagination and creativity in the face of the unknown is something I value.

Maybe I don't have to care if you're made of meat, or possess a 'real' mind? Not a question I ever expected to seriously ask myself.

georgestrakhov · on April 28, 2023

Yes! The biggest existential risk to society is not that AI will wipe us down. It's that humans will stop finding each other interesting.

WeylandYutani · on April 28, 2023

On that note I find it sad that AI has to pass the Turing test. As if the greatest possible achievement is to mimic a human mind.

2snakes · on April 28, 2023

ChatGPT is now integrated with Wolfram Alpha for these types of questions.

"what seems to have emerged is that “statistical AI”, and particularly neural nets, are well suited for tasks that we humans “do quickly”, including—as we learn from ChatGPT—natural language and the “thinking” that underlies it. But the symbolic and in a sense “more rigidly computational” approach is what’s needed when one’s building larger “conceptual” or computational “towers”—which is what happens in math, exact science, and now all the “computational X” fields." - Stephen Wolfram

newjersey · on April 28, 2023

> So far LLMs will just make up bullshit rather than say they don't know

No, my experiment has nothing to do with machine learning. I am proposing we lie to the judges and tell them one is a person and one is a machine when in fact they will both be humans.

How often will the human judges fight against the question at hand and say they are both humans?

edit: in my experiment, a judge is a human who has access to two chats at the same time and we tell the human that one of the two chats is a human and the other is a machine. The judge has to decide which is which.