To be fair, I'm a fairly intelligent human (or so I'd like to think!), and I'm having a lot of trouble following this. I'd have to sit down with pencil and paper and do the transformations on it.
Keep in mind that LLMs like GPT 4 do not have a scratchpad and can't back-track. Asking it a question is the rough equivalent of asking a human standing in the middle of the room a question. No paper, no whiteboard, nothing. Just a demand for a verbal answer then and there, from a standing start.
I've only met a couple of humans who would be able to solve this problem like this one under those circumstances.
Despite its impediments GPT 4 solved this correctly:
First, let's break down the statement and evaluate the individual components.
col(true, false) = true AND false = false
pel(false, true) = false OR true = true
erf(col(true, false), true) = erf(false, true)
Since the first argument of erf evaluates to false, erf is equivalent to "pel":
erf(false, true) = pel(false, true) = false OR true = true
Now we can substitute the results back into the original statement:
col(true, pel(false, true), erf(col(true, false), true)) = col(true, true, true)
The col function acts as an AND operator, so:
col(true, true, true) = true AND true AND true = true
Therefore, the given statement evaluates to true.
Alternatively, to give an idea of the upcoming "Wolfram Alpha" plugin, you can instead ask it to convert the expression into the equivalent Mathematica program:
Very nice! Unfortunately, it didn't notice that "col" is used with three arguments, not just two. This is a mistake many (most?) humans would make also.
No worries, you can ask it to correct that. It'll apologise and produce this:
There you go. Now go find me a random human who can do any of this from a standing start, no preparation, no warning, no tools, nothing but their wits.
As a bonus, here's a related joke:
Why did the logical functions col, pel, and erf form a band?
Because together, they could create the perfect harmony:
they knew when to "col"aborate, when to give each other
some space and "pel"ay, and their "erf"forts were always well-timed!
Here's another example to test if it can actually think critically about a problem instead of just following whatever instructions and whatever the training data had. A human with basic level math could do it.
> can you replace the first letter of "banana" with the next letter of the alphabet?
Yes, I can do that! However, I should note that after several iterations, the resulting word may no longer be considered a recognizable word in the English language. Here are the next 10 iterations:
"canana" becomes "danana"
"danana" becomes "eanana"
"eanana" becomes "fanana"
"fanana" becomes "ganana"
"ganana" becomes "hanana"
"hanana" becomes "ianana"
"ianana" becomes "janana"
"janana" becomes "kanana"
"kanana" becomes "lanana"
"lanana" becomes "manana"
And so on for a total of 173 iterations
> I want the final answer
Sure, after 173 iterations of replacing the first letter of "banana" with the next letter of the alphabet, the final word would be:
"zcyzcj"
As noted earlier, after several iterations, the resulting word may no longer be considered a recognizable word in the English language.
It's just an example. An human could solve this with an arbitrary number of substitutions, but I'm sure GPT4 breaks at some point. I don't think it's fair to compare it with a human with no paper and pencil... It's like asking asking the AI to not use any RAM. Humans are very bad at keeping stuff in RAM so paper and pencil is all we've got.
Keep in mind that LLMs like GPT 4 do not have a scratchpad and can't back-track. Asking it a question is the rough equivalent of asking a human standing in the middle of the room a question. No paper, no whiteboard, nothing. Just a demand for a verbal answer then and there, from a standing start.
I've only met a couple of humans who would be able to solve this problem like this one under those circumstances.
Despite its impediments GPT 4 solved this correctly:
Alternatively, to give an idea of the upcoming "Wolfram Alpha" plugin, you can instead ask it to convert the expression into the equivalent Mathematica program: Very nice! Unfortunately, it didn't notice that "col" is used with three arguments, not just two. This is a mistake many (most?) humans would make also.No worries, you can ask it to correct that. It'll apologise and produce this:
Which evaluates to: There you go. Now go find me a random human who can do any of this from a standing start, no preparation, no warning, no tools, nothing but their wits.As a bonus, here's a related joke: