I'm not entirely convinced that is all there is to it. I had it write some code and associated unit tests, and then it came up with passing and failing examples. I also prompted for function results based on arbitrary input, and it would perform the calculations.
It has some emergent ability to evaluate code IMO. I do believe this ability has been drastically reduced in the last several months. It no longer executes complex code as reliably as it once did.
It has some emergent ability to evaluate code IMO. I do believe this ability has been drastically reduced in the last several months. It no longer executes complex code as reliably as it once did.