I don't think this paper is dismissing the importance of correct yes/no tests an...

golemotron · on May 9, 2023

The paper is goal-post shifting by measuring another thing. At the level of yes/no the behavior is emergent.

nullsense · on May 9, 2023

Is the binary distinction helpful? It doesn't appear to be a more helpful way of evaluating how capable a model is.

golemotron · on May 9, 2023

I think it is more fundamental than that. Emergence always disappears when we slice phenomena thinly. A person walking through a doorway is a very continuous phenomenon. We can see their relation to the doorway at each point. There is no abrupt change. But when we stand back and apply the criterion: "is the person through the door (y/n?)" we end up with an answer. When it is yes, we can say that the passage is an emergent effect of the motion. At one moment it wasn't there, and at another it was.

tsimionescu · on May 9, 2023

If emergence disappears when you slice it thinly enough, then the phenomenon was not emergent. There are emergent phenomena in mathematics - for example, infinite sets have many emergent properties that arbitrarily large finite sets don't share. As far as we know, consciousness seems to be an emergent phenomenon, when you increase brain size in some way far enough. Turing completeness is usually emergent as well - remove any single element from a Turing complete system and it typiclaly becomes unable to compute the vast majority of functions.

vaidhy · on May 9, 2023

Is there an accepted definition of consciousness? I thought the definition itself is still under debate. If so, calling an undefined, nebulous thing as an emergent behavior is just silly.

nullsense · on May 10, 2023

Is there a definition of what "emergent" means?

stevenhuang · on May 9, 2023

How can it not be. Like think about what you're saying here.

Would you rather be able to evaluate a model on it's demonstrated capabilities (multistep reasoning, question and answer, instruction following, theory of mind, etc) or some nebulous metric along an axis that may as well not correspond to practice.

We only care about how good AI is at things that matter to us as humans. Why not test for these directly?

If some perfect metric is discovered that shows the phenomenon of emergence is actually continuous, then that would be helpful.