> We only need a single counter example to show that Othello-GPT does not have a systemic understanding of the rules, only a statistical inference of them.
I don't agree with this binary interpretation. This only indicates that any systemic understanding the model has internally built is not complete. We are not trying to assess completeness of that understanding. E.g. if you had a traditional rules engine for Othello, and you removed one rule that could result in illegal moves, does that make that rule engine a statistical model all of a sudden?
> An inaccurate engine and a statistical one are indistinguishable.
How so? The paper is trying to assess if some semantic understanding is being created by the underlying model, not if a completely accurate one is. If such a world model maps to Othello-prime (gleaning concepts like tiles, colors etc), that is still a very interesting result.
I don't agree with this binary interpretation. This only indicates that any systemic understanding the model has internally built is not complete. We are not trying to assess completeness of that understanding. E.g. if you had a traditional rules engine for Othello, and you removed one rule that could result in illegal moves, does that make that rule engine a statistical model all of a sudden?