Also (for those like me who didn't know the rules) generating legal Othello move...

anonymouskimmer · on March 15, 2023

I don't see that this follows. It doesn't seem materially different than knowing that U always follows Q, and that J is always followed by a vowel in "legal" English language words.

https://content.wolfram.com/uploads/sites/43/2023/02/sw02142... from https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-...

I imagine it's technically possible to do this in a piecewise manner that doesn't "understand" the larger board. This could theoretically be done with number lines, and not a geometry (i.e. the 8x8 grid and current state of each square mentioned in the comment you replied to). It could also be done in a piecewise manner with three ternary numbers (e.g. 1,0,-1) for each 3 square sets.

I guess this is a kind of geometric representation on the order of Shannon's Theseus.

nottathrowaway3 · on March 15, 2023

> It doesn't seem materially different than knowing that U always follows Q, and that J is always followed by a vowel in "legal" English language words.

The material difference is one of scale, not complexity.

Your rules have lookback = 1, while the Othello rules have lookback <= 63 and if you, say, are trying to play A1, you need to determine the current color of all squares on A1-A8, A1-H1, and A1-H8 (which is lookback <= 62) and then determine if one of 21 specific patterns exists.

Both can be technically be modeled with a lookup table, but for Othello that table would be size 3^63.

anonymouskimmer · on March 15, 2023

> Both can be technically be modeled with a lookup table, but for Othello that table would be size 3^63.

Could you just generate the subset you need denovo each time? Or the far smaller number of 1-dimensional lines?

nottathrowaway3 · on March 15, 2023

Then there becomes a "material" difference between Othello and those LL(1) grammars as grandparent comment suggested there wasn't.

I would argue the optimal compression for such a table is a representation of the geometric algorithm of determining move validity that all humans use intuitively, and speculate that any other compression algorithm below size say 1MB necessarily could be reduced to the geometric one.

In other words, Othello is a stateful, complex game, so if GPT is doing validation efficiently, it necessarily encoded something that unequivocally can be described as the "geometric structure".

thomastjeffery · on March 15, 2023

And that is exactly how this works.

There is no way to represent the state of the game without some kind of board model.

So any coherent representation of a sequence of valid game states can be used to infer the game board structure.

GPT is not constructing the board representation: it is looking at an example game and telling us what pattern it sees. GPT cannot fail to model the game board, because that is all it has to look at in the first place.

nottathrowaway3 · on March 15, 2023

> There is no way to represent the state of the game without some kind of board model.

I agree with the conclusion but not the premise.

The question under debate is about not just a stateful ternary board X but a board endowed with a metric (X, d) that enables geometry.

There are alternative ways you can represent the state without the geometry: such as, an ordered list of strings S = ["A1", "B2", ...] and a function Is-Valid(S) that returns whether S is in the language of valid games.

Related advice: don't get a math degree unless you enjoyed the above pedantry.

thomastjeffery · on March 15, 2023

An ordered list of strings is the training corpus. That's the data being modeled.

But that data is more specific than the set of all possible ordered lists of strings: it's a specific representation of an example game written as a chronology of piece positions.

GPT models every pattern it can find in the ordered list of tokens. GPT's model doesn't only infer the original data structure (the list of tokens). That structure isn't the only pattern present in the original data. There are also repeated tokens, and their relative positions in the list: GPT models them all.

When the story was written in the first place, the game rules were followed. In doing so, the authors of the story laid out an implicit boundary. That boundary is what GPT models, and it is implicitly a close match for the game rules.

When we look objectively at what GPT modeled, we can see that part of that model is the same shape and structure as an Othello game board. We call it a valid instance of an Othello game board. We. Not GPT. We. People who know the symbolic meaning of "Othello game board" make that assertion. GPT does not do that. As far as GPT is concerned, it's only a model.

And that model can be found in any valid example of an Othello game played. Even if it is implicit, it is there.

nottathrowaway3 · on March 15, 2023

> We call it a valid instance of an Othello game board. We. Not GPT. We. People who know the symbolic meaning of "Othello game board"...

The board structure can be defined precisely using predicate logic as (X, d), i.e., it is strictly below natural language and does not require a human interpretation.

And by "reduction" I meant the word in the technical sense: there exists subset of ChatGPT that encodes the information (X, d). This also does not require a human.

thomastjeffery · on March 16, 2023

The context of reading is human interpretation. The inverse function (writing) is human expression. These are the functions GPT pretends to implement.

When we write, we don't just spit out a random stream of characters: we choose groups of characters (subjects) that have symbolic meaning. We choose order and punctuation (grammar) that model the logical relationships between those symbols. The act of writing is constructive: even though - in the most literal sense - text is only a 1-dimensional list of characters, the text humans write can encode many arbitrary and complex data structures. It is the act of writing that defines those structures, not the string of characters itself. The entropy of the writer's decisions is the data that gets encoded.

When we read, we recognize the same grammar and subjects (the symbolic definitions) that we use to write. Using this shared knowledge, a person can reconstruct the same abstract model that was intentionally and explicitly written. Because we have explicitly implemented the act of writing, we can do the inverse, too.

There's a problem, though: natural language is ambiguous: what is explicitly written could be read with different symbolic definitions. We disambiguate using context: the surrounding narrative determines what symbolic definitions apply.

The surrounding narrative is not always explicitly written: this is where we use inference. We construct our own context to finish the act of reading. This is much more similar to what GPT does.

GPT does not define any symbols. GPT never makes an explicit construction. It never determines which patterns in its model are important, and what ones aren't.

Instead, GPT makes implicit constructions. It doesn't have any predefined patterns to match with, so it just looks at all the patterns equally.

Why does this work? Because text doesn't contain many unintentional patterns. Any pattern that GPT finds implicitly is likely to exist at some step in the writing process.

Remember that the data encoded in writing is the action of writing itself: this is more powerful than it seems. We use writing to explicitly encode the data we have in mind, but those aren't the only patterns that end up in the text. There are implicit patterns that "tag along" the writing process. Most of them have some importance.

The reason we are writing some specific thing is itself an implicit pattern. We don't write nonsensical bullshit unless we intend to.

When a person wrote the example Othello game, they explicitly encoded the piece positions and the order of game states. But why those positions in that order? Because that's what happened in game. That "why" was implicitly encoded into the text.

GPT modeled all of the patterns. It modeled the explicit chronology of piece positions, and the implicit game board topology. The explicit positions of pieces progressed as a direct result of that game board topology.

The game board and the rules were just as significant to the act of writing as the chronology of piece positions. Every aspect of the game is a determiner for what characters the person chooses to write: every determiner gets encoded as a pattern in the text.

Every pattern that GPT models requires a human. GPT doesn't write: it only models a prompt and "shows its work". Without the act of humans writing, there would be no pattern to model.