Not 100% so for chain of thought models, they should recognize to spell the word...

spacemanspiff01 · 2025-01-21T11:10:46 1737457846

But the only way to do this is if it is trained on how to map the word token to character tokens ie

Hello -> h e l l o 66547 -> 12 66 88 88 3

Or, maybe it memorized that hello has a single e.

Either way, This seems to be a edge case that may or may not exist in the training data, but seems orthogonal to 'reasoning'

A better test case would be how it performs if you give the spelling mappings for each word the context?

zamadatix · 2025-01-21T16:22:22 1737476542

"Be trained how to map" implies someone is feeding in a list of every token and what the letters for that token are as training data and then training that. More realistically, this just happens automatically during training as the model figures out what splits work with which tokens because that answer was right when it came across a spelling example or question. The "reasoning" portion comes into play by its ability to judge whether what it's doing is working rather than go with the first guess. E.g. feeding "zygomaticomaxillary" and asking for the count of 'a's gives a CoT

> <comes to an initial guess> > Wait, is that correct? Let me double-check because sometimes I might miscount or miss letters. > Maybe I should just go through each letter one by one. Let's write the word out in order: > <writes one letter per line with the conclusion for each > *Answer:* There are 3 "a"s in "zygomaticomaxillary."

It's not the only example of how to judge a model but there are more ways to accurately answering this problem than "hardcode the tokenizer data in the training" and heavily trained CoT models should be expected to hit on at least several of these other ways or it is suspect they miss similar types of things elsewhere.

svachalek · 2025-01-21T15:28:36 1737473316

The important thing is, when we're all replaced by robots, deep down we will know we are superior because we can count letters in strawberry.