I hear this a lot but there are vast sums of money thrown at where a model fails...

antonvs · on July 24, 2024

> Think about math and logic. If a single symbol is off, it’s no good.

In that case the tokenization is done at the appropriate level.

This is a complete non-issue for the use cases these models are designed for.

meroes · on July 24, 2024

But we don’t restrict it to math or logical syntax. Any prompt across essentially all domains. The same model is expected to handle any kind of logical reasoning that can be brought into text. We don’t mark it incorrect if it spells an unimportant word wrong, however keep in mind the spelling of a word can be important for many questions, for example—off the top of my head: please concatenate “d”, “e”, “a”, “r” into a common English word without rearranging the order. The types of examples are endless. And any type of example it gets wrong, we want to correct it. I’m not saying most models will fail this specific example, but it’s to show the breadth of expectations.