Somewhere in the video he says that LLMs have expert (only slightly fuzzy) knowledge about a lot of topics, but fail with simple math questions. Many non-technical people anthropomorphize LLMs and don't know that they can't think or calculate like a real calculator. LLMs compute tokens and you can improve the performance, if you don't put too much computation into a single result token.
I think it's an excellent example to show the capabilities and limits of LLMs. For softer topics, you can argue a lot more about what's considered to be right or wrong. With Math, you have a single correct answer that can be evaluated and people assume that computers are good at computer things, such as calculating numbers, even though LLMs actually aren't good at this.
The takeaway is: Prompting and "computational complexity per token" matter and if you understand how it works for math, you probably understand how it works for softer things like answers about law or whatever.
I've definitely done things like "give me a time stamp" then took too long to realize the time it gave made no sense. You get used to it working well when it does, and then it doesn't, and it's hard to switch the skepticism back on in response.
Somewhere in the video he says that LLMs have expert (only slightly fuzzy) knowledge about a lot of topics, but fail with simple math questions. Many non-technical people anthropomorphize LLMs and don't know that they can't think or calculate like a real calculator. LLMs compute tokens and you can improve the performance, if you don't put too much computation into a single result token.
I think it's an excellent example to show the capabilities and limits of LLMs. For softer topics, you can argue a lot more about what's considered to be right or wrong. With Math, you have a single correct answer that can be evaluated and people assume that computers are good at computer things, such as calculating numbers, even though LLMs actually aren't good at this.
The takeaway is: Prompting and "computational complexity per token" matter and if you understand how it works for math, you probably understand how it works for softer things like answers about law or whatever.