I am getting two contradictory but plausible seeming replies when I ask about a certain set being the same when adding 1 to every value in the set, asked on how I ask the question.
What led you to beleive that mathematics is a good tool for evaluating an LLM? It is a thing they currently dont do well, since it is wildly out of domain of their training corpus - down the very way we structure information for an LLM to ingest. If we start doing the same for humans, most humans are in deep trouble.
Well I am studying mathematics, and I use the LLM to help me learn.
They aren't terrible, and they have all of arXiv to train on. Terrence Tao is doing some cool stuff with it - the idea will be an LLM to generate Lean proofs.
And I can assure you when I start to talk about these topics with the average human person that doesn't know the material, they just laugh at me. Even my wife who has a PhD in physics.
Here's some cool math I learned from a regular book, not an LLM:
Correct answer: https://chatgpt.com/share/67a9500b-2360-8007-b70e-0bc2b84bc1...
Incorrect answer (I think): https://chatgpt.com/share/67a950df-d4e0-8007-8105-95a9e5be19...