Simplifying to that point is more of what a Markov chain is. LLMs are able to ge...

Simplifying to that point is more of what a Markov chain is. LLMs are able to generalize a lot more than that, and it's sufficient to "understand text" on a decent level. Even a relatively small model can take, e.g. even this poorly prompted request:

  "The user has requested 'remind me to pay my bills 8 PM tomorrow'. The current date is 2025-02-24. Your available commands are 'set_reminder' (time, description), 'set_alarm' (time), 'send_email' (to, subject, content). Respond with the command and its inputs."

And the most likely response will be what the user wanted.

A Markov chain (only using the probabilities of word orders from sentences in its training set) could never output a command that wasn't stitched together from existing ones (i.e. it would always output a valid command name, but if no one had requested a reminder for a date in 2026 before it was trained, it would never output that year). No amount of documents saying "2026 is the year after 2025" would make a Markov chain understand that fact, but LLMs are able to "understand" that.