Not strictly true. The LLM provider should be running a constrained token selection based off of the json schema of the tool call. That alone makes a massive difference as you're already discarding non-valid tokens during the completion at a low level. Now, if they had a BNF Grammer for each cli tool and enforced token selection based on that, you'd be much better off than unrestrained token selection.
Yeah, that's why I said "not much" difference. I don't think it's much, because LLMs do quite well generating JSON without turning on constrained output mode, and I can't remember them ever messing up a bash command line unless the quoting got weird.