I understand the rendering is an out-of-band addition. But the question is, if this is a statistical language model (i.e. a monte carlo word generator), then how is it able to go from the description to the SVG successfully without understanding things more logically.
My best guess is that theres an actual example in the training corpus that its pulling from. Or that this is being gamed in some way.
LLMs are very good at parsing and generating highly structured data as long as it has a rigid and logical structure and doesn't involve any actual mathematical calculations. For simple calculations it will often get it right because it has the knowledge stored, similar to how a human can remember a multiplication table. But once the numbers get high enough that humans would fail without solving it with an algorithm using pen and paper (or a calculator) that's when LLM's fail as well.
Luckily this problem can be solved as you can prompt the model to not solve math, but instead transform them into a structure that can be fed into an API (like wolfram alpha).
With a multistep process it would be easy to have ChatGPT generate accurate answers to advanced math problems:
1. Instruct ChatGPT to not solve mathematical expression it sees in the prompt, instead only extract the expression(s) into a list.
2. Programmatically scan the output, and make API calls to get the answers.
3. Submit the answers to ChatGPT and tell it to answer the original question with the newly supplied information.
4. Voila! You can solve almost any mathematical question using natural language.
So thats I guess exactly my point. The text-to-accurate-SVG is much closer to the "solve a math problem for real" side of the spectrum than other prompts. And it does it correctly. In a sibling comment it turns out it doesn't work for other company logos. So my guess is either that the corpus happened to have this data already, they purposefully fed it additional Microsoft-related materials so it has a better basis of generation, that they regenerated it multiple times until it got it, or that theres an entire previously-undisclosed layer of processing being added to this version vs other ChatGPT.
My best guess is that theres an actual example in the training corpus that its pulling from. Or that this is being gamed in some way.