I imagine if this thing sticks around, they'll eventually be able to automate simple things like the pizza ordering and whatnot. I imagine it could call back to a human in case of issues, such as:
Requester: "I want two pepperoni pizzas, one with extra cheese."
AI Response: "I can have 3 pizzas, two pepperoni, and one cheese (with extra cheese), delivered from Dominos for $24.15"
Requester: "No, I wanted two pepperoni pizzas, with extra cheese on one of them"
Human steps in after the "no" is detected: "Sorry, I misunderstood your request, I can get two pepperoni pizzas with extra cheese on one of them for $16.50 from Dominos"
Requester: "Yes, thanks."
======
It'll just be a matter of time as the automation spreads to less and less mundane topics.
The first step is just to have backend software make a first pass at the texts and make suggestions to the rep. As it get better, it increases the throughput for the human rep. Like most automation, it becomes about making a human more productive, and that's a lot easier to accomplish.
There seems to be a pattern lately of services that are initially human based but then use that as validation for a human-supported AI model further down the road. https://x.ai/ is one of the recent ones I remember seeing.
Humans are still the most flexible source of labor you can use for a task, so it's possible to learn exactly what the requirements are using a human and then try to automate various repetitive parts of the task until the human hardly does anything but verify that the robots are working properly. Note that this does not mean that humans are the least expensive form of labor, although they are currently the only form of labor that can be created by two completely unskilled humans in about 9 months.
This could totally be automated, text recognition handles the easy cases where ordering can be done via the internet, the rest is handled by humans.
The user confirmation makes sure the automation didn't screw something up, and if the user doesn't responds positively to a certain threshold then humans take over. The threshold can be dynamic depending on accuracy of text recognition.