I'm probably overlooking something, but what makes the problem of being able to get from item in basket to item is shipping different from choosing which item(s) to put in the basket?
In other words, if Agents are able to navigate marketplaces, shouldn't that imply they can also navigate a subset of the marketplace, the payment section? Especially given that that section is "easier: theres no need for qualitative (or quantitative) judgement like there is for the shopping portion.
It's not actually doing browser actions like Playwright or other browser automation tools, rather than direct API and MCP calls/actions. This is a whole new subset of API and connections that are all contained within the Agent context, no browser mocking. That's why they are creating these new protocols, so the full governance can work within the context of the Agent and its available tools.
As I said, it doesn't have to make sense, but this is being pushed on us anyway...
It seems like this workflow suffers the same problem as Alexa and Amazon dash buttons: consumers don't typically want the computer to just go buy things for them with no oversight. At least I don't.
Adding a checkout step would make this more plausible to me. "Agent, go find the most efficient dishwasher under $600" where it adds its recommendation to a cart, or even "Find me the best dishwashers under $600" where it creates a catalog page with its recommendations and an easy checkout process with whatever store is actually providing.
It would/will be extremely irresponsible to put non-deterministic and fallible models in charge of weapons. We are not close to having solved the problem of ensuring AI pursues good outcomes
I agree completely. Anybody who uses the models extensively know it can do something amazing for a prompt and something awful for another. But I also know that wars are unfortunately real and there are real enemies between countries and they don't want a limited model.
Probably drones targeting and automatically killing Russian people by a thinking model guessing if its Russian on Ukrainian person is a red line.
Elon Musk already denied Starlink for being used for remote killing, but at some point all these technologies will be nationalized, as they are too important not to be.
In other words, if Agents are able to navigate marketplaces, shouldn't that imply they can also navigate a subset of the marketplace, the payment section? Especially given that that section is "easier: theres no need for qualitative (or quantitative) judgement like there is for the shopping portion.
Perhaps its a matter of proper safeguards?