The standard for "agents" is that tools run in sequence, so no need to worry about concurrency. Several models support parallel tool calls now where the model can say "Run these three tools" and your harness can chose to run them in parallel or sequentially before passing the results back to the model as the next step in the conversation.
It's still _very_ early in figuring out good patterns for LLM tool-use - the models only got really great at using tools in about the past 6 months, so there's plenty to be discovered about how best to orchestrate them.
What do you recommend for this? I've actually had good luck having them create XML, even though you're "supposed" to use the native tool calling in a JSON schema. There seems to be far fewer issues with getting JSON syntax correct.
Anthropic are leaning more into multi-agent setups where the parent agent might delegate to one or more sub-agents which might run in parallel. They use that trick for Claude Code - I have some notes on reverse-engineering that here https://simonwillison.net/2025/Jun/2/claude-trace/ - and expand on that in their write-up of how Claude Research works: https://simonwillison.net/2025/Jun/14/multi-agent-research-s...
It's still _very_ early in figuring out good patterns for LLM tool-use - the models only got really great at using tools in about the past 6 months, so there's plenty to be discovered about how best to orchestrate them.