Sub-agent is another LLM loop that you simply import and provide as a tool to your orchestrator LLM. For example in Claude Code, sub-agent is a tool called "Task(<description>)" made available to the main LLM (the one that you chat with) along with other tools like patch_file and web_search.
Concurrent tool call is when LLM writes multiple tool calls instead of one, and you can program your app to execute those sequentially or concurrently. This is a trivial concept.
The "agent framework" layer here is so thin it might as well don't exist, and you can use Anthropic/OAI's sdk directly. I don't see a need for fancy graphs with circles here.
> The "agent framework" layer here is so thin it might as well don't exist
There's plenty of things that you need to make an AI agent that I woudn't want to re-implement or copy and paste each time. The most annoying being automatic conversation history summarization (e.g. I accidentally wasted $60 with the latest OpenAI realtime model, because the costs go up very quickly as the conversation history grows). And I'm sure we'll discover more things like that in the future.
I would highly recommend gemini 2.5 pro too for their speech quality. It's priced lower and the quality is top notch on their API. I made an implementation here in case you're interested https://www.github.com/akdeb/ElatoAI but its on hardware so maybe not totally relevant
I'm using LiveKit, and I indeed have tested Gemini, but it appears to be broken or at least incompatible with OpenAI. Not sure if this is a Livekit issue or a Gemini issue. Anyway I decided to go back to just using LLM, SST and TTS as separate nodes, but I've also been looking into Deepgram Voice Agent API, but LiveKit doesn't support it (yet?).