Just write the for loop to react to tool calls? It’s not very much code.

koakuma-chan · 2025-06-18T00:35:17 1750206917

They mentioned hand offs, sub agents, concurrent tool calls, etc. You could write that yourself, but you would be inventing your own framework.

crazylogger · 2025-06-18T07:29:58 1750231798

Sub-agent is another LLM loop that you simply import and provide as a tool to your orchestrator LLM. For example in Claude Code, sub-agent is a tool called "Task(<description>)" made available to the main LLM (the one that you chat with) along with other tools like patch_file and web_search.

Concurrent tool call is when LLM writes multiple tool calls instead of one, and you can program your app to execute those sequentially or concurrently. This is a trivial concept.

The "agent framework" layer here is so thin it might as well don't exist, and you can use Anthropic/OAI's sdk directly. I don't see a need for fancy graphs with circles here.

koakuma-chan · 2025-06-18T11:12:18 1750245138

> The "agent framework" layer here is so thin it might as well don't exist

There's plenty of things that you need to make an AI agent that I woudn't want to re-implement or copy and paste each time. The most annoying being automatic conversation history summarization (e.g. I accidentally wasted $60 with the latest OpenAI realtime model, because the costs go up very quickly as the conversation history grows). And I'm sure we'll discover more things like that in the future.

akadeb · 2025-06-18T11:24:17 1750245857

I would highly recommend gemini 2.5 pro too for their speech quality. It's priced lower and the quality is top notch on their API. I made an implementation here in case you're interested https://www.github.com/akdeb/ElatoAI but its on hardware so maybe not totally relevant

koakuma-chan · 2025-06-18T11:33:52 1750246432

I'm using LiveKit, and I indeed have tested Gemini, but it appears to be broken or at least incompatible with OpenAI. Not sure if this is a Livekit issue or a Gemini issue. Anyway I decided to go back to just using LLM, SST and TTS as separate nodes, but I've also been looking into Deepgram Voice Agent API, but LiveKit doesn't support it (yet?).

risyachka · 2025-06-18T07:18:29 1750231109

Its still just a loop.

Also - funny enough how “parallel calls” became a feature in AI? Like wow, yeah, we could call functions in parallel since the dawn of CS