Hacker Newsnew | past | comments | ask | show | jobs | submit | pancsta's commentslogin

I can barely agree with anything in this article.

> most agent systems break down from too much complexity, not too little

...when the platform wasn't made to handle complexity. The main problem is that the "frameworks" are not good enough for agentic workloads, which naturally will scale into complex stateful chaos. This requires another approach, but all that is done is delegating this to LLMs. As the author says "A coordinator agent that managed task delegation", which is the wrong way, an easy exit, like "maybe it will vibe-state itself?".

Agentic systems existed before LLMs (check ABM), and nowadays most ppl confuse what LLMs give us (all-knowing subconscious DBs) with agency, which is a purpose of completing a process. Eg a bus driver is an agent, but you dont ask a bus driver to play the piano. It has predefined behavior, within a certain process.

Another common mistake is considering a prompt (with or without history) an agent. It's just a DB model which you query. A deep research agent has 3 prompts: Check if an answer is possible, scrape, and answer. These are NOT 3 agents - these are DB queries. Delegating logical decisions to LLMs without verification is like having a drunk bus driver. A new layer is needed, which is where all the python frameworks offer it on top of their prompts. That's a mistake, because it splits the control flow, and managing complex state with FSMs or imperative code will soon hit a wall.

Declarative programming to the rescue - this is the only (and also natural) way of handling live and complex systems. It has to be done from the bottom up and it will change the paradigm of the whole agent. I've worked on this exact approach for a while now, and besides handling complexity, the 2nd challenge is navigating through it easily, to find answers to your questions (what and when, exactly, went wrong). I let LLMs "build" the dynamic parts of the agent (like planning), but keeping them in IoC - only the agent layer makes decisions. Another important thing - small prompts, with a single task; 100 focused prompts is better then 1 pasta-prompt. Again, without a proper control flow, synchronizing 100 co-dependent prompts can be tricky (when approached imperatively, with eg a simple loop).

Theres more to it, and I recommend checking out my agents (research and cook), either as a download, source code, or a video walk-through [0].

PS. Embrace chaos, and the chaos will embrace you.

TLDR; toy-frameworks in python, ppl avoiding coding, drunk LLMs

[0] https://github.com/pancsta/secai


By splitting prompts into smaller chunks you effectively get “bias free” opinions, especially when cross-checked. You can then turn them into local reasoning, which is different from “sending an email to the LLM” which seems to be the case here. Remember, LLM is Rainman.

Your question is “how to test opaque nondeterministic databases”. I test my agents deterministically, because I know how to IoC. Check out this code [0] and follow the usage. In the rest of cases, you assert with embeds. Good luck.

[0] https://github.com/pancsta/secai/blob/74d79ad449c0f60a57b600...


For anyone interested, I'm on YC cofounder matching.

When designing interactions with my agents I've added 3 things to the regular chat window: a static list of stories (for mental navigation), a dynamic list of buttons / progress bars (for structured input) and a narrator (kinda a sys-prompt for the user). There's also an interrupt button, as an extra.

I just launched in on HN, take a look:

https://news.ycombinator.com/item?id=44386314


I agree with the debugging part - we need a new approach to handle these abstractions, and a regular code-stepping or reading logs isn't enough in the case of agents. It's all about managing state, and most ppl here say to avoid it. What I say is that if you go all-in and unify the architecture from the bottom layer, it's actually easier in the long run, but you'll need dedicated devtools for that.

This is precisely why I've created AI-gent Workflows (launched on HN today [0]), which comes with a purpose-built state machine and devtools. Unlike LangGraph, it starts already in the lowest layer and everything is state-based. You can time travel and even modify states of a live agent.

[0] https://news.ycombinator.com/item?id=44386314


> automating businesses while documenting the journey comic-style

Where is that documentation exactly?

> specialized AI agents, each with defined roles

Agency is a defined role, by definition.


TLDR; it’s a Key Value DB with TTL. No thinking. Clear mislabeling by the author.


Sway is way more stable, just less eye candy (which is great). Hyperland is practically a fork (wlroots) by a single dev.


I would love to see startup times measured as well.


Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: