Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

>I expect things to get better, this will not always be the state of things, but for now “vibe coding” (specifically not reviewing/writing code yourself) is not sustainable.

It will not.

And I say this as someone whose been building internal LLM tools since 2021.

The issue is their context window. If you increase the context window so they can see more code costs skyrocket as n^2 the size of the code base. If you don't then you have all the issues people have in this thread.

The reason why I have a job right now is that you can get around this by building tooling for intelligent search that limits the overfill of each context window. This is neither easy, fast, or cheap when done at scale. Worse the problems that you have when doing this are at best very weakly related to the problems the major AI labs are focusing on currently - I've interviewed at two of the top five AI labs and none of the people I talked to cared or really understood what a _real_ agentic system that solves coding should look like.



I can't help but wonder whether the solution here is something like building a multi-resolution understanding of the codebase. All the way from an architectural perspective including business context, down to code structure & layout, all the way down to what's happening in specific files and functions.

As a human, I don't need to remember the content of every file I work on to be effective, but I do need to understand how to navigate my way around, and enough of how the codebase hangs together to be able to make good decisions about where new code belongs, when and how to refactor etc.. I'm pretty sure I don't have the memory or reading comprehension to match a computer, but I do have the ability to form context maps at different scales and switch 'resolution' depending on what I'm hoping to achieve.


> building tooling for intelligent search that limits the overfill of each context window

I'm interested to know what you mean by this, in our system we've been trying to compress the context but this is the first I've seen about filtering it down.


For general text you run some type of vector search against the full-text corpus to see what relevant hits there are and where. Then you feed the first round of results into a ranking/filtering system which does pair wise comparison between each chunk that you've had a good score from the vector search. Contract/expand until you've reach the limit of the context window for your model and run against the original query.

For source code, you are even luckier since there are a lot of deterministic tools which provide solid grounding, e.g., etags, and the languages themselves enforce a hierarchical tree-like structure on the source code, viz. block statements. The above means that ranking and chunking strategies are solved already - which is a huge pain for general text.

The vector search is then just an enrichment layer on top which brings in documentation and other soft grounding text that keeps the LLM from going berserk.

Of course, none of the commercial offerings come even close to letting you do this well. Even the dumb version of search needs to be a self-recursive agent which comes with a good set of vector embeddings and the ability to decide if it's searched enough before it starts answering your questions.

If you're interested drop a line on my profile email.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: