No. Few college or professional coaches weren't themselves college or professional players. Think of all those assistant coaches, QB coaches, DB coaches etc.--all players. Mike Leach comes to mind as a rare counterexample.
This sounds a lot like another hacker news posted in the last few days. The same problem image generators have with a prompt like, produce numbers 1-50 in a spiral pattern and it can't count properly. But if you break it into a raster/vector where you have it first produce the visual content and then a SVG overlay, it's completely capable.
Have you tried doing a two step: review the image, then render a vector?
Maybe there is a smart trick to get them to do the right thing, but the things I tried did not work.
At one point I had some smaller model draw bounding boxes around everything that looked interactable and labels like "e3" ... then asked the model to tell me "click on e3". Did not work in my tests was pretty much as bad as x,y.
Yeah, I've held off on doing any kind of rag till there's models that properly handle layout detection and partitioning because it's so easy to generate shitty data if you're not properly attending to visual cues first before you slice up a document.
One thing AI can power nicely is the anti-SaaS movement. Being able to just boot a cheap PC and test out any of the open source packages is so infinitely easier than piling into all the random credential Bazaars.
But that won't take away the inability of the LLM from confusing whats in dev, whats in production, whats in localhost and whats remote; I've been working on getting a tools/skill for opencode that works with chrome/devtools via a linuxserver.io image. I can herd it to the right _arbitrary_ ports, but every compaction event steers it back to wanting to use the standard 9222 port and all that. I'm tempted to just revert it but there's a security and now, security-through-LLM-obscurity value in not using defaults. Defaults are where the LLM ends up being weak. It will always want to use the defaults. It'll always forget it's suppose to be working on a remote system.
Using opencode, there's no way to force the LLM into a protocols that limits their damage to a remote system or a narrow scope of tools. Yes, you can change permissions on various tools, but that's not the weakness that's exposed by these types of events. The weakness is the LLM is a averaged 'problem solver' so will always tend towards a use case that's not novel, and will tend to do whatever it saw on stackoverflow, even if what you wanted isn't the stackoverflow answer.
>But that won't take away the inability of the LLM from confusing whats in dev, whats in production, whats in localhost and whats remote
In my experience, Claude Code with Opus 4.7 tends to assume things are production unless explicitly told otherwise.
>there's no way to force the LLM into a protocols that limits their damage to a remote system or a narrow scope of tools
Might not be able to force it but prompting and context help. An AGENTS.md that explicitly calls out what is and isn't production helps (at least with Claude Code)
Not sure about OpenCode but in Claude Code, memories also help (more injected context)
I find it fascinating, all these attempts are goldmining LLMs with a harness and it's clear they're generating all the docs for AI to read and use, even the docs say "we made a MCP for this!" like some how within 2 years people no longer make choices and it's just like AIs roaming the internet trying on harnesses, etc; certainly that'd be a fascinating reality but the verbosity really is a eye-glazing experience. Who do they expect to read all of that ad copy? It's not me.
serious question: I've already got a opencode harness running on a local model. It's easily installable via the insecure bash command. It's already tailored with a couple of plugins and with a proper TODO.md and planning, I can get it to loop fine with proper attention to its pratfalls on vague/non-determinant language. It's all running on a AMD 395+ Qwen3-Coder-Next model with ~256k context. opencode has a webui I can put behind a password protected endpoint and keep it busy from anywhere I need to via a simple nginx proxy.
How does this go above and beyond this straightforward opensource, open weights and relatively cheap setup? Do you just get more tokens from SOTA models? Can anyone rationally say the products of token production are quality and secure?
It can't actually; I had to create a systemd service that watched the config path and send a signal to reload the files. It roughly works, but it doesn't actually do the loop correctly.
However, the problem with self-modification is the tendency towards inoperable states. Does it automatically revert when a detrimental state is reached? How does it determine that a modification worked?
The paper shows that it can. Take note this seems to be someone’s experiment. If it’s not working for you that’s probably because it’s not a polished product.
>If you work on a terrible enterprise codebase... it's very possible that software quality/quantity isn't actually that important to your organization.
It's possible capitalism will drive all enterprise to terrible codebases.
I think if these companies first adopted local models with fewer token outs and the learners got to watch the tokens get made, there'd be a lot more understanding.
reply