More

cyanydeez · 2026-05-05T23:50:14 1778025014

is the corollary The Emperors New Clothes?

cyanydeez · 2026-05-05T23:47:32 1778024852

some might think tamificarion of everything was actually prep for gambling on everything.

cyanydeez · 2026-05-05T21:42:59 1778017379

No, no, we cannot question capitalism, verboten! Nein!

cyanydeez · 2026-05-05T21:16:37 1778015797

In sports like Football where CTE is king, there's just not gonna be enough qualified personnel to coach.

jasonfarnon · 2026-05-05T23:41:07 1778024467

No. Few college or professional coaches weren't themselves college or professional players. Think of all those assistant coaches, QB coaches, DB coaches etc.--all players. Mike Leach comes to mind as a rare counterexample.

cyanydeez · 2026-05-05T19:40:54 1778010054

This sounds a lot like another hacker news posted in the last few days. The same problem image generators have with a prompt like, produce numbers 1-50 in a spiral pattern and it can't count properly. But if you break it into a raster/vector where you have it first produce the visual content and then a SVG overlay, it's completely capable.

Have you tried doing a two step: review the image, then render a vector?

julius · 2026-05-05T19:50:58 1778010658

Maybe there is a smart trick to get them to do the right thing, but the things I tried did not work.

At one point I had some smaller model draw bounding boxes around everything that looked interactable and labels like "e3" ... then asked the model to tell me "click on e3". Did not work in my tests was pretty much as bad as x,y.

cyanydeez · 2026-05-05T20:48:23 1778014103

Yeah, I've held off on doing any kind of rag till there's models that properly handle layout detection and partitioning because it's so easy to generate shitty data if you're not properly attending to visual cues first before you slice up a document.

cyanydeez · 2026-05-05T14:53:31 1777992811

One thing AI can power nicely is the anti-SaaS movement. Being able to just boot a cheap PC and test out any of the open source packages is so infinitely easier than piling into all the random credential Bazaars.

But that won't take away the inability of the LLM from confusing whats in dev, whats in production, whats in localhost and whats remote; I've been working on getting a tools/skill for opencode that works with chrome/devtools via a linuxserver.io image. I can herd it to the right _arbitrary_ ports, but every compaction event steers it back to wanting to use the standard 9222 port and all that. I'm tempted to just revert it but there's a security and now, security-through-LLM-obscurity value in not using defaults. Defaults are where the LLM ends up being weak. It will always want to use the defaults. It'll always forget it's suppose to be working on a remote system.

Using opencode, there's no way to force the LLM into a protocols that limits their damage to a remote system or a narrow scope of tools. Yes, you can change permissions on various tools, but that's not the weakness that's exposed by these types of events. The weakness is the LLM is a averaged 'problem solver' so will always tend towards a use case that's not novel, and will tend to do whatever it saw on stackoverflow, even if what you wanted isn't the stackoverflow answer.

nijave · 2026-05-05T21:37:32 1778017052

>But that won't take away the inability of the LLM from confusing whats in dev, whats in production, whats in localhost and whats remote

In my experience, Claude Code with Opus 4.7 tends to assume things are production unless explicitly told otherwise.

>there's no way to force the LLM into a protocols that limits their damage to a remote system or a narrow scope of tools

Might not be able to force it but prompting and context help. An AGENTS.md that explicitly calls out what is and isn't production helps (at least with Claude Code)

Not sure about OpenCode but in Claude Code, memories also help (more injected context)

traderj0e · 2026-05-05T18:50:26 1778007026

AI can also make using AWS directly easier

cyanydeez · 2026-05-05T20:48:58 1778014138

Is there some harness that navigates AWS or are you referring to some complex clie. I've only seen their S3 clie.

traderj0e · 2026-05-05T21:01:28 1778014888

Just having it use the AWS CLI if you're careful about what creds you give it

cyanydeez · 2026-05-05T14:46:43 1777992403

I find it fascinating, all these attempts are goldmining LLMs with a harness and it's clear they're generating all the docs for AI to read and use, even the docs say "we made a MCP for this!" like some how within 2 years people no longer make choices and it's just like AIs roaming the internet trying on harnesses, etc; certainly that'd be a fascinating reality but the verbosity really is a eye-glazing experience. Who do they expect to read all of that ad copy? It's not me.

cyanydeez · 2026-05-05T14:44:12 1777992252

serious question: I've already got a opencode harness running on a local model. It's easily installable via the insecure bash command. It's already tailored with a couple of plugins and with a proper TODO.md and planning, I can get it to loop fine with proper attention to its pratfalls on vague/non-determinant language. It's all running on a AMD 395+ Qwen3-Coder-Next model with ~256k context. opencode has a webui I can put behind a password protected endpoint and keep it busy from anywhere I need to via a simple nginx proxy.

How does this go above and beyond this straightforward opensource, open weights and relatively cheap setup? Do you just get more tokens from SOTA models? Can anyone rationally say the products of token production are quality and secure?

pohl · 2026-05-05T15:03:26 1777993406

You know how OpenCode can be prompted to modify itself when you want to improve it in some way? This just automates that kind of thing.

cyanydeez · 2026-05-05T15:20:40 1777994440

It can't actually; I had to create a systemd service that watched the config path and send a signal to reload the files. It roughly works, but it doesn't actually do the loop correctly.

However, the problem with self-modification is the tendency towards inoperable states. Does it automatically revert when a detrimental state is reached? How does it determine that a modification worked?

pohl · 2026-05-05T15:26:40 1777994800

The paper shows that it can. Take note this seems to be someone’s experiment. If it’s not working for you that’s probably because it’s not a polished product.

cyanydeez · 2026-05-05T14:33:19 1777991599

>If you work on a terrible enterprise codebase... it's very possible that software quality/quantity isn't actually that important to your organization.

It's possible capitalism will drive all enterprise to terrible codebases.

cyanydeez · 2026-05-05T11:10:37 1777979437

I think if these companies first adopted local models with fewer token outs and the learners got to watch the tokens get made, there'd be a lot more understanding.