More

stillsut · 2025-10-11T20:06:52 1760213212

I've actually written my own a homebrew framework like this which is a.) cli-coder agnostic and b.) leans heavily on git worktrees [0].

The secret weapon to this approach is asking for 2-4 solutions to your prompt running in parallel. This helps avoid the most time consuming aspect of ai-coding: reviewing a large commit, and ultimately finding the approach to the ai took is hopeless or requires major revision.

By generating multiple solutions, you can cutdown investing fully into the first solution and use clever ways to select from all the 2-4 candidate solutions and usually apply a small tweak at the end. Anyone else doing something like this?

[0]: https://github.com/sutt/agro

thethimble · 2025-10-11T20:24:22 1760214262

There is a related idea called "alloying" where the 2-4 candidate solutions are pursued in parallel with different models, yielding better results vs any single model. Very interesting ideas.

https://xbow.com/blog/alloy-agents

stillsut · 2025-10-11T22:35:07 1760222107

Exactly what I was looking for, thanks.

I've been doing something similiar: aider+gpt-5, claude-code+sonnet, gemini-cli+2.5-pro. I want to coder-cli next.

A main problem with this approach is summarizing the different approaches before drilling down into reviewing the best approach.

Looking at a `git diff --stat` across all the model outputs can give you a good measure of if there was an existing common pattern for your requested implementation. If only one of the models adds code to a module that the others do not, it's usually a good jumping off point to exploring the differing assumptions each of the agents built towards.

michaelbarton · 2025-10-11T20:34:11 1760214851

This reminds me of an an approach in mcmc where you run mutiple chains at different temperatures and then share the results between them (replica exchange MCMC sampling) the goal being not to get stuck in one “solution”

stillsut · 2025-09-30T14:20:18 1759242018

I've been building something like this, a markdown that tracks your prompts, and the code generated.

https://github.com/sutt/innocuous/blob/master/docs/dev-summa...

Check it out, I'd be curious of your feedback.

stillsut · 2025-09-30T13:36:17 1759239377

Encoding / decoding hidden messages in LLM output.

https://github.com/sutt/innocuous

The traditional use-case is steganography ("hidden writing"). But I see more potential applications than just for spy stuff.

I'm using this project as a case study for writing CS-oriented codebases and keeping track of every prompt and generated code line in a markdown file: https://github.com/sutt/innocuous/blob/master/docs/dev-summa...

My favorite pattern I've found is to write encode implementations manually, and then AI pretty easily is able to follow that logic and translate it into a decode function.

stillsut · 2025-09-23T17:46:52 1758649612

I'm also working on a library to steer the sampling step of LLM's but more for steganographic / arbitrary data encoding purposes.

Should work with any llama.cpp compatible model: https://github.com/sutt/innocuous

dcreater · 2025-09-25T04:28:19 1758774499

i am not following how you encoded a BTC address into a poem. can you help explain?

stillsut · 2025-09-30T14:30:52 1759242652

I think the easiest explanation is to look at the table here: https://github.com/sutt/innocuous?tab=readme-ov-file#how-it-...

Watch how the "Cumulative encoding" row grows each iteration (that's where the BTC address will be encoded) and then look at the other rows for how the algorithm arrives at that.

Thanks for checking it out!

stillsut · 2025-09-19T13:15:55 1758287755

You can also earn zaps for pull requests working on Nostr clients.

We've been hosting some bounties like this one here: https://app.lightningbounties.com/issue/615dc5f7-ed91-4ecd-8...

stillsut · 2025-09-15T15:52:14 1757951534

I've got some receipts for what I think is good vibe coding...

I save every prompt and associated ai-generated diff in a markdown file for a steganography package I'm working on.

Check out this document: https://github.com/sutt/innocuous/blob/master/docs/dev-summa...

In particular, under v0.1.0 see `decode-branch.md` prompt and it's associated generated diff which implements memoization for backtracking while performing decoding.

It's a tight PR that fits the existing codebase and works well, you just need a motivating example you can reproduce which can help me you quickly determine if the proposed solution is working. I usually generate 2-3 solutions initially and then filter them quickly based on a test case. And as you can see from the prompt, it's far from well formatted or comprehensive, just "slap dash" listing of potentially relevant information similar to what would be discussed at an informal whiteboard session.

stillsut · 2025-09-11T14:50:41 1757602241

I'm actually working on something similar to this where you can encode information into the outputs of LLM's via steganography: https://github.com/sutt/innocuous

Since I'm really looking to sample the only the top ~10 tokens, and I mostly test on CPU-based inference of 8B models, there's probably not a lot of worries getting a different order of the top tokens based on hardware implementation, but I'm still going to take a look at it eventually, and build in guard conditions against any choice that would be changed by an epsilon of precision loss.

stillsut · 2025-09-08T20:38:16 1757363896

I think the "magic" that we've found a common toolset of methods - embeddings and layers of neural networks - that seem to reveal useful patterns and relationships from a vast array of corpus of unstructured analog sensors (pictures, video, point clouds) and symbolic (text, music) and that we can combine these across modalities like CLIP.

It turns out we didn't need a specialist technique for each domain, there was a reliable method to architect a model that can learn itself, and we could already use the datasets we had, they didn't need to be generated in surveys or experiments. This might seem like magic to an AI researcher working in the 1990's.

stillsut · 2025-09-08T20:20:34 1757362834

"Unstructured data learners and generators" is probably the most salient distinction for how current system compare to previous "AI systems" examples (NLP, if-statements) that OP mentioned.

stillsut · 2025-09-08T15:05:51 1757343951

At a meta-level, I wonder if there's this un-talked about advantage of poaching ambitious talent out of an established incumbent to work a new product line in a new organization, in this case Apple Silicon disrupting Intel/AMD. And we've also seen SpaceX do this NASA/Boeing, and OpenAI do it to Google's ML departments.

It seems like large, unchallenged organizations like Intel (or NASA or Google) collect all the top talent out of school. But changing budgets, changing business objectives, frozen product strategies make it difficult for emerging talent to really work on next-generation technology (those projects have already been assigned to mid-career people who "paid their dues").

Then someone like Apple Silicon with M-chip or SpaceX with Falcon-9 comes along and poaches the people most likely to work "hardcore" (not optimizing for work/life balance) while also giving the new product a high degree of risk tolerance and autonomy. Within a few years, the smaller upstart organization has opened up in un-closeable performance gap with behemoth incumbent.

Has anyone written about this pattern (beyond Innovator's Dilemma)? Does anyone have other good examples of this?

vid · 2025-09-08T15:35:36 1757345736

I'm not sure it really takes that kind of breakthrough approach. Apple chips are more energy efficient, but x86 can be much faster on CPU or GPU tasks, and it's much more versatile. A main "bug and feature" issue is the PC industry relies on common denominator standards and components, whereas Apple has gone vertical with very limited core expansion. This is particularly important when it comes to memory speed, where the standards are developed and factories upgraded over years at huge cost.

I gather it's very difficult and expensive to make a board that supports more channels of RAM, so that seems worth targeting at the platform level. Eight channel RAM using common RAM DIMMs would transform PCs for many tasks, however for now gamers are a main force and they don't really care about memory speed.

stillsut · 2025-09-08T16:04:41 1757347481

Makes sense: M-chips, Falcon-9, GPT's are product subsets or the incumbent's traditional product capabilities.