More

hardmaru · 2026-01-08T16:17:29 1767889049

Hi HN,

I am one of the authors from Sakana AI and MIT. We just released this paper where we hooked up LLMs to the classic 1984 programming game Core War. For those who haven't played it, Core War involves writing assembly programs in a language called Redcode that battle for control of a virtual computer's memory. You win by crashing the opponent's process while keeping yours running. It is a Turing-complete environment where code and data share the same address space, which leads to some very chaotic self-modifying code dynamics.

We did not just ask the model to write winning code from scratch. Instead, we treated the LLM as a mutation operator within a quality-diversity algorithm called MAP-Elites. The system runs an adversarial evolutionary loop where new warriors are continually evolved to defeat the champions of all previous rounds. We call this Digital Red Queen because it mimics the biological hypothesis that species must continually adapt just to survive against changing competitors.

The most interesting result for us was observing convergent evolution. We ran independent experiments starting from completely different random seeds, yet the populations consistently gravitated toward similar behavioral phenotypes, specifically regarding memory coverage and thread spawning. It mirrors how biological species independently evolve similar traits like eyes to solve similar problems. We also found that this training loop produced generalist warriors that were robust even against human-written strategies they had never encountered during training.

We think Core War is an under-utilized sandbox for studying these kinds of adversarial dynamics. It lets us simulate how automated systems might eventually compete for computational resources in the real world, but in a totally isolated environment. The simulation code and the prompts we used are open source on GitHub.

Other info other than the blog link:

Paper (website): https://pub.sakana.ai/drq/

Arxiv: https://arxiv.org/abs/2601.03335

Code: https://github.com/SakanaAI/drq

NitpickLawyer · 2026-01-08T19:09:11 1767899351

> adversarial evolutionary loop where new warriors are continually evolved to defeat the champions of all previous rounds.

Interesting. So you're including past generation champions in the "fights"? That would intuitively model a different kind of evolution than just "current factors"-driven evolution.

> We also found that this training loop produced generalist warriors that were robust even against human-written strategies they had never encountered during training.

Nice. Curious, did you do any ablations for the "all previous champions" vs. "current gen champions"?

aldebaran1 · 2026-01-08T21:38:15 1767908295

Very interesting paper, thank you. It makes me wonder what other game substrates could form the basis for adversarial/evolutionary strategy optimization for LLMs, and whether these observations replicate across games.

Since LLMs are text based, a text-based game might be interesting. Something like Nomic?

Or a "meme warfare" game where each agent tries to prompt-inject its adversaries into saying a forbidden codeword, and can modify its own system prompt to attempt to prevent that from happening to itself.

hardmaru · 2026-01-05T15:41:31 1767627691

Hi HN,

We are the team at Sakana AI. To give some context on the difficulty here, an OpenAI agent placed 2nd in the AHC world tournament last August, so taking 1st place against 804 humans in this contest is a significant milestone for us. Our agent approached the production planning problem by running its own experiments during the contest. It independently discovered a Simulated Annealing strategy using a "virtual power" heuristic which ended up outperforming the greedy solutions that the problem setters anticipated.

We used inference-time scaling with GPT-5.2 and Gemini 3 Pro Preview to make this happen. The agent ran parallel code generation loops to iteratively refine the algorithm, costing about $1,300 in total compute for the 4 hour event. We published the full logs showing the agent's analysis and code evolution at the link in the post.

Happy to answer any questions about the architecture!

Blog Post with details: https://sakana.ai/ahc058

For more technical detailed information, including the logs and analysis output by ALE-Agent during the contest, see: https://sakanaai.github.io/fishylene-ahc058/

hardmaru · 2025-08-18T12:48:36 1755521316

Lol...

hardmaru · 2025-05-30T15:22:03 1748618523

If you are interested, here is a link to the technical report:

https://arxiv.org/abs/2505.22954

Also the reference implementation on GitHub:

https://github.com/jennyzzt/dgm

Enjoy!

hardmaru · 2025-05-30T09:22:48 1748596968

Blog post about the work:

The Darwin Gödel Machine: AI that improves itself by rewriting its own code

https://sakana.ai/dgm/

hardmaru · 2025-05-30T02:11:24 1748571084

Link to the technical report: https://arxiv.org/abs/2505.22954

Code: https://github.com/jennyzzt/dgm

hardmaru · on Oct 28, 2024

No Paywall link to the article:

https://archive.is/BrW6G

hardmaru · on Aug 26, 2024

No paywall link here: https://archive.is/K9mVn

sylware · on Aug 26, 2024

... and I get a javascript-only capcha there :(

hardmaru · on Nov 22, 2023

For a summary of changes to the language, see John Reid's slides here: https://fortran.bcs.org/2022/AGM22_Reid.pdf

More info, The Home of Fortran Standards: https://wg5-fortran.org/

theodorethomas · on Nov 22, 2023

https://wg5-fortran.org/N2201-N2250/N2212.pdf is John Reid's complete document.

fithisux · on Nov 22, 2023

as a funny note John Reid was the singer of Nightcrawlers in the 90s.

krylon · on Nov 22, 2023

That is funny. Interesting career trajectory.

thechao · on Nov 22, 2023

Dang ternary operator. I wish ternary was "open":

    X ? A : B : ... Z;

Where the value of each expression counted down from A..Z, ie, of there's 8 limbs, then A is 7, B is 6, ..., Z is 0.

smegsicle · on Nov 22, 2023

you want a one-line switch expression??

thechao · on Nov 30, 2023

Yes; not often, but, yes.

gamache · on Nov 22, 2023

Sounds similar to computed GO TO, as in:

      GO TO (LABEL1, LABEL2, LABEL3, LABEL4) I

https://docs.oracle.com/cd/E19957-01/805-4939/6j4m0vn9l/inde...

fithisux · on Nov 22, 2023

Which FOSS compilers support it?

pklausler · on Nov 22, 2023

If "it" is F'23, then none. GNU Fortran has had the "new" degree-unit trig functions for a while, but no compiler, FOSS or otherwise, has the newly invented features of this revision.

Fortran doesn't prototype features with real implementations (or test suites) before standardizing them, which had led to more than one problem over the years as ambiguities, contradictions, and omissions in the standard aren't discovered until years later when compiler developers eventually try to make sense of them, leading to lots of incomplete and incompatible implementations. I've written demonstrations for many examples and published them at https://github.com/klausler/fortran-wringer-tests/tree/main .

rsecora · on Nov 22, 2023

flang [1], it's part of the LLVM project.

[1] https://github.com/llvm/llvm-project/tree/main/flang

certik · on Nov 22, 2023

GFortran, Flang and LFortran are all open-source compilers that support modern Fortran.

hardmaru · on Nov 10, 2023

Good news for the company. Hope they'll continue the open-source efforts.

No paywall link to the article: https://archive.is/aLHcF