More

mattarm · 2026-01-28T15:14:23 1769613263

Makes me wonder if the hardware engineers look at software engineers and shrug, “they don’t really know how their software really works.”

Makes me wonder if C programmers look at JS programmers and shrug, “they don’t understand what their programs are actually doing.”

I’m not trying to be disingenuous, but I also don’t see a fundamental difference here. AI lets programmers express intent at a higher level of abstraction than ever before. So high, apparently, that it becomes debatable whether it is programming at all, out whether it takes any skill, out requires education or engineering knowledge any longer.

mattarm · 2026-01-08T15:22:02 1767885722

Yeah, the online Gemini app is not good for long lived conversations that build up a body of decisions. The context window gets too large and things drop.

What I’ve learned is that once you reach that point you’ve got to break that problem down into smaller pieces that the AI can work productively with.

If you’re about to start with Gemini-cli I recommend you look up https://github.com/github/spec-kit. It’s a project out of Microsoft/Github that encodes a rigorous spec-then-implement multi pass workflow. It gets the AI to produce specs, double check the specs for holes and ambiguity, plan out implementation, translate that into small tasks, then check them off as it goes. I don’t use spec-kit all the time, but it taught me that what explicit multi pass prompting can do when the context is held in files on disk, often markdown that I can go in and change as needed. I think it ask basically comes down to enforcing enough structure in the form of codified processes, self checks and/or tests for your code.

Pro tip, tell spec-kit to do TDD in your constitution and the tests will keep it on the rails as you progress. I suspect “vibe coding” can get a bad rap due to lack of testing. With AI coding I think test coverage gets more important.

HarHarVeryFunny · 2026-01-08T16:46:43 1767890803

Thanks for the spec-kit recommendation - I'll give it a try!

mattarm · 2026-01-08T14:42:26 1767883346

They work much better off a test that must pass. That they can “see”. Without it they are just making up some other acceptance criteria.

mattarm · 2025-12-05T22:42:50 1764974570

See https://abseil.io/tips/ for some idea of the kinds of guidance these kinds of teams work to provide, at least at Google. I worked on the “C++ library team” at Google for a number of years.

These roles don’t really have standard titles in the industry, as far as I’m aware. At Google we were part of the larger language/library/toolchain infrastructure org.

Much of what we did was quasi-political … basically coaxing and convincing people to adopt best practices, after first deciding what those practices are. Half of the tips above were probably written by interested people from the engineering org at large and we provided the platform and helped them get it published.

Speaking to the original question, no, there were no teams just manually reading code and looking for mistakes. If buggy code could be detected in an automated way, then we’d do that and attempt to fix it everywhere. Otherwise we’d attempt to educate and get everyone to level up their code review skills.

perching_aix · 2025-12-05T22:50:25 1764975025

This is a really cool insight, thank you!

> Half of the tips above were probably written by interested people from the engineering org at large and we provided the platform and helped them get it published.

Are you aware how those engineers established their recommendations? Did they maybe perform case studies? Or was it more just a distillation of lived experience type of deal?

eep_social · 2025-12-08T09:02:21 1765184541

I wasn’t in C++ style land but my recollection is that distilled experience would be backed up by extensive mailing list discussions. in case of contention the discussion might extend into case studies or other quantitative techniques atop google3. It’s difficult for me personally to describe the impact (outsized)of a super-resourced monorepo for this kind of thing. also as gp mentioned, it was sometimes possible to automate changes to comply with updated guidelines.

mattarm · on Aug 18, 2024

Some popular streamers have dabbled in OCaml this year, sometimes calling it "the Go of functional programming", which probably set off a small wave of people tinkering with the language. OCaml has also gotten gradually better in recent years in terms of tooling, documentation, standard library, etc.

adambrod · on Aug 18, 2024

I think they were saying that Gleam was Go of functional programming? OCaml may be like Go compared to Haskell but IMHO Gleam really embraces simplicity and pragmatism.

myaccountonhn · on Aug 18, 2024

I would say some other reasons OCaml is similar to Go is that the runtime is very simple, performance is on par and the compilation times are very fast. It also markets itself as a GC'd systems language similar to Go. I think a seasoned OCaml would be able to guess the generated assembler code.

I suspect that Gleam is quite different in that regard.

pjmlp · on Aug 20, 2024

Thankfully it has a modern type system, though.

As for the GC systems language, there is even a book about it,

https://ocaml.github.io/ocamlunix/

thesz · on Aug 19, 2024

The "Go of functional languages" title should go to Haskell.

The Haskell's STM and channels implemented in it allow for most (or all) of the Go "select" statement, but in a library, not language.

nequo · on Aug 19, 2024

Go prioritizes simplicity and pragmatism which is much more like OCaml than Haskell.

Haskell is more like a Rust of FP. But Rust is also much more pragmatic than Haskell.

mattarm · on Aug 14, 2024

In my experience learning a bit of OCaml after Rust, and then looking at Haskell, the three aren't all that different in terms of the basics of how ADTs are declared and used, especially for the simpler cases.

jghn · on Aug 14, 2024

Agreed. As a concept they're all the same thing.

Another way of phrasing my query is that given these are all basically ML-style constructs, why would the examples not be ML? And I was assuming the answer to that is "the sorts of people reading these blogs in 2024 are more familiar with Rust"

runeblaze · on Aug 14, 2024

I think a second reason might be that translating OCaml/Haskell concepts to Python has that academic connotation to it. Rust also (thanks to PyO3) has more affinity to Python than the ML languages. I guess it isn't a surprise that this post has Python, C++, and Rust, all "commonly" used for Python libraries.

mattarm · on June 25, 2024

> Bad example. Google docs doesn’t use CRDTs but uses OT instead. CRDTs may handle your scenario just fine depending on how they decide to handle this scenario.

The CRDT may pick one or the other replacement word, but who is to say that either choice is correct? Perhaps including both words is correct.

> Then there’s not even a merge conflict...

Agree, this is what CRDTs are all about.

> ...to really worry about.

I think it is important to make clear that CRDTs do not "solve" the merging problem, they merely make it possible to solve in a deterministic way across replicas.

Often, CRDTs do not capture higher level schema invariants, and so a "conflict free" CRDT merge can produce an invalid state for a particular application.

There is also the example above, where at the application level, one particular merge outcome may be preferred over another.

So, it isn't as simple as having nothing to worry about. When using CRDTs, often, there are some pretty subtle things that must be worried about. :-)

kiitos · on June 27, 2024

> The CRDT may pick one or the other replacement word, but who is to say that either choice is correct?

You, as the application developer.

> I think it is important to make clear that CRDTs do not "solve" the merging problem

They literally do, in the context in which they are defined. Which is about data consistency, not semantic correctness.

> Often, CRDTs do not capture higher level schema invariants,

CRDTs never capture higher level schema invariants. Just like TCP doesn't enforce HTTP session authentication. Orthogonal concerns.

vlovich123 · on June 25, 2024

Yup 100% agreed.

mattarm · on May 26, 2024

I don't agree that a missing "framework" is the whole of the problem. It just isn't that simple.

Sure, people need to use resiliency skills to cope with the stresses of life. Often times, this is an important part of what therapy for depressed people is trying to achieve.

But this isn't to say that there isn't a constellation of causes in recent decades and years that cause the world to be particularly stressful, especially for young people. It also isn't to say that we should dismiss what is occurring in the world today as "the same old stuff" without acknowledging that it may actually have unique properties worth understanding. Off the top of my head: world population is at an all-time high, global warming is becoming increasingly understood, it is increasingly acknowledged that we can no longer simply extract unlimited resources from the earth to solve all problems, the Internet has changed the way the world works that seems to speed everything up: communication, changes within social groups, larger societal shifts, economic change, etc.

smokel · on May 27, 2024

I must agree that it is not that simple. That would be highly unlikely.

But how does one measure the impact of recent changes, such as the rise of the internet? Did the invention of the crossbow, the invention of money, of language, of the wheel, not also impact our lives in dramatic ways?

World population has almost constantly been at an all-time high, because it is mostly increasing.

It sure may feel different this time, but if you read the Book of Revelation, or consider 14th century pandemics, our current situation looks like child's play to me.

mattarm · on Dec 26, 2023

> It's much more a matter of whether you want to do something small scale and fun, or whether you want to suck all the joy out of it by applying the same soul crushing constraints we already get paid to do in our day jobs. Bleh.

Amen. And further, what better prepares a programmer to assess the relative costs of implementing a thing vs using a library providing that thing than having attempted an implementation?

Learning by doing is a valid approach, and this can even be called fun.

mattarm · on Dec 16, 2023

Definitely interested in how you achieved another 2-10x over the btree approach. I want surprised that btree was as effective as it was, but I’d be curious to know how you squeezed a bit more out of it.

josephg · on Dec 16, 2023

The btree works great, and has barely changed. I made it faster with two tricks:

1. I made my own rope library (jumprope) using skip lists. Jumprope is about 2x faster than ropey on its own. And I have a wrapper around the skip list (called “JumpropeBuf” in code) which buffers a single incoming write before touching the skip list. This improves raw replay performance over ropey by 10-20x iirc.

2. Text (“sequence”) CRDTs replicate a list / tree of fancy “crdt items” (items with origin left / origin right / etc). This special data structure needs to be available both to parse incoming edits and generate local edits.

Turns out that’s not the only way you can build systems like this. Diamond types now just stores the list of original edits. [(Edit X: insert “X” position 12, parent versions Y, Z), …]. Then we recompute just enough of the crdt structure on the fly when merging changes.

This has a bunch of benefits - it makes it possible to prune old changes, it lowers memory usage (you can just stream writes to disk). The network and disk formats aren’t dependant on some weird crdt structure that might change next week. (Yjs? RGA? Fugue?). File size is also smaller.

And the best bit: linear traces don’t need the btree step at all. Linear traces go as fast as the rope. Which - as I said above, is really really fast. Even when there are some concurrent edits and the btree is created, any time the document state converges on all peers we can discard all the crdt items we generated so far and start again. Btrees are O(log n). This change essentially keeps resetting n, which gives a constant size performance improvement.

The downside is that the code to merge changes is more complex now. And it’s slower for super complex traces (think dozens of concurrent branches in git).

I’m writing a paper at the moment about the algorithm. Should be up in a month or two.