On your site you make the claim that: "Our thesis is that there is 100 years of physics and math research that has gone unnoticed by the CS/ML communities and we intend to rectify that."
Extraordinary claims require extraordinary evidence. Especially considering that a decent fraction of the CS/ML researchers that I know have solid physics and math backgrounds. Just of the top of my head, Marcus Hutter, David MacKay, Bernhard Scholkopf, Alex Smola, Max Welling, Christopher Bishop, etc. are/were prominent researchers with strong math and physics backgrounds. More recently Jared Kaplan and Dario Amodei at Anthropic also have physics backgrounds, as well as plenty of people at DeepMind.
To claim that you have noticed something in "100 years of physics and math research" that all of those people (and more) have missed and you didn't is pure hubris.
Cliche phrase is cliche. And yeah, no shit, we are working on it.
Re: your other points: cool, yeah there are people in ML that studied physics. Do you feel like much of physics has made it to ML? Do we have scalable energy-based models? If not, why not?
Is it concerning to anyone else that the "Simple & Reliable" and "Reliable on Longer Tasks" diagrams look kind of like the much maligned waterfall design process?
I am mostly worried that I am wrong, in my opinion, that "agents" is a bad paradigm for working with LLMs
I have been using LLMs since I got my first Open AI API key, I think "human in the loop" is what makes them special
I have massively increased my fun, and significantly increased my productivity using just the raw chat interface.
It seems to me that building agents to do work that I am responsible for is the opposite of fun and a productivity sink as I correct the rare, but must check for it, bananas mistakes these agents inevitably make
The thing is, the same agent that made the bananas mistake is also quite good at catching that mistake (if called again with fresh context). This results in convergence on working, non-bananas solutions.
Look up The Old Lady who Swallowed a Fly. Or The King, the Mice and his Cheese
What you propose makes things worse, not better
LLMs are magnificent tools, but there needs to be a human hand holding them.
Nothing I have seen anywhere, yet, challenges my view that "agents" will not be a good idea until we have better technology, that there is no sign of yet (?), than LLMs.
Just to be clear, I wasn't claiming that "communicating clearly" is a new idea in software engineering, I'm mainly commenting on how effective embracing it can be.
When doing math, pretty much every term is "load-bearing" in that arguments will make use of specific aspects of a concept and how it relates to other concepts.
If you look at most graduate-level math textbooks or papers, they typically start with a whole bunch of numbered definitions that reference each other, followed by some simple lemmas or propositions that establish simple relationships between them before diving into more complex theorems and proofs.
The best software projects I've seen follow a roughly similar pattern: there are several "core" functions or libraries with a streamlined API, good docs, and solid testing; on top of that there are more complex processes that treat these as black-boxes and can rely on their behavior being well-defined and consistent.
Probably the common thread between math and programming is both lean heavily on abstraction as a core principle.
As someone who taught myself 68000 assembler as a kid in order to render Mandelbrot and Julia sets quickly it still blows my mind a little that fairly hi-res versions of these can be rendered basically instantaneously in a browser using an interpreted language.
Similar(ish) although I only really got as far as BASIC on a 80286 running DOS 3.something!
I did manage to get something in C to compile and work with hard coded co-ordinates but it took me ages and didn't float my boat but it was rather faster 8) I suppose I'll always be a scripter.
I had a copy of the "Beauty of Fractals" and the next one too (can't remember the name). I worked in a books warehouse as a holiday job before Poly (UK Polytechnic - Plymouth) and I think I persuaded my parents to buy me the first and the second may have fallen off a shelf and ended up in the rejects bin. I got several text books for Civil Engineering too, without even needing to cough drop them myself.
One of the books had pseudo code functions throughout which even I could manage to turn into BASIC code. I remember first seeing a fern leaf being generated by a less than one screen (VGA) program which used an Iterated Function System (IFS) and I think a starter matrix with carefully chosen parameters.
I also had to convince my parents to buy me books about fractals. My prized possession as a 15 year old was a copy of Mandelbrot's "Fractal Geometry of Nature". A lot of it went over my head but it had some gorgeous colour plates and interesting sections. I still have it at home some 35 years later.
That also inspired me to write IFS code for ferns, Sierpinski gaskets, and Menger sponges in 68k assembler (after realizing AmigaBASIC was too slow).
I spent many hours experimenting with Fractint, trying to get the inner and outer coloring just right, along with the zoom magnification that I could handle walking away from the computer for long enough to get something interesting. The worst was zooming somewhere that looked interesting, and coming back many hours later to find out you had nothing of value.
I spent my early teen years addicted to Fractint, before I could even really show off my creations except in person to my friends. I still look back at those days as more interesting with computers than now. Maybe I need to go back and write my own software to render fractals (or work on existing fractal software and see if I can improve it). In the mid-2000's, I was using GnoFract4D to render fractals, and the results were far more impressive. A change in GNOME or Ubuntu created an issue with the render window for me, and I ended up abandoning it.
A lot of those skills come from thinking about development in a team as a system and ask where do things frequently go wrong or take too long?
Practice clearly and concisely expressing what you understand the problem to be. This could be a problem with some code, some missing knowledge, or a bad process.
Check to see whether everyone understands and agrees. If not, try to target the root of the misunderstanding and try again. Sometimes you’ll need to write a short document to make things clear. Once there is a shared understanding then people can start taking about solutions. Once everyone agrees on a solution, someone can go implement it.
Like any skill, if you practice this loop often enough and take time to reflect on what worked and what didn’t, you slowly find that you develop a facility for it.
These visualizations are much nicer than mine though.
Curious fact, the Bregman divergences are a different class of divergences to the f-divergences that intersect at the KL divergence. That is, KL is (essentially) the only divergence that is both an f-divergence and a Bregman divergence. This is basically because log turns a ratio into a difference.
I always found this one a little poignant: