More

ekez · 2025-12-14T21:36:17 1765748177

I implemented an algorithm for neural network verification called ⍺β-CROWN for a deep-learning library called tinygrad.

tinygrad’s small set of operations and laziness made it easy to implement. Tho my overall sense is that neural network verification is currently more of a research interest than something practical.

ekez · 2025-04-30T18:22:11 1746037331

I wonder if the authors have tried incorporating error feedback from Lean into their models.

Work from 2023 [1] showed general purpose models did better when they were able to incorporate error feedback, humans incorporate error feedback, but none of the SOTA models on minif2f seem to.

[1]: https://arxiv.org/abs/2310.04353

johnmcd3 · 2025-05-01T00:51:19 1746060679

In a way, DeepSeek Prover's subgoal decomposition is a partial-step towards error/proof-state feedback. (DS Prover breaks down a proof into subgoals and attacks each subgoal separately with batched sampling, then puts the pieces back together.)

This is distinct from the approach of the previous SOTA for an open-weights model (Kimina Prover) which generated at the full-proof level.

While it was very impressive to see Kimina's ability to generate medium-length proofs (think AIME-level problems) without sub-goals or feedback at intermediate steps, it's likely that at least subgoal decomposition will be required for longer proofs (think IMO-level problems.)

I certainly agree that where and how error/proof state feedback is best incorporated (training data synthesis / reward function / CoT during inference / etc.) is a fascinating area of research. (It's rumored that GDM's AlphaProof does use proof state / lean feedback already.)

MJGrzymek · 2025-05-01T01:25:01 1746062701

That's surprising to learn.

I'm surprised those even use actual lean code instead of like raw type theory.

ekez · on March 7, 2025

There’s something compelling about these, especially w.r.t. their ability to generalize. But what is the vision here? What might these be able to do in the future? Or even philosophically speaking, what do these teach us about the world? We know a 1D cellular automata is Turing equivalent, so, at least from one perspective, NCA/these aren’t terribly suprising.

data-ottawa · on March 7, 2025

Potentially it would be useful if you could enter a grid from satelite images and simulate wildfire spread or pollution spread or similar problems.

achille · on March 7, 2025

these are going to be the dominant lifeforms on earth exceeding bacteria, plants and humans in terms of energy consumption

cellular automata that interact with their environment, ones that interact with low level systems and high level institutions. to some approximation we, humans are just individual cells interacting in these networks. the future of intelligence aint llms, but systems of automata with metabolic aspects. automata that co-evolve, consume energy and produce value. ones that compete, ones that model each other.

we're not being replaced, we're just participants in a transformation where boundaries between technological and cellular systems blur and eventually dissolve. i'm very thankful to be here to witness it

see: https://x.com/zzznah/status/1803712504910020687

ryukoposting · on March 7, 2025

I'll have what this guy is smoking. Those visualizations are pretty, though.

I can imagine this being useful for implementing classifiers and little baby GenAI-adjacent tech on an extremely tiny scale, on the order of several hundred or several thousand transistors.

Example: right now, a lot of the leading-edge biosensors have to pull data from their PPG/ECG/etc chips and run it through big fp32 matrices to get heart rate. That's hideously inefficient when you consider that your data is usually coming in as an int16 and resolution any better than 1bpm isn't necessary. But, fp32 is what the MCU can do in hardware so it's what you gotta do. Training one of these things to take incoming int16 data and spit out a heart rate could reduce the software complexity and cost of development for those products by several orders of magnitude, assuming someone like Maxim could shove it into their existing COTS biosensor chips.

achille · on March 7, 2025

yes absolutely: current systems are wildly inefficient. the future is one of extreme energy efficiency.

re smoking: sorry let me clarify my statement. these things will be the dominant life forms on earth in terms of metabolism, exceeding the energy consumption of biological systems, over 1k petawatt hours per year, dwarfing everything else

the lines betwen us may blur metaphorically, we'll be connected to them how we're connected to ecosystems of plants and bacteria. these systems will join and merge in the same way we've merged with smartphones -- but on a much deeper level

BriggyDwiggs42 · on March 7, 2025

Okay so another way to put it is that these are gonna be the software we run on lots of computers in the future. Why this particular model of intelligence and not some other one?

suddenlybananas · on March 7, 2025

So grandiose. It's a good thing to rapture is happening when you're alive to see it. You're just that important.

achille · on March 7, 2025

i wasn't around to see the first humans land on the moon. i feel a similar deep sense of awe and excitement to see this revolution

ysofunny · on March 7, 2025

because the goal of life is to maximize metabolic throughput?

or to minimze energetic waste?

emmelaich · on March 7, 2025

The self-healing properties suggest biological evolution to me.

ekez · on Aug 15, 2024

If "we" all got together and reduced permitting costs, "our" homes would be much more valuable because a developer can build a highrise where the house is.

ekez · on Aug 3, 2024

why is it impossible to decide whether x<0, x=0, or x>0 as in Example 1?

fgdorais · on Aug 4, 2024

This is better known as the Table Maker's Dilemma.

Say you have some computable number p, that means you can compute a (rational) approximation p' to p within any given tolerance eps > 0 (i.e. you know |p - p'| < eps). To determine whether p > 0, p = 0, or p < 0, you compute an approximation p' to a certain tolerance eps. If p' > eps then you know p > 0, if p < -eps then you know p < 0, otherwise you need a better approximation... Without further knowledge about p, there is no point where you can assert p = 0.

ekez · on Aug 5, 2024

Thank you for the response. (I assume you mean "p' < -eps" in the second conditional?)

How often are rational approximations computable within any given tolerance?

lisper · on Aug 4, 2024

Only x=0 is undecidable. It's because you have to check an infinite number of digits past the decimal point to see if all of them are zero, and that will not halt.

teraflop · on Aug 4, 2024

> Only x=0 is undecidable.

That cannot be true, because if both x<0 and x>0 are decidable then x=0 is also decidable.

nyrikki · on Aug 4, 2024

Equality of real numbers is undecidable. Specifically recursively undecidable, which is the same as uncomputable.

The Real numbers are topologically weird when you dig into it.

Random example.

https://mathworld.wolfram.com/RichardsonsTheorem.html

teraflop · on Aug 4, 2024

I agree that given two computable real numbers x,y, asking whether x=y is undecidable.

The point I'm making is that it can't be true that x<y or x>y are decidable, because then x=y would also be decidable, and it isn't.

nyrikki · on Aug 4, 2024

A computable number is real number that can be computed to by a finite, terminating algorithm.

x<y meets that definition because you can compare digits until one doesn't match.

The same can be done for not equal, testing until there is a difference.

Real real numbers are infinite precision.

Showing that a number is not, not equal always requires you to normalize, truncate etc...

No matter how many digits you test, there is an uncountable infinity of real numbers between you epsilon.

That means you do not have a finite terminating algorithm.

nequo · on Aug 4, 2024

Is x < y decidable or semidecidable?

If x and y are the same number, then a Turing machine that compares the digits of x and y to determine whether x < y will never terminate. Doesn’t this mean that x < y is not decidable?

nyrikki · on Aug 4, 2024

A Turing machine as an infinite long tape, and even if it takes 10 times the time to the heat death of the universe x<y will halt.

Finite time is not practical time.

Given two Turing machines A and B approximating numbers a and b, where a ≠ b, you can use an epsilon to prove a > b, that is 'sufficient' for a proof.

You are trying to convert that one sided, semi-decidable 'a>b' into a two sided a>b f Id yes and a≤b is no.

That is undecidable.

Even if you change the ≤ into <, which does imply the exitance of b. It doesn't prove equality.

You have to resort to tactics like normalization for that.

Type theory is probably a good path to understand why. You seem to be coming from a set perspective and you will run into the limits of ZFC that way.

nequo · on Aug 4, 2024

> A Turing machine as an infinite long tape, and even if it takes 10 times the time to the heat death of the universe x<y will halt.

> Finite time is not practical time.

Sorry for my daftness but if x and y are both a transcendental number, for example pi, then x and y have infinitely many digits so the Turing machine that determines whether x < y is true will not halt, even with an infinitely long tape, or will it? What am I misunderstanding?

If you could recommend a source that explains the basics of these arguments, I’d appreciate it.

nyrikki · on Aug 4, 2024

Pi is transcendental but computable.

For an introduction to real real analysis is an old book typically called the 'Baby Rudin' Principles of Mathematical Analysis by Walter Rudin.

It is tiny and expensive, as most college math text books are once you get past a certain level. No idea if there is something better now, it is from the 1950's and there are used copies in most used book shops for cheap (or there was). It has no definition of 'equals' but does have a concept of equivalent relations.

Dummit and Foote's Abstract Algebra book is a great reference, especially with the problem with AI pollution. Coming from a General Relativity background, it still took me two years to work through on my own.

I don't know any suggestions on a more modern approach of starting with type or category theory.

I am not really interested in constructivist math, it is a problem to get around for me, so maybe someone who is can chime in on a better path.

While not rigorous, this paper does a good job at explaining just how muddy the concept of 'equality' is.

https://arxiv.org/abs/2405.10387

WithinReason · on Aug 4, 2024

What if you want to decide if x<y when actually x==y? Then your comparator doesn't halt.

nyrikki · on Aug 4, 2024

You need to move past decimal expansion, there are sufficient methods to prove one sided limits on the reals in finite time.

mgsouth · on Aug 4, 2024

Haven't seen the paper, but I'd guess the argument is something like "if the domain for x includes 0, then [edit: whenever x = 0] the Turing machine will get into an infinite loop looking for the first inverted bit[1]. _None_ of the three states will ever be provably true or false. If the domain _excludes_ 0 then you can prove that the machine will hit a non-zero bit in finite time, thus terminate, and either x<0 or x>0 will be shown true. But if you exclude 0 then you've also shown that the =0 state will never be true."

[1] multiple zero bits and a one bit is >0, multiple 1 bits and a zero bit is <0. Infinite zero bits (even if followed by a one bit) is +0, infinite 1 bits (even if followed by a zero bit) is -0.

nine_k · on Aug 4, 2024

An exact comparison is undecidable. You can only say if the numbers sufficiently differ. This is equivalent to saying that x > 0 + ε or x < 0 - ε, where ε > 0. If you represent numbers in binary, ε = 2^(-n), where n is the maximum number of bits you allocate to represent a number. Obviously you have n < ∞ in any finite machine.

lisper · on Aug 4, 2024

Nope. You're thinking about this mathematically, not algorithmically. The question is essentially to decide whether the output of a TM is all zeros or whether it will eventually output a 1. If it outputs a 1 you will know that in a finite amount of time by running the program, but if it never outputs a 1 then you can't tell that by running the program, and it's impossible in general to tell any other way because that would be tantamount to solving the halting problem.

Here is a concrete example: write a program that systematically tests all numbers for counterexamples to the Goldback conjecture. Every time a number fails to be a counterexample, output a zero, otherwise output a 1. Consider the output of that program to be a real number expressed as a binary decimal. If that number is not zero you will know in a finite amount of time, but if it is zero you will not know that until someone proves the Goldbach conjecture.

bubblyworld · on Aug 4, 2024

The order relation on computable reals is undecidable - this is well known, and the parent's argument is correct. If both x<y and x>y were decidable then you could easily decide equality too by just checking them and seeing if the answer to both was "no".

Computability theory is mathematics.

lisper · on Aug 4, 2024

Tracing back through this thread, the term "deciable" was introduced in a very unfortunate way:

ekez: why is it impossible to decide whether x<0, x=0, or x>0 as in Example 1?

lisper: Only x=0 is undecidable. It's because you have to check an infinite number of digits past the decimal point to see if all of them are zero, and that will not halt.

teraflop: That cannot be true, because if both x<0 and x>0 are decidable then x=0 is also decidable.

Both lisper (me) and teraflop are correct. The problem is that inequality is not decidable, it's semi-decidable. It's teraflop who introduced the word "decidable" into the discussion, not me, but I didn't pay enough attention to that at that time.

/u/scarmig got it right in this comment: https://news.ycombinator.com/item?id=41150519

bubblyworld · on Aug 4, 2024

Right, I agree that they are semidecidable, thanks for clarifying. My interpretation of "only x=0 is undecidable" was that you were claiming the other two were decidable (which is a reasonable reading imo). But it sounds like we don't actually disagree here.

lisper · on Aug 4, 2024

It's my fault for not being more explicit. I just didn't think it all the way through.

teraflop · on Aug 4, 2024

> Nope. You're thinking about this mathematically, not algorithmically.

I don't know what you mean by this, and I'm not sure how it relates to what I said. I'm using "decidable" in the strict, computer-science sense of the word.

The statement in example 1 of the paper, which we're discussing, is about computable real numbers. A computable real number is one for which there is a Turing machine that can calculate it to any desired finite precision in finite time.

A semidecidable problem is one for which there is a program that halts if the answer is "yes", but may not halt if the answer is "no". The halting problem is in this category.

Given a computable real number x, asking whether x<0 or x>0 are semidecidable but not decidable problems.

lisper · on Aug 4, 2024

Yes, you're right. See https://news.ycombinator.com/item?id=41154273 for my post-mortem.

scarmig · on Aug 4, 2024

x > 0 and x < 0 are both semidecidable. Which is to say, they can be decided, but their converses cannot. So if either of those are true, they will halt (by running them in parallel, and halting once one halts), but if neither of them are, it may not halt.

If you remove 0 from possible inputs, you no longer need to worry about that possibility, and the problem is then decidable.

ekez · on March 25, 2024

The metric the authors use confuses me.

Edit distance seems like a strange way to test if the model understands arithmetic ([1], Figure 3). I think `1+3=3` would be equally as correct as `1+1=9`?

Why not consider how far off the model is `abs(actual-expected)`? I wonder if there is an inflection point with that metric.

https://arxiv.org/abs/2206.07682

n2d4 · on March 25, 2024

It depends on how you do arithmetic. If you're a human and you do column addition, 12345+35791=58136 is just as big of a mistake as 48146 (the actual result is 48136). It's just one mistaken column in both. Binary half-adders work the same way.

We don't really know how LLMs do arithmetic. Maybe token edit distance would be interesting, but either way it doesn't really change the claim of the paper.

Unrelated: The link is incorrect, the one you're referring to is here: https://arxiv.org/pdf/2304.15004.pdf

HarHarVeryFunny · on March 25, 2024

Yeah, and as an aside I wonder how hard it would be to train an LLM to do addition, multiplication, etc, human style? Presumably it should be possible at least in step-by-step style (as substitute for short-term memory), the same way that we do it.

Without using an algorithmic approach, it seems an LLM can only learn a bunch of partially correct heuristics, and attempt to generalize over examples.

I've played with this a bit in the past, and came to the conclusion that GPT-3 seems to have learnt to compare the size of numbers (whether accurately or via heuristics), and would get the approximate size of an answer right (depending on the task), even if not the actual value right. I seem to recall it also doing this for tasks like asking for a prime number greater than a particular value.

pixl97 · on March 25, 2024

I mean, is it efficient to teach them addition human style instead of heuristics of when to call the right function?

Imagine you could say 'calc' in your brain, and some separate subcomponent of your brain that is far more power efficient could return an answer almost instantly? You would not focus on understanding addition/subtraction, more you would focus on on when to use addition/subtraction/multiplication or whatever.

HarHarVeryFunny · on March 25, 2024

> is it efficient to teach them addition human style

Not if math is your only goal, but there'd be value in making the models more powerful so that they could learn to do simple things like this (and not so simple things) by themselves. You can't have a tool for everything, and hopefully future AGI can itself do things that are more than just a mashup of existing tool capabilities.

ekez · on Dec 3, 2023

> ChatGPT specifically seems to absolutely adore the phrase [complex and multifaceted], using it at every opportunity to explain higher level concepts.

I'm curious why this is. Before the echo chamber, did the LLM lean some ontology of speech?

ekez · on Nov 18, 2023

The title is an allusion to the experiment which samples the output of the 12th layer of LLaMA-7b. Surprisingly, there is a fair bit of structure on the principle components of those samples.

ekez · on July 4, 2021

Generally agree - the goal here was mostly to illustrate how simple a TCP proxy actually could be.

I shiver to think of how many lines of code actually get run with the tokio macro and green threads.

ekez · on July 2, 2021

What are the advantages of tagged memory as opposed to using the unused bits in a regular pointer as a tag?

retrac · on July 2, 2021

Usually, tagged memory at a hardware level, goes along with support for tagged operations in the processor.

In the LISP machines for example, you had an add instruction. Which would happily work correctly on pointers, floats, and integers depending on the data type, at the machine code level. Offers safety and also makes the the compiler simpler.

But where this really shines is in things like, well, lists, since the tags can distinguish atoms from pairs and values like nil, fairly complex list-walking operations can be done in hardware, and pretty quickly too. It also makes hardware implementation of garbage collection possible.

This is just my intuition, but I suspect, these days, it all works out to about the same thing in the end. You use some of the cache for code that implements pointer tagging, or you can sacrifice some die area away from cache for hardwired logic doing the same thing. It probably is in the same ballpark of complexity and speed.

gumby · on July 2, 2021

In addition to what retrac wrote, these tag bits would apply to immediates as well, not just pointers.

fulafel · on July 3, 2021

Speed, like is usual with hw vs software implementations of things. But the difference is these days less than it used to be, because processors spend most of their time stalled on all kinds of things and the ALU processing capacity is underused most of the time.

An interesting question for a modern ISA design would be to figure out how to make tagged words and memory work well with SIMD.

gumby · on July 3, 2021

> An interesting question for a modern ISA design would be to figure out how to make tagged words and memory work well with SIMD.

Not sure what the issue might be. Let’s say you’re doing a multiply-add: you’d call the one for the data type you want and if any operand were of the wrong type you’d get a fault. Am I missing something?

fulafel · on July 3, 2021

That sounds like a lot of mode bits or instruction variants to me. But maybe it wouldn't be a problem.

gumby · on July 3, 2021

Consider that your SIMD instruction might itself take a tag mask and the ALU need only do equality on the tag field. In fact it could do that in parallel with the ALU op; on mismatch you could simply discard the current state and abort the operation. However realistically you’d want the same set of SIMD instructions as an I tagged architecture anyway.

Also I expect any compiler would assume that the contents of an array subject to SIMD computation would be homogeneous anyway, perhaps trying to enforce it elsewhere.

In any case this doesn’t seem like a big deal to me…but I could be wrong!

fulafel · on July 4, 2021

Yep, sounds like a good sketch.

I guess to really get into it a good start would be to work with existing SIMD and take a quantitative approach to what is actually the most hot part of it. I wonder if any existing language implementations (eg Common Lisp) attempt to do these kinds of things in the first place.