This is lovely. I’ve had a vague notion that I wanted something like this… but n...

surprisetalk · on April 28, 2023

This is one great question I've wrestled with over the years.

Right now, all three programs would result in a different hash. I decided that the scrapyard publication process should do as little "magic" as possible.

As the tooling evolves, we can provide a canonicalizer scrap that renames variables and simplifies the code. I think eventually we would make this the default process to encourage more reuse.

alephaleph · on April 28, 2023

You might want to look into the way Unison does things: [1], they've got a lot of similar ideas and IIRC the way their hash system works avoids that ambiguity.

[1]: https://www.unison-lang.org/learn/the-big-idea/

surprisetalk · on April 28, 2023

Wow, that's a great guide!

I didn't read it in-depth yet, but I think I had a previous implementation that worked very similarly to theirs.

The problem with variable names is that they actually matter for some of the collaborative editor features I hope to implement in the future. There are certain assumptions I can't make quite yet, and so am keeping my options open :)

pcl · on April 28, 2023

Have you thought about using type aliases or higher order types of some sort for that instead of relying on the variable name?

I could define that a familyname and a givenname are strings, and then a function argument of type familyname would be differentiated from a function argument of type givenname, but still the argument names themselves could be dropped from the hash.

sagebird · on April 28, 2023

Thinking a bit more:

A-If a program is a thing that produces a result for a given input, these programs are the same.

B-You can’t generally know if two programs are the same without solving halting problems, reducing polynomial algos to linear time or having infinite time and space computing resources, etc.

C-The best you can do is ask if two compilers produce the same output, or a linting/refactoring/canonicalization/formatting operation results in same source.

D-If a program is a thing that interactive debuggers use to create explorations next to source code while computing, these programs are different.

I agree with your approach of starting with these programs as being different, and allowing custom scraps to become a higher layer hash resolver. The reason is that compilers change over time, and it is only under the lens of a compiler(or linter, etc) that comparisons can be made. To bake a holy compiler in today would be arbitrary and quickly outdated.

xigoi · on April 28, 2023

Due to the halting problem, it's not possible to tell in general if two programs are equivalent.

amelius · on April 28, 2023

As far as I know, nobody ever proved that some programs for which halting cannot be determined are smaller in size than the number of atoms in the universe. Or in other words, that the theorem applies to practical programs. It's purely theoretical.

xigoi · on April 28, 2023

Here is a Python program which is definitely smaller than the number of atoms in the universe. Can you tell me whether it halts?

  n = 1
  while sum(d for d in range(1,n) if n % d == 0) != n:
      n += 2

scythmic_waves · on April 29, 2023

Oh this is a good one. It halts if there’s an odd perfect number, right?

xigoi · on April 29, 2023

amelius · on April 28, 2023

If I can't then that's no proof that the halting theorem is true for practical programs.

xigoi · on April 29, 2023

If it's so hard even for such a simple program to determine whether it halts, why do you expect it to not be a problem for much more complex programs?

amelius · on April 29, 2023

It absolutely is a problem. But don't say that there is no solution to it unless you have a proof.

I'd be impressed if you gave me a program for which it provably cannot be proved that (edit: if) it halts. Or a bound on its size.

xigoi · on April 29, 2023

> I'd be impressed if you gave me a program for which it provably cannot be proved that it halts. Or a bound on its size.

For every program that halts, it's obviously possible to prove that it halts. The point is that there is no algorithm for deciding if an arbitrary given program halts.

amelius · on April 29, 2023

> there is no algorithm for deciding if an arbitrary given program halts.

But can you prove that statement if I change it into:

there is no algorithm for deciding if an arbitrary given program with size less than N halts; for some suitable definition of size.

xigoi · on April 29, 2023

Since we're considering practicality, let's change the statement to: “There is no algorithm for deciding if an arbitrary given program with size less than 1000 bytes halts which can be implemented without solving several famous open mathematical problems”. I think my program pretty clearly shows that this is true.

dragonwriter · on April 29, 2023

[never mind, misread the conditiom]

xigoi · on April 29, 2023

Why? This gives sum([]) != 1, which is true. Have you actually tried running the program?