I've been following Unison for a long time, congrats on the release!
Unison is among the first languages to ship algebraic effects (aka Abilities [1]) as a major feature. In early talks and blog posts, as I recall, you were still a bit unsure about how it would land. So how did it turn out? Are you happy with how effects interact with the rest of the language? Do you like the syntax? Can you share any interesting details about how it's implemented under the hood?
> Unison is among the first languages to ship algebraic effects (aka Abilities [1]) as a major feature. In early talks and blog posts, as I recall, you were still a bit unsure about how it would land. So how did it turn out?
No regrets! This Abilities system is really straightforward and flexible. You find yourself saying that you don't miss monads, if you were already of the FP affiliation. But were glad that it means that you don't have to understand why a monad is like a burrito to do FP.
> Do you like the syntax?
So this one is very loaded. Yes we LOVE the syntax and it is very natural to us, but that is because most of us that are working on the language had either already been fluent in haskell, or at least had gotten to at least a basic understanding to the point of of "I need to be able to read these slides". However we recognize that the current syntax of the language is NOT natural to the bulk of who we would like to be our audience.
But here's the super cool thing about our language! Since we don't store your code in a text/source code representation, and instead as a typechecked AST, we have the freedom to change the surface syntax of the language very easily, which is something we've done several times in the past. We have this unique possibility that other languages don't have, in that we could have more than one "surface syntax" for the language. We could have our current syntax, but also a javascript-like syntax, or a python-like syntax.
And so we have had lots of serious discussions recently about changing the surface syntax to something that would be less "weird" to newcomers. The most obvious one being changing function application from the haskell style "function arg1 arg2" style to the more familier "c?" like style of "function(arg1, arg2)". The difficulties for us will be trying to figure out how to map some of our more unique features like "what abilities are available during function application" onto a more familiar syntax.
So changing the syntax is something that we are seriously considering, but don't yet have a short term plan for.
What is the data you actually store when caching a successful test run? Do you store the hash of the expression which is the test, and a value with a semantics of "passed". Or do you have a way to hash all values (not expressions/AST!) that Unison can produce?
I am asking because if you also have a way to cache all values, this might allow to carry some of Unison's nice properties a little further. Say I implement a compiler in Unison, I end up with an expression that has a free variable, which carries the source code of the program I am compiling.
Now, I could take the hash of the expression, the hash of the term that represents the source code, i.e., what the variable in my compiler binds to, and the hash of the output. Would be very neat for reproducibility, similar to content-addressed derivations in Nix, and extensible to distributed reproducibility like Trustix.
I guess you'll be inclined to say that this is out of scope for your caching, because your caching would only cache results of expressions where all variables are bound (at the top level, evaluating down). And you would be right. But the point is to bridge to the outside of Unison, at runtime, and make this just easy to do with Unison.
Feel free to just point me at material to read, I am completely new to this language and it might be obvious to you...
Yes, we have a way of hashing literally all values in the language, including arbitrary data types, functions, continuations, etc. For instance, here, I'm hashing a lambda function:[1]
> crypto.hash Sha3_256 (x -> x + 1)
⧩
0xs704e9cc41e9aa0beb70432cff0038753d07ebb7f5b4de236a7a0a53eec3fdbb5
The test result cache is basically keyed by the hash of the expression, and then the test result itself (passed or failed, with text detail).
We only do this caching for pure tests (which are deterministic and don't need to be re-run over and over), enforced by the type system. You can have regular I/O tests as well, and these are run every time. Projects typically have a mix of both kinds of tests.
It is true that you can only hash things which are "closed" / have no free variables. You might instead hash a function which takes its free variables as parameters.
Overall I think Unison would be a nice implementation language for really anything that needs to make interesting use of hashing, since it's just there and always available.
Thank you! (And thanks for following along for all the years!)
I'll speak a bit to the language audience, and others might weigh in as they see fit. The target is pretty broad: Unison is a general-purpose functional language for devs or teams who want to build applications with a minimal amount of ceremony around writing and shipping applications.
Part of the challenge of talking about that (the above might sound specious and bland) is that the difference isn't necessarily a one-shot answer: everything from diffing branches to deploying code is built atop a different foundation. For example, in the small: I upgraded our standard lib in some of my projects and because it is a relatively stable library; it was a single command. In the large: right now we're working on a workflow orchestration engine; it uses our own Cloud (typed, provisioned in Unison code, tested locally, etc) and works by serializing, storing, and later resuming the continuation of a program. That kind of framework would be more onerous to build, deploy, and maintain in many other languages.
Really cool project. To be honest, I think I don't fully understand the concept of a content addressed language. Initially I thought this was another BEAM language, but it seems to run on its own VM. How does Unison compare to BEAM languages when it comes to fault tolerance? What do you think is a use case that Unison shines that Erlang maybe falls short?
Erlang is great and was one inspiration for Unison. And a long time ago, I got a chance to show Joe Armstrong an early version of Unison. He liked the idea and was very encouraging. I remember that meant a lot to me at the time since he's a hero of mine. He had actually had the same idea of identifying individual functions via hashes and had pondered if a future version of Erlang could make use of that. We had a fun chat and he told me many old war stories from the early days of Erlang. I was really grateful for that. RIP, Joe.
Re: distributed computing, the main thing that the content-adressed code buys you is the ability to move computations around at runtime, deploying any missing dependencies on the fly. I can send you the expression `factorial 4` and what I'm actually sending is a bytecode tree with a hash of the factorial function. You then look this up in your local code cache - if you already have it, then you're good to go, if not, you ask me to send the code for that hash and I send it and you cache it for next time.
The upshot of this is that you can have programs that just transparently deploy themselves as they execute across a cluster of machines, with no setup needed in advance. This is a really powerful building block for creating distributed systems.
In Erlang, you can send a message to a remote actor, but it's not really advisable to send a message that is or contains a function since you don't know if the recipient has that function's implementation. Of course, you can set up an Erlang cluster so everyone has the same implementation (analogous to setting up a Spark cluster to have the same version of all dependencies everywhere), but this involves setup in advance and it can get pretty fragile as you start thinking about how these dependencies will evolve over time.
A lot of Erlang's ideas around fault tolerance carry over to Unison as well, though they play out differently due to differences in the core language and libraries.
Mobile or browser clients talk to a Unison backend services over HTTP, similar to any other language. Nothing fancy there.[1]
> sending code over the network to be executed elsewhere feels like a security risk to me?
I left out many details in my explanation and was just describing the core code syncing capability the language gives you. You can take a look at [2] to see what the core language primitives are - you can serialize values and code, ask their dependencies, deserialize them, and load them dynamically.
To turn that into a more industrial strength distributed computing platform, there are more pieces to it. For instance, you don't want to accept computations from anyone on the internet, only people who are authenticated. And you want sandboxing that lets you restrict the set of operations that dynamically loaded computations can use.
Within an app backend / deployed service, it is very useful to be able to fork computations onto other nodes and have that just work. But you likely won't directly expose this capability to the outside world, you instead expose services with a more limited API and which can only be used in safe ways.
[1] Though we might support Unison compiling to the browser and there have already been efforts in that direction - https://share.unison-lang.org/@dfreeman/warp This would allow a Unison front end and back end to talk very seamlessly, without manual serialization or networking
Not a dumb question at all! Unison's type system uses Abilities (algebraic effects) for functional effect management. On a type level, that means we can prevent effects like "run arbitrary IO" on a distributed runtime. Things that run on shared infrastructure can be "sandboxed" and prevented with type safety.
The browser or mobile apps cannot execute arbitrary code on the server. Those would typically call regular Unison services in a standard API.
I'm curious about how the persistence primitives (OrderedTable, Table, etc) are implemented under the hood. Is it calling out to some other database service? Is it implemented in Unison itself? Seems like a really interesting composable set of primitives, together with the Database abstraction, but having a bit of a hard time wrapping my head around it!
Hey there! Apologies for not getting to you sooner. The `Table` is a storage primitive implemented on top of DynamoDB (it's a lower-level storage building block - as you've rightfully identified; these entities were made to be composable, so other storage types can be made with them). Our `OrderedTable` docs might be of interest to you: they talk about their own implementation a bit more (BTrees); and `OrderedTable` is one of our most ergonomic storage types: https://share.unison-lang.org/@unison/cloud/code/releases/23...
The Database abstraction helps scope and namespace (potentially many) tables. It is especially important in scoping transactions, since one of the things we wanted to support with our storage primitives is transactionality across multiple storage types.
Congrats on 1.0! I've been interested in Unison for a while now, since I saw it pop up years ago.
As an Elixir/Erlang programmer, the thing that caught my eye about it was how it seemed to really actually be exploring some high level ideas Joe Armstrong had talked about. I'm thinking of, I think, [0] and [1], around essentially a content-addressable functions. Was he at all an influence on the language, or was it kind of an independent discovery of the same ideas?
I am not 100% sure the origin of the idea but I do remember being influenced by git and Nix. Basically "what if we took git but gave individual definitions a hash, rather than whole working trees?" Then later I learned that Joe Armstrong had thought about the same thing - I met Joe and talked with him about Unison a while back - https://news.ycombinator.com/item?id=46050943
Independent of the distributed systems stuff, I think it's a good idea. For instance, one can imagine build tools and a language-agnostic version of Unison Share that use this per-definition hashing idea to achieve perfect incremental compilation and hyperlinked code, instant find usages, search by type, etc. It feels like every language could benefit from this.
Hi there, and congrats on the launch. I've been following the project from the sidelines, as it has always seemed interesting.
Since everything in software engineering has tradeoffs, I have to ask: what are Unison's?
I've read about the potential benefits of its distributed approach, but surely there must be drawbacks that are worth considering. Does pulling these micro-dependencies or hashing every block of code introduce latency at runtime? Are there caching concerns w.r.t. staleness, invalidation, poisoning, etc.? I'm imagining different scenarios, and maybe these specific ones are not a concern, but I'd appreciate an honest answer about ones that are.
There are indeed tradeoffs; as an example, one thing that trips folks up in the "save typed values without encoders" world is that a stored value of a type won't update when your codebase's version of the type updates. On its face, that should be a self-evident concern (solvable with versioning your records); but you'd be surprised how easy it is to `Table.write personV1` and later update the type in place without thinking about your already written records. I mention this because sometimes the lack of friction around working with one part of Unison introduces confusion where it juts against different mental models.
Other general tradeoffs, of course, include a team's tolerance for newness and experimentation. Our workflow has stabilized over the years, but it is still off the beaten path, and I know that can take time to adjust to.
I hope others who've used Unison will chime in with their tradeoffs.
These seem to be mostly related to difficulties around adapting to a new programming model. Which is understandable, but do you have examples of more concrete tradeoffs?
For example, I don't think many would argue that for all the upsides a functional language with immutable state offers, performance can take a significant hit. And if can make certain classes of problems trickier, while simplifying others.
Surely with a model this unique, with the payoffs come costs.
Unison does diverge a bit from the mainstream in terms of its design. There's a class of problems around deploying and serializing code that involve incidental complexity and repetitive work for many dev teams (IDLs at service boundaries and at storage boundaries, provisioning resources for cloud infrastructure) and a few "everyday programming" pain points that Unison does away with completely (non-semantic merge conflicts, dependency management resolution).
To what Unison is actually compiled to and how is it ran? Gemini LLM says it compiles down to Scheme with is then ran Chez Scheme, how much of an LLM hallucination is that?
That information is a bit out of date, though correct at the time. We've put the Chez Scheme interpreter on ice, and we focused on runtime improvements to the Haskell interpreter. So, currently, Unison compiles to Haskell.
I know how fraught performance/micro-benchmarks are. But do you have any data on how performant it is? Should someone expect it to perform similar to Haskell?
By and large, our CEO did, but the website content is open source and has been iterated on by many hands over the years. If you have suggestions, feel free to drop a note in a ticket: https://share.unison-lang.org/@unison/website
aha yeah! good question! We have two different types of type declarations, and each has its own keyword: "structural" and "unique". So you can define two different types as as
structural type Optional a = Some a | None
structural type Maybe a = Just a | Nothing
and these two types would get the same hash, and the types and constructors could be used interchangeably. If you used the "unique" type instead:
unique type Optional a = Some a | None
uniqte type Maybe a = Just a | Nothing
Then these would be totally separate types with separate constructors, which I believe corresponds to the `BRANDED` keyword in Modula 3.
Originally, if you omitted both and just said:
type Optional a = Some a | None
The default was "structural". We switched that a couple of years ago so that now the default is "unique". We are interestingly uniquely able to do something like this, since we don't store source code, we store the syntax tree, so it doesn't matter which way you specified it before we made the change, we can just change the language and pretty print your source in the new format the next time you need it.
How does the implementation of unique types works? It seems you need to add some salt to the hashes of unique type data, but where does the entropy come from?
There's an algorithm for it. The thing that actually gets assigned a hash IS a mutually recursive cycle of functions. Most cycles are size 1 in practice, but some are 2+ like in your question, and that's also fine.
Does that algorithm detects arbitrary subgraphs with a cyclic component, or just regular cycles? (Not that it would matter in practice, I don't think many people write convoluted mutually recursive mess because it would be a maintenance nightmare, just curious on the algorithmic side of things).
I don’t totally understand the question (what’s a regular cycle?), but the only sort of cycle that matters is a strongly connected component (SCC) in the dependency graph, and these are what get hashed as a single unit. Each distinct element within the component gets a subindex identifier. It does the thing you would want :)