I appreciate Ladybird's initiative. But if they work with Servo, Ladybird can build the browser and Servo can focus on the engine. Also we can avoid C++ nightmare. Everybody wins.
Ladybird seems to have more momentum and be further along in development in my testing of visiting random websites. This may or may not have something to do with developer velocity of each language, genuinely I don't know but I think it's worth considering.
Regardless, from what I've gathered, Ladybird is going to ship of theseus their way into memory safety. It's not announced what the C++ replacement language will be, but they are working towards that.
In what way? Rendering pages CSS compatibility? I tried servo on Windows and it worked, not so much for Ladybird - granted, I wasn't feeling up to task of compiling it for Windows.
Ladybird doesn't support Windows yet because most developers use Linux or macOS. Ladybird has been progressing faster when it comes to CSS rendering and JavaScript support.
One of the crimes of modern imperative programming languages is not having ADTs (except maybe Rust) built-in. It is such a basic mental model of how humans think and solve problems. But instead we got inheritance and enums which are practically very primitive.
Moreover, they have already been proposed by John McCarthy in October 1964, 60 years ago, for inclusion in the successor of ALGOL 60, which makes even more weird the lack of widespread support.
(And in fact Algol 68 had a better implementation than most later languages, but Algol 68 was missing completely any documentation suitable for newbies, like tutorials and programming examples, while not being promoted by any hardware vendor, like IBM or DEC, so it was doomed.)
>The more I ponder the principles of language design, and the techniques which put them into practice, the more is my amazement and admiration of ALGOL 60. Here is a language so far ahead of its time, that it was not only an improvement on its predecessors, but also on nearly all its successors.
While this is a step up from C, it is still a long way from the full power and generality of algebraic data types. The key word here is algebraic. In a language with ADTs, such as Haskell, you can pattern match on an arbitrarily complex types, not just the outermost tag. A contrived example (from [1]):
This is more of syntax sugar than power and generality, since nested pattern matching can be mechanically translated into "top-level" matching (e.g., see [1] and [2]).
[1] L. Augustsson. Compiling Pattern Matching. In Functional
Programming Languages and Computer Architecture, pages 368–
381, 1985.
[2] P. Wadler. Efficient Compilation of Pattern Matching. In S.L. Peyton
Jones, editor, The Implementation of Functional Programming
Languages, pages 78–103. Prentice Hall, 1987.
This is where I like to cite Dijstra's "Go To Statement Considered Harmful".
What a lot of people miss about that paper is that he wasn't just talking about goto statements. He was also making a more general observation about how more powerful and general programming language features are not necessarily desirable, because they tend to adversely impact developer productivity.
The reason I, as a user, prefer structured control flow statements over goto is not that I believe they are powerful. It's precisely because they are less powerful. The resulting constraints on how the program can be structured make it easier for me to read and reason about existing code. That makes maintaining code easier. It also makes optimization and static analysis easier. And it makes writing tests easier, too.
I have similar feelings about ADTs. The reason I prefer them to other ways of doing composite data types is not that I think they're more powerful. It's that they create constraints that tend to reduce the semantic complexity of the domain models people create in programming languages that use them. And that makes my job easier.
The corollary to that, though, is that I'm not actually all that hype about adding ADTs to existing languages. For reasons that are similar to how the mere availability of structured, reentrant function calls is small consolation in a codebase that's already riddled with goto statements. The real win doesn't come from using ADTs, it comes from not having to worry about all those other confusing overpowered things that aren't ADTs.
That's exactly where I am. Pattern matched sum types "feel great" to code in to an expert because they are a concise and reasonably tight way to express the otherwise boring "enumerate over possibilities" code.
But they're hard to read for anyone who isn't an expert on not just the language but the type in question (c.f. Rust's Option() idioms all looks like line noise to newbies, etc...). And that's a bad trade.
In essence, this stuff is just Perl all over again. It's a language feature that prioritizes concision over comprehension. And I say that as someone who really likes coding in perl. But "people" don't like perl, and the community moved on, and the reasons are... the same reason that uptake in ADTs is lagging where the experts want it to be.
You've got to distinguish syntax from semantics, though. I agree, it's easy to turn a big semantic win into a net readability loss if you choose to represent it with an overly terse syntax that promotes code golf.
Compilers are really smart. It's easy enough in the modern world to demand that a "type" field be checked against all enumerants, preserving the "semantic win". A plain old C switch statement with gcc -Wall will do this today, in fact.
Pattern matching on the "top level branch" of the ADT, whatever you call it, is pretty darned useful.
Pattern matching on the next level down is a power tool to be used with care.
Having used some pattern matching languages for quite some time, I find anything much deeper than that is a code smell at best and pathological at worst. Pattern matching creates coupling proportional to the depth/complexity/precision of the pattern match. The top-level coupling is often unavoidable; if you're pattern matching at all, you certainly care which "branch" you are on and there is likely no refactoring that away. But the danger rises rapidly the deeper in you go. It's just so easy to pattern match on a third-level part of the complex object when you really ought to be wrapping that behind a function somewhere, possibly itself emitting a sum type value.
... but if all you really need is that "top level" match, a lot of pattern matching syntax and features are not really necessary (if not positively dangerous).
> Pattern matching on the "top level branch" of the ADT, whatever you call it, is pretty darned useful. Pattern matching on the next level down is a power tool to be used with care.
Which is exactly how Perl apologia arguments went.
This argument is the most common fallacy I see in programming language discussions. I might as well give it a name right here: "Turing equivalence fallacy" or perhaps "syntax sugar fallacy."
All Turing Complete programming languages are Turing equivalent to one another. Programs written in one language can be mechanically transformed into those written in another. This is irrelevant to the discussion of programming languages. The whole point of creating different programming languages is to explore different ways to express the same program!
In programming language design, we tend to distinguish between global and local analysis. While type checking and elaboration is an example of global analysis, desugaring is inherently local to some piece of code. Therefore, "power" or "expressiveness" usually mean that something cannot be syntactically "expanded"; e.g., while type classes elaborate into explicit dictionaries, they still require information from the type checker, and therefore considered a "real" feature of a programming language. On the other hand, nested pattern matching can be formulated as local syntax transformation, and therefore it doesn't bring anything fundamentally new to the type system or dynamic semantics.
There's also a great talk on the matter [1], if somebody is interested in formalities.
You're approaching this from a PL design standpoint where the distinction is important, but from a user perspective it doesn't matter if it's just "syntax sugar" or if it's a super complicated to implement all that matters is whether the feature is available or not.
Typing features affect the way we design APIs. Libraries written in languages with type classes and without them can have completely different designs. If nested pattern matching is not available, this will not affect the APIs, only the function bodies -- because desugaring is local by definition.
That doesn't matter in practice. If two programming languages have the same underlying feature but one has syntactic sugar to make it very easy to use and the other does not (so is quite cumbersome to use) then you'll find that the library ecosystem for the former language will see the feature in widespread use whereas the ecosystem of the latter will tend to shun the feature.
This is one of the social factors of programming language design and it's one of the main reasons successful programming languages work so hard to establish a coherent philosophy and a set of best practices or idioms within the language. For similar reasons, I believe this is why "anything goes" languages such as LISP have struggled to gain widespread adoption: with no philosophy every programmer becomes an island unto themselves.
> "anything goes" languages such as LISP have struggled to gain widespread adoption: with no philosophy every programmer becomes an island unto themselves.
There are already two misconceptions.
First: "Lisp has no programming philosophies" and styles.
Not every program starts by zero. Since Lisp exists since the end 1950s, it has seen quite a lot in programming styles over the years and it may contain traces of several. Generally it may support more than one programming paradigm. For example during the Common Lisp standardization there was a wish to have a standardized object system. So instead of the multiple possible approaches (actors, message passing, prototype-based, ...), Common Lisp has just one: CLOS, the Common Lisp Object System. So, much of the object-oriented code written in CL is implemented in one particular object system: CLOS. Object Lisp, Flavors, LOOPs, Common Objects, and a bunch of other once had thus been replaced by one standard.
CLOS also defines a bunch of user-level macros: DEFCLASS, DEFMETHOD, DEFGENERIC, ... Everyone using CL & CLOS will use those macros.
Second: "every programmer becomes an island unto themselves". If we look at the way CLOS was designed: there was a core group of six people from three companies. Around that there was a mailing-list based communication with a large group of interested people. Early on a prototype was implemented as a portable implementation of CLOS. This was widely distributed among interested parties: implementors, companies, research groups, ... Then reports about the language extension and its layers were published, books were published, application & library code was published.
One of famous books coming out of this effort: "The Art of the Meta-Object Protocol". It contained also a toy implementation of CLOS in Common Lisp. Book and the implementation of CLOS (both the larger prototype and the toy implementation) showed in excellent quality how to write object-oriented Lisp code.
First: "Lisp has no programming philosophies" and styles
You misquoted me. I said no philosophy, singular. In the programming language context, a philosophy is a convention or a standard. Just as many standards implies that there is no standard, many philosophies implies no philosophy.
Everything else you said is evidence for my premise. Hire 3 different programmers, one from each of the communities, and you might as well have 3 different programming languages. That’s not a standard. That’s not a philosophy. That’s anything goes!
Most of the Common Lisp code that is accessible via public repositories conforms to conventions and is understandable.
Lisp programmers are highly motivated toward encouraging collaboration, since there aren't that many people in Lisp where you can afford to be turning people away toward projects that are easier to get into.
Also, you can easily hire 3 developers and get 3 different languages in, oh, Java or JavaScript. One person is doing Kotlin, another one Scala, ...
Three C++ programmers in the same room could also not understand each other. The greybeard speaking only C++98 with a bit of 2003 doesn't grok the words coming out of the C++20 girl's mouth and so it goes.
> You misquoted me. I said no philosophy, singular. In the programming language context, a philosophy is a convention or a standard. Just as many standards implies that there is no standard, many philosophies implies no philosophy.
That makes no sense.
> Hire 3 different programmers, one from each of the communities, and you might as well have 3 different programming languages.
Maybe not. They build on the same foundation, a language with a large standard, which is largely unchanged since three decades. A language which can be incrementally extended, without invalidating the rest. A language where extensions can be embedded, without invalidating the rest of the language or its tools.
Many projects use SBCL (now itself 25 years old and only incrementally grown) and a bunch of core libraries for it.
> That’s not a standard. That’s not a philosophy. That’s anything goes!
Most languages support widely different software development practices. Take JavaScript: it includes imperative, object-oriented and functional elements (similar to Lisp). It has huge amounts of competing frameworks (many more than any Lisp), where many of them have been superseded and many of them are built on a multitude of other libraries. The developer can pick and choose. Each projects will be different from other projects, depending on which libraries and programming frameworks it uses - and which of those the developer re-invents.
Any half-way powerful language (C++ -> templates, Ruby -> meta objects, Java -> objects & byte code & reflection & class loader, Smalltalk -> meta objects, Rust -> macros, C++ -> language interpreters, Java -> external configuration languages, C -> macro processor, ...) has ways to adapt the language to a certain style & domain.
Any large Java framework defines new standards, new configuration mechanisms, new ways to use new Java features (lambdas, streams, ...). See the large list of features added to the Java language over time: https://en.wikipedia.org/wiki/Java_version_history For many of them there were competing proposals. Each Java code base will use some subset/superset of these features, depending on what is needed and what the personal preferences are. And then the Java architect in the project will not be satisfied and will invent yet another configuration system, this time not using XML or JSON for the syntax, but develop a new embedded scripting language for the JVM and integrate that with his software. I have seen Java architects which eventually didn't understand their own configuration system any more.
If you think a language like Common Lisp is special here, then in reality it is not. It's just slightly different in that it has extensibility as one of its philosophies and provides defined interfaces for that. There is one macro mechanism, which is powerful enough, that for decades it has not been replaced. Every syntactic extension mechanism will use this macro system, which is documented, stable and widely used.
> I believe this is why "anything goes" languages such as LISP
Why do you think that Lisp is an "anything goes" language? What's your baseline? I think that C is no less an "anything goes" language, but with a much less pleasant UI.
> with no philosophy every programmer becomes an island unto themselves
Some people actually think that Lispers tend to be too philosophical
Lisp is anything goes because of macros. Hire ten different Lisp programmers and you’ll get ten different domain specific languages and no one will understand what everyone else has done. If there is a unifying philosophy of Lisp, it’s probably “don’t use macros” and yet so many ignore it!
For all its faults, C is quite easy to read and understand, even by beginner C programmers. Yes, C also has macros but their clumsiness helps to discourage their use.
I'm a Common Lisp programmer. I don't really know how to respond to that except to say, what you describe is not the case at all. Reading others' Common Lisp code is a joy compared to reading others' code in any other language.
Yes, features that are easy to use will be more often used, while inconvenient features will be less used. I don't quite see any controversy with my comment.
The point is that in both cases the underlying feature is present, so APIs will be compatible. However, the lack of syntactic sugar in the one case will make any API that uses the feature cumbersome to use in that language, so in practice it will be avoided.
Abstractly this is true, but software development is a human practice, so it matters not what's technically possible but what people actually do.
That's why the most important difference between C++ and Rust isn't some technicality even though the technical differences are huge, it's cultural. Rust has a Safety Culture and everything else is subservient to that difference.
Sugar matters, Rust's familiar looking loops are just sugar, it only "really" has a single way to do loops, the loop construct, an infinite loop you can break out of. But despite that, people deliberately write the other loops - and the linter strongly recommends that they write them, because the programs aren't just for machines to compile, they're for other humans to read, and a while let loop is an intuitive thing to read for example, so is the traditional for-each style iterator loop.
Of course the syntax sugar is a good thing if it makes it easier to write the code, but if the question is about "expressive power of the type system", it's not really relevant: Zig's type system can properly express a sum type. In addition: pattern matching is orthogonal to ADT, you can have pattern matching in both languages with and without algebraic types. Neither one implies the other.
Turing completeness has nothing to do with static type checking. Dynamically typed PLs can't express any type (except Any) yet are still Turing complete.
I once explored the Epigram dependently typed programming language. It used to preclude many types of free form pattern matches, due to heavy dependence on structured editor (you speciy a type, it generates pattern matching, you cannot change the structure), so it was almost completely unuseable for many, many tasks.
So, while you are formally right, the need of shortcuts in pattern matching is undeniable to me.
TS narrows union types cases based on conditionals like "if" (called discriminated unions in the docs in the past), and supports exhaustiveness checks. How do they differ in functionality from sum types?
This often comes up when writing a function which returns a wrapper over a generic type (like Option<T>). If your Option type is T | null, then there's no way to distinguish between a null returned by the function or a null that is part of T.
As a concrete example, consider a map with a method get(key: K) -> Option<V>. How do you tell the difference between a missing key and a key which contains `null` as a value?
If you have a function that will normally return a string, but can sometimes fail due to reasons, you may wish to yield an error message in the latter case. So you're going to be returning a string, or a string.
It's not what the content of the data is; it's how you're supposed to interpret it. You have two cases, success and failure, and control will flow differently depending on that, not based strictly on the type of data at hand. We just model those cases in a type.
Why would you return `string | string` here? Wouldn't you explicitly mark the error, and return `string | error`? (substitute error with whatever type you want there - null, Error, or your own thing)
> substitute error with whatever type you want there - null, Error, or your own thing
And if I want `error` to be `string` so I can just provide an error message? Proper disjoint sum types let me keep the two sides disjoint, even if they happen to be the same type. Union types will collapse on any overlap, so if I happen to instantiate `error` at `string` then I can no longer tell my error case from my data case.
No disrespect, but that still sounds entirely useless to me. I would never model something as `String | String` as that makes zero sense. You should use a `Result` or `Either` type for that like everyone does.
> You should use a `Result` or `Either` type for that like everyone does.
You have missed the thread context, which is whether `Either a a` (also written `a + a`) has any merits over simply `a` (which is identical to `a | a`). If you're on the `Either` train already, we are arguing over imaginary beef.
> No disrespect, but that still sounds entirely useless to me.
It is disrespectful to say something "makes zero sense", regardless of anything you might say to the contrary. You've misrepresented my point: nobody wants to model something as `String | String`.
If you have, say, an `Int | Bool`, and you pass both sides through some kind of stringification function, you're naturally going to get `String | String` without ever having written that type down yourself. You wouldn't necessarily want to collapse that to `String`, however, because you may -- for instance -- want to give strings and ints different decorations around them before finally flattening them. You might write such a function as something like
You couldn't run this if the result of the `mapBoth` had type `String | String`: that type is indistinguishable from `String`, and since you can't tell which case you're in, you wouldn't know which tag to prepend.
You could write the same function without passing through `String | String`:
And yes, in this especially contrived example, perhaps you may find that to make more sense anyway. But in longer pipelines, sometimes separated over multiple functions, often involving generic code, it becomes much harder to simply always fuse steps in this manner. This is why we say sum types (e.g. `String + String`) compose better: they don't behave any differently depending on whether the two sides are the same or not. You have explicit control over when the two sides rejoin.
Supports exhaustiveness checks only if you explicitly opt-in it (by coding to pattern where you use helper function that accepts `never` type). "Dicriminated Unions Type"/"Sum Types" feels very hacky there, at least syntax-wise, because it is constraint by being "JS + types" language. It's remarkable what Typescript can do, but having native Discriminated Unions in JS (hence in TS too) would be much more ergonomic and powerful.
Most of the common languages today have product types.
Java[1], Rust, Haskell, etc. have sum types.
I think it gets a bit more escoteric beyond that though - i don't doubt that there's probably some haskell extension for quotient types[2] or some other category theory high-jinx.
Does Java sealed classes enable something like an exhaustive pattern matching? (A form of pattern matching that will fail at compile time if you add a new class that extends the sealed class)
sealed interface BinaryTree {
record Leaf(int value) implements BinaryTree {}
record Node(BinaryTree lhs,
BinaryTree rhs,
int value) implements BinaryTree {}
}
public class Hello {
static int sum(BinaryTree tree) {
return switch (tree) {
case BinaryTree.Leaf(var value) -> value;
case BinaryTree.Node(var lhs, var rhs, var value) -> sum(lhs) + value + sum(rhs);
};
}
public static void main(String... args) {
var tree = new BinaryTree.Node(
new BinaryTree.Leaf(1),
new BinaryTree.Node(
new BinaryTree.Leaf(2),
new BinaryTree.Leaf(3),
4),
5);
System.out.println(tree);
System.out.println("Sum: " + sum(tree));
}
}
If you added a new subtype to BinaryTree you would need to fix the switch.
EDIT: I didn't handle the `null` case above... so it would be a NullPointerException if someone passed null... apparently, Java decided to make handling `null` optional. More information: https://www.baeldung.com/java-lts-21-new-features
It absolutely does. Here is a (modified) snippet of my Java code from yesterday.
final boolean hasUncollectedSecret =
switch (each)
{
case Wall() -> false;
case Goal() -> false;
case Player p -> false;
case BasicCell(Underneath(_, var collectible), _)
->
switch (collectible)
{
case NONE, KEY -> false;
case SECRET -> true;
};
case Lock() -> false;
};
Java's sealed classes are still somewhat more limited than Rust's or Haskell's sum types, in that each instance of the superclass holds a fixed variant (i.e., subclass), so you can't change the variant without creating a new instance. Clearly, this limitation is necessary for references to stay intact, but I've personally ran into this issue when trying to represent a sum type in an ORM.
#[derive(Debug)]
pub enum Example {
Foo(i32),
Bar(&'static str),
}
let mut ex: Example = Example::Foo(42);
println!("{ex:?}"); // Foo(42)
let ex_ref: &mut Example = &mut ex;
*ex_ref = Example::Bar("hello");
println!("{ex:?}"); // Bar("hello")
Given a mutable reference to a value of enum type, you can replace it with another variant. Or you can swap it out with any other value of the same type, even if the variants are different. This is most commonly used for Option<T>, where you can insert or remove the contained value as long as you have a reference.
The limitation here is that for as long as the mutable reference is live, no other code can access the value. So when you do change the variant, you can't have any other references sitting around that point to the removed subfields.
This doesn't have anything to do with ADTs. It's because Rust has both in-place variables and references. You aren't changing the variant of an existing ADT, you're replacing the entire ADT value with a new one. That replacement is visible in multiple places because the language allows you to take references to variables (the `&mut ex` expression).
You can accomplish the same thing in C and C++ because they also have in-place value semantics and allow you to take the address of any variable. You can't do that in Java only because Java doesn't let you take references to variables.
> You aren't changing the variant of an existing ADT, you're replacing the entire ADT value with a new one.
'Values' in Rust have no identity, except for their address. They're just a bunch of bytes in a row. What could it mean for a value to exist, except for it to be present at a set place in memory? If I have an instance of a Java class, and I change all the fields, I'd hardly say the instance has been replaced with a new instance.
If you insist, I'd say "Java's sealed classes are more limited in that when you have a bunch of references to the same thing, you can change the values in the fields of that thing (and have it be reflected in other references), but you can't change which variant it is." Call that thing a 'value' or a 'variable', it doesn't change the visible outcome compared to enums in Rust, or discriminated unions in C/C++.
int x = 42;
printf("%d\n", x); // 42
int* x_ref = &x;
*x_ref = 53;
printf("%d\n", x); // 53
In these examples, we haven't mutated the number 42 into the number 53. We've simply stored an entirely new value in the location of `x`. In your Rust example, you're doing the exact same thing with an ADT. The variant case isn't being changed. You're creating a new value and storing it in an existing storage location. Every pointer pointing to that storage location sees the update because they all point to the same storage.
I don't really think it's useful to do that though?? Can you give an example?
By the way, I would claim Java's sum types are less limited than Rust because in Rust, variants don't have their own type. The consequence is that you can't have functions that only accept some variant, as far as I know (I remember having this problem once), or add "methods" only to one variant... while in Java, because variants are just normal types, you can do both, and doing that is pretty damn useful.
> I don't really think it's useful to do that though?? Can you give an example?
For an exercise, I had to write a system with Admins and Customers, with the ability to upgrade a Customer into an Admin, or vice versa.
My thought was to put them as two subclasses under a User superclass, so that I could put them under a single Users table, and not have to link and unlink things over the conversion. Hibernate ORM supports storing subclasses by adding an implicit discriminator field.
However, its object model specifies that a single row always corresponds to a particular instance, so it has no support for changing the subclass of a row. Ultimately, I ended up with a hacky solution of creating a new record with the primary key copied over.
> By the way, I would claim Java's sum types are less limited than Rust because in Rust, variants don't have their own type. The consequence is that you can't have functions that only accept some variant, as far as I know (I remember having this problem once), or add "methods" only to one variant... while in Java, because variants are just normal types, you can do both, and doing that is pretty damn useful.
At least in Rust, you can simulate this pretty trivially by having each variant store a struct value with all the data and methods you want. See proc_macro::TokenTree [0] for an example of this. Of course, it's not ideal in how verbose it is (though not much worse than Java!), but it can be workable on the consumer's side if you add some extra From impls.
NAND is a universal circuit primitive because it can be used to create all of the other circuit primitives. But if you think about it, this is more of an argument of manufacturing than it is in comprehensibility. Only needing to manufacture NAND is easy, but if you could only create your circuit this way, then you would have an unmaintainable mess.
You can do the same thing with boolean logic and just have not-and, but thankfully we have and, or, not, xor. Similarly, you don't need greater-than-or-equal because you can just write 'x > y || x == y'.
Comprehension is linked to how closely you can express the idea of what you're doing in the object language that you have to look at. It might be convenient to compile everything down to SK combinators so that your optimizer and evaluator can be simpler, but people should never look at that level (at least not until you suspect a compiler defect).
So we get to object oriented programming. Where our data expression has an AND property (a class has an INT field AND a STRING field), an existential property (interfaces: there exists some object with these methods), and inheritance (a truly bizarre feature where we duck tape subtyping to a method and field grouping mechanism with a bunch of hooks).
With interfaces and inheritance you can simulate both a universal property (generic) and an OR property. But because it's not a direct expression, we leave this giant gap for what people intended to happen to diverge from what actually happens. Especially after time passes, defects are found, and requirements change. [For example, when using interfaces to simulate an OR property, there really isn't any mechanism to let everyone know that this construct is closed. So if something erroneously gets added, you won't know to check the entire code base. And if requirement change and you need to add a new case, then you have to check the entire code base. Completeness checking of ADTs give you this for free in your pattern matches.]
Too many non-trivial architectural messes that I've encountered in my career have been due to either someone trying to solve all of their problems with interfaces or the same with inheritance* when a simple OR data structure would have made everything simple, clear, and correct.
[*] - Inheritance being more problematic when someone tries to create a non-trivially sized category hierarchy, which ruins the day when requirements change and suddenly the tree needs to be reorganized but doing so would invalidate entire swaths of the code base already accepting types with a different assumed (and undocumented) hierarchal tree structure. Thankfully most people have gotten the memo and switched to interfaces.
It's a term that is commonly used in computer science education to refer to any data type independent of its concrete implementation (so, basically, its interface.) I don't think it's just restricted to Modula-2?
Indeed, however CLU and Modula-2 were the ones used mostly for practical teaching purposes, until ML became widespread enough for ADT to gain yet another meaning.
> [Algebraic Data Types are] such a basic mental model of how humans think and solve problems
I think that's actually wrong for "Sum types". Product types, sure. The idea of storing a bunch of fields in a single thing matches the way we've been organizing information since we started writing things down.
But I genuinely don't think I've seen an attempt at a sum/union/enumerant/whatever syntax in a programming language that wasn't horrifyingly confusing.
Where by extension: class-based inheritance is actually pretty simple to understand. The classic "IS-A" relationship isn't as simple as "fields in a struct", but it's not hard to understand (c.f. all the animal analogies), and the syntax for expressing it is pretty clean in most languages.
Is it the "best" way to solve a problem? Maybe not. Neither are ADT sum types. But I think there's a major baby-in-the-bathwater problem with trying to be different. I really don't think, for the case of typical coders writing typical code, that ADTs are bringing as much to the table as the experts think.
> But I genuinely don't think I've seen an attempt at a sum/union/enumerant/whatever syntax in a programming language that wasn't horrifyingly confusing.
Syntaxes like: `A | B(Int) | C(String)`. That means A, B, or C.
> Where by extension: class-based inheritance is actually pretty simple to understand. The classic "IS-A" relationship isn't as simple as "fields in a struct", but it's not hard to understand
This value is either `A`, an Int (`B(Int)`) or a String (`C(String)`). Or: this knapsack either contains an A, B, or C. Difficult?
> (c.f. all the animal analogies),
Reminds me of the fact that software isn’t typically as static as animal taxonomies.
> and the syntax for expressing it is pretty clean in most languages.
I’m most used to Java where you spread `extends` over N files. (Sealed classes in Java is an exception.)
It’s fine. I don’t understand how it is particularly clean.
> Is it the "best" way to solve a problem? Maybe not. Neither are ADT sum types.
Is this an argument? ’Cause I don’t see it.
> But I think there's a major baby-in-the-bathwater problem
Inheritance is something that needs to have some concrete implementation affordance. Baby in the bathwater? I don’t see how you bolt this onto the struct model in a way that gets out of the way for people who don’t want to use it (the zero-cost-abstraction (in the if you don’t use it sense) is important to some low-level languages).[1]
Maybe the designers of hypothetical language X thinks that algebraic data types is enough. What baby are you missing then?
[1] For algebraic data types: structs are straightforward enough. While the “sum type” can be implemented by leaving enough space for the largest variant. That one-size-fits-all strategy isn’t perfect for all use-cases but it seems to have been good enough for Rust which has a lot, a lot of design discussions over minutiae.
> trying to be different.
With a technology from the ’70s. I also saw your “one oddball new idea is a revolution” snark. You’re clearly being very honest.
FWIW, I'm talking about the pattern matching syntax required to enforce coverage of all the sum types. Obviously the idea of the representation as a set of alternatives is easy enough to understand. The idea of "Now You Have To Show The Compiler You Know What To Do In All Circumstances" is very much not, c.f. how much basically everyone struggles with Option() in Rust.
Inheritance addresses similar issues by delegating the responsibility, something that is easy to understand. Does it have downsides? Sure. But I'm not sold that ADT has the advantage here at all.
> But I genuinely don't think I've seen an attempt at a sum/union/enumerant/whatever syntax in a programming language that wasn't horrifyingly confusing.
That's a bit much. Type inheritance has been a core abstraction in software development since before most working developers were born. We as a society know how to do this. The idea that one oddball new idea is a revolution that turns a "nightmare" into sunshine is way too hyperbolized.
Sum typing might be better! But frankly the jury is still out, and the impact is clearly going to be smaller than what you're imagining.
The new lowest level pls (zig, rust) have ditched class based inheritance. Higher level PLs are going more functional, to include JS, where entire frameworks are encouraging functional (not to mention how everyone complains about the opacity of trying to use inheritance in place of declarative i.e. Amazon CDK)
Yeah, when I first learned Haskell a million years ago, and Erlang slightly less than a million years ago, the pattern matching was so plainly obviously the "correct" way to do things; it just felt like it was exactly how I thought about problems, and all the constructs with if/switch/enums had been an attempt to force my brain thinking into something that executes.
It honestly does annoy me that a lot of mainstream languages still haven't really adopted ADTs; when Java 8 added a lot of (well-needed) new syntax, it felt like that was an ideal opportunity to add ADTs and pattern matching (though I'm sure that was easier said than done).
> when Java 8 added a lot of (well-needed) new syntax, it felt like that was an ideal opportunity to add ADTs and pattern matching
Well at least Java does now (as of Java 21) have pattern matching (including nested record destructuring) and sealed classes, which let you have decent sum types.
The one issue is that everything is nullable, but that's a wider Java issue.
Yeah, but the annoying part of Java is that people stick with old versions for a long time. Java 21 looks pretty great but most companies are still using Java 17, or even Java 11 still.
Some have even older Java versions in 'prod'. 6 is still alive in some places out there, because management refuses to pay for either upgrade, replacement or dismantling.
I have a mid-sized application I built on 17 that's used to deliver a particular project, really looking forward to finish the project so I get to move to 21 and refactor it with these new features and make it more general.
It is. We run an old version of our application on Java 6. Apparently multiple people have tried to upgrade it in the past but it proved too difficult because it's using a bunch of completely obsolete technology with little available documentation that seems to randomly break when you upgrade even minor things.
The plan has been to gradually replace it with the new version of the software (which runs on Java 17), unfortunately this plan has been ongoing for 10 years and it's not gonna be done anytime soon.
Such are the sad realities of working on legacy code sometimes.
Absolutely, so you try to seal it in hermetically, only talk to it over a special message bus or something.
Sometimes the upgrade path from 6 onwards isn't as nice as it usually is from 8 up, especially if you built with some old undead libraries that require an heroic effort to understand well enough to reimplement. Takes a very special organisation to divert some person-years to pull it off, and as some middle or other manager it's highly unlikely to catch you a new, better job even if it goes well.
It makes me sad, but I totally understand why someone would stay with Java 6, for the same reason that there's still COBOL stuff hanging around: the stuff "works" as is, upgrading is expensive, and some business person decided that the cost of the upgrade isn't worth it for the value they'd receive.
I do worry that there's "Y2K-esque" bugs hiding in Java 6 programs somewhere. I don't think java.util.date uses 32 bit integers for time so the 2038 problem probably won't be an issue, but I do wonder if there's other stuff hiding.
There might be some limit or bug like that in Java 6, but I think time stuff has been backported, so it'd likely be something else. I don't know enough about the JVM internals to speculate on whether the old ones were simpler and less likely to be defective, or not.
By now most types of issues ought to have popped up though, since it was so widely used for such a long time.
"Haskell is the best imperative language," (C) various software engineers.
Also, algebraic data types can be seen as hierarchy consisting of abstract base class and several final children classes. So it is an inheritance model, just restricted one.
I'm a huge proponent of ADTs being a more comprehensible way to write code than some of the alternatives. But I do have to agree with you that there isn't really evidence that this is a basic mental model.
However
What we do see is a bunch of mathematical disciplines that end up creating properties like: AND, OR, Universal, Existential, Implication, (and a few others). They end up in places like: set theory, type theory, category theory, various logics, lattice theory, etc.
Now, maybe they're only copying one another and this is more of a memetic phenomena. Or maybe they've hit upon something that's important for human comprehensibility.
That would be the 'evidence' of the positive effect of ADTs (scare quotes because it might just be math memes and not fundamental). But we can also think about what I feel is legit evidence for the negative effect of lacking ADTs.
Consider what happens if instead of having the standard boolean logic operators and, or, not, xor, we only have the universal not-and operator. Now a straightforward statement like: A && B || C becomes (((A !& B) !& (A !& B)) !& ((A !& B) !& (A !& B))) !& (B !& B) [I think...]. It's more complicated to tell what's actually supposed to be going on AND the '&&' simulation can get intertwined with the '||' simulation. The result being that requirements changes or defect fixes end up modifying the object level expression in a way where there is no longer any mapping back to standard boolean logic. Comprehensibility approaches zero.
And we've seen this happen with interfaces and inheritance being used to implement what would otherwise be a relatively simple OR property (with the added benefit that pattern matching ADTs often comes with totality checking; not something you can do with interfaces which can always have another instance even up to and including objects loaded at runtime).
Appearing in symbolic reasoning tools we have invented doesn't really support them being how brains work, though? This is akin to saying that gears are how nature works because gears are everywhere in how we build things. I could maybe buy that with "friction" being a fundamental thing, but feels like a stretch for the other.
Now, I should add that I did not mean my question to be a criticism of them! I'm genuinely curious on evidence that they are a basic building block. Feels save to say they are a good building block, and those aren't the same thing.
As an easy example for them not being basic building blocks, I can't remember ever seeing anything like them in any assembly instructions for things. Put together a batting net for the kids. Lots of instructions, but nothing algebraic, in this sense. Looking at recipes for food. Nothing algebraic, really? Maybe I can squint and see some, but it would be hard. Exercise plans? Music lessons? Playbooks for a sport?
Again, though, I /do not/ intend this as a criticism of them. Genuinely curious on any investigation into them.
I think you're right to point out that it's too strong a claim to say that sum types are a basic building block of thought, although I believe they are very useful in coding regardless of that claim.
There is the still the ongoing debate about how much human perception and human reason are shaped by cultural forces vs. universal forces (where the latter asserts humans reason in the same/similar ways).
There's evidence that certain optical illusions don't work across cultures for example (I seem to remember those in Western countries have a tendency to mentally group things in rectangular boxes). The exact balance between cultural and universal forces isn't known and I doubt we could say anything about sum types in that regard.
Verdex's explanation is detailed but too long IMO. The short version is that ADTs/sum types formally correspond to the OR-logical connective, and records/product types formally correspond to AND-logical connective. I think you'd be hard-pressed to argue that people don't think in terms of "X AND Y OR Z". These are core primitives of any kind of reasoning.
I can easily argue that people don't think in terms of boolean logic. For one, the evidence seems as strong that people generally think backwards from the answer far more than they do forwards from the ingredients. This is often why new things are so long to be discovered. It isn't that people couldn't have gotten there, but they didn't know to go for it.
For two, addition is a wildly disparate thing everywhere we use it. We like to joke that computers made that hard, but literally half of intro chemistry is learning how to get thing to add together in a meaningful way, no? Balancing a chemical equation is a thing.
> For one, the evidence seems as strong that people generally think backwards from the answer far more than they do forwards from the ingredients.
Logic doesn't really have a direction, it works backwards or forwards. Even if you're solving a system "backwards", whatever that means, you still have to satisfy all of the necessary AND and OR constraints for a solution to be valid, so you're effectively still building ADTs or records just using a different evaluation order.
And this logic is how folks convince themselves that ball players are doing trigonometry when playing. It is just wrong.
You can /model/ it that way. But you are making a symbolic model to justify how a solution is reached.
Now, it can be frustrating to consider that this model could produce an agent that is better at the ball game than the players. But it is silly to think that means you have mirrored them.
> And this logic is how folks convince themselves that ball players are doing trigonometry when playing. It is just wrong.
You're attempting a sleight of hand here by saying "they're" not "doing trig". Clearly they are not doing anything like that consciously, but equally clearly some part of their brain is triangulating objects and predicting trajectories based on gradients, meaning that part is "doing trig and calculus" subconsciously. What else does it mean to "do something" if not "process X is isomorphic to process Y"?
> You can /model/ it that way. But you are making a symbolic model to justify how a solution is reached.
I really don't understand what you think people are doing when they're compiling a grocery list. They're clearly thinking, "I need x AND y OR I can substitute z".
Or if they're planning to paint their fence, they're thinking, "I need paint AND brushes AND I have to start before lunch OR I won't finish before dinner".
No. You are confusing the model for the reality. Again, our model may lead to a more impressive reality, but ball players are not doing trig. They are almost certainly simulating things, but that does not require trig. Indeed, it only loosely requires math. Models can be very informal with loose approximations.
You seem to think I have to prove your model false to show others don't do that. But I am specifically not claiming your model is false. I'm saying folks don't think that way, necessarily. For example, many build lists for shopping that they were taught. Not that they reasoned.
The ability to build lists requires reasoning. It requires enumeration, inclusion/exclusion conditions, and stop conditions, which necessarily requires logic. This argument is pointless, you go on believing that some people can't use basic logical connectives, I'm sure next you'll say they can't even count.
By that reasoning, though, we can claim that the earth counts to 364 as it orbits the sun. Or that women's bodies count roughly to 9 as they make a baby. These are ways that we model those things, but that is not what is happening.
Most people learn to build lists through mimicry. They literally mimic the lists they were taught they need to build to go to the store. With enough experience, many of us learn to build our own lists from other principals, but it almost all starts with mimicry and simulation. Is why "going shopping" is such a fun game for kids. They are learning.
None of which is to say that you can't make your thinking better with logic. You almost certainly can. But you are begging the question with your examples and explanations. Heavily.
> Logic doesn't really have a direction, it works backwards or forwards.
Implication is one of the primitives in logic, and gives us several of the classic logical fallacies: affirming the consequent, denying the antecedent, fallacy of the converse, and fallacy of the inverse.
All of which are examples of trying to work logic as though it doesn't have a direction.
Implication is not primitive, "x->y" is reducible to "not(x) or y". None of those fallacies derive from a lack of direction, but from not being invalid logical deductions.
But Pascal's variant records (1970-11) had very ugly design errors in comparison with the unions of Algol 68 (1968-12), which made them either useless or annoying for most applicatons.
Niklaus Wirth is well known as a critic of Algol 68 (before the design of Algol 68 was finalized), but in the case of his variant records he has completely failed to create something competitive.
Doesn't look too bad particually the match tagged version last:
MyType= (Scalar4, Real4, NullTerminatedStringC);
MyUntaggedRecType=
RECORD
CASE MyType OF
Scalar4: (longC: ARRAY[1..150] OF longint);
Real4: (floatC: ARRAY[1..150] OF real);
END;
MyTaggedRecType=
RECORD
CASE tag: MyType OF
Scalar4: (longC: ARRAY[1..150] OF longint);
Real4: (floatC: ARRAY[1..150] OF real);
END;
...
{ set all to 0.0 without running through the MC68881 }
FOR j := 1 TO 150 DO
longC[j]:= 0;
...
CASE tag OF
Scalar4: percentReal = longC[1];
floatC: percentReal = floatC[1]*100;
ELSE
percentReal = 0.0/0.0;
edit: don't have a pascal compiler handy, but that's the idea
Algol 68 was a failure mainly due to its inappropriate documentation, not due to the quality of the language.
It included many innovations that appeared again in other programming languages only decades later.
Niklaus Wirth was a good teacher and writer and the success of his languages is due mostly to his books and due to his languages being used for teaching in many universities, not due to their technical qualities.
Algol 68 may have been a failure because it was ahead of its time, which meant it was hard to write a compiler for it. For example, Algol 68 had closures (apparently Knuth snuck them in).
people used Pascal because the compiler was blazing fast, it was easier to implement and learn and the features it lacked against Algol did not really matter most of the time (at the time)
More features doesn't automatically means "better"
Also Pascal had quite strong technical qualities, not very common among other contemporary languages
edit: can I ask the reason for the downvote? I would really like to hear an opinion on what Pascal did wrong, having used it extensively in the late 80s until the end of the 90s and why my comment was awarded with a negative score.
Extended variants of Pascal, like Turbo Pascal, should not be confused with the Pascal language as designed by Niklaus Wirth.
Wirth's Pascal was a language designed for the purpose of teaching programming and it was adequate for that, but it was completely inappropriate for any serious work.
It had no means for writing big programs that must be divided into multiple source files and it had a lot of design errors, like including the size of an array in its data type (which made impossible the writing of any linear algebra library) or the way of handling the variant records (which was insecure and verbose).
The well known paper "Why Pascal Is Not My Favorite Programming Language" by Brian W. Kernighan contains valid arguments against Pascal (against Wirth's Pascal, there have been many extended Pascals, especially following Turbo Pascal, which have corrected some of the most egregious defects of the original Pascal).
People keep playing this tired song, while clearly ignoring that until ISO C89, even C was full of dialects, a mix of K&R C and whatever the compiler writer felt like, filled with tons of Assembly.
Small-C, RatC,....
Additionally forgetting that Modula-2 came out in 1978, as means the sort out all those issues, a language designed for systems programming, instead of the "Python" from 1970, designed for teaching.
With features that C is yet to have, 50 years later, while HN is hypping for having them in a C like syntax delivered in a Zig package.
I know that paper and personally I think some of the critiques are actually qualities of pascal, making it, while certainly not a completely refined language, a more modern language than C
- no escape (AKA no casting): good!
- no default clause in case: good idea, not so good implementation (undefined behaviour)
- no break outside for loops: inconvenient, but that's how FP works. it is still debated today if breaking loops is considered a good or a bad practice
- no separated compilation: I will quote Kernighan on this Theoretically, there is no need for separate compilation - if one's compiler is very fast Pascal compiler was fast, maybe not very fast, but speed was one of the primary goals for Wirth.
many other issues were similar in other languages and in C
Pascal had obviously its faults, but every language back then had some
Pascal was simple enough to make it easy to compile and implement. That's what Wirth thaught, he's the author of Compiler Construction after all, it wasn't like learning Python today as a data scientist
make the language more complex (and more useful/interesting) and you're stuck with a very slow or very buggy compiler that very few people would know how to implement
I think there's a reason why we had Turbo Pascal and not Turbo Algol 68
In practice you need to couple it with an enum, and your visitation mechanism is a switch statement. But C doesn't impose that on you and lets you do it as you see fit.
You're confusing semantics for implementation. The point of union and discriminated union types (not what C calls union) is to enable compiler checked pattern matching, which tagged enums in C plus a switch statement do not get you.
Tagged unions + pattern matching is what gp wants. You can always encode whatever model you want using any programming language, but language features/ergonomics matter.
That you can sort of simulate the skeleton of algebraic data types does not mean that C has algebraic data types. The whole point of the algebra part is that the syntax has a compositional semantics which is completely absent in C, unless you go to great lengths as with this macro header.
"The COW filesystem for Linux that won't eat your data".
This is, IMHO, a dig at Btrfs, which is famous for recovery problems. Btrfs doens't have a working reliable `fsck` replacement or a working `df` command which gives reliable results.
So, bcachefs aims to be better than Btrfs and that's a low bar to clear.
It's "better" than ZFS because it's GPL and designed to do the same as ZFS but be integrated into the Linux kernel, which ZFS can't be because its license doesn't allow it.
For this it merely needs to equal ZFS' features and functions, not exceed them.
Using OnlyOffice promotes Microsoft Document format even further. Microsoft will again change their spec and OnlyOffice won't know how to render it correctly.
For long term, it's better to choose LibreOffice which uses open standards and does not purposefully change specs to break others.
It doesn't have a file creation dialog (nor global setting for this behavior), hence "promotes". New files are docx/xlsx/pptx. You have to manually use a "Save as" dialog to choose a corresponding LibreOffice format.
I deleted my Reddit account long back after they blocked VPNs. I used old.reddit.com usually from DDG or Google search results. Now this is fucked up. I am not able to access Reddit through either VPN or Tor. They are dead to me now.
I just want a solution to completely remove Reddit links from appearing in search results. Anyone has any solution?
You may be able to find an extension for your browser of choice to do just that! I have one for Firefox but I’m on mobile right now and can’t find it. I can send it later if you’d like. But they do exist!
I know. I was blaming Oracle, and Sun before them. Why can't they just relicense the entire thing as GPL or some other license that's actually open source and GPL-compatible such as MIT? They own the copyright, don't they? Surely they have the power to do that.
ZFS is too important to not be in Linux. Yet it's 2024 and it's still not in there. Because of licensing copyright nonsense. The blame falls solely on the copyright owners.
I'm very skeptical of installing apps to access Email or VPN. If the company turned evil tomorrow, apps can turn into malware.
Always use standard protocols using free software to transfer data over Internet. Use Thunderbird or Wireguard/OpenVPN client. Worst case you will give out data specific to that service. Don't encourage "apps" over protocols.
> flatpak is like a cross-distro snap that actually works
You are right. Besides proprietary app store, Snap lacks even basic sandboxing in non-Debian based distros. I can't believe that Canonical engineers thought this was a good idea when designing a cross platform packaging format.
Flatpak got everything right from it's initial design. Multiple repositories, Sandboxing that actually works in any Linux distro, OSTree to reduce disk space usage etc.
We fucked up bad. Real bad. As software engineers, we have produced nothing but garbage in the last two decades.