I like this. Very much falls into the "make bad state unrepresentable".
The issues I see with this approach is when developers stop at this first level of type implementation. Everything is a type and nothing works well together, tons of types seem to be subtle permutations of each other, things get hard to reason about etc.
In systems like that I would actually rather be writing a weakly typed dynamic language like JS or a strongly typed dynamic language like Elixir. However, if the developers continue pushing logic into type controlled flows, eg:move conditional logic into union types with pattern matching, leverage delegation etc. the experience becomes pleasant again. Just as an example (probably not the actual best solution) the "DewPoint" function could just take either type and just work.
Yep. For this reason, I wish more languages supported bound integers. Eg, rather than saying x: u32, I want to be able to use the type system to constrain x to the range of [0, 10).
This would allow for some nice properties. It would also enable a bunch of small optimisations in our languages that we can't have today. Eg, I could make an integer that must fall within my array bounds. Then I don't need to do bounds checking when I index into my array. It would also allow a lot more peephole optimisations to be made with Option.
Weirdly, rust already kinda supports this within a function thanks to LLVM magic. But it doesn't support it for variables passed between functions.
Academic language designers do! But it takes a while for academic features to trickle down to practical languages—especially because expressive-enough refinement typing on even the integers leads to an undecidable theory.
I think the reasons are predominantly social, not theoretical.
For every engineer out there that gets excited when I say the words "refinement types" there are twenty that either give me a blank stare or scoff at the thought, since they a priori consider any idea that isn't already in their favorite (primitivistic) language either too complicated or too useless.
Then they go and reinvent it as a static analysis layer on top of the language and give it their own name and pat themselves on the back for "inventing" such a great check. They don't read computer science papers.
Hello! I was curious if you would happen to have any advice or particular comp sci papers you would point as aspiring compiler developer towards.
I think I'm sort of who you're talking about. I have no formal education and I am excited to have my compiler up to the point I can run a basic web server. I think it's a fairly traditional approach with a lexer, recursive decent parser, static analysis, then codegen. I'm going for a balance between languages like Ruby and Rust to get the best of both worlds.
You'll probably find it funny that I don't know the name for the technique Im using for dynamic dispatch. The idea is that as long as a collection doesn't have mixed types then the compiler statically knows the type even in loops and such. Only for mixed type collections, or maybe trait functions, will the compiler be forced to fall back to runtime dynamic dispatch. I find this cool because experts can write fast static code, but beginners won't be blocked by the compiler complaining about things they shouldn't have to care about yet. But, syntax highlighting or something may hint there are improvements to be made. If there is a name for this, or if it's too small a piece to deserve one, I would be very curious to know!
On Refinement Types, I not sure they are a good idea for general purpose languages and would love to be challenged on this. Succinctly, I think it's a leaky abstraction. To elaborate, having something like a `OneThroughTen` type seems helpful at first, but in reality it's spreading behaviour potentially all over the app as opposed to having a single function with the desired behaviour. If a developer has multiple spots they're generating a number and one spot is missing a check and causes a bug, then hopefully a lesson was learned not to do that and instead have a single spot for that logic. The heavy handed complexity of Refinement Types is not worth it to solve this situation.
If there are any thoughts out there they would be greatly appreciated!
Aren't most type systems in widely used languages Turing complete and (consequently) undecidable? Typescript and python are two examples that come to mind
But yeah maybe expressive enough refinement typing leads to hard to write and slow type inference engines
Range checks in Ada are basically assignment guards with some cute arithmetic attached. Ada still does most of the useful checking at runtime, so you're really just introducing more "index out of bounds". Consumer this example:
procedure Sum_Demo is
subtype Index is Integer range 0 .. 10;
subtype Small is Integer range 0 .. 10;
Arr : array(Index) of Integer := (others => 0);
X : Small := 0;
I : Integer := Integer'Value(Integer'Image(X)); -- runtime evaluation
begin
for J in 1 .. 11 loop
I := I + 1;
end loop;
Arr(I) := 42; -- possible out-of-bounds access if I = 11
end Sum_Demo;
This compile, and the compiler will tell you: "warning: Constraint_Error will be raised at run time".
It's a stupid example for sure. Here's a more complex one:
procedure Sum_Demo is
subtype Index is Integer range 0 .. 10;
subtype Small is Integer range 0 .. 10;
Arr : array(Index) of Integer := (others => 0);
X : Small := 0;
I : Integer := Integer'Value(Integer'Image(X)); -- runtime evaluation
begin
for J in 1 .. 11 loop
I := I + 1;
end loop;
Arr(I) := 42; -- Let's crash it
end Sum_Demo;
This again compiles, but if you run it: raised CONSTRAINT_ERROR : sum_demo.adb:13 index check failed
It's a cute feature, but it's useless for anything complex.
The comment to which I replied is about two concepts: Defining subtypes and an optimization made possible by them.
I get a lot of value out of compile time enforcement of subtypes (e.g., defining a function as requiring a parameter be Index instead of Integer) finding errors in my thinking. It is more than "cute" to me.
As for the possible optimization of the array bounds check and it happening at runtime instead of compiletime, isn't that a failure of GNAT and not Ada? Can't a sufficiently smart compiler detect the problem in your example?
Regardless, I am not convinced that getting around the compiletime type safety by casting with Integer'Value(Integer'Image(X)) proves anything. I will still prefer having subtypes over not having them.
I proposed a primitive for this in TypeScript a couple of years ago [1].
While I'm not entirely convinced myself whether it is worth the effort, it offers the ability to express "a number greater than 0". Using type narrowing and intersection types, open/closed intervals emerge naturally from that. Just check `if (a > 0 && a < 1)` and its type becomes `(>0)&(<1)`, so the interval (0, 1).
My specific use case is pattern matching http status codes to an expected response type, and today I'm able to work around it with this kind of construct https://github.com/mnahkies/openapi-code-generator/blob/main... - but it's esoteric, and feels likely to be less efficient to check than what you propose / a range type.
There's runtime checking as well in my implementation, but it's a priority for me to provide good errors at build time
type
Foo = range[1 .. 10]
Bar = range[0.0 .. 1.0] # float works too
var f:Foo = 42 # Error: cannot convert 42 to Foo = range 1..10(int)
var p = Positive 22 # Positive and Natural types are pre-defined
This can be done in typescript. It’s not super well known because of typescripts association with frontend and JavaScript. But typescript is a language with one of the most powerful type systems ever.
Among the popular languages like golang, rust or python typescript has the most powerful type system.
How about a type with a number constrained between 0 and 10? You can already do this in typescript.
You can even programmatically define functions at the type level. So you can create a function that outputs a type between 0 to N.
type Range<N extends number, A extends number[] = []> =
A['length'] extends N ? A[number] : Range<N, [...A, A['length']]>;
The issue here is that it’s a bit awkward you want these types to compose right? If I add two constrained numbers say one with max value of 3 and another with max value of two the result should be max value of 5. Typescript doesn’t support this by default with default addition. But you can create a function that does this.
// Build a tuple of length L
type BuildTuple<L extends number, T extends unknown[] = []> =
T['length'] extends L ? T : BuildTuple<L, [...T, unknown]>;
// Add two numbers by concatenating their tuples
type Add<A extends number, B extends number> =
[...BuildTuple<A>, ...BuildTuple<B>]['length'];
// Create a union: 0 | 1 | 2 | ... | N-1
type Range<N extends number, A extends number[] = []> =
A['length'] extends N ? A[number] : Range<N, [...A, A['length']]>;
function addRanges<
A extends number,
B extends number
>(
a: Range<A>,
b: Range<B>
): Range<Add<A, B>> {
return (a + b) as Range<Add<A, B>>;
}
The issue is to create these functions you have to use tuples to do addition at the type level and you need to use recursion as well. Typescript recursion stops at 100 so there’s limits.
Additionally it’s not intrinsic to the type system. Like you need peanno numbers built into the number system and built in by default into the entire language for this to work perfectly. That means the code in the function is not type checked but if you assume that code is correct then this function type checks when composed with other primitives of your program.
Complexity is bad in software. I think this kind of thing does more harm than good.
I get an error that I can't assign something that seems to me assignable, and to figure out why I need to study functions at type level using tuples and recursion. The cure is worse than the disease.
It can work. It depends on context. Like let's say these types are from a well renowned library or one that's been used by the codebase for a long time.
If you trust the type, then it's fine. The code is safer. In the world of of the code itself things are easier.
Of course like what you're complaining about, this opens up the possibility of more bugs in the world of types, and debugging that can be a pain. Trade offs.
In practice people usually don't go crazy with type level functions. They can do small stuff, but usually nothing super crazy. So type script by design sort of fits the complexity dynamic you're looking for. Yes you can do type level functions that are super complex, but the language is not designed around it and it doesn't promote that style either. But you CAN go a little deeper with types then say a language with less power in the type system like say Rust.
Typescript's type system is turing complete, so you can do basically anything with it if this sort of thing is fun to you. Which is pretty much my problem with it: this sort of thing can be fun, feels intellectually stimulating. But the added power doesn't make coding easier or make the code more sound. I've heard this sort of thing called the "type puzzle trap" and I agree with that.
I'll take a modern hindley milner variant any day. Sophisticated enough to model nearly any type information you'll have need of, without blurring the lines or admitting the temptation of encoding complex logic in it.
>Which is pretty much my problem with it: this sort of thing can be fun, feels intellectually stimulating. But the added power doesn't make coding easier or make the code more sound.
In practice nobody goes too crazy with it. You have a problem with a feature almost nobody uses. It's there and Range<N> is like the upper bound of complexity I've seen in production but that is literally extremely rare as well.
There is no "temptation" of coding complex logic in it at all as the language doesn't promote these features at all. It's just available if needed. It's not well known but typescript types can be easily used to be 1 to 1 with any hindley milner variant. It's the reputational baggage of JS and frontend that keeps this fact from being well known.
In short: Typescript is more powerful then hindley milner, a subset of it has one to one parity with it, the parts that are more powerful then hindley milner aren't popular and used that widely nor does the flow of the language itself promote there usage. The feature is just there if you need it.
If you want a language where you do this stuff in practice take a look at Idris. That language has these features built into the language AND it's an ML style language like haskell.
I have definitely worked in TS code bases with overly gnarly types, seen more experienced devs spend an entire workday "refactoring" a set of interrelated types and producing an even gnarlier one that more closely modeled some real world system but was in no way easier to reason about or work with in code. The advantage of HM is the inference means there is no incentive to do this, it feels foolish from the beginning.
Coffeescript went obsolete because many of the features were directly folded into JS ES6.
If you think typescript will go the way of coffeescript. Well. First off, coffeescript was waaay less popular than typescript. Second, if it does it just means types will get folded directly into js.
yes … I find it as useful as I used to find perl (spent 4 years as a perl coder back in the day) and use for ad hoc scripts … data migration, PDF/CSV scraping, some LLM prompt engineering in a business setting, wrote a module to manage/install Wordpress on EC2 for example
But wouldn't that also require code execution? For example even though the compiler already knows the size of an array and could do a bounds check on direct assigment (arr[1] = 1) in some wild nested loop you could exceed the bounds that the compiler can't see.
Otherwise you could have
type level asserts more generally. Why stop at a range check when you could check a regex too? This makes the difficulty more clear.
For the simplest range case (pure assignment) you could just use an enum?
You can do this quite easily in Rust. But you have to overload operators to make your type make sense. That's also possible, you just need to define what type you get after dividing your type by a regular number and vice versa a regular number by your type. Or what should happen if when adding two of your types the sum is higher than the maximum value. This is quite verbose. Which can be done with generics or macros.
You can do it at runtime quite easily in rust. But the rust compiler doesn’t understand what you’re doing - so it can’t make use of that information for peephole optimisations or to elide array bounds checks when using your custom type. And you don’t get runtime errors instead of compile time errors if you try to assign the wrong value into your type.
ATS does this. Works quite well since multiplication by known factors and addition of type variables + inequalities is decidable (and in fact quadratic).
What the GP described could be achieved with dependent types, but could also be achieved with a less powerful type system, and the reduced power can sometimes lead to enormous benefits in terms of how pleasant it actually is to use. Check out "refinement types" (implemented in Liquid Haskell for example). Many constraints can be encoded in the type system, and an SMT solver runs at compile time to check if these constrains are guaranteed to be satisfied by your code. The result is that you can start with a number that's known to be in [0..10), then double it and add five, and then you can pass that to a function that expects a number in [10..20). Dependent types would typically require some annoying boilerplate to prove that your argument to the function would fall within that range, but an SMT solver can chew through that without any problem.
The full-blown version that guarantees no bounds-check errors at runtime requires dependent types (and consequently requires programmers to work with a proof assistant, which is why it's not very popular). You could have a more lightweight version that instead just crashes the program at runtime if an out-of-range assignment is attempted, and optionally requires such fallible assignments to be marked as such in the code. Rust can do this today with const generics, though it's rather clunky as there's very little syntactic sugar and no implicit widening.
AIUI WUFFS doesn't need a full blown proof assistant because instead of attempting the difficult problem "Can we prove this code is safe?" it has the programmer provide elements of such a proof as they write their program so it can merely ask "Is this a proof that the program is safe?" instead.
This is also approximately true of Idris. The thing that really helps Wuffs is that it's a pretty simple language without a lot of language features (e.g., no memory allocation and only very limited pointers) that complicate proofs. Also, nobody is particularly tempted to use it and then finds it unexpectedly forbidding, because most programmers don't ever have to write high-performance codecs; Wuffs's audience is people who are already experts.
Also, Wuffs doesn't let you prove arbitrary correctness properties, it aims only to prove the absence of memory corruption. That reduces how expressive the proof system has to be.
There's refinement types, which are less general than dependant types, but sufficient to provide ranges, and simpler to implement because the type only needs to be associated with a predicate.
Spec and Malli in Clojure feel like effing magic - so expressive, so simple - "data as code" philosophy means validations schemas are first-class citizens that can be manipulated, stored, transmitted, and reasoned about like any other data and that's just beautiful. For some type checks you can express in them, it is nearly impossible to achieve in TS, Rust or even Haskell. Clojure makes the complex feel simple because the language itself is designed around data transformation.
That power of course does come with a price, there does not exist a static analyzer that automatically checks things, even though you can pretty much generate beautiful tests based on specs. I think e.g. Rust teams can have more junior devs safely contribute due to enablement of less variability in code quality - the compiler enforces discipline. Clojure teams need higher baseline discipline but can move incredibly fast when everyone's aligned.
It's saddening to see when Clojure gets outright dismissed for being "untyped", even though it absolutely can change one's perspective about type systems.
> 1 + "1"
(irb):1:in 'Integer#+': String can't be coerced into Integer (TypeError)
from (irb):1:in '<main>'
from <internal:kernel>:168:in 'Kernel#loop'
from /Users/george/.rvm/rubies/ruby-3.4.2/lib/ruby/gems/3.4.0/gems/irb-1.14.3/exe/irb:9:in '<top (required)>'
from /Users/george/.rvm/rubies/ruby-3.4.2/bin/irb:25:in 'Kernel#load'
from /Users/george/.rvm/rubies/ruby-3.4.2/bin/irb:25:in '<main>'
Some people mistakenly call dynamic typing "weak typing" because they don't know what those words mean. PSA:
Static typing / dynamic typing refers to whether types are checked at compile time or runtime. "Static" = compile time (eg C, C++, Rust). "Dynamic" = runtime (eg Javascript, Ruby, Excel)
Strong / weak typing refers to how "wibbly wobbly" the type system is. x86 assembly language is "weakly typed" because registers don't have types. You can do (more or less) any operation with the value in any register. Like, you can treat a register value as a float in one instruction and then as a pointer during the next instruction.
Ruby is strongly typed because all values in the system have types. Types affects what you can do. If you treat a number like its an array in ruby, you get an error. (But the error happens at runtime because ruby is dynamically typed - thus typechecking only happens at runtime!).
It's strongly typed, but it's also duck typed. Also, in ruby everything is an object, even the class itself, so type checking there is weird.
Sure it stops you from running into "'1' + 2" issues, but won't stop you from yeeting VeryRawUnvalidatedResponseThatMightNotBeAuthorized to a function that takes TotalValidatedRequestCanUseDownstream. You won't even notice an issue until:
- you manually validate
- you call a method that is unavailable on the wrong object.
I recall a type theorist once defined the terms as follows (can't find the source): "A strongly typed language is one whose type system the speaker likes. A weakly typed language is one whose type system the speaker dislikes."
So yeah I think we should just give up these terms as a bad job. If people mean "static" or "dynamic" then they can say that, those terms have basically agreed-upon meanings, and if they mean things like "the type system prohibits [specific runtime behavior]" or "the type system allows [specific kind of coercion]" then it's best to say those things explicitly with the details filled in.
It would be weak if that was actually mutating the first “a”. That second declaration creates a new variable using the existing name “a”. Rust lets you do the same[1].
Rust lets you do the same because the static typing keeps you safe. In Rust, treating the second 'a' like a number would be an error. In ruby, it would crash.
These are two entirely different a's you're storing reference to it in the same variable. You can do the same in rust (we agree it statically and strongly typed, right?):
let a = 1;
let a = '1';
Strongly typing means I can do 1 + '1' variable names and types has nothing to do with it being strongly typed.
In the dynamic world being able to redefine variables is a feature not a bug (unfortunately JS has broken this), even if they are strongly typed. The point of strong typing is that the language doesn't do implicit conversions and other shenanigans.
Well yeah, because variables in what you consider to be a
strongly typed language are allocating the storage for those variables. When you say int x you're asking the compiler to give you an
int shaped box. When you say x = 1 in Ruby all you're doing is saying is that in this scope the name x now refers to the box holding a 1. You can't actually store a string in the int box, you can only say that
from now on the name x refers to the string box.
The “Stop at first level of type implementation” is where I see codebases fail at this. The example of “I’ll wrap this int as a struct and call it a UUID” is a really good start and pretty much always start there, but inevitably someone will circumvent the safety. They’ll see a function that takes a UUID and they have an int; so they blindly wrap their int in UUID and move on. There’s nothing stopping that UUID from not being actually universally unique so suddenly code which relies on that assumption breaks.
This is where the concept of “Correct by construction” comes in. If any of your code has a precondition that a UUID is actually unique then it should be as hard as possible to make one that isn’t. Be it by constructors throwing exceptions, inits returning Err or whatever the idiom is in your language of choice, the only way someone should be able to get a UUID without that invariant being proven is if they really *really* know what they’re doing.
(Sub UUID and the uniqueness invariant for whatever type/invariants you want, it still holds)
> This is where the concept of “Correct by construction” comes in.
This is one of the basic features of object-oriented programming that a lot of people tend to overlook these days in their repetitive rants about how horrible OOP is.
One of the key things OO gives you is constructors. You can't get an instance of a class without having gone through a constructor that the class itself defines. That gives you a way to bundle up some data and wrap it in a layer of validation that can't be circumvented. If you have an instance of Foo, you have a firm guarantee that the author of Foo was able to ensure the Foo you have is a meaningful one.
Of course, writing good constructors is hard because data validation is hard. And there are plenty of classes out there with shitty constructors that let you get your hands on broken objects.
But the language itself gives you direct mechanism to do a good job here if you care to take advantage of it.
Functional languages can do this too, of course, using some combination of abstract types, the module system, and factory functions as convention. But it's a pattern in those languages where it's a language feature in OO languages. (And as any functional programmer will happily tell you, a design pattern is just a sign of a missing language feature.)
I find regular OOP language constructor are too restrictive. You can't return something like Result<CorrectObject,ConstructorError> to handle the error gracefully or return a specific subtype; you need a static factory method to do something more than guaranteed successful construction w/o exception.
Does this count as a missing language feature by requiring a "factory pattern" to achieve that?
The natural solution for this is a private constructor with public static factory methods, so that the user can only obtain an instance (or the error result) by calling the factory methods. Constructors need to be constrained to return an instance of the class, otherwise they would just be normal methods.
Convention in OOP languages is (un?)fortunately to just throw an exception though.
In languages with generic types such as C++, you generally need free factory functions rather than static member functions so that type deduction can work.
> You can't return something like Result<CorrectObject,ConstructorError> to handle the error gracefully
Throwing an error is doing exactly that though, its exactly the same thing in theory.
What you are asking for is just more syntactic sugar around error handling, otherwise all of that already exists in most languages. If you are talking about performance that can easily be optimized at compile time for those short throw catch syntactic sugar blocks.
Java even forces you to handle those errors in code, so don't say that these are silent there is no reason they need to be.
This is why constructors are dumb IMO and rust way is the right way.
Nothing stops you from returning Result<CorrectObject,ConstructorError> in CorrectObject::new(..) function because it's just a regular function struct field visibility takes are if you not being able to construct incorrect CorrectObject.
I don't see this having much to do with OOP vs FP but maybe the ease in which a language lets you create nominal types and functions that can nicely fail.
What sucks about OOP is that it also holds your hand into antipatterns you don't necessarily want, like adding behavior to what you really just wanted to be a simple data type because a class is an obvious junk drawer to put things.
And, like your example of a problem in FP, you have to be eternally vigilant with your own patterns to avoid antipatterns like when you accidentally create a system where you have to instantiate and collaborate multiple classes to do what would otherwise be a simple `transform(a: ThingA, b: ThingB, c: ThingC): ThingZ`.
Finally, as "correct by construction" goes, doesn't it all boil down to `createUUID(string): Maybe<UUID>`? Even in an OOP language you probably want `UUID.from(string): Maybe<UUID>`, not `new UUID(string)` that throws.
> Even in an OOP language you probably want `UUID.from(string): Maybe<UUID>`, not `new UUID(string)` that throws.
One way to think about exceptions is that they are a pattern matching feature that privileges one arm of the sum type with regards to control flow and the type system (with both pros and cons to that choice). In that sense, every constructor is `UUID.from(string): MaybeWithThrownNone<UUID>`.
The best way to think about exceptions is to consider the term literally (as in: unusual; not typical) while remembering that programmers have an incredibly overinflated sense of ability.
In other words, exceptions are for cases where the programmer screwed up. While programmers screwing up isn't unusual at all, programmers like to think that they don't make mistakes, and thus in their eye it is unusual. That is what sets it apart from environmental failures, which are par for the course.
To put it another way, it is for signalling at runtime what would have been a compiler error if you had a more advanced compiler.
Unfortunately many languages treat exceptions as a primary control flow mechanism. That's part of why Rust calls its exceptions "panics" and provides the "panic=abort" compile-time option which aborts the program instead of unwinding the stack with the possibility of catching the unwind. As a library author you can never guarantee that `catch_unwind` will ever get used, so its main purpose of preventing unwinding across an FFI boundary is all it tends to get used for.
Just Java (and Javascript by extension, as it was trying to copy Java at the time), really. You do have a point that Java programmers have infected other languages with their bad habits. For example, Ruby was staunchly in the "return errors as values and leave exception handling for exceptions" before Rails started attracting Java developers, but these days all bets are off. But the "purists" don't advocate for it.
> Functional languages can do this too, of course, using some combination of abstract types, the module system, and factory functions as convention
In Haskell:
1. Create a module with some datatype
2. Don't export the datatype's constructors
3. Export factory functions that guarantee invariants
How is that more complicated than creating a class and adding a custom constructor? Especially if you have multiple datatypes in the same module (which in e.g. Java would force you to add multiple files, and if there's any shared logic, well, that will have to go into another extra file - thankfully some more modern OOP languages are more pragmatic here).
(Most) OOP languages treat a module (an importable, namespaced subunit of a program) and a type as the same thing, but why is this necessary? Languages like Haskell break this correspondence.
Now, what I'm missing from Haskell-type languages is parameterised modules. In OOP, we can instantiate classes with dependencies (via dependency injection) and then call methods on that instance without passing all the dependencies around, which is very practical. In Haskell, you can simulate that with currying, I guess, but it's just not as nice.
Indeed, OOP and FP both allow and encourage attaching invariants to data structures.
In my book, that's the most important difference with C, Zig or Go-style languages, that consider that data structures are mostly descriptions of memory layout.
I've recently been following red-green-refactor but instead of with a failing test, I tighten the screws on the type system to make a production-reported bug cause the type checker to fail before making it green by fixing the bug.
I still follow TDD-with-a-test for all new features, all edge cases and all bugs that I can't trigger failure by changing the type system for.
However, red-green-refactor-with-the-type-system is usually quick and can be used to provide hard guarantees against entire classes of bug.
I like this approach, there are often calls for increased testing on big systems and what they really mean is increased rigor. Don't waste time testing what you can move into the compiler.
It is always great when something is so elegantly typed that I struggle to think of how to write a failing test.
What drives me nuts is when there are testing left around basically testing the compiler that never were “red” then “greened” makes me wonder if there is some subtle edge case I am missing.
As you move more testing responsibilities to the compiler, it can be valuable to test the compiler’s responsibilities for those invariants though. Otherwise it can be very hard to notice when something previously guaranteed statically ceases to be.
I found myself following a similar trajectory, without realizing that’s what I was doing. For a while it felt like I was bypassing the discipline of TDD that I’d previously found really valuable, until I realized that I was getting a lot of the test-first benefits before writing or running any code at all.
Now I just think of types as the test suite’s first line of defense. Other commenters who mention the power of types for documentation and refactoring aren’t wrong, but I think that’s because types are tests… and good tests, at almost any level, enable those same powers.
I dont think tests and types are the same "thing" per se - they work vastly better in conjunction with each other than alone and are weirdly symmetrical in the way that theyre bad substitutes for each other.
However, Im convinced that theyre both part of the same class of thing, and that "TDD" or red/green/refactor or whatever you call it works on that class, not specifically just on tests.
Documentation is a funny one too - I use my types to generate API and other sorts of reference docs and tests to generate how-to docs. There is a seemingly inextricable connection between types and reference docs, tests and how-to docs.
Types are a kind of test. Specifically they’re a way to assert certain characteristics about the interactions between different parts of the code. They’re frequently assertions you’d want to make another way, if you didn’t have the benefit of a compiler to run that set of assertions for you. And like all tests, they’re a means to gain or reinforce confidence in claims you could make about the code’s behavior. (Which is their symmetry with documentation.)
As you mentioned correctly: if you go for strongly typed types in a library you should go all the way.
And that means your strong type should provide clear functions for its conversion to certain other types. Some of which you nearly always need like conversion to a string or representation as a float/int.
The danger of that is of course that you provide a ladder over the wall you just built and instead of
They now go the shortcut route via numeric representation and may forget the conversion factor. In that case I'd argue it is best to always represent temperature as one unit (Kelvin or Celsius, depending on the math you need to do with it) and then just add a .display(Unit:: Fahrenheit) method that returns a string. If you really want to convert to TemperatureF for a calculation you would have to use a dedicated method that converts from one type to another.
One thing to consider as well is that you can mix up absolute values ("it is 28°C outside") and temperature deltas ("this is 2°C warmer than the last measurement"). If you're controlling high energy heaters mixing those up can ruin your day, which is why you could use different types for absolutes and deltas (or a flag within one type). Datetime libraries often do that as well (in python for example you have datetime for absolute and timedelta for relative time)
Union types!! If everything’s a type and nothing works together, start wrapping them in interfaces and define an über type that unions everything everywhere all at once.
Welcome to typescript. Where generics are at the heart of our generic generics that throw generics of some generic generic geriatric generic that Bob wrote 8 years ago.
Because they can’t reason with the architecture they built, they throw it at the type system to keep them in line. It works most of the time. Rust’s is beautiful at barking at you that you’re wrong. Ultimately it’s us failing to design flexibility amongst ever increasing complexity.
Remember when “Components” where “Controls” and you only had like a dozen of them?
Remember when a NN was only a few hundred thousand parameters?
As complexity increases with computing power, so must our understanding of it in our mental model.
However you need to keep that mental model in check, use it. If it’s typing, do it. If it’s rigorous testing, write your tests. If it’s simulation, run it my friend. Ultimately, we all want better quality software that doesn’t break in unexpected ways.
union types are great. But alone they are not sufficient for many cases. For example, try to define a datastructure that captures a classical evaluation-tree.
You might go with:
type Expression = Value | Plus | Minus | Multiply | Divide;
interface Value { type: "value"; value: number; }
interface Plus { type: "plus"; left: Expression; right: Expression; }
interface Minus { type: "minus"; left: Expression; right: Expression; }
interface Multiply { type: "multiply"; left: Expression; right: Expression; }
interface Divide { type: "divide"; left: Expression; right: Expression; }
And so on.
That looks nice, but when you try to pattern match on it and have your pattern matching return the types that are associated with the specific operation, it won't work. The reason is that Typescript does not natively support GADTs. Libs like ts-pattern use some tricks to get closish at least.
And while this might not be very important for most application developers, it is very important for library authors, especially to make libraries interoperable with each other and extend them safely and typesafe.
The issues I see with this approach is when developers stop at this first level of type implementation. Everything is a type and nothing works well together, tons of types seem to be subtle permutations of each other, things get hard to reason about etc.
In systems like that I would actually rather be writing a weakly typed dynamic language like JS or a strongly typed dynamic language like Elixir. However, if the developers continue pushing logic into type controlled flows, eg:move conditional logic into union types with pattern matching, leverage delegation etc. the experience becomes pleasant again. Just as an example (probably not the actual best solution) the "DewPoint" function could just take either type and just work.