Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Statically-typed error handling in Python using Mypy (beepb00p.xyz)
198 points by Agathos on Dec 8, 2019 | hide | past | favorite | 126 comments


Here's my opinion, having written many thousands of lines of mypy code.

* Third party libraries are still not typed, which sucks

* Inference is weak. Sometimes an 'if' statement narrows accurately, sometimes it doesn't.

* Generics are extremely confusing, moreso than any other typed language I have used. Any Generics are even more confusing.

* Fundamentally, Python is structured such that the concept of interfaces are coupled tightly to implementation of classes, which is severely limiting. As an example, a module A can declare a class, a module B can declare an interface (Protocol), but module B can not implement that interface for A (without modifying A's code).

* Errors from mypy are still very very light. There are very few "Try XYZ to fix it", and it's just a single line of read text that amounts to an assertion.

* Still weak support for JSON (use Dict[str, Any])

* Slowly improving but still not amazing support for Self types

I could go on.

Mypy is awesome but it still feels very immature, and as it should since it's pre 1.0. It does not come close to matching the experience of other languages with static types.


* Third party libraries: I sometimes see them annotated, but what people don't suspect, is that you need to include 'py.typed' file with your package in order for it to be discoverable. Perhaps that's something setuptools could warn the developer about..

* Regarding errors: there are few quite recent flags: ' --pretty', '-show-error-context' and '--show-error-codes', they make it a bit more pleasant. You can see me using them in this section: https://beepb00p.xyz/mypy-error-handling.html#container


The 'py.typed' thing is extremely unintuitive, I package up python libs and I'm still not sure I've done it right.

Thank you for letting me know about those flags, I had not heard of them before!


* Still weak support for JSON (use Dict[str, Any])

Python 3.8 added TypedDict which finally helps this particular shortcoming.


I was surprised to find out that MyPy will throw an error if you use a variable to access a key (https://github.com/python/mypy/issues/7178).

Optional keys are also a nuisance.


How are optional keys a nuisance? I find that using the patterns:

    if "optional_key" in my_dict:
        do_something_with(my_dict["optional_key"])
Or:

    optional_value = my_dict.get("optional_key")
    if optional_value:
        do_something_with(optional value)

Works as expected. Nested optional keys can be a bit annoying, although:

    my_dict.get("optional_parent", {}).get("optional_child")
seems to work.


I was referring to defining optional keys.

If you have a dict with some keys which are optional, you need to create a separate subclass (with `total=False`) just for those optional keys.

With TypeScript, I can just use `key?: type`.

https://mypy.readthedocs.io/en/latest/more_types.html#mixing...


Wonderful, that's great to hear, and I can move to 3.8 quite easily myself.

It's nice to see strong progress in such important areas. My post is probably overly critical, but I write Python almost every day, and maintaining a highly-typed library can be demoralizing at times.


Btw, TypedDict has been available in older versions of python via `mypy_extensions`

    from mypy_extensions import TypedDict

https://github.com/python/mypy_extensions


in my experience, TypedDict has its own shortcomings, most of which boil down to the fact that it's not truly structural subtyping, because 'extra' keys are treated as errors.


Having recently found myself switching between Python and Scala, I've come to value IntelliJ's type hinting and red squiggly prompts, not to mention compile-time errors.

While mypy and type hinting support in PyCharm and VSCode is great, it's not the seamless experience as with typed languages and the lack of (optional) runtime typing still allows all sorts of bad things to happen. A "safe" mode for Python, where typing and perhaps some other stuff is checked at run-time, might be an interesting advancement for those in the community building larger or mission-critical systems.


My opinion is that python shouldn't be used for mission- critical systems anyway.


I’m on the other side; having gotten deeper into languages like c# and java; I think those technologies are vastly over used, and great languages like Python and Typescript are under used.


How do you figure? If it’s mission critical, it means failure is unacceptable. Python and other dynamic languages (TS is a middle ground here) excel when failure isn’t so bad and making and deploying changes is fast/easy/cheap. This is exactly the wrong tradeoff for mission critical systems.


>How do you figure? If it’s mission critical, it means failure is unacceptable. Python and other dynamic languages (TS is a middle ground here) excel when failure isn’t so bad

In practice I find that static typing as done by, say, Java and C#, excels at uncovering very shallow bugs. Its benefit is the most obvious when you do very little automated testing. The tech industry's dirty little secret is that automated testing is done infrequently and badly, so it looks pretty effective.

I find that if you try to compare the cost of building sophisticated tests in a high productivity language (like python) with the cost of writing code in a low productivity/high type safety language like haskell, there's a pretty blurry cost/benefit trade off. If you take out productivity as a factor, haskell looks much better.

Nonetheless, the idea that static types (and, to a lesser extent, formal proofs) are some sort of silver bullet persists. There is no silver bullet.


For sure static types are no silver bullet; no one in this thread is claiming otherwise and anyone who is making that claim is wrong. And of course various languages like Haskell and Java and C# have a productivity toll (although I’ve heard the latter two have improved considerably since I last used them), but I think almost none of that is due to static types intrinsically. Haskell’s productivity toll is due largely to its insistence on functional purity, bizarre syntax, inaccessible and jargon-laden culture and documentation, and general lack of attention to practical concerns. Java and C# suffer from an abundance of boilerplate, a large feature matrix, a bunch of baggage from an era where inheritance was shoehorned into everything, and similar “enterprise” cultural baggage (“IAbstractBeanFactory”).

A good counterpoint is Go. Despite being far more familiar with Python (it’s my professional/work language), it’s static types actually help me write correct code more quickly than I can with Python. Your mileage may vary, but I think anyone who has given Go an honest shake will find that it’s at least in the same productivity ballpark as Python. Note that this isn’t considering tooling or deployment, or performance where Go specifically excels over Python (and also often due to static types).

Of course, with sufficient investment in testing, you could get the same confidence with Python that you get in static languages, but writing tests is a productivity cost as much as boilerplate or a pedantic compiler. And with static languages, I often find that I can write fewer tests for the same confidence (in fact, I often prototype in Go and backport to Python).


Static types do help eliminate errors and unwanted behavior without requiring to write explicit tests to verify the same outcomes. A good static type system helps reason about the code and therefore increases productivity.

On that note, why would you consider Haskell a low productivity language? People who are well versed in it seem to find it an exceptionally productive language to work with.


Types can help eliminate errors and unwanted behavior, yes. Just like tests.

Types also help document code. Just like tests.

The point is that both are investments and both have different payoff matrices. Sophisticated are often better at preventing obscure logical bugs. They're also good at uncovering obscure not-bugs and preventing code from getting out until the compiler is satisfied. Bad unit tests also do this.

Haskell simply takes longer to write than other languages. Given two developers of equal skill and experience, anyway. I partly attribute the relative paucity of haskell software out there to this.


To be clear, there are many other statically types languages besides Haskell, most of which are more amenable to a faster pace of development.

Further, while I agree with your “different matrixes of payout”, I think static types are a very low investment and they have a respectable payout in terms of preventing bugs but also in terms of documenting code and facilitating tooling (such as autocomplete or documentation generation) and they also permit easier code changes than comprehensive unit testing (even “good” unit testing). Of course, I’m not advocating that static types completely obviate tests; rather that they obviate some of the tests; however, there is no clear answer as to how many or which tests are obviated; it’s very circumstantial.


From the paragraph following the "no silver bullet" paragraph:

> Skepticism is not pessimism, however. Although we see no startling breakthroughs, and indeed, believe such to be inconsistent with the nature of software, many encouraging innovations are under way. A disciplined, consistent effort to develop, propagate, and exploit them should indeed yield an order-of-magnitude improvement. There is no royal road, but there is a road.


I thought the emerging story was one of integrated testing and style conformity for the entire team. Who is working at a big software company for which this is _not_ true?


Millions?


Research doesn't support the claim that there is a strong correlation between static typing and fewer bugs. Is your opinion based on your gut feeling, or do you have data on the matter that you can provide?

https://web.cs.ucdavis.edu/~filkov/papers/lang_github.pdf


The type of bug (and related failure modes) is important though. Would you willingly board a plane if the avionics software and firmware had been written in a dynamically typed language?

Alternatively, usage of Rust may not be correlated with fewer bugs overall but I'd expect there to be fewer of the use-after-free variety.


I would prefer it to be written in something like Ada or Haskell, but I'd rather board a plane with avionics software written in Python than in Java.

I actually heard Aribus software is written in C++. If true it should scare the shit out of anyone.


As someone who enjoys Python quite a bit, I think Java has a lot of options that would be really important there that Python is lacking, like real-time support, and Python no real benefit.


Space shuttle software is written in C, and lots of critical software is written in C++. Discipline, standards, and experience go a long way. Static verification is also nice.


I am seeing Ada poping up a lot more lately, is Ada still being used?


The mere fact that something is "research" does not mean that it is credible. The study you reference in particular has a lot of issues: https://buttondown.email/hillelwayne/archive/science-turf-wa...


Of course, but research is still the best we have when it comes to support a claim or try to establish certain truths about the world.

If you don't believe the results of the study, please provide the counter evidence.


Measuring programmer productivity is very hard using standard research tools so they are not necessarily the best we have. This is not physics.

Besides, all I wanted to do was to point out that you shouldn’t derive any conclusions from that study. I don’t need any counter evidence for that (apart from the evidence that the study is flawed of course).


I would say experience is the best we have.


Research is actually just all over the place (lots of papers come to many, many different conclusions), because it's impossible to control for various things like developer experience.

Instead, common sense is actually the best substitute we have.


Its based on experience and a general consensus among the most experienced developers I know. The research isn’t remotely conclusive; it’s not worth anything until they control for much, much more.


Well, from what I’ve seen, if developers can’t get a dynamic language then they’re highly prone to cheat: e.g. not validating on IO and using things like the c# dynamic type everywhere.

Point is, code culture matters much more than language.


My approach is to dial up strictness gradually as code proves its value. I'll start out building a project and not validating on I/O, but as the requirements get locked down and the code has proven itself, I'll clean up all the edge cases - which will often mean adding in progressively stricter validation on border code.

The advantage of this is that if you end up not wasting too much time "building the wrong thing". Let's say that you took one form of I/O and built massively strict validation in and then realized later that you should have taken an entirely different form of I/O for your subsystem. All that time building in validation on that useless part of code was a pointless waste.

I don't have any stats, but my gut feel is that on average 40% of code can end up being tossed in this way (in some projects it's 100% =).

Prototyping speed is, additionally, not just useful in reducing the cost of building the right kind of code, it's useful in reducing the cost of building the right kind of test (a really underappreciated facet of building mission critical systems).

In my younger years I used to believe that for mission critical systems "building the wrong thing" was somehow less of a problem in code because you could fix requirements and do architecture upfront with some sort of genius architect. Turns out this was wrong.


> using things like the c# dynamic type everywhere

I'm curious as to where you have observed this, because my experience has been exactly the opposite: even in circumstances where "dynamic" might lead to more readable code, C# developers are loathe to use it, to the point where it's very hard to find it in idiomatic C# code.


Why do dynamic languages excel when failure is not so bad? What makes non-dynamic (compiled?) languages better when failure is bad?

I'm parsing your comment to mean, "it's easier to write correct code in compiled languages", but this is not obvious to me, or anyone who's, for instance, written any C at all.


C is the wrong comparison to make here - it's extremely low level. Java or perhaps C++ would be much closer.

* Ahead of time compiled languages reduce flexibility but allow the compiler to do more reasoning about the system.

* Statically typed languages allow the compiler to reason about the assignments you make and methods you call. If the language is also AOT compiled, you avoid crashing at runtime.

* Dynamically typed languages significantly reduce boilerplate and avoid the mental effort of expressing an idea in a rigorously typed manner. It's the "hold my beer" approach.


C is a good example precisely because it highlights the fact that explicit typecasts and memory safety are a LOT better at preventing bugs than static types.


> Why do dynamic languages excel when failure is not so bad?

There's a usual confusion of banking in this case, but if we're talking about popular dynamic/memory-managed languages, you can put an equivalent of:

    try:
      actual code
    except:
      log.exception
      
At a high level and be fairly sure things will be ok in a long run even with failures along the way.

On the other hand, most static-typed systems handle failures explicitly.

I think C is a bad example here. We've come a long way since C. We know how to do better.

Unless you want to include the ecosystem as well. C + valgrind + PVS + clang-analyser do make it easier to create correct code.


Wait but you can do that pattern in any language (java or cpp, both of which are static, and one which is unmanaged).


You can't really do it in c++ in the same way. If you mess up you will crash on invalid memory reference. With Java, you'll catch a bad cast or a null ref exception instead.


I think there is a spectrum here between mission-critical systems (utilities, telcos, ATC, airline bookings systems, embedded) through to scripting. Perhaps I shouldn't have used the term "mission-critical" here, rather "big, complex systems whose failure costs money". Python is used extensively in certain disciplines(1), such as data science and data engineering, and there are plenty of systems that meet these criteria. Strong typing would wipe out an entire class of errors.

(1) Due to ecosystem richness, the need to collaborate cross-functionally, and other reasons.

EDIT: grammar gremlins.


"Strong typing" is a term so vague that it loses all usefulness in conversation. Wikipedia gives 5 different definitions.

https://en.wikipedia.org/wiki/Strong_and_weak_typing


Dynamic languages can be strongly typed. Meaning a dynamic language can have better type safety then a static language.


The way to ensure minimal failure rates in code is a top notch test suite. Typed or untyped language doesn't matter.

Before test suites, I would have agreed with you.


Test suites are one leg of the stool. The others include memory safety, type safety and (for real-time applications) resource guarantees, as well as a formal development process that includes rigorous code inspections. Put everything together and you have "space shuttle computer" level investment required. As you give up legs, the stool becomes cheaper yet not as sturdy.


The problem with that is the economics for many applications don’t favor elaborate test suites. If you’re building a saas software product where you can detect a bug, fix it, and deploy in a matter of hours, and wherein developer velocity is among your most important metrics, an elaborate test suite is a liability (note that the suite spot is not “no tests” but somewhere in the middle) as it makes changes slower. Dynamic languages work reasonably well for these categories of applications, although some static typing can help improve that iteration time.


Having done TDD for 10+ years, I've come to realize the main value test brings is not the error checking, nice as it is.

The biggest value to me is that you can easily do huge refactorings with a lot of confidence. That in turn means you can keep redesigning your code and frameworks long after they would ossify into legacy code no one dares change in a normal project.

That's not to say I disagree that much with your point.


If you give someone $1000 for each bug they find they would find tons of bugs. So in the end its a matter of time/money.


Would you rather build safety-critical systems in Python (type-unsafe but memory-safe) or C++(type-safe but memory-unsafe)?

While static typing does catch code errors, in my experience they are mostly errors that would be caught somewhere else in the testing process. Memory safety issues are rarer, but also have a penchant for showing up unexpectedly and for crashing the entire system in unrecoverable ways.

Obviously, the best would be to have both kinds of safety, but people program safety critical systems in C++ all the time, and deal with the consequences. At least you can catch a TypeError in Python and handle it.


I would rather build it in a type-safe and memory-safe language - but if that is not allowed I would go for C++ even though I have my issues with it.

Catching TypeError in python and handling it is not a substitute for static typing.


People are gonna dogpile you, but if you'd said this about JS they wouldn't have blinked.


Coz there's a difference between "dynamic type system" and "type system that's thrown together with bits of string and rubber bands":

https://www.destroyallsoftware.com/talks/wat


Python has is own 'wat' stuff, especially Python 2.

https://stackoverflow.com/questions/3270680/how-does-python-...

"CPython implementation detail: Objects of different types except numbers are ordered by their type names; objects of the same types that don’t support proper comparison are ordered by their address."


> objects of different types always compare unequal, and are ordered consistently but arbitrarily

Seems like a completely reasonable thing to do?


> When you order two incompatible types where neither is numeric, they are ordered by the alphabetical order of their typenames:

????????


That's entirely an implementation detail. The thing that matters for programmer experience is the specification of behavior: There's going to be an arbitrary order. You certainly can argue that you think it would be less surprising if trying this just failed, or that promising a stable order ties down implementations too much, but I don't think providing an order is that surprising.

None (as far as I remember) of the common "wat" examples about JS are interpreter implementation details, but specified behavior.

I'm mildly curious why CPython chooses to implement it this way, but if I had to guess: I'm assuming it is to provide a stable order between objects of different classes with the same hash() value. Hash-value being what I suspect is what the quoted answer misrepresents as "their address" (the address being in CPython the default fallback for objects that do not implement a hash() function), and being a good candidate to establish an arbitrary order.


Sorry, I'm not sure I understand at all the difference between this and the 'wat' stuff.


That's just GIGO. There's simply no good way to order "two incompatible types where neither is numeric".


There's always an option to just treat it as an error, especially given that 1+"2" already is.


None of those surprising behaviors exist in python 3.

I wouldn't say "100" < "2" really counts, since, alphabetically it makes sense (an equivalent would be "baa" < "c").


Yes, Python2 was called out explicitly. Reference equality is definitely 'wat' shit IMO, though it's really mainstream too.


Honestly, with JS flow (https://flow.org) it's very manageable, I almost never have WAT sort of issues


I don't see any reason to use Flow over TS.


Maybe! But I had to implement some tools (chrome extensions) on Javascript with no prior JS experience, so was trying to stick to vanilla side.


Lack of strong typing is low on the very long list of problems with JS


That depends on the mission, wouldn't you say?


Care to elaborate? Which languages would you use for mission-critical systems? What specifically about Python is not conducive to such systems?

You realize that many mission-critical workloads run on Python with great success, right?


Why? Seems that it is a well proven language. Just like php, asp.net, and other languages people like to hate on.

Or rather, how would you define mission critical?


Aside from static types, why?


One thing I dislike in Java is the string comparison.

If you write str1 == str2, it compiles but probably does not do what you want (object identity vs string comparison).

(edit: to clarify, this does not mean Java is not a good language overall, every language has its own set of shortcomings and as someone points out in the replies, this behavior is consistent with everything else in the language)


It’s been a while since I used Java, but what on earth does that string comparison do if not comparing the strings? Is it an address-only comparison?


Yes, and this is entirely consistent with the overall design of the language. The ‘==‘ does not have a special overload for String instances, which are reference types just like any other in the JVM. The equals operator does a reference comparison, just as it should. If you want to do a logical comparison of instances, you use the equals() method just as you would for anything else.

Frankly, if this is the most meaningful complaint you can come up with, it might be indicative that you haven’t learned enough about it to have an informed opinion.


> The ‘==‘ does not have a special overload for String instances

As a non-Java developer I have to ask why you think this is a reasonable design choice. It's a massive footgun for the a casual polyglot Java developer. I can imagine just forgetting I can't do == if I've just switched from being immersed in another language.


It’s precisely the same behavior you would get with C, C++, C# for references, with C# defaulting the ‘==‘ operator to short-circuit to ReferenceEquals() unless otherwise re-defined.


> with C# defaulting the ‘==‘ operator

Which is surely the important bit. I can just use "==" in C#.


They have a magic overload for '+'. Consistency wasn't much of a priority.


It isn't at all consistent, because integer types compare correctly with == and do not have a .equals() method. And regardless of consistency, if it's a mistake people make frequently with the language and need to learn not to do the obvious thing, it's probably not an ideal bit of design.


In java (and every other C derived language that I know of) there’s a hard and explicit distinction between reference and primitive types.

In fact, from a certain perspective the two flavors get treated identically. Comparing two integers with ‘==‘ will be true if the contents of the variables are the same, and comparing two references will be true if the pointers they contain are the same value. It just so happens that for primitives identity and logical equality are the same thing.


C, C++, Rust, Go, etc have no special distinction between reference types and primitive types but rather everything is a value and some values are pointers/references. In a language with operator overloading, I don’t see why you’d ever want to check string identity by default. Interestingly, Go doesn’t have operator overloading but it treats strings as a special case including doing a string comparison for ==. Not saying this is better or worse, although no one makes mistakes with string comparison as far as I’m aware, for whatever that is worth.


That special case isn't specifically for strings - you can also compare arrays, channels and structs that way (as long as the array/struct doesn't contain anything non-comparable). Unfortunately it's a rarer footgun though since it can panic when comparing interfaces :(


The distinction between String and int in Java is not explicit, certainly not to a beginner; maybe other classes are because you have to create them with new, but it's not obvious that string literals result in a different kind of thing with different rules than integer literals.

Regardless, none of that changes the fact that it's a bad design decision, because people regularly get it wrong.


The flaw in Java design is that String is a reference type. Logically, it's a value, and should be treated as such (and not copying the underlying data should be an implementation detail).

The problem is that in Java, there are no user-defined value types, only primitives; and primitives can't have methods. So if it were a primitive, you'd have to write "String.length(s)" etc. Also, all other Java primitives are basically bit sequences that are interpreted in one way or another, but that wouldn't be the case for strings.

.NET/C#, though, doesn't get that excuse. It totally could have defined String as a struct with an internal char[] field, and then there would be no question of value/reference equality for its ==. But that would also mean that you couldn't use null for strings - and they didn't have nullable value types back then. I suspect that, plus the overall mindset carried over from Java, is what won the day.

Side note: there's no hard and explicit distinction between reference and primitive types in C itself, nor in C++. If Python is "everything is an object reference", and Java is "everything is an object reference except for primitives that are values", then C++ is "everything is a value, including object references". Thus, there's no ambiguity with == in C++ - it always compares values, it's just that sometimes those values are pointers.

I think sometimes that perhaps the implicitness of object references that is so common to OO languages, that I think was introduced by Smalltalk, is a mistake. It's interesting that Simula-67 didn't have it - although it was very Java-like otherwise, having only primitive values and references to objects (i.e. no objects as values, like C++), it distinguished the two consistently by using distinct equality operators (= vs ==), and even distinct assignments (:= vs :-).

Or, alternatively, treat everything as an object reference, but make == do implicit dereferencing as well, as Python and VB do; and provide a completely distinct operator for reference equality, such as "is". Python has a problem in that regard in that it has a default implementation of == for all classes that compares references, and so it ends up used as reference equality in practice sometimes. It would be better to have no default for == at all, just as there's no default for other comparison operators; this is what VB does.

It might also be best to stop talking about value and reference types altogether - what's actually important is the presence or absence of identity. Then everything can be an object reference, but not all object references can be compared for equality (and in practice, under the hood, the implementation can just skip the references and copy the data itself - or not, depending on mutability and size).


Yes, it's an object-identity comparison, not a logical one. Worse, you can sometimes get spurious true results due to optimization.


I think in the long run, python, ruby, js, etc are all bad ideas. You are solving high level problems with these languages, correctness should be the main priority.

Realising that type checking is a good thing, then bolting it on, I feel is a crappy solution, when you should just use something designed better from the start.

I completely disagree that there is a "productivity toll". If you dont know how to speak a language then obviously it seems difficult. I think you save time in the long run not dealing with all the type errors.


Type system's of typescript , mypy and julia are far better than type syatem's of java/c#/c++/go . Null safety, type inference, sanner generics, Union types, structural typing are come to mind. I think the reason is that these type systems should catch up expressiveness of their host language.


> Type system's of typescript [...] are far better than type systems [than java]

TypeScript's type system is far more powerful than Java's but I'm not sure that's a good thing. A lot of the concepts in TypeScript are necessary to paper over gratuitously dynamic APIs in JavaScript.

While Java certainly has warts in its type system (no generics over primitives, arrays), Java's type system straight-forward compared to TypeScript. TypeScript stuffs a staggering amount of functionality into the type system. To name a few of the more esoteric features: literal types, const assertions, asserts modifier for type predicates.


I don't think literal types are esoteric at all, they are one of the most useful features of the type system, especially when combined with Union and intersection types. That functionality alone is why I miss TS so much when I switch over to Java land.


>> “ I think you save time in the long run not dealing with all the type errors.”

Have you written and deployed production code in the languages you mention above? Did you encounter a substantial number of type errors?


There is no good definition of a type error. It certainly goes beyond TypeError(Exception).

Is a '.close()' method being called twice a type error? It is in a language that can express states in the type system.

Is SQL injection a type error? It is when you use refined types in your sql library interface.

The vast, vast majority of errors I run into are errors that, with effort and the right type system, I could turn into type errors.

The point being that asking "would those be type errors?" is a really big question that is, when answered simply, "yes".


There aren't any mainstream languages that have the ability to detect the ultra high-level errors you describe. Maybe typescript is getting better.

Pyflakes and unit tests will detect the great majority of potential errors. Mypy gets you closer to zero.


Rust can express state machines just fine, as well as refinement types (Python can do refinement types too), but yes, I agree that there's room for languages to build more ergonomic but advanced type systems.


There are very usable, well-supported languages out there right now that do have such type systems. Saying they're out of consideration because they're not mainstream is a circular argument.


>Saying they're out of consideration because they're not mainstream is a circular argument.

No, it's a pragmatic argument. There's no circularity it.

If they're not mainstream, then they're neither "very usable" or "well-supported" to any extend that a mainstream language would be.


Here's the circularity:

- There are no mainstream languages with powerful and convenient type systems

- But there are less mainstream ones

- But I won't use those

- There are no mainstream languages with powerful and convenient type systems


That's circularity alright, but not as in a "circular argument" though. It's a feedback loop.

That's however a totally legitimate engineering choice.

Engineers picking on a language shouldn't bet on less mainstream incomplete environments with less tooling and libs and options and devs, just so that they can raise them into the mainstream.

That might be a "tragedy of the commons" thing, but it's not an engineering obligation to be an early adopter.


They're not obligated to by early adopters, but then you hear comments like:

> There aren't any mainstream languages that have the ability to detect the ultra high-level errors you describe.

And it's like, well there are great languages that can do that, but if you restrict yourself to that tiny 'mainstream' subset, you'll never know it.

And moreover I think engineers (or rather, companies) are way too conservative about this stuff–picking 'mainstream' tech can be a touch-and-go proposition at any time. Just because something is mainstream, doesn't mean it's the right choice for your project.


close() is partially handled by any language with scoped resources. There's lots of them.

Injections are handled by any language with taint, although that became less popular. Perl had it. Anything with typed orm also has something similar (for example Esqueleto in Haskell)


Both of those errors can also be addressed by better code, e.g. you can use lambdas to make sure things get autoclosed if there's no with/using construct. And you can avoid SQL injection by using query string interpolation instead of string concatenation, or parameter bindings.


I don't really see your point. Types can prove the absence of those bugs. "Writing good code" is not... anything.


Neither of the examples given need types if good interfaces to those functions are chosen. Since types are part of the interface then why wouldn't you just change the interface to not exhibit those bugs rather than using types to prop up a bad one?


How would an interface guarantee that close is not called twice, or that you have escaped a string exactly once?


I have rarely needed to escape a SQL string since the days of PHP4, nowadays it is standard to use parameter binding such as "SELECT * FROM table WHERE row = ?" - it's also faster since the query doesn't need to be recompiled every time, but if you really have a desire to escape SQL then you can write a string interpolation function that does it automatically e.g. sql_format("SELECT * FROM table WHERE foo = %s", s). Indeed, JavaScript supports this via custom templated string literals.

As for closing resources, you can use a pattern such as using(open("file.txt"), (f) => { ... }) if your language doesn't already support such a construct.


Escaping multiple times would be unnecessary with a type based system, by definition.

Sure, no one should need to do that anymore, it's just an example. There are many other cases that are similar, however.

`using` is a language construct, not something that is part of an interface. It also does not prevent close from being called twice.


In my case: Yes and yes. Wildly guessing, about 90% of the errors I find at runtime with dynamic languages would would be caught by a sane type system at compile time. If I'm comparing a completely mis-engineered statically typed language such as C++, the compilation overhead may be so bad that for code I'm writing right now feedback can still be faster with the dynamically typed language. But if we're talking interfacing a few bits of unfamiliar and not great dynamically typed code: well I've literally spent several weeks on a project that would have taken me half a day with even bad static typing.


Pyflakes and a test or two will weed out trivial errors right away. Getting to zero errors takes more work however and that's where mypy has a place.


Unfortunately tests in dynamic languages often don't help. I've run into too many issues where dependencies were mocked with parameters the dev thought were expected. (Or were expected in previous version of some library) With time, it all falls apart without real interface checks.


Mypy can apply to tests as well. I don't have any noticeably buggy Python programs and have written more than I can count. All this tooling is a great help but is no substitute for minimum of competence and attention.

It's when you want to add features to a codebase you don't understand, when these features shine.


Hi, I'm the author! Happy to answer your questions here!


Hi, thanks for featuring my result library!

Regarding your criticism that `result.ok()` returns `None` if the value is not an `Ok` type: That's what `result.unwrap()` and `result.expect(msg)` are for :) (There's no unwrap_err and expect_err so far, but PRs are welcome.) The lib is strongly inspired by Rust (see https://doc.rust-lang.org/std/result/enum.Result.html), that's why `result.ok()` returns something option-ish.

I like your `isinstance` approach btw! Will have to think about it a bit more. Reminds me a bit of Typescript as well. Does mypy have type guards (https://www.typescriptlang.org/docs/handbook/advanced-types....)? Maybe that could be used to create helper functions that check for success/failure and which help mypy to derive the correct type.


> Does mypy have type guards?

not yet, maybe in the future. might be possible to emulate some use cases with Literal Types. tracking issue:

https://github.com/python/mypy/issues/5206


Oh, not sure how I missed these methods, thanks! I'll update the post to reflect it.


Thanks for writing it! I'm a Python programmer in my day job. I have sometimes remarked that the reason I play around with Haskell in my free time is because I don't want to see another run time type error unless I'm being paid to do so. But I needed to do some spreadsheet-scraping and data-munging for a personal project and I figured it would be easier in Python. So I searched and found your article and it was just what I needed.

Now to sneak it in at work...


I'm a fan of cool type systems and Haskell in particular too.

Also doing lots of messing with data for personal stuff (e.g. https://beepb00p.xyz/mypkg.html#examples). But yep, at least for me, the reality is if I need some quick yet useful code, Python happens to get me there very fast, just because of the sheer amount of code people already have written in Python.


I’ve written a similar python library called `safetywrap` that implements both Result and Option types from rust, with almost all of the interfaces implemented, including unwrap, map, flat map, err_map, etc. It fully supports types and works well with mypy.

Feel free to give it a shot! https://pypi.org/project/safetywrap/


At first glance, this looks really solid! Maybe I can deprecate my result library now :D


Appreciate how in depth this is, looking at all the possible ways of handling the issue. Makes the final result more interesting knowing the journey it took to get there.


Has anyone evaluated libraries like Pydantic ( https://pydantic-docs.helpmanual.io/) and Encode Typesystem (https://www.encode.io/typesystem/) versus MyPy as a real life thing ?



These are completely different objects: data validation libraries.

I think naming the latter "Encode Typesystem" is misleading (and your question is an indication that I am right).


why is it misleading ? didnt understand. i just didnt want people mistaking "typesystem" as a word rather than a project name by Encode.

its a very common name.


I maintain a library for doing exactly this in C# (https://github.com/mcintyre321/OneOf). I tried to replicate it in Python as I'm writing it at the moment, but couldn't quite manage it. Can you have a multi generic typed Union in python?


Do you mean something like this?

  T1 = TypeVar("T1")
  T2 = TypeVar("T2")
  GenericUnion = Union[T1, T2]


Yes I can't quite remember why I couldn't get it to work. I want to make something where this is valid, but couldn't manage it:

    def foo() -> OneOf[Bar, Baz]:
      if blah():
        return Bar()
      return Baz()

   

   def bar_to_blah(bar:Bar) -> Blah :
     ...

   def baz_to_blah(baz:Baz) -> Blah :
     ...


   one_of_bar_or_baz = foo()
   blah = one_of_bar_or_baz.match(bar_to_blah, baz_to_blah) 
but the critical thing is to get a type error if you add another case to foo's return type, or change the generics to different types.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: