Hacker News new | past | comments | ask | show | jobs | submit login

> This first slide is from a research paper where the researcher wrote his own language and make both a statically typed and dynamically typed version then got a bunch of people to solve programming problems in it. The results were that the people using the dynamic version of the language got stuff done much quicker.

Does this first plot control for notions of quality and extensibility of the different solutions? A faster-to-develop but sloppier solution in a dynamic language which requires more painful investment to refactor for future use cases should not necessarily be viewed as better. If you are only saving short term time at the expense of much more long-term time, then whether it is a net win for you depends on your discount function.

> What was most interesting was that he tracked how much time was spent debugging type errors. In other words errors that the statically typed language would have caught. What he found was it took less time to find those errors than it did to write the type safe code in the first place.

For which developers, and with what level of experience with static typing? This was true for me 3 months after I started learning Haskell. Now I have > 8 years of Python experience and less than 2 years experience with Haskell and the type system demonstrably speeds me up. Way, way faster to use Haskell's type system first than to use Python's type system and trace backs to debug type errors later. (I still like and use Python a lot -- just sayin.)

> The guy giving the talk, Robert Smallshire, did his own research where he scanned github, 1.7 million repos, 3.6 million issue to get some data. What he found was that there were very few type error based issues for dynamic languages.

> So for example take python. Out of 670,000 issues only 3 percent were type errors (errors a static typed language would have caught)

This strikes me as one of the most problematic parts of the post. To me this just seems to be evidence that in Python, at least, TypeError is more common when you are using something interactively, and you can resolve the issue for yourself (because it generally directly means you are using it wrong, and it's not the library's fault).

This also resonates with my experience with Pandas on GitHub. Early on there was a lot of TypeError stuff with index-related issues, but once the bulk of that work became mature, index errors were then a signal of a novice user who needed to change the user code, and not at all an indication of a library problem worthy of opening a GitHub issue.

It seems totally reasonable to me to hypothesize that the types of problems worthy of becoming GitHub issues are not usually TypeError. But TypeError might still be a huge proportion of all of the errors encountered out in the wild.

Further, there's also some selection effects here for users who actually post things to GitHub. When I worked in quant finance, and everything was in Python, it was an hourly occurrence for hugely important parts of the system to hit type errors, and they were all incredibly painful to fix in the legacy code. This was just accepted as a way of life, and because the invest staff weren't incentivized to care much about code, they usually just hacked their own work arounds, and would never have dreamt of actually opening a GitHub issue about type errors (that would be way too slow of a dev cycle for them, which is why the state of the code was so poor in the first place!)

> His point there is that all that static boilerplate you write to make a statically typed language happy, all of it is only catching 2% of your bugs.

This is absolutely false and not a valid generalization of the presented data. For one, a major claim of static typing proponents is that by writing with static typing, it eliminates bugs from ever being introduced, and allows you to use a compiler workflow to verifiably remove entire classes of bugs. When you run some bit of Python and it does not produce a TypeError -- that doesn't mean the code is free of errors. It might just mean you got lucky that the data or the user selections or whatever didn't happen to hit the TypeError corner case. With a static language, you know that certain classes of errors are not even possible -- not just that they didn't happen to occur this one time, but that they cannot occur. This is very different.

Further, another claim of static typing proponents is that the design process of code with static also leads to fewer bugs because the mandate for static types forces you to clarify befuddled design ideas before the program will work. The benefit of this is murkier, for sure, but it's still something that can't be addressed by this particular data.

> Some other study compared reliability across languages and found no significant differences. In other words neither static nor dynamic languages did better at reliability.

It's interesting to me that that chart doesn't include any functional languages. Let's try it again with a pure functional language and see, and then also compare, say, Clojure with Haskell. If it keeps on robustly bearing out the same trend, then I might start to question my current beliefs on defect rates in dynamic, imperative languages.

> Part of that was reflected in size of code. Dynamic languages need less code.

This again is relative to the ability of a developer and also relative to different types of tasks. However, it's not really fair to compare languages like C, where brevity of syntax was not too big of a language design priority, with a language like Python, where brevity of syntax is sometimes militant (just try talking with "Pythonistas" on Stack Overflow about why one-liner-ness is really not that useful). And also, at least part of the result is fixed for you: static typing at the very least requires the extra type annotations -- although here again you could try against something like Haskell where you have very powerful type inference. I would be extremely surprised if, for equivalently experienced developers, Haskell programs were not consistently shorter than Python programs.

> He points out for example when he’s in python he misses the auto completion and yet he’s still more productive in python than C#

Try Jedi in emacs (or whatever the equivalent must be in vim). Although, I for one hate IDEs (get off my lawn) and I also hate autocompletion and editor utilities that jump to function or class definitions. I've never noticed a significant speed up from these, except possibly when I am merely reading code from a large codebase that is brand new to me. But I have often experienced huge slowdowns from the features getting in my way.

> Another point he made is that writing static types is often gross and unmaintainable whereas writing unit tests not.

See Haskell. Also, writing unit tests can be a nightmare in OO and imperative settings, where you need some inscrutable cascade of mocked architecture to be able to test things. This is where something like Haskell's QuickCheck can make life a lot easier. I'm sure you could cook up something like that in Python too. But I strongly believe that writing unit tests in Python is way uglier and more frustrating than writing type annotations in Haskell.




continued ...

> Static types are also anti-modular. You have some library that exports say a Person (name, age ..). Any code that uses that data needs to see the definition for Person. They’re now tightly coupled. I’m probably not explaining this point well. Watch the video around 48:20.

This seems just wrong to me. You can declare structs as static in C and provide public helper functions that internally create data types, apply other static functions to them, and the produce results from them. In Haskell, it's very common to avoid exporting value constructors for data types, and to instead provide helper functions that allow for the implementations to remain hidden from anyone using the module. Modularity really has nothing at all to do with the dynamic vs. static typing debate.

I'll also throw one more downside of dynamic typing into the ring -- you sometimes will see really poor attempts to use so-called "defensive programming." In Python this is an especially bad code smell -- you'll see a huge block of assert statements right at the top of a function definition, in which all kinds of type properties and invariants of the arguments are asserted, so that TypeError can be raised immediately.

For one, in a dynamic typing setting, it's probably better if that stuff is the burden of the caller rather than the callee, in the spirit of a function "doing one thing and doing it well" it shouldn't also have to carry around all of its own type and invariant assertions. Notice that in a static language though, this isn't a problem and even is a huge benefit because it doesn't require the huge, human-error-laden block of asserts to achieve it. Just a nice, simple static typing annotation and then the compiler will deal with it.

Related to this, and as a final point, we should also need to give more "severity" to dynamic typing exceptions that occur at run time due to type errors. For example, in the financial job I mentioned before, it would be common place for an analyst to submit a very large batch processing job to the internal job manager. Some of these jobs took > 48 hours to compute and the output would mutate databases and so on.

So when someone set it running on Friday evening and expected there to be results in a database on Monday, imagine how awful it was to see that a TypeError had occurred and that not only did your manually created assertions fail to capture it, but also, there was no way of proving it couldn't happen without just running your code -- so you burnt maybe 30 hours of computational effort just to be told that upon hitting a certain point in the code, here's a TypeError.

This kind of error, which is categorically eliminated from possibility in a well-written static language program, should count for way, way more than a simple and stupid "oh I tried to call the API function with a list instead of a tuple, whoops my bad, let me just arrow-up in IPython and do it again" Type Error (though it's not clear to me that several of the referenced data in the post would make this distinction or penalize these types of errors more).


> You can declare structs as static in C and provide public helper functions that internally create data types, apply other static functions to them, and the produce results from them.

You can do that, with training and careful effort. But it was a design flaw that you have to do it manually, and that it isn't mandatory and trivial for even beginners to do. At the time C was "designed," this wasn't necessarily known to be important. We have no excuse today. But languages which do this wrong by default are still popular.

> In Haskell, it's very common to avoid exporting value constructors for data types, and to instead provide helper functions that allow for the implementations to remain hidden from anyone using the module.

In general, if calls require knowledge of type information at the call site, and the type needs to change for any reason (which becomes more likely as type annotation reaches further into program semantics) then all the call sites will need to be updated, or there will be an error. In any published library, this means backward compatibility is completely broken and everyone else's code needs to change.

This is a misdesign in C and in a number of "statically typed" languages which crib from it.

> you'll see a huge block of assert statements right at the top of a function definition,

I almost never see this. The only time I see it is when a dogmatic true believer in the ideology of static typing writes Python. People can do stupid things in any language.

> this isn't a problem and even is a huge benefit because it doesn't require the huge, human-error-laden block of asserts to achieve it.

Humans are still required to provide type information, which means they can still make errors. Even better, correcting these errors often affects the interface at call sites, which means the fix has to break backward compatibility.

> so you burnt maybe 30 hours of computational effort just to be told that upon hitting a certain point in the code, here's a TypeError.

You were not reasoning correctly about your code. Proper testing should have been your safety net, but you weren't testing properly. If you are even vaguely trained and you are even vaguely trying, writing code which emits TypeError in production takes some doing.

The number of shops which never have problems in production is vanishingly small in ANY language.

It sounds to me like you got started in Python, and are identifying beginner's mistakes with the language itself.


> In general, if calls require knowledge of type information at the call site, and the type needs to change for any reason (which becomes more likely as type annotation reaches further into program semantics) then all the call sites will need to be updated, or there will be an error. In any published library, this means backward compatibility is completely broken and everyone else's code needs to change.

Notice I said you avoid exporting the value constructors. You're still free to export or not export the data type itself as you wish, allowing users to reference the type in type annotations while still not letting them ever construct their own value of the type except through helper functions.

This achieves even better modularity, because then in the implementation file, you can change what happens with the value constructors however you want, and you can service backward compatibility to your heart's content without ever requiring the users of the data type to even be aware that anything is changing.

Maybe you are referring to something else, but I am referring to data type and value constructors in Haskell. The data type itself is a distinct semantic construct in Haskell from the constructors of values of that data type, and they can have different privacy properties.

> I almost never see this.

Well, I've seen it over and over in production critical code in three different organizations ... so our anecdotes disagree.

> Humans are still required to provide type information, which means they can still make errors. Even better, correcting these errors often affects the interface at call sites, which means the fix has to break backward compatibility.

It depends on the language. In Haskell for example, you could just make a type union, one for allowing passage of the old-style interface and one for the new, corrected version. It's very easy to do, still has the upsides of type checking, and doesn't break backward compatibility.

> You were not reasoning correctly about your code. Proper testing should have been your safety net,

Except you missed the relevant test case, whereas a tool like QuickCheck would have had a better shot at discovering a corner case that humans couldn't have anticipated.

> It sounds to me like you got started in Python, and are identifying beginner's mistakes with the language itself.

I'm not sure what you're referring to. The code I was working with was written by a mix of many Python developers. Some were core committers to the Python language itself; some were data analysts who didn't want to be programming.

I can say that I haven't had significant front-end experience in Python. But I've touched a lot of most other major areas, particularly in very low-level NumPy code, LLVM stuff with both Numba and llvmlite, pandas, Excel tools, and many different database technologies and ORMs.

I will say though, that in the projects where we switched from pure Python over to statically-typed Cython, it cleared up tons and tons of our issues, many of them almost over night.

Rather than me finding beginner mistakes in Python, it seems to me like you worked on one single system that suffered a lot of issues with backward compatibility, and you're generalizing that backward compatibility experience to other areas where you're less familiar (like solving the same backward compatibility stuff in Haskell).




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: