Regarding the need for evidence in programming languages research: maybe it's just my bias for theory, but I think of PL research as a branch of mathematics more than an experimental discipline. In mathematics, you don't cite evidence, you write proofs. Most of the papers in conferences like POPL deal with topics like domain theory, type theory, category theory, formal logic, semantics, models of computation, etc. Results in these fields are achieved on paper or in a proof assistant, not by taking a survey.
For whatever reason, papers about programming languages that attempt to use data to make arguments about productivity, safety, etc. always seem unconvincing to me. Maybe it's because they usually aren't reproducible, or maybe it's just because the way most programmers write code doesn't matter to me, because most programmers aren't aware of what is possible on the frontier.
Given your user name, that's a fitting perspective. I have a very different view: programming language design is applied psychology, not mathematics. Programming languages are the ultimate human-computer interface. It may be tempting to ignore the human side of that interface: after all that side is messy, incomprehensible, impossible to write neat proofs about, and almost as hard to publish tidy convincing papers about. Without the human side of programming languages, however, PL design is a fairly pointless endeavor — without the humans, we wouldn't need programming languages in the first place.
I fear that, for these rather tempting reasons, academic programming language research has fallen into the trap of treating PL design as a form of mathematics, ignoring the human side of the field, and as a result has contributed far less to the practical practice of programming in the past several decades than one might naïvely expect and hope. Academic PL has, for example, almost completely disregarded the wild success and popularity of dynamic programming languages. Instead, prominent PL researchers glibly dismiss dynamic languages as "unityped", ignoring that there is, in fact, a rich life of types in every language if only you shift your notion of types to match what the users of those languages actually mean by the term. Shouldn't more academics be interested in how and why this approach is so successful to see what can be learned from that success, rather than just dismissing it out of hand because it doesn't fit neatly into the theory?
We have seen a recent cross-pollination of ideas between dynamic and static programming languages resulting in the likes of TypeScript, Go and Julia—and these hybrids have been wildly successful. But none of this is coming from academic PL circles. It's coming form the people who view languages as tools and want tools that fit the hands of their human users, rather than designing beautiful but alien tools that don't fit human hands. Even heavily static languages like Rust that have become very successful, have done so largely by going against the grain of static language research.
So yes, academic PL does seem to have largely become a subfield of mathematics, but to that extent it has, it has also become as detached from actual programming as modern literary theory is from the practice of writing books that people want to read.
> I have a very different view: programming language design is applied psychology, not mathematics.
It's both, and probably more.
As you note, programming languages are the ultimate human-computer interface. They have to serve two masters - the computer's unforgiving hardware, and the inevitably fallible humans attempting to harness the computer's power. Each master requires different types of proof/evidence of the language's suitability to the task.
Maybe it's because they usually aren't reproducible, or maybe it's just because the way most programmers write code doesn't matter to me, because most programmers aren't aware of what is possible on the frontier.
I have to say that sounds remarkably arrogant.
I think a significant portion of programmers are very smart people who work hard at producing good, generalizable, understandable effective code. PL designer would do well to pay attention to them (not that I know whether or not other PL designers have the parent's attitude. I hope they don't).
> I think a significant portion of programmers are very smart people who work hard at producing good, generalizable, understandable effective code.
To be clear, I don't disagree with this, but it's not relevant. You wouldn't expect a typical mechanical engineer to have research-level mastery of theoretical physics, even if they can build impressive, reliable machines. It's not an insult to the engineer.
You wouldn't expect a typical mechanical engineer to have research-level mastery of theoretical physics, even if they can build impressive, reliable machines.
I'm trying to figure out the world you're imagining here. I can see several possibilities.
A) The programming language research you are doing will eventually yield some programming practice that will be so advanced that the programming that happening now, before this change, will turn out too be irrelevant. Thus you are paying no attention to what's happening.
B) The programming language research you are doing will never intersect with the world of the ordinary programmer. You will prove interesting theories to say things about mathematical objects that happen to be programming languages, working on a track forever parallel to what ordinary programmers are doing.
C) Like a theoretical physicist, you're producing insights about physical reality on a much lower level than the average in engineer. If your insights yield an advance in understanding, you won't be the one to turn into a practical tool. That would be the work of the many layers applied-sciences practitioners that sit between the physicist and the engineer.
Choice C seems at least logical. But I'd claim that programming language designers considering things this way is not plausible. The world of programming abstractions just doesn't have enough layers that you're going to get anything like a pure theoretical science without relation ordinary human step-by-step problem solving. Moreover, we know layers of applied scientists don't exist between the ordinary programmer and the language designer. If you want your stuff to be relevant, you'll need to sell it yourself, unlike the theoretical physicist.
I spent few years along the most average developers you could find. They didn't care about their work outside their work hours, didn't care about anything other than Java and EJB and their hourly rate. I am okay with it, it's not like a less able programmer/gamer/cook/... would anger me, but these people indeed are distinct, and there is a lot of them
> significant portion of programmers are very smart people who work hard at producing good, generalizable, understandable effective code
Yet this is very different from what GP is talking about: knowledge about the research frontier. Which is OK, programmer and researcher are two different jobs, and much like it isn’t expected from the scientist to write good code (or write code at all), the programmer job is not to be knowledgeable about the state of art in PL theory.
>For whatever reason, papers ... that ... use data to make arguments about productivity, safety, etc. always seem unconvincing to me. ... maybe it's just because the way most programmers write code doesn't matter to me, because most programmers aren't aware of what is possible on the frontier.
As someone involved in PL research I partially agree, but I think this effect happens in both directions. There are plenty of great ideas in PL that haven't been applied to real problems whereas, simultaneously, there are plenty of huge problems for programmers where PL hasn't been applied. That is to say, us researchers often ignore the problems of the masses in favor of ones with more elegant (or more rigorous) solutions because we enjoy writing fancy proofs and theory more than we enjoy pragmatism. I think both sides of the equation are very important here, but it would be great to see some more direct interaction between the two sides. As it stands, a conference like POPL contributes about as much to recreational math as it does to the state of the art in programming.
Taking a step back, mathematics just happens to be our most developed tool/perspective for understanding patterns. So, to the extent that programming languages are concerned with modeling certain patterns in programs or domains or computing devices, at one extreme they might end up looking like (applied) mathematics. I consider category theory to be an exercise in this spirit (and this approach is broadly quite valuable precisely because math is often our best modeling framework for phenomena!)
At the other extreme for detecting patterns is simplistic empirical study of the kind that you allude to. It might be difficult to reproduce or generalize from these datasets, but it's hard to do much better without a better framework to operate in.
In between these two extremes is one of the most interesting threads of PL research, where it is treated as a design problem. For this to be possible, the researcher uses a hunch based on their human intuition of a domain or a problem-solving technique, and then tries to reify it into a programming language to provide it as an affordance to users of that system. It is strongly influenced by the taste of the researcher and the reviewers, and often needs to be iterated on to explore an interesting space.
What is the distribution between these three in academic PL research -- I have no idea. But I have seen many interesting demonstrations of the last kind, especially in conference talks at Strange Loop.
Unpopular opinion: The abstract mathematics side of PL research is progressing good enough. More important thing is to improve tooling, and it matters we should be able to design efficient and ergonomic programming languages with tooling focus in mind.
Maybe hardcore Set theory people don't accept progresses in Zig or Rust as PL research, but they are very important thing as well.
That is an interesting take, and it seems common in PL research (even if unspoken).
I love the formalism and "hard" maths of PL design - but if the purpose of a PL is to bridge the chasm between how computers work (i.e. machine code) and how people work, then disregarding one side of this isn't right.
The belief I previously held was that this is ok because computers don't get smarter, but you can teach people about the difference between contravariant and covariant types, and then pretty soon they will be happy scala programmers. Now I am not convinced - a simple language which fits in people's heads, and a way of leaving comments. Yes there will be bugs that don't get picked up at compile time - this happens in all languages - but the bugs that would be caught by a richer type system in another language tend to be the ones that are the easiest to fix.
> papers about programming languages that attempt to use data to make arguments about productivity, safety, etc. always seem unconvincing to me
Productivity and safety are aspects of human interaction. No programming language is safe, or productive in a vacuum. It is impossible to make claims about safety or productivity in a programming language without relating them to human use.
And here is the disconnect. While your claim is pedantically correct the reality is that most practical programming activites are engineering not mathematically science.
Like the architect, steel worker, joiner, bricklayer and so on we use many materials and techniques that can be explained and proved with science but we do so while be broadly ignorant of it all. And that's totally fine.
The problem if you disappear down the rabbit hole of academic self indulgence you may forget you were suppose to actually build something.
In answer to the post's lament that popular & well-regarded languages seem not to emerge from academic work, I'd like to point out that the Julia programming language (released in only 2012, around the time of this post) whose core design formed Jeff Bezanson's PhD thesis [1]. Getting the dissertation done was also considered a high-priority issue [2] for the language development project :-)
2. My own area of research: Tree Notation (https://treenotation.org/). More generally: the idea of programming languages and programs as 2-D and 3-D structures.
I follow thousands of languages and those are the 2 domains that I think are most exciting (I'm obviously biased).
There's lots of incredibly value incremental work as well, but not much else I see that will cause a 10x+ jump.
Not yet (soon, just have other things to ship first).
Yes, small parts of it are available online powering a few different things. (Can’t provide links yet, sorry, but you may find some of the things serendipitously)
In my opinion, we need programming experience (PX) research. While programming languages don't have to be purely textual, they primarily are, and this is a big constraint on typical programming language design. I don't know if it's been quantified, but there is certainly a qualitative ceiling to what's possible with typical text-based syntax. That's why most languages are just permutations of what exists, and there's always limits.
We have such rich ways to express information and interactivity, and these are nearly completely ignored by the PL community. I see hybrid programming languages and environments, that are part visual and part textual, as the future in the same way that languages like Fortran and Lisp and then text editors and text-based IDEs were once the future.
A frustration I heard a lot from PL researchers is that popular languages today seemed to have gotten that way /despite/ their intrinsic values rather than because.
My stab at explaining that is some kind of take on a 'programmer demographic' factor.
The PC revolution (and rise of the killer micros) of the 80's caused a big democratization wave.
Before, languages were typically created by academics and industry professionals.
The newer languages listed come from hobbyists in the PL field: people who make some language for fun or for an immediate need, perhaps without full knowledge of the field's histories and implementation techniques.
With the democratization wave and the explosion in the need for skilled programmers, the population of programmers during the 90's and beyond is significantly less academically educated than in the preceding decades. That means that in a large part, the views of the PL research community does not matter as much anymore. What matters more is a type of darwinism of languages, where popularity within a niche of practitioners can be the springboard into broad popularity. It doesn't really matter than it gets lexical scoping wrong, or combines incompatible abstractions, or the hacky implementation is leaking into the language semantics. What matters more is that there are useful frameworks, editors and learning material online.
Yes, yes and yes. Last time I checked human cognition and collaboration seemed to attract very little interest. I suppose this somehow falls into a slot of "Human Productivity".
For instance - instinctively I find Clojure easier to work with than Scala. How is it different for different people and why? How does the cognitive complexity scale as the program scales? Do some languages foster collaboration better? As the team scales does the subjective complexity of a program scales in some languages better as the others?
Or something even more fundamental - why do some people write a lot of comments and other don't. Is there a difference in program comprehension with different length of indentation or different formatting for that matter.
We need more insights into the psychology of programming (languages, design, comprehension, etc.). If that's not possible we should at least do more rigorous philosophy of programming.
I think PL is progressing just fine. If we look at major advancements in the past decade: LLVM, WASM, async/await, Typescript, Golang, Rust
The more interesting question is "what's happened to IDEs?" -- they're basically unchanged over the same time period. I think they feel "unglamorous" relative to PL and thus attract less passion projects.
I want to see an IDE focused on hybrid code/data systems (say Postgres/Python). Keep your schema, your data, and you glue language in one spot, edit it all together. Set up unit tests by specifying an initial "mocked" Postgres state by writing test data in columns, see it percolate through your code, etc.
> major advancements in the past decade: LLVM, WASM, async/await, Typescript, Golang, Rust
Are these really major research achievements? What research contributions are there in WASM, async/await, Typescript, or Golang? I can see in LLVM and Rust though.
async/await came out of Microsoft research; each year they build and test dozens of concepts. Struck gold with async/await and within a couple years that pattern got retrofitted into most major languages (JS, Python, Rust, C++, C#)
I was wrong about Golang and Typescript, I did some reading and found most of their signature features in older languages.
WASM is some sort of political consortium building triumph, but I think has more potential than everything else on the list. The performance is shockingly good (see Google Blob Opera, Mozilla Pyiodide, etc). Over the next decade everything will become a fast Electron app.
A genuine IDE for data science would be very useful, in my opinion. Something to unify code, data and models would be really interesting. An ideal tool would allow frictionless experimentation, but also maintain continual reproducibility.
I'd love the same. I often think that a good data science IDE will end up looking like RStudio with mlflow abilities. I barely use R at all anymore (or IDEs really), but RStudio is still the best attempt at a data science IDE that I've come across.
knitr/Sweave integration for literate programming, dedicated panes for plots and printing `help` documentation, and not having to mix code and output like notebooks do. It gets a lot of things right (for me).
Something that I wonder about is the intersection (if any) of programming languages and natural languages.
It could be that as in natural languages the final arbitration on what is used is the masses and all the academic discussion in the world can't change it much. As I write this it occurs to me that an Artist (Shakespeare) and the King James Bible, had a significant effect on consolidating what became modern English.
If computer languages follow natural languages then a continuous evolution of existing languages will occur over time whether we like it or not. I think we can see some evidence for that in the oldest computer languages that we have. It would also follow that some languages disappear over time ...
In natural language development I have read that it takes about 1000 years for a language to become unintelligible if separated from its mother tongue. What is the equivalent in CL? You can't compile/interpret the program? A human can't understand the code?
There might be some room for cross discipline research here.
English faculty and Comp. Sci on the same papers?
(I'd love to be fly on the wall at those meetings) :)
I recently achieved an appreciation for language researchers after reading "Confessions of a Used Programming Language Salesman
Getting the Masses Hooked on Haskell" by Erik Meijer. His mention of the continuation monad and how he decided to bring functional ideas to popular languages, made me understand monads a bit and appreciate that he put async/await in C#/Dart
For sure there is need for programming language research. Language designers benefit greatly from the work done by the academia - for example in the area of type systems. But, OTOH, scientific community is not very tuned in the everyday problems programmers have. That is why some of the popular languages are basically hobby projects spawned from some lone programmer's personal itch.
I think we should have the best of both worlds; people enjoying the mathematical rigor should do PL research. And hobbyist language designers should pursue whatever ideas they might have. But both camps should acknowledge that the other side is equally important.
I think that there are other directions to pursue. (Some overlap, sorry)
1. Bidirectional Compilers -- code --> ast --> code, at a bare minimum. All the way to executable and back, if possible. Except for Macros... it should be fairly easy to ingest any valid program in one language, and output the exact equivalent in another (however clunky). This allows refactoring while staying in a given language with far more flexibility that any IDE trying to guess.
2. Time Travel Debugging - if you have something go awry, you should be able to scroll back and forth across its execution until you find where it deviates from expectations.
3. Using traditional flow based execution alongside declarative code (I hope I got that term right). You should be able to do traditional a:= b; (assignment) statements... alongside a different type of form.. the equation c :== d; in which ANY change to d is immediately and ALWAYS propagated to c, regardless of what's going on in the flow of control. Verilog and other languages for programming FPGA and other systems handle hardware that executes all instructions in a circuit.
4. Imperative, Declarative, Functional, Object Oriented, Concatenative programming all in one box... with support for infix, prefix, and postfix functions all at the same time. (After all, they are just transformations of each other)
5. Hard and soft/no type systems that let the user slide between the ends of the scale. If you're just prototyping something, who cares what the types are, as it gets used, start to lock it down, as the user permits.
6. Rich text source... like ColorForth, where you can just tell the computer that X is a string literal, variable, function, or whatever attributes you care to assign to it, and the users preferences then indicate to them what it is. Last resort would be to spit it all out in text.
7. Notebooks - The idea of using notebooks like Jupyter instead of IDE or a REPL allows prototyping and eases the discovery of solutions by lowering the cost of trying an idea out, and making the stack of ideas required to implement an idea smaller.
8. Parallel processing, en mass, as we move to 50+ core processors, and programmable hardware, the idea of a single control flow running the show becomes absurd. Even low end embedded devices are starting to have FPGA peripherals.
9. Least privilege / Multilevel Secure Systems - Eventually, the admins of the world will realize the folly of using Unix like systems where everything is based on the users level of access, and migrate to systems which allow no access by default. In those environments, code needs to handle the IO it is given, instead of just driving everything. This is the next step in evolution from interactive command line apps --> windows apps that deal with events --> apps that only get given IO
10. We can make things better, but we have to stop making them worse first... peak usability happened with Hypercard, VB6, Delphi, and it's gone downhill since.
Programming languages are about engineering, not mathematics. Engineering is the discipline of how to help people produce things that help them produce things. (Study of) Mathematics has no real purpose except to support engineering.
We have two approaches for a programming language:
1. We design something "perfect" and "pure", that will never need to change.
2. We design something simple that works for something and let other people improve the language.
PHP or js were terrible designs, but there are not so bad today. Thousands of people have improved the design over time.
BTW, languages like Lisp actually evolved over time too. S expressions were not intended to be definitive, and you were not supposed to interpret code in code itself. McCarthy was just interested in the mathematical proof of creating a Lisp machine different from a Turing one. The practical side of that was of no interest for him and other people added this fucntionality.
In fact one of the problems with Lisp is that it is too flexible. In c like languages you have 3 or 4 different loop systems, you have for, while, do. In common Lisp you have more than 20 official loops, and you can create your own infinite set with macros!!
And that is with loops alone. It applies to everything in the language. You have so many ways of creating objects or parsing.
That is amazing and incredible for academic research, but not so good for production.
For production you need standards that are simplified, like if you design a machine, you choose 3 different screws instead of 30. Your prices go down, need less tools, you don't need "change of tool" interruptions, everybody can supply you.
Most of the time, if you design your own screw you are shooting yourself in the foot. You will have to wait much longer for being manufactured at 100x the price.
With software something similar happens. Geeks love learning new paradigms but while you are in "production mode" you should be producing not learning the custom syntax a programmer has created that can be done with a standard one.
Everybody has an ego. Dunning–Kruger effect tells us that the less we know about computer languages, the more confident we are that we can change the world with a brilliant design (that is not so brilliant in the first place).
In fact, languages like c or PHP or js are actually brilliant in lots of ways. If they were not, nobody would have used them. You should learn what this brilliance is before you create something new.
> BTW, languages like Lisp actually evolved over time too. S expressions were not intended to be definitive, and you were not supposed to interpret code in code itself. McCarthy was just interested in the mathematical proof of creating a Lisp machine different from a Turing one. The practical side of that was of no interest for him and other people added this fucntionality.
Actually McCarthy was working on a new research domain called 'Artificial Intelligence'. For that he and his team needed a programming language. They did early experiments with list processing added to Fortran. Eventually they developed their own programming language. It was designed and implemented over short time frame of a few years for an IBM computer they had.
McCarthy and his team were very much interested in the practical side how to implement Lisp and they learned a lot while doing so. During the first design and implementation phase 'it became clear that this combination of ideas made an elegant mathematical system as well as a practical programming language' (McCarthy, History of Lisp, 1979).
The new programming language was thought to be used to implement software like the proposed 'Advice Talker' (McCarthy, 1959).
> For production you need standards that are simplified
Quasi standards for Lisp were early the Lisp 1.5 language (the language description was published as a book). Then actual work started on a new language standard, supposed to be Lisp 2. That work failed eventually. In the 80s/90s Common Lisp was defined as a standard. Further standards of Lisp languages were IEEE Scheme and ISLisp (an ISO standard).
> For production you need standards that are simplified
See the extensive standards of Ada, C++ and Java EE, ...
For whatever reason, papers about programming languages that attempt to use data to make arguments about productivity, safety, etc. always seem unconvincing to me. Maybe it's because they usually aren't reproducible, or maybe it's just because the way most programmers write code doesn't matter to me, because most programmers aren't aware of what is possible on the frontier.