I love J, APL and the array languages in general. I am always looking for exercises to practice my J skills. I found this article by Tyler Limkemann called "Modeling the COVID-19 Outbreak with J"[1].
I also follow tangentstorm and took some of this J-talks gui code[2], and mashed it up with Tyler's as a J learning exercise. I just put this up on my github. [3]
I highly recommend tangentstorm's YouTube videos too![4]
The concise J code, and the way it makes me approach a problem brings me joy! Given all of my coding projects in J fit on one page, I can revisit them later, and get right back in without worrying about having too many comments or documents to understand my code. I like refining the code over time for fun and learning even after it works, or somebody much smarter than myself shows me a different way to look at it.
It seems like array languages are really picking up steam on HN lately; It's a positive development. Every developer should learn at least one array language. They sound absolutely crazy before you start, but then you start writing some code and you realize there's a lot of wisdom in the coding style. It's not something you can appreciate only from afar, though.
Similar logic as to why array languages are good can also be applied to why Linear Algebra is a good thing to learn, especially when you're in a calculation-heavy field. A lot of stuff can be abstracted away in a robust way that makes repetition unnecessary whilst also being able to reason about your equations/code.
I've done a little J in the past and have been playing with APL a lot lately.
It is definitely a different way to think. My programs are taking a long time to write as I train my brain to think in this domain, but the code is very concise and generally not hard to follow afterwards. Everything is so interactive it's a joy.
In particular, I enjoy how the poster broke the code down as I can only understand some of C code. It's cool to think I could create my own little J interpreter this quick.
Since this flurry of APL/J/K/A+/Q articles started, I've been wondering about the relationship between APLs and literate programming.
They say that APL was developed to mimic mathematical notation, and I see the resemblance. But real works of mathematics consist of one or two lines of expressions separated by sentences and paragraphs explaining the goal of those expressions.
So far in K "advertising" I only see the expressions and not the explanations. The explanation is there, but it's not a comment in the codebase; it's in a blog post that explains it. The code has many lines of expressions on top of each other, and nobody writes math that way.
I feel like I'm missing something, basically. It's clear how to write K, but what are the best practices for making a large codebase readable? I shouldn't have to invent that myself, right?
> APL was developed to mimic mathematical notation
APL was developed to replace mathematical notation. Making it a computer language was almost an afterthought. Indeed, there have been some trivial mathematical proofs in j[1][2][3]. Perhaps those give you a better idea of the connection?
Assuming you write your code for a certain domain, I think most of your colleagues could follow along after learning a little k.
I wrote a toy simulator for the domain I work in using something like 6 lines of APL code and sent it to someone good with APL. They replied back to the email pretty quickly with some tips on how to improve it and could generally get the gist of what I was trying to do while knowing nothing of the problem domain. It was really an interesting experience. I think with zero doc, a colleague of mine would immediately understand what I was doing after going through a short APL tutorial. It isn't magic, but not nearly as cryptic as people make it out to be.
Edit: people do comment their APL code. I think a million loc project would be pretty difficult in APL, but the beauty is that you should be able to do some pretty powerful work in a few pages I would imagine.
I think you're right. Aaron Hsu's work in APL is inspiring and awesome. The source code to his data parallel compiler hosted on the GPU is 17 lines of APL! Read the abstract to his paper here:
The source code to his data parallel compiler hosted on the GPU is 17 lines of APL!
The co-dfns source I've seen[1] is much longer. Do you know if there's a write-up somewhere of the differences between that and Appendix B from the dissertation?
The 17 lines don't include the parser or the code generator, which most people would count as "part of a compiler" in a practical sense. They are usually the most mechanical parts of a compiler though, so there's relatively little to be excited about in them.
I think its important to distinguish between something that is a core piece vs all the other things that make the system usable. For example once you start adding error handling, and good error reporting, the complexity goes up by an order of magnitude. And in many cases the approach for the core does not necessarily scale out to the other contexts.
The right tool for the job. If you are building a huge website with input forms, videos, data collection, ML algorithms, yes then you wouldn't do the whole thing in APL or J even if you could. Python is big in ML because packages for working with data in array language ways were developed. Pandas by Wes McKinney is one example, and he studied J or q, and even tweeted: IMHO J is the best APL for data analysis because of its broad suite hash table-based functions.
I like APL and J as a scratchpad where arrays are the basic unit and not scalars. J is functional and it turned me on to that world before I touched Haskell or F#.
Aaron Hsu has a lot of great videos that speak to a lot of the usability and scaling out you mention:
I am able to grasp concepts or own them after coding them in APL or J even if the code isn't as fast such as how well APL applies to Convolutional Neural Networks [1,2]. I really understood the mechanics of CNNs better after working through this paper a lot more than books I had read on ANNs in general since the late 80s/early 90s. By contrast, I have coded ANNs in C and Python, and I get lost in the PL, not the concept, if that makes sense. Anyway, I am a polyglot and find people criticize J/APL/k etc. from a brief look without really trying to learn the language. I learned assembler and basic back in 1978 to 1982, and I felt the same way when I first looked at opcodes.
Bahaha. It's a small world fellow HN user. As soon as ACM opened their digital library, I started looking for interesting APL papers and found that one and thought it was beautifully done. My takeaway is that you can make purpose-built AI in APL with very little code versus calling out to a large library like Tensorflow and having no idea what's going on.
I think someone has translated this to J, but I am trying on my own to practice my J-fu by implementing it in my own way. Then I usually open it up to the J experts on the mailing list, and my learning takes off. There are some awesomely smart people there who are generous with their time.
Yes, the takeaway is that with APL or J is that you can see the mechanics in a paragraph of code, and it is not a very trivial example. If the libraries or verbs are created to deal with some of the speed or efficiency issues, it is promising as a way of understanding the concept better.
The dataframes of R and Python (Pandas) were always a thing in APL/J/k/q, so it is their lingua franca or basic unit of computation upon which the languages were built - arrays, not a library.
More importantly, almost along the lines of the emperor has no clothes, is a tack to get away from the black box, minimal domain knowledge, ML or DL that cannot be explained too easily - see newly proposed "Algorithmic Accountability Act" in US legislature. Differentiable Programming and AD (Automatic Differentiation)applied with domain knowledge to create a more easily explainable model, and try to avoid biases that may creep into a model and affect health care and criminal systems in a negative way [1][2].
And then there are those who use DL/ANNs for everything, even things that are easily applied and solved using standard optimization techniques. Forest from the trees kind of phenomenon. I have been guilty of getting swept up with them too. I started programming ANNs in the late 80s to teach myself about this new, cool-sounding thing called "neural networks" back then ;)
Yes J does attempt to be a concise notation like mathematical notation for working with concepts in a manageable way. Like mathematical notation it takes effort to learn the meanings of the symbols, however, once learned you require less boilerplate to explain the meaning of those arranged symbols unless you are teaching somebody how to code in J.
Because the code is concise and includes all the important information, you can view most useful programs in one page, so it doesn't take much to work through the logic again if you have not touched it in a while. I put comments inline for attribution to a source, or a quick mnemonic to unravel some tacit J code.
I loved J the first time I saw it back in 2011/2012. I had played with APL in the 80s. I have played with many languages (asm, Basic, C, Haskell, Joy, Forth, k, Lisp, Pascal, Ada, SPARK 2014, Python, Julia, F#, Erlang, R, etc.), and each paradigm shift has taught me to approach problems from many different angles. I use the language that suits my current need at hand. Frink is on my desktop at work for all of my engineering, unit conversion, small input program stuff. R/RStudio is there for my statistics work. Julia is replacing MATLAB for me. I wrote Blender 3D scripts in Python in the early 2000s to make 3D wood carvings from 2D photos.
J is always open on my desktop, and is more than a desktop calculator. It is my scratchpad for mathematical and whimsical ideas or exploration. See Cliff Reiter's "Fractals, Visualization & J" for fun [1], or Norman J. Thomson's "J - The Natural Language for Analytic Computing" [1]. I just bought Thomson's book three month's ago for about $35. There's now a crazy $925 posting on Amazon! Somebody's creating sales from HN!
"Mr. Babbage's Secret: The Tale of a Cypher and Apl" was also a fun book. It's not really an APL or programming book!
While I really like array programming languages in general, I think what is really an elegant balance of readability and conciseness is the nile language and its application to rasterization. Highly concise, yet very readable.
Not a large K program, but a company I used to work for maintains a large, open source q framework https://github.com/AquaQAnalytics/TorQ which is used in many large investment banks and hedgefunds.
Funnily enough, the framework is actually a more expansive version of a tick system developed by Kx (the company that makes kdb) https://github.com/KxSystems/kdb-tick
The Kx one is incredibly concise. When I first started working with it, it took me a while to figure out what was going on.
The larger K/q programs get, the more they tend to look like "normal" code, but you still see a lot of these clever one liners hidden away in there.
> The larger K/q programs get, the more they tend to look like "normal" code, but you still see a lot of these clever one liners hidden away in there.
Yep, that's what I suspected. TorQ confirms it.. giving everything a single-letter name will no longer do :-)
I'm a bit surprised at the number of comments that explain what the next line does. I'm not sure what to think of that, but it reminds me of beginner tutorials explaining code for people who can't yet read it confidently; for obvious reasons, not a popular style among more conventional languages.
Am I weird because I have no interest in terseness that you have to tease apart. It feels counter-productive. The programming golf (Code golf) contests and obfuscated C contests just seem like a waste of time to me. The whole point of programming languages is to allow humans to produce machine code using a more natural language closer human language that we are all already fluent in.
J is terse in the way regex syntax is terse. Using ”Kleene_star” doesn’t make the idea clearer so we use “*” notation. J is a notation for specifying programs. Notations allow high conceptual density with very high specificity. That’s why mathematicians use notation for communicating ideas among themselves. Natural language would be longer and no more precise. J is a tool. It’s not a likely a good replacement for Cucumber and it’s probably not a good idea to cowboy J into an operational Python codebase at a typical workplace. Rivet guns for riveting, hammers for nailing.
I think the idea is that you learn it, and then you reap the benefits. Not much different from e.g. maths. You could figure out how to rotate a point about the origin with some haphazard application of multiplication, addition, sine, and cosine. Or you can learn you some linear algebra and then, tersely, multiply your vector with a matrix. When you learn this, you also learn the vocabulary to talk about it.. and then you have human language that you are fluent in. Maybe someone else isn't, but that's ok, you don't need to hold back and we, for sure, shouldn't hold back the entire industry just because some programmers are unwilling to graduate from goto and for loops. (I think any field would be seriously held back if we couldn't make up new concepts and extend our vocabulary)
And like the symbols in math, symbols in K (and presumably other APLs) have names and a programmer who learned the language can just read a line of code out loud; it sounds quite like natural language, but probably comes closer to describing the entire solution than a corresponding read-out of the 50-line chunk of C that performs all these little steps and manipulations of temporary variables and individual array members to arrive at the same result.
For similar reasons, modern programmers tend to prefer stronger abstractions like folds, maps, or function composition to achieve some terseness (and eliminate unneeded variables and manual iteration). These generally bring the solution closer to what its description in a natural language would be.
After that, it gets a bit more controversial. How much whitespace do you need? How long are your identifiers going to be? How many assignments will you nest in a single expression?
My experience is that it gets easier to work with terse code if you get into it, and once you're into it, it actually does save you time (in reading and writing) while more verbose style becomes irritating to work with. It's like reading an article that gets sidetracked and says a lot but doesn't ever seem to get to the point, and once it's finally over, you realize you didn't get the point because it was buried in fluff.
(Fwiw, my personal style isn't quite Whitney level, but it's way more terse than what we have at work, and honestly the verbosity of work-code feels counter-productive to me.)
And so, aside from my opinion about how it is to work with terse code, there's another point: making the trees smaller makes it easier to see the forest for the trees. And if you really need to study the trees (because one of them is wrong and you have a bug?), well, you can still do it (because you learned how to work with terse code).
In other words, the point is actually to improve readability! It's the exact opposite of deliberate obfuscation, even if the end result might seem similar to the untrained eye.
Down this thread ben509 writes that > what you see with most languages is extensive commentary to help future authors understand what's going on, and that's partly because terse languages tend to be cryptic.
I'm not sure I agree. What I see with modern language developments is that they're trying to empower the programmer to make the forest smaller by eliminating the trees or making them smaller where it's feasible. We're making things higher level and more terse (but the APL family is way ahead of any mainstream language). It's the low level languages that make code seem cryptic, because you get lost in the low level details.
I still work with C day to day and the actual comments in the code bases I'm involved in tend to reflect this: C forces you to deal with lots of details (trees), and it is painfully easy to see them but not see what's actually going on at a higher level (the forest). So people write comments to explain what's going on.
That's not the point of programming languages. The point of programming languages is to be more productive, it has nothing to do with natural language. In fact, natural languages are a poor fit for programming languages (see Cobol).
Counterpoint: SQL is incredibly successful and its syntax uses natural language structures quite heavily.
The problem Cobol has is that uses some natural structures, but then becomes very verbose because it only uses a few, and they often can't be combined, so it is very repetitive.
It's not natural to say,
I will buy eggs at the grocery store.
And I will buy ham at the grocery store.
And I will buy cheese at the grocery store.
> The point of programming languages is to be more productive...
But you should be productive not only in initially writing the code, but also in reading it later when you do maintenance.
And what you see with most languages is extensive commentary to help future authors understand what's going on, and that's partly because terse languages tend to be cryptic.
In my experience, SQL is successful despite its syntax, not because of it. Wasn't it originally intended to be an end-user interface? It's failed at that. Now even programmers choose to use an ORM to insulate themselves from it. Even the programmers I know who don't use OO still use an abstraction library to avoid the need to touch SQL directly.
SQL isn't even one language. It's a family of incompatible dialects. I've never written more than the most trivial SQL that would even be valid (much less the same result) on any other SQL dialect. Even though it has ISO and ANSI standards, every database in the world requires proprietary rules and conventions and extensions. How do you quote identifiers? Are strings case sensitive? Is '' the same as NULL? How do you create an index? What data types are there? We can't even agree on the most basic aspects of syntax.
It's "incredibly successful" in the same way that early HTML and JS was. People wanted access to the underlying platform so badly they'll put up with a nutty design and gratuitous incompatibilities. They don't really have a choice, and many are going out of their way to build their own alternatives because the vendors won't.
I get anonymously downvoted to hell whenever I post this, but SQL is an absolutely awful language. Fundamental aspects like consistent representation of the same semantics in different places are missing in the very core of the language. The ubiquity of SQL generators is a sign of how little most people want to work with it, and a major reason we haven't jumped ship.
I wasn't advocating programming in natural language. Much too ambiguous. I'm saying "closer" to natural language so our cognitive load is reduced reading and writing in the language so we can focus on what matters - to be more productive as you point out.
There was an attempt to do so, so that professional programmers would be unnecessary. The result was COBOL.
I don't want "close to natural language" because it's too verbose. It's like reading an article in The New Yorker - it takes so long to get anything said that you forget what the point was.
Of course, "too terse" isn't the answer either. There's a sweet spot. I suspect that the sweet spot varies, depending on the person and the kind of code.
I'm not sure why ppl are misunderstanding what I am saying. I'm not advocating COBOL-like languages. I'm saying assembler is better than writing in machine code, C is better than assembler and it would appear for most programmers, Python and JavaScript are the best choice but nobody who wants to keep their job writes code in the style of minified JavaScript or obfuscated C.
To some APL and J are the equivalent of C to assembler. It is not obfuscated, it is concise: brief but comprehensive, and easy to understand if you take the time to learn it like you would other PLs. I think Ken Iverson's "Notation as a Tool of Thought" covers this pretty well [1 PDF].
Thank you, that's a fascinating article and I'm going to finish reading it, but I'm ot arguing against languages that are designed to be terse and expressive to a certain mode of thought, but rather languages like JavaScript that are not designed to be terse and are deliberately made so at the cost of readability. I just don't find it interesting. It does not follow that the resulting machine code is faster or smaller, so I don't see the point beyond playing a game with language.
Long ago I was doing some analysis on a Sigma 5/CP-V machine. One of the few high level languages available was Xerox APL. For those without an APL keyboard they provided ASCII replacements (I don't rember what they were, but looked like $NDX, $ASG). After learning APL with the awkward mnemonics I was never able to read it with the special characters! J does a much better standard-character APL.
Why don't these languages support latex-style symbols in the way Julia does? Even in the Repl: type \alpha<TAB> and you get α. Even Σᵢ is possible. This makes symbols accessible without a keyboard and the code is quite close to mathematical notation.
Latex style is way too verbose, I would not want to type like that. And I think the input method should be a problem for your editor/OS to handle, not the language.
In any case, symbols were abandoned back in the day when everyone was still using 8-bit encodings. [1][2]
Nowadays you have e.g. Dyalog APL which does use APL symbols and you have short ascii-based sequences (easier to type than latex) for inputting them:
> For the first few months, the special APL characters and the ASCII spelling co-existed in the system. It was Ken who first suggested that I should kill off the special APL characters. I myself resisted for a few weeks longer, until the situation became too confusing, for reasons described in J for the APL Programmer.
> J uses the 7-bit ASCII alphabet. It also makes non-essential use of the box-drawing characters in the 8-bit ASCII alphabet for display. Using ASCII avoids the many problems associated with using APL symbols. It allows J to be used on a variety of machines without special hardware or software, and permits easy communication between J and other systems.
Absolutely yes for J and for k, but I wouldn't necessarily deem it necessary to use one or the other "a lot" just to get an intuitive grasp readability-wise.
J is probably the best language in the world to craft native GUIs (something k could never do very well, and now doesn't do at all).
It also has a wonderful amount of books on it, most of them Creative Commons-licensed now (as all/most of Iverson's books are, I believe). It's significantly easier to learn for someone new to array languages yet who doesn't have the sort of hands-on training you can get with APL & k; you could go into the woods with nothing but the J interpreter tarball for a week and come out pretty having internalized the language, and it's even easier with some of the other books available.
I was kinda thinking the exact same thing with regards to J.
Btw Scott, I've been meaning to ask you about what tools you're using for your work for awhile, but don't see an email address anywhere on your blog.
It seems like you've tried Lush, Clojure, Lua (Torch 7), J, R, and a bunch of other technologies. I've had the chance to try some of these for hobby purposes, but nothing for work yet, so I would trust your evaluation more than mine. Have you given up on array languages all together?
Lush is worth people's attention for its design, but I didn't feel like getting involved with language maintenance: my strengths are elsewhere. J also solves all the problems Lush did and has a larger user community.
Clojure was cool, but JVM is basically worthless to me.
Same concepts for the most part. I think K is a bit easier to understand and if you're lucky enough to work with a kdb+ system, I'd focus on k & q to be honest.
I know I can escape into the k interpreter with a \ (backslash)
I don't know how to find the documentation for the k version that is shipped with kdb+ (free non commercial download). I only find broken links (it seems they removed the k documentation from kx.com website?)
The creator of kdb+, the k language, and the q-sql DSL sold his share of Kxsystems (company that makes kdb+) to start a new venture called "Shakti" with a new version of his "k" language. Kdb+ is still around and developing, so I'd say they're forked at this point.
It's probably like k's find (`?`), which gets the indices of the found elements, or else the length of the list (or null, depending on the dialect) if not found:
I think I would also hate J, APL, Prolog, Haskell, Forth, or Lisp in University.
The problem is that those languages require a completely different shift in software from the common imperative/OO style to something more declarative or in Lisp's case, just a bit different.
In a typical university course, there isn't enough time to focus on learning those things and you stay so shallow with the material that the language seems useless. It seems like 1/2 of the Prolog subreddit is about homework assignments on things like list reversal. If you can already do that in Python/Java with a single method call, Prolog just seems like a really bizarre and inefficient method. Now once you take another step and see how it can figure out how to solve Sudoku without explicit instructions...that is cool.
It's really the same thing with electrical engineering when the professor is trying to teach us microprocessors and assembly at the same time and it all seems like a waste when C is much easier. The assembly method probably teaches better, but I need more than just a few weeks to synthesize that information. Especially when you're getting slammed by other hard classes like differential equations at the same time. So perhaps you do hate J, then again, maybe you just need to learn it on your own schedule and not to answer arbitrary test questions about something you can already do in Java.
I don't know how it is in J, but having fielded Lisp questions from students learning Lisp (usually 1/3 or 1/2 of a semester class on "Programming Languages" or "Functional Programming"), the assignments tend towards terrible questions, to the point where I don't think I could come up with questions that were better suited to making someone hate the language.
People come out of a class that used Common Lisp thinking that Common Lisp lacks any looping construct (it has several). It looks like the professor took a SICP (which uses scheme, not common lisp) based class as an undergrad, never looked at any language in the lisp family again, and then 20 years later made up a syllabus using the half-remembered information using Common Lisp because they remember SICP had something to do with Lisp and taught it without actually trying to do any of the exercises.
About halfway through reading that, I was thinking -- it would be fun to try and write this in Rust, and then i got to the bottom, and he had aleady done it. Much easier to read in rust.
The C version is 41 lines, and 1749 bytes. The Rust version is 295 lines, and 8289 bytes.
The point of the Incunabulum is not really what it does (it's a REPL with a handful of half-implemented J verbs and no error checking), but the style it demonstrates. I think that rewriting a similar program in another language without even attempting to reproduce the style is rather missing the point.
If you're interested in Rust, why not figure out what a semantically-compressed style looks like for it?
And the style (or the missing comments) is really what makes it hard to read.
I am fine with higher-order functions (I do that in Haskell all the time), and letters for variables, but please, please, give your functions meaningful names. Or add a comment that explain what they do.
Code is not only for the person writing it (who perfectly knows what the letters mean, at least for a short time after he or she wrote it). It's also for others who have to read it some day.
How much of that is due to the C preprocessor though? In theory, you could run the same preprocessor for Rust code (if you figure out some sensible macro shortcuts).
It's a completely apples to oranges comparison, with the rust version having some error handling, support for longer identifiers, tests, relatively idiomatic style with line breaks and indentation where you expect it, variables and type names longer than one or two characters, etcetra.
You could compress it massively before going for the preprocessor. In fact, a few of the preprocessor macros just make the C version longer (when measured in lines) than it would be without those macros if lines were allowed to be as long as in the rust version (<70 chars vs 100 chars). The printf macro (used only three times) actually makes the C code longer both in bytes and lines (or equal length in lines if you retain the 70 column limit).
There's really only one macro (DO) that expands enough to save lines, but just barely. The shorthand for return saves quite a few bytes (but not so many lines) given that it's used everywhere.
This topic in particular tends to get flooded with a few shallow clichés as soon as it arises. This is to be avoided, because they have the effect of turning a thread into the same generic argument as the last N times it came up, which is tedious.
Better to respond to the specifics of a topic. Alternatively, if you want to learn, ask curious and neutral questions. But please don't pick one of the low-hanging bombs and toss it; the results are predictably boring. Not picking on you personally—we all have this reflex, and it's actually fine in in-person conversation, but bad for internet forums.
J[1], ngn/k[2], oK[3], kona[4], and GNU APL[5] are the big oss interpreters. The main closed-source ones are Dyalog APL, Kx k, and Shakti k (which are still free for non-commercial use).
I would argue that, being stylistically inconsistent with the way most apl is implemented, that's actually not such a great resource. I found 'whitney c' surprisingly approachable, with just a little patience; despite no prior experience with it, I was able to make an apl-ish interpreter in 7kb and a few hours. If you're curious about implementation details, that might be a more worthwhile exercise. Be sure to check out the parsing section[1] of the J dictionary.
I also follow tangentstorm and took some of this J-talks gui code[2], and mashed it up with Tyler's as a J learning exercise. I just put this up on my github. [3]
I highly recommend tangentstorm's YouTube videos too![4]
The concise J code, and the way it makes me approach a problem brings me joy! Given all of my coding projects in J fit on one page, I can revisit them later, and get right back in without worrying about having too many comments or documents to understand my code. I like refining the code over time for fun and learning even after it works, or somebody much smarter than myself shows me a different way to look at it.
[1] https://datakinds.github.io/2020/03/15/modeling-the-coronavi...
[2] https://github.com/tangentstorm/j-talks
[3] https://github.com/rpherman/JLang/tree/master/COVID-19
[4] https://www.youtube.com/user/tangentstorm