Wow, I didn't realize Chapel was still under development, I haven't heard about it since 2011. Its contemporary rival was Fortress from Sun Microsystems (https://en.wikipedia.org/wiki/Fortress_(programming_language...), which does appear to be defunct, both of which are languages intended for highly-parallel computation on supercomputers.
Working in heterogeneous computing environments (and having first-class facilities for that), while still being easy enough to immediately start adapting numerical code from any reference implementation.
Roughly, the sets of computational problems that people used (use?) MPI for. Things like numerical solvers for sparse matrices that are so big that you need to split them across your entire cluster. These still require a lot of node-to-node communication, and on top of it, the pattern is dependent on each problem (so easy solutions like map-reduce are effectively out). See eg https://www.open-mpi.org/, and https://courses.csail.mit.edu/18.337/2005/book/Lecture_08-Do... for the prototypical use case.
My answer would be that Chapel supports a partitioned global namespace such that a variable within the lexical scope of a given statement can be referenced whether it is local to that CPU's memory, stored on a remote compute node, or stored within a GPU's memory (say). The compiler and runtime implement the communication on the programmer's behalf and take steps to optimize away unnecessary communication. Other key features include first-class support for creating parallel tasks in high-level ways, including parallel loops.
This is interesting. I've had good experiences, but admittedly haven't done problems with huge amounts of communication. Is there some fundamental issue? Lack of supporting libraries?
I wonder if the JIT model prevents predictable benchmarking/optimization, making long-term robustness difficult? It might also make using MPI difficult. But this is mostly speculation on my part.
Legion is great! I like that its developers really get HPC. But it is C++, with all of its baggage. (Update: There is also Regent language on top of C++ API, looks interesting).
I wonder if it would be possible to easily set it up on our university's HPC cluster (which is not a supercomputer, but still has SLURM and everything there).
Chapel's killer feature, in my opinion, is being able to take something that looks like a simple loop and turn it into distributed code (via domain maps). To the extent that you can write your program in terms of that feature, you can get very clean programs. But once you step outside of that feature, you've basically back to full-on parallel programming (i.e., explicit PGAS programming with nearly every parallel programming construct under the sun). Chapel offers a nice syntax, but it's semantically not so different from writing SHMEM or MPI or one of these other explicitly parallel programming models.
Regent does also support loop auto-parallelization, though it's not the focus of the language and not generally how people write idiomatic Regent programs. Regent fundamentally is a task-based programming model. "Task" is a fancy word for a function that can run in parallel. The key is that (a) tasks execute with sequential semantics, and (b) inside of a task, you can do basically whatever you want. The compiler doesn't need to analyze the code aside from verifying that you're passing data around correctly. This means the set of programs you can write in the "nice" subset of the language is much, much larger. The vast majority of Regent programmers never encounter any explicitly parallel programming constructs, there is no way to make code that deadlocks or races, etc. On the other hand, organizing programs in terms of tasks does still take effort and a degree of cognitive shift. You still have to divide the program into parts that can be parallelized, even if you're not responsible for parallelizing them.
Thanks, this is helpful. It seems like (based on your reply) there are people successfully using Regent for scientific computing (I'm assuming); do you think the language is a viable choice for industry, or are there particular milestones you're looking reach?
Yes, we're focus mostly on scientific computing. One of our major apps is S3D, which we ported to Regent from Legion C++ [1]. This has been the first time the domain scientists have been able to modify the code themselves; previously the C++ was too much for them. There are other examples of Regent apps elsewhere in this thread.
If by "industry" you mean in areas related to HPC, then Regent is likely to be applicable to what you want. The further you get away from HPC, the less likely that would be. You probably wouldn't use Regent to write a web server, though it's not impossible....
Right now my biggest item is making sure we support all the DOE supercomputers, which means adding support for Intel GPUs. (Regent currently supports NVIDIA and AMD.)
We definitely don't have an excess of supply in the
"HLLs for vendor neutral GPU programming" dept, especially if you have a healthy reaction to C++. Chapel and Futhark, what else?
Taichi solves a different problem. It's (mainly) a tensor compiler that targets kernel generation on different architectures, while Chapel is for orchestrating and communicating between computations over distributed memory spaces (and heterogeneous devices).