Erlang is a concurrency-oriented language though its concurrency architecture (multicore/node/cluster/etc.) is different from that modeled by GPUs (Vectorized/SIMD/SIMT/etc.) Since share-nothing Processes (so-called Actor model) are at the heart of the Erlang Run Time System(ERTS)/BEAM it is easy to imagine a "group of Erlang processes" being mapped directly to a "group of threads in a warp on a GPU". Of course the Erlang scheduler being different (it is reduction based and not time sliced) one would need to rethink some fundamental design decisions but that should not be too out-of-the-way since the system as a whole is built for concurrency support. The other problem would be memory transfers between CPU and GPU (while still preserving immutability) but this is a more general one.
You can call out to CUDA/OpenCL/etc. from Erlang through its C interface (Kevin Smith did a presentation years ago) but i have seen no new research since then. However, there has been some new things in Elixir land notably "Nx" (Numerical Elixir) and "GPotion" (a DSL for GPU programming in Elixir).
But note that none of the above is aimed at modifying the Erlang language/runtime concurrency model itself to map to GPU models which is what i would very much like to see.
The biggest issue I think is utilizing massive GPU memory bandwidth, you really need SIMD or your GPU is just going to generate a lot of heat to do only a bit of work.
But SIMD has got nothing to do with language per se. In Erlang everything is in a module and hence i can imagine annotating a module with SIMD/SIMT attribute which would then be the hint for the ERTS to map all the processes in that module to a warp on a GPU using SIMD vectorization as needed. Of course my Erlang processes must be written to take advantage of the above and thus cannot be a general-purpose (i.e. MIMD) process.
Erlang is a concurrency-oriented language though its concurrency architecture (multicore/node/cluster/etc.) is different from that modeled by GPUs (Vectorized/SIMD/SIMT/etc.) Since share-nothing Processes (so-called Actor model) are at the heart of the Erlang Run Time System(ERTS)/BEAM it is easy to imagine a "group of Erlang processes" being mapped directly to a "group of threads in a warp on a GPU". Of course the Erlang scheduler being different (it is reduction based and not time sliced) one would need to rethink some fundamental design decisions but that should not be too out-of-the-way since the system as a whole is built for concurrency support. The other problem would be memory transfers between CPU and GPU (while still preserving immutability) but this is a more general one.
You can call out to CUDA/OpenCL/etc. from Erlang through its C interface (Kevin Smith did a presentation years ago) but i have seen no new research since then. However, there has been some new things in Elixir land notably "Nx" (Numerical Elixir) and "GPotion" (a DSL for GPU programming in Elixir).
But note that none of the above is aimed at modifying the Erlang language/runtime concurrency model itself to map to GPU models which is what i would very much like to see.