We've built up layers and layers and layers of inefficiencies in the entire OS a...

wayoutthere · on Sept 5, 2019

Correct me if I'm wrong, but isn't this exactly the problem LLVM was designed to tackle?

jcranmer · on Sept 5, 2019

If you're targeting non-CPU designs--such as GPUs, FPGAs, systolic arrays, TPUs, etc.--it is very much the case that you have to write your original source code differently to be able to get good speedups on those accelerators. It has long been known in the HPC community that "performance portability" is an unattainable goal, not that it stops marketing departments from trying to claim that they've achieved it.

LLVM/Clang makes it much easier to bootstrap support for a new architecture, and to add the necessary architecture-specific intrinsics for your new architecture, but it doesn't really make it possible to make architecture-agnostic code work well on weirder architectures.

sifar · on Sept 6, 2019

True. If you want performance - you have to re-write the code for the new architecture otherwise it is pointless to develop the new core.

The problem with developing a good processor architecture is you have to always maintain legacy compatibility without sacrificing performance - because you know software.

This adds layers of extra HW with each passing generation of the processor lying around for some legacy code.

AtlasBarfed · on Sept 11, 2019

So is the reference to DSL in the article an attempt at performance portability by providing a language that hardware can optimize better?