hm. Doesn't the existence of Vulkan subgroups and CUDA shuffle/ballot poke huge holes in their 'SIMT' model? From where I sit, that looks a lot like SIMD. The only difference seems to be that SIMT professes to hide (or use HW support for) divergence. Apart from that, reductions and shuffles are basically SIMD.