Great printf debugging experience with C, and playing compiler with source code on the fly, meanwhile on CUDA side, graphical debugging with shader code and support for standard C++17, shipping bytecode.
It took a while to figure out how the comment was related to the discussion. But I never said that CUDA would actually be difficult, just that I've never met a manager who would not turn down a programmer's suggestion to do calculation on GPU.