> Secondly, GPUs don't really have very much hardware specialized for graphics a...

> Secondly, GPUs don't really have very much hardware specialized for graphics anymore. If we called it a TPU I'm not sure you'd be making the same point :P

I think this is semantics. Modern GPUs do have a lot of hardware that isn't specialized for machine learning. My limited knowledge says that very recent NVIDIA GPUs have some things that vaguely resemble TPU cores ("Tensor cores"), but they also have a lot of silicon for classic CUDA cores. Which I was calling "graphics" hardware, but might better be described as "silicon that isn't optimized for the necessary memory bandwidth for DL". So its still used non-optimally.

To be clear, you can still use the CUDA cores for DL. We did that just fine for a long time, they're just decidedly less efficiently than Tensor cores.