Interesting that the hardware is NVidia Blackwell, not Google TPUs. That means Google will likely have an energy efficiency and cost advantage, and keep their proprietary hardware out of other people's reach.
Getting a whole business set up to build TPU hardware for third parties (design, build, sell, support, etc.) is probably not worth it when there is overflowing demand for TPUs in their cloud already.
Businesses running their own hardware probably prefer CUDA as well for being more generally useful.
Part of the reason for this is likely due to customers preference to have CUDA available which TPUs do not support. TPU is superior for many use cases but customers like the portability of targeting CUDA
My limited understanding is that CUDA wins on smaller batches and jobs but TPU wins on larger jobs. It is just easier to use and better at typical small workloads. At some point for bigger ML loads and inference TPU starts making sense.
Not really. Reverse engineering a modern chip is no small feat. Any company capable of it is also capable of designing their own from scratch. However getting something taped out (and debugged) on a modern process is massively expensive.