I beg to differ. Some ideas related to training methods of deep neural networks e.g binarized neural networks, asynchronous SGD are used in datacenter infrastructure see https://research.fb.com/wp-content/uploads/2017/12/hpca-2018.... Some of the companies that I built in the past in Montreal uses SOTA research for sound event detection, a dropout method that friends and I co-authored were used in production at an NLP company (that I cannot disclose).