(Hi, Tom!) Reread the article and look for “CPU”. The whole article is about doi...

(Hi, Tom!) Reread the article and look for “CPU”. The whole article is about doing deep learning on CPUs not GPUs. Moonshine, the open source project and startup he talks about, shows speech recognition and realtime translation on the device rather than on a server. My understanding is that doing The Math in parallel is itself a performance hack, but Doing Less Math is also a performance hack.