One of my favorite mini games at my job is rewriting classic algorithms to run i...

One of my favorite mini games at my job is rewriting classic algorithms to run in batched mode on gpu/tpu. The speed improvements often improve model training time by days, and it's always a lovely intellectual challenge. (The basic challenge is to rewrite the algorithms in terms of matrix operations which operate on many examples of the problem at once, while eliminating all branching.)