I used to always stare at these two PDFs while deriving machine learning algorithms back in the day. (At least my hand rolled code C++ still beats PyTorch's reverse autodiff performance.)
I used to always stare at these two PDFs while deriving machine learning algorithms back in the day. (At least my hand rolled code C++ still beats PyTorch's reverse autodiff performance.)