PR is quite similar to a neural network with a single extremely wide layer. Unle...

zozbot123 · on Feb 14, 2019

I think what you literally need is a network that is "narrow" but also has many layers. This is because what PR features over simple linear regression is interaction terms, and multiple layers are an alternate way to get those. This also hints at one problem with true PR - in a multi-variate context, the amount of coefficients you'll need to fit is going to scale very badly given the number of variables and the degree of your polynomial. NN layers have natural dimenionality-reduction properties that make them more flexible.