If you decide to try to make this faster, check out ceres[0] a non-linear least squares optimisation framework that does automatic differentiation using a clever C++ template hack.
I've used it a few times to solve these kind of problems and found it to be very good!
Yep, I'm still waiting to use ceres for something - I didn't end up using it on my image approximation project https://mzucker.github.io/2016/08/01/gabor-2.html because it doesn't work well with inequality constraints.
If you decide to try to make this faster, check out ceres[0] a non-linear least squares optimisation framework that does automatic differentiation using a clever C++ template hack.
I've used it a few times to solve these kind of problems and found it to be very good!
[0] http://ceres-solver.org/