Off-the-shelf optimization functions in Matlab

Estimating a statistical model via maximum likelihood or MAP involves minimizing an error function – the negative log-likelihood or log-posterior. Generic functions built in to Matlab like fminunc and fmincon will often do the trick. There are many other free solvers available, which are often faster, or more powerful: Solvers by Mark Schmidt: there’s a huge collection of functions from Mark … More Off-the-shelf optimization functions in Matlab

Adagrad – eliminating learning rates in stochastic gradient descent

Earlier, I discussed how I had no luck using second-order optimization methods on a convolutional neural net fitting problem, and some of the reasons why stochastic gradient descent works well on this class of problems. Stochastic gradient descent is not a plug-and-play optimization algorithm; it requires messing around with the step size hyperparameter, forcing you … More Adagrad – eliminating learning rates in stochastic gradient descent