Why second-order methods can be futile in non-convex problems

I’ve been working on fitting a convolutional model of neurons in primary and intermediate visual cortex. A non-convex optimization problem must be solved to estimate the parameters of the model. It has a form similar to: There are some shared weights to further complicate things, but the most salient features is that it’s a 3-layer … More Why second-order methods can be futile in non-convex problems

The secret ingredient in stochastic gradient descent

I had dinner with Geoffrey Hinton and Yoshua Bengio a few weeks back, and I left full of ideas – and wine, also. Now I’m fitting a massive model for early and intermediate visual areas which involves major spiffiness and about 100 hours of data (!). Stochastic gradient descent is one optimization algorithm which can … More The secret ingredient in stochastic gradient descent