Using an L1 penalty with an arbitrary error function

The L1 penalty, which corresponds to a Laplacian prior, encourages model parameters to be sparse. There’s plenty of solvers for the L1 penalized least-squares problem, \arg \min _\mathbf{w} ||\mathbf{y}-\mathbf{Xw}||_2 + \lambda ||\mathbf{w}||_1. It’s harder to find methods for other error functions than the sum-of-squares. L1General by Mark Schmidt solves just such a problem. There’s more than a dozen different algorithms implemented in Matlab, so you should be able to find something that works well for your problem. I’m having good luck with L1General2_TMP in a sparse convolutional neural net embedded in a GLM scaffold.

If you’re interested in the more specialized problem of fitting a vanilla GLM with a sparseness penalty, you’re in luck, there’s plenty of specialized packages for that. If you have a very recent version of Matlab, then you can use the statistics toolbox function lassoglm. Then there’s glmnet, which is based on (I think) coordinate descent. Then there’s the package I published as part of Mineault et al. (2009) you might like.

Leave a comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s