# Using the SVD to estimate receptive fields

Spatio-temporal receptive fields can be hard to visualize. They can also be quite noisy. Thus, it’s desirable to find a low-dimensional approximation to the RF that is both easier to visualize and less noisy. The SVD is frequently used in neurophysiology for this purpose. Reading the Wikipedia page on the SVD, you might have trouble understanding how the SVD is relevant to RF estimation. Here’s a quick explanation of why you would want to use the SVD to estimate RFs.

The singular value decomposition (SVD) of a real rectangular matrix M factorizes the matrix into components $\mathbf{USV'}$, where the S matrix is diagonal, and composed of real, non-negative numbers. The numbers on the diagonal of S are called the singular values of M. Here I will use the convention that the singular values are sorted in descending order.

Both U and V are orthonormal matrices, meaning that $\mathbf{U'U = I}$ and $\mathbf{V'V = I}$. Each column of U and V is a (left or right, respectively) singular vector of M. Writing out the result of the factorization explicitly, you get:

$\mathbf{M_{i,j} = (USV')_{i,j} =} S_{1,1}U_{1,i}V_{1,j} + S_{2,2}U_{2,i}V_{1,j} + \ldots$

Thus, the SVD factorizes an arbitrary matrix into a sum of outer products. Furthermore, each term in this series of outer products has decreasing influence. Thus, by truncating the SVD after a given number of singular values, you obtain an low-rank approximation of the original matrix.

So how is this relevant to RF estimation? Well, suppose that your receptive field is arranged as a matrix R, where the first dimension represents time, and the second dimension represents a single dimension of space. Now take the SVD of R. Then $S_{1,1}U_{1,i}V_{1,j}$ is an approximation of $\mathbf{R_{ij}}$ as an outer product of two vectors. So you’re saying that the receptive field can be approximated as the product of a time filter $\mathbf{U_{1,i}}$ and a space filter $\mathbf{V_{1,j}}$; in other words, you’re saying the RF is separable in space-time. Furthermore, this is the best such approximation possible, in a certain mathematical sense.

Separable receptive fields are easier to visualize than non-separable RFs, they’re of lower-dimensionality, and typically have higher SNRs. Is it reasonable to assume that a RF is separable, though? You can gauge this by looking at the sequence of singular values. If a receptive field is truly separable, then only its first singular value should be non-zero.

In reality, noise will mean that a measured RF will never be truly separable. The ratio of the square of the first singular value to the sum of the squares of all singular values can be used as an index of separability (here for example). Furthermore, if you plot the singular values, you should see that they drop off rapidly after the first few (as with an eigenvalue decomposition), and this can be used as a means of finding a good point to truncate the decomposition. More formal criteria can be derived through cross-validation or bootstrapping.

Now, it’s possible to approximate a non-separable matrix using the SVD as well; you simply truncate after n singular values rather than the first. If you did this on the spatiotemporal RF of an LGN cell, you might find the first singular vectors to correspond to the early center response with the second corresponding to the late surround response (see above, from Wolfe and Palmer 1998).

It’s important to keep in mind, however, that the first and second left singular vectors will be orthogonal, while the traditional decomposition of an LGN RF might involve non-orthogonal vectors. It’s possible to recover a more “natural” non-orthogonal decomposition lying in the subspace of the truncated SVD based on other criteria like sparseness (see Construction of direction selectivity poster here).

If you have 3 dimensions instead of 2, you can simply bunch two dimensions together (in Matlab, using reshape) before performing the SVD.

The SVD is useful in other contexts in neurophysiology as well. It can usually directly on a design matrix X where you would use the eigenvalue decomposition on X’X; whitening images is one application of this. It’s an important tool to add to your belt.

1. […] first sight, the natural tool to estimate such a low-rank decomposition is the SVD; however, SVD doesn’t find the right kind of spatio-temporal decomposition. As shown below, […]

2. Kostas says:

Hi,

From an ignorant’s point of view I was wondering whether SVD can be used for the extreme condition of an RFs that is constant in time.
SVD would deem the RF separable in that case, wouldn’t it?

Thanks,

Kostas

3. Kevin says:

I would emphasize that it’s the best _linear_ approximation according the Frobenius norm.

@Dimitri: They can be formulated in terms of each other, so sure, they’re mathematically equivalent. OTOH, Zorn’s Lemma, the Well Ordering Theorem and the Axiom of Choice are also equivalent (in ZF) but one is often easier to use than the other depending on the theorem you need to prove.

4. xcorr says:

Sure, the two are closely related. This is what I meant when I said you can use the SVD on X when you would use an eigenvalue decomposition on X’X. The eigenvalue decomposition on the covariance matrix X’X is the core operation in principal component analysis (PCA).

I think it’s clearer for this particular application, however, to think of the dimensionality reduction in terms of the SVD and outer products. If you think about it in terms of PCA, then what you’re doing is finding the directions of highest variance in the distribution of (temporal or spatial) slices of the RF. If you keep only one direction then necessarily all your slices are scalar multiples of each other, so you get back a separable RF. It’s not very intuitive, I find, though.

5. Dimitri says:

many researchers know the SVD as PCA. The two are mathematically equivalent.