CSHL computational vision: day 2

Eero Simoncelli

Eero Simoncelli delivered a talk focusing on linear systems, convolution and Fourier analysis. From an informational theoretical perspective, a linear or nonlinear transformation of the type performed in cortex can only be information-conserving or losing. Thus, there is something interesting about the visual information which is discarded. I could not help thinking of Tom Waits’ song: “Son, there’s a lot of things in this world that you gotta have no use for. And when you get blue, and you’ve lost all your dreams, there’s nothing like a campfire and a can of beans”.

In noiseless discrete linear systems, characterized by a transformation matrix M, the discarded information can be read off from the SVD of M. That is, if M = USV’, then the columns of V corresponding to zero values on the diagonal of the scaling S matrix correspond to directions in the input space that are discarded. That defines the null space of the transformation.

From a more theoretical perspective, you might remember from your linear algebra class that the null space of a matrix M is define as the set of all vectors such that Mx = 0. Then if M = USV’, if a vector x is a sum of vectors corresponding to columns of V associated with null singular values, V’x will be equal to some vector which is non-zero only for elements associated with null singular values. This is because V’V = I. Then SV’x will be the zero vector. So the null space of a linear transformation M is given by the subspace spanned by the columns of V associated with null singular values.

Eero illustrated these ideas by the fascinating history of trichromatic color vision. Discretizing the span between 400 and 700 nm corresponding to visible light yields a vector whose length is equal to the number of bins used (say, 30). Since there’s only three types of cones, there’s only three measurements taken in a given spot in the retina. Call the spectrum of some sample of light x, and encode the sensitivity of the three cone types as a 3×30 matrix M. Then the only relevant factor in determining whether the colors y and x are perceived similarly are the inner products of x and y with the 3 sensitivity profiles (the 3 rows of M). Thus, x and y could have very different spectra, but as long as within this subspace defined by M the vectors look the same, they will appear the same to the observer.

Thus, any light source can be matched by a weighted sum of 3 primary light colors (not necessarily red, green and blue; any three independent colors would suffice). Note that to simulate negative weights (important!), one simply needs to use a sufficiently large background light and define negative light values as decreases from this background illumination.

Eero then went on to illustrate the idea of a nonlinear null space with Jeremy Freeman’s work on metamers in V2 (Freeman and Simoncelli, 2011). These are distinct from the texture metamers discussed by Tony yesterday. The idea here is that you can take an image and find an alternate image through randomization followed by gradient descent which is similar from the perspective of putative V2 neurons. These V2 neurons are assumed to be second order complex cells acting on V1 complex cells.

These metamers appear psychophysically indistinguishable, despite being quite clearly different (example below; you need to fixate for this to work). You can do this metamer business over different assumed scales; as it turns out, the matching is most convincing when the metamers are compared at scales similar to those of V2 neurons.

All of this reminds me of DiCarlo and Cox (2007). I’m wondering how the nonlinear null space related to the isosurfaces (manifolds) of DiCarlo and Cox. These are slightly different concepts; the DiCarlo isosurface is a manifold in IT firing rate space corresponding to the different views of a single object. Meanwhile, the nonlinear nullspace of a population of IT neurons is a manifold in STIMULUS space corresponding to similar responses in the population of IT neurons. Thus, as you traverse the D-manifold, identity is conserved but not necessarily IT responses, while as you traverse the S-manifold, IT responses are conserved but not necessarily identity. It does seem that the two concepts are different ways of capturing invariances at the population level, and they do so in terms of the geometry of different high-dimensional spaces. Very interesting.

Greg Horwitz

Greg Horwitz talked about spike-triggered averaging, both in terms of regression and geometrically. He also talked about STC. An interesting application of STC is the study of color-selective neurons in V1. He showed a yellow selective cell with an STA that would indicate selectivity for yellow blobs. STC analysis revealed a first eigenvector corresponding to a blue/yellow edge. Various combinations of these basis functions revealed that the cell is in fact selective for an edge defined by dark yellow on one side and light yellow on the other, in a phase invariant way.

It was briefly mentioned that it’s possible to create an isoluminant stimulus that varies along the blue/yellow axis only that is very difficult to see because S cones are so much less numerous than other types in the retina. A demonstration requires a calibrated monitor, I might try something along those lines tonight.

Miscellaneous: stem is a cool matlab function.


One thought on “CSHL computational vision: day 2

Leave a comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s