Fixing broken .m4v movies made in Matlab

I made a few ten-minute movies in Matlab composed of 1-second clips from BBC Earth, encoded as H.264, to use as visual stimuli. They played fine on Windows but would stop at the 170 second mark in Quicktime on Mac.

We’re using Processing on Mac for visual display, which uses Quicktime in the background, so they would lock up in Processing as well.

A solution is to use a passthrough encoder to remux the file; somehow there’s missing metadata and that fixed it for quicktime. ffmpeg does the trick:

ffmpeg -i bad_movie.m4v -c:v copy -c:a copy good_movie.m4v

Note that it doesn’t actually re-encode the data, it just adds some metadata – about 3k worth of metadata in fact.

Fixing broken .m4v movies made in Matlab

SFN 2014 poster – converging encoding strategies in dorsal and ventral visual streams

I have a poster session on Sunday afternoon at SFN 2014 in DC. It’s on a spiffy new method I’ve been working on for estimating the nonlinear transformation performed by an ensemble of sensory neurons, and its application to understanding visual representation in the dorsal and ventral visual streams.Sans titre

Some background: there’s a growing consensus that the point of having hierarchical, as opposed to flat, sensory systems is to permit the creation of “good representations” of sensory stimuli. The work of Nicole Rust and Jim DiCarlo, in particular, have pointed towards the idea that hierarchical computations can and do untangle high-dimensional manifolds corresponding to image identity, pose, etc. in such a way that in high-level visual cortex, decoding of behaviourally relevant variables is trivial.

The question that I tackle here is how these representations emerge from single neuron computations, i.e. the receptive fields of neurons. I introduce systems identification methods based on deep neural networks that can capture the complexity of transformations in high-level visual cortex.

A key idea that I use here is that rather than fitting a multi-layer neural network for every neuron – this would require immense amounts of data – I fit a single multi-layer neural network common to a set of neurons, which captures the set of computations leading up to a set of neurons. This is a powerful idea: we directly infer both the representation underlying the responses of multiple neurons with an explicit computational model for this representation.

In other words, whereas traditional systems identification methods reveal the computation, and decoding methods the representation, the proposed method reveals both the representation and its computational substrate.

Here I show that it is possible to capture the computations performed in two different visual areas – MT and V2. Interestingly, we find that MT and V2 use similar computational strategies to re-encode stimuli – a combination of pooling and tuned suppression. The learned representation accounts for responses not only in MT and V2 but is necessary for accounting for responses in V4 and MST.

Let me repeat that: we used V2 data from Jack Gallant’s lab collected about 7 years ago to form an silico model of how V2 represents visual stimuli. Then we used the learned representation to account for our own V4 data in a completely unrelated task. It worked on the first try. That is pretty non-trivial. Then we did the same in MT and MST. And it works. So I think that we’ve nailed an aspect of computation and representation in visual cortex that is real, non-trivial and robust.

Interestingly, the learned representation in V2 is indeed such that image identity can be decoded in an invariant fashion. I speculate that a good representation must strike a balance between invariance at the single neuron level and high dimensionality at the population level – hence the need for tuned suppression to create novel features. Simulations show encouraging results in this direction, but the full story will have to wait for the paper (in progress).

I’m pretty enthusiastic about this work overall, come and see.

Oh, and don’t forget, Tuesday night is Neurolabware party and Josh isn’t kidding about “drinks are on him”. Bring your friends.

SFN 2014 poster – converging encoding strategies in dorsal and ventral visual streams

Sorting calcium imaging signals

Calcium imaging can record from several dozens of neurons at once. Analyzing this raw data is expensive, so one typically wants to define regions of interest corresponding to cell bodies and work with the average calcium signal within.

Dario has a post on defining polygonal ROIs using the mean fluorescence image. Doing this manually is fairly time-consuming and it can be easy to miss perfectly good cells. Automated sorting methods still require some oversight, which can quickly become as time-consuming as defining the ROIs manually.

I’ve worked on an enhanced method that makes defining an ROI as simple as two clicks. The first enhancement is to use other reference images in adding to the mean fluorescence image: the correlation image, standard deviation over the mean, and kurtosis. The correlation image, discussed earlier on Labrigger, shows how correlated a single pixel is with its neighbour. When adjacent pixels are strongly correlated, that’s a good sign that that pixel belongs to a potential ROI. Similarly, pixels with high standard/mean and high kurtosis tend to correspond to potential ROIs.

With GCamp6f, however, you often have so many cells which are labelled that potential ROIs blend into each other in these alternative reference images. To solve this problem, I introduced x-ray. After I click on the flood fill button, the region surrounding the cursor shows pairwise correlations between the pixel underneath it with all the pixels surrounding it. You can easily tell two cells apart with x-ray by moving your over each of them individually*. Heck, you can even follow processes. We’ve seen some cells where you can follow processes for 200 microns. Ah!

In fact, these x-ray images are so clean that neurons can be identified just by flood filling from a user-defined point. This is what I do above: click, then move the mouse-wheel to adjust the number of pixels in the ROI, then click again to save the ROI. Since most cells are about the same size, you rarely have to adjust the number of pixels, so really two clicks per cell is all that is necessary. That’s still 250 clicks for this particular dataset (see at the end), but hey, quality problems!

The latest interface should be made available soon to all Scanbox users.

*x-ray is not the same as the correlation image. The correlation image is a single image of size h x v. x-ray is a stack of images of size h x v x windowh x windowv. It’s huge, and it contains a ton of information. You can derive the correlation image from the x-ray but not vice-versa.

Sorting calcium imaging signals