Spike identification through Gibbs sampling #1

Multi-unit activity (MUA) is usually derived by high-pass filtering a raw wideband signal, thresholding with a low threshold or rectifying, and subsequently using a low-pass filter. The intuition I think is correct, in that spikes from far away neurons will cause transient blips in the wideband signal which can be amplified by thresholding.

The usual methods, however, seem very hacky; to the best of my knowledge, there are no methods for estimating the MUA grounded in sound statistical theory. This seems like a waste, since the MUA might be able to capture properties of large scale populations of neurons; unlike the LFP, the source of the MUA is well understood.

I’ve been experimenting with different methods of estimating the MUA through Bayesian methods. MUA differs from single unit activity in that instead of focusing on clustering spikes, the problem is in detecting the spikes. In fact detection is so difficult that giving binary spike/no-spike judgements is probably a lost cause; it appears more appropriate to use continuous judgements.

In other words, if s corresponds to a spike train, spike sorting single unit activity focuses on solving the problem:

$\mbox{arg} \max_s p(s|D)$

Where D is the data. For MUA processing, however, it seems a more appropriate objective is to determine the expected probability of spikes:

$\mathbb{E}(p(s|D))$

A simple model for spike identification

Let’s start by writing out a simple model for a whitened wideband signal y containing stereotyped spikes w:

$y \sim N(\sum_i s_i*w_i,\sigma^2)$

This simply states that the signal is the sum of stereotyped spikes plus iid Gaussian noise. The spike trains themselves can be given iid Binomial priors:

$p(s_i) \sim \prod Binomial(\pi_i)$

It’s possible to give a beta prior to $\pi$ and a gamma prior to $1/\sigma^2$ etc. In later articles I’ll add ever more sophisticated assumptions to increase the accuracy of the identification; the two assumptions above, however, are sufficient to have a working model that can identify spikes.

Given these assumptions, how can we estimate $\mathbb{E}(p(s|D))$? As far as I know, estimating this quantity is analytically intractable. The model is related, however, to factorial hidden Markov models (fHMMs). One way to perform inference in fHMMs is Gibbs sampling. Gibbs sampling involves computing, for each model parameter, the probability of this parameter given observations of every other parameter.

For the spike trains, Gibbs sampling thus reduces to comparing the probability of the data with the spike added to the currently estimated spike train versus no spike added. Let’s say that we are looking at the current time point t, and the current spike train is s_i. Then we have to do the following:

1. Set $s_i(t) = 0$
2. Compute $r = y - \sum_i w_i s_i$
3. Compute $d = p(r|s_i(t) = 1) / p(r|s_i(t) = 0)$. This involves taking the difference between the residual sum-of-squares with the spike added versus without the spike added.
4. Set $s_i(t) = 1$ with probability $\frac{\pi_i d}{\pi_i d + (1-\pi_i)}$.

The algorithm proceeds by following these steps for every time point for a given number of iterations. I simulated some data containing a few spikes and a lot of noise and it looks like this:

I then ran the Gibbs sampling algo. The following shows the simulated spike train as a function of Gibbs sampling iteration:

Notice how each vertical line relates to what appears like a positive blip in the wideband signal a few samples after.

Unfortunately the Gibbs sampler does not mix well. For example at the top you can see that $s(t=7) = 1$ for the first 150 or so time samples, but after that $s(t=8) = 1$. What happens is that once a spike is assigned to a time step this “explains away” the subsequent blip in the wideband data. So it stays stuck in the same position for a long time. There’s a small probability that it will be set to 0. In that case the blip is no longer explained, and thus other nearby time steps will “pick up” the blip and explain it. So the Gibbs sampler will show unacceptably long correlation times.

Yet despite this, the algorithm does seem to identify spikes reasonably efficiently. The correlation between the spike train and the estimated probability of a spike is > .9. In the next article, I’ll modify the Gibbs sampler so that it samples many time points in a single shot, and most of the difficulty will go away.