# Spike identification through Gibbs sampling #2

Last time, I demonstrated how to use Gibbs sampling to obtain an estimate of the probability of spikes at different time points given a wideband signal. Unfortunately the method I proposed suffered from long correlation times. In this post I expand upon the previous method to obtain a practical method for identifying spikes.

### Making it better by making it worse

Previously I assumed that the probability of finding a spike in a bin was independent of other time samples. In reality, we know this isn’t correct, as at very short time scales the absolute refractory period of a neuron is in effect. In addition, while it’s desirable to resolve spike overlaps when the peaks are only partially overlapping, strong overlaps are both rare and probably better left alone. For example, Herbst et a. 2008 found that with three spike sources on the same electrode, their algorithm which fully solves the overlap problem often introduced fake spikes because of an unidentifiability in the sorting model (one spike looked like the time derivative of another).

For these theoretical and practical reasons, we can add a constraint that there can’t be more than 1 spike per X number of adjacent time bins, where X > 1 and smaller than the number of time samples in a waveform. So, for example, X might correspond to .5 ms.

It would appear as though this would make our Gibbs mixing time problems worse. Before, we had the problem that one spike would “capture” and explain away a blip in the wideband signal, refusing to yield to (about equally probable) spikes in nearby bins. The underlying issue is that we only sample one bin at a time, yet the probability of spikes in adjacent time bins are heavily anti-correlated.

In the original model, it was difficult to sample from multiple adjacent time bins at a time, because we had to consider every combination of spike/no-spike within the relevant time frame. With the additional constraint, however, we know that there can’t be more than one spike per X number of bins. So we can sample from multiple bins at once; the relevant conditional distribution is now a multinomial.

Here’s an example of samples from around the time of a spike in a simulated dataset: As you can see, mixing is much faster than in the previous version. Here’s a snippet of the simulated data (in red), the hidden spike times (in green), and the estimated probability of a spike through Gibbs sampling (blue): Pretty good right?