scispace - formally typeset
Open AccessPosted ContentDOI

Kilosort: realtime spike-sorting for extracellular electrophysiology with hundreds of channels

Reads0
Chats0
TLDR
Kilosort models the recorded voltage as a sum of template waveforms triggered on the spike times, allowing overlapping spikes to be identified and resolved and is an important step towards fully automated spike sorting of multichannel electrode recordings.
Abstract
Advances in silicon probe technology mean that in vivo electrophysiological recordings from hundreds of channels will soon become commonplace. To interpret these recordings we need fast, scalable and accurate methods for spike sorting, whose output requires minimal time for manual curation. Here we introduce Kilosort, a spike sorting framework that meets these criteria, and show that it allows rapid and accurate sorting of large-scale in vivo data. Kilosort models the recorded voltage as a sum of template waveforms triggered on the spike times, allowing overlapping spikes to be identified and resolved. Rapid processing is achieved thanks to a novel low-dimensional approximation for the spatiotemporal distribution of each template, and to batch-based optimization on GPUs. A novel post-clustering merging step based on the continuity of the templates substantially reduces the requirement for subsequent manual curation operations. We compare Kilosort to an established algorithm on data obtained from 384-channel electrodes, and show superior performance, at much reduced processing times. Data from 384-channel electrode arrays can be processed in approximately realtime. Kilosort is an important step towards fully automated spike sorting of multichannel electrode recordings, and is freely available github.com/cortex-lab/Kilosort.

read more

Content maybe subject to copyright    Report

Kilosort: realtime spike-sorting for extracellular electrophysiology
with hundreds of channels
Marius Pachitariu
1,2
, Nicholas Steinmetz
1,2
, Shabnam Kadir
1,2
, Matteo Carandini
3
and
Kenneth D. Harris
1,2
1 UCL Institute of Neurology, London WC1E 6DE, United Kingdom.
2 UCL Department of Neuroscience, Physiology, and Pharmacology, London WC1E 6DE,
United Kingdom.
3 UCL Institute of Ophthalmology, London EC1V 9EL, United Kingdom.
* Correspondence to marius.pachitariu.10@ucl.ac.uk.
Abstract
Advances in silicon probe technology mean that in vivo electrophysiological recordings from hun-
dreds of channels will soon become commonplace. To interpret these recordings we need fast,
scalable and accurate methods for spike sorting, whose output requires minimal time for manual
curation. Here we introduce Kilosort, a spike sorting framework that meets these criteria, and show
that it allows rapid and accurate sorting of large-scale in vivo data. Kilosort models the recorded
voltage as a sum of template waveforms triggered on the spike times, allowing overlapping spikes
to be identified and resolved. Rapid processing is achieved thanks to a novel low-dimensional ap-
proximation for the spatiotemporal distribution of each template, and to batch-based optimization
on GPUs. A novel post-clustering merging step based on the continuity of the templates substan-
tially reduces the requirement for subsequent manual curation operations. We compare Kilosort to
an established algorithm on data obtained from 384-channel electrodes, and show superior perfor-
mance, at much reduced processing times. Data from 384-channel electrode arrays can be pro-
cessed in approximately realtime. Kilosort is an important step towards fully automated spike sort-
ing of multichannel electrode recordings, and is freely available (github.com/cortex-lab/Kilosort).
1 Introduction
The oldest and most reliable method for recording neural activity involves lowering an electrode
into the brain and recording the local electrical activity around the electrode tip. Action potentials
of single neurons generate a stereotypical temporal deflection of the voltage, known as a spike
waveform. When multiple neurons close to the electrode fire action potentials, their spikes must
be identified and assigned to the correct neuron based on the features of the recorded waveforms,
a process known as spike sorting
1–15
.
Measuring voltage at multiple closely-space sites in the extracellular medium can substantially
improve spike sorting accuracy. In this case, the recorded waveforms also have characteristic
spatial shapes, determined by each neuron’s location and physiological characteristics. Together,
the spatial and temporal shape of the waveform provide all the information that can be used to
assign a given spike to a neuron
16
.
Current methods for spike sorting, however, will struggle to meet the requirements raised by a
new generation of high-count, high-density electrodes that are soon to become commonplace.
These electrodes have several hundred closely-spaced recording sites
17
, and initial tests suggest
1
.CC-BY-NC-ND 4.0 International licensea
certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted June 30, 2016. ; https://doi.org/10.1101/061481doi: bioRxiv preprint

500 1000 1500 2000 2500 3000 3500 4000 4500 5000
samples (25kHz)
20
40
60
80
100
120
channels
a
b
c
correlation of channel noise
20 40 60 80 100 120
channels
20
40
60
80
100
120
channels
Figure 1. Data from high-channel count recordings. a, High-pass filtered and channel-whitened data. Negative
peaks are action potentials. b, Example mean waveforms, centered on their peaks. c, Example cross-correlation
matrix across channels (before whitening, no spikes included).
that they can reveal the activity of 100 to 1,000 neurons firing tens of millions of spikes. When
applied to such data, algorithms designed for tens of recording sites
18,19
suffer from substantial
limitations. The automatic sorting software can take days to weeks to run, and require hours
to days of manual curation. Furthermore, with more channels and higher density, resolution of
spatiotemporally overlapping spikes becomes both more tractable and more important.
Here we overcome these limitations and present Kilosort, a new algorithm which takes advan-
tage of a novel mathematical approach that greatly reduces the amount of calculation required,
together with the computing capabilities of low-cost commercially available graphics processing
units (GPUs). To illustrate its abilities we show that it accurately spike sorts the output of 384-
channel dense probes in approximately real time.
1.1 High-density electrophysiology and structured sources of noise
With high-density neural probes (i.e. site spacing in the range 20 µm), the waveforms of each
neuron can be typically detected on 5 to 50 channels simultaneously (Fig. 1a,b; example data
available at http://data.cortexlab.net/dualPhase3). This provides a substantial amount of informa-
tion per spike, but because other neurons also fire on the same channels, a clustering algorithm
is required to unmix the signals and assign spikes to the correct neuron. Furthermore, structured
sources of noise can make this assignment more difficult. For example, neurons that are too dis-
tant from the electrode to be sortable provide myriad superimposed spike waveforms, a continuous
random background against which the features of sortable spikes must be distinguished
16
.
2
.CC-BY-NC-ND 4.0 International licensea
certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted June 30, 2016. ; https://doi.org/10.1101/061481doi: bioRxiv preprint

1.2 Previous work
A traditional approach to spike sorting divides the problem into several stages
2
. First, spike times
are detected, for example as times when the negative voltage crosses a pre-defined threshold.
Second, these spike waveforms are extracted and projected into a common low-dimensional
space, typically obtained by principal component analysis (PCA
20
. Third, the spikes are clustered
in this low-dimensional space using a variety of approaches, such as mixtures of Gaussians
18
or peak-density detection
21
. Some algorithms also include a fourth stage of template matching
that scans the raw data for overlapping spikes, which may have been missed in the first detection
phase
11,12,14
. Finally, a manual curation stage is required, in which a human operator corrects the
imperfect automated results using a graphical user interface (GUI). This last step is particularly
necessary for recordings subject to electrode drift, where the waveforms of a given neuron vary
over time and may be assigned to multiple clusters.
Here we describe a system that omits spike detection and PCA and instead combines the identi-
fication of template waveforms and associated spike times in a single unified model. This model
seeks to reconstruct the entire raw voltage dataset with the templates of candidate neurons. We
define a cost function for this reconstruction, and derive approximate inference and learning algo-
rithms that can be successfully applied to very large channel count data. This approach is related
to a previous method
6
, but that method requires a generic convex optimization that is slow for
recordings with large numbers of channels.
As we demonstrate with constructed ground-truth datasets, our system is more accurate than
a current widely-used method
18
. Furthermore, we demonstrate that on real datasets with 384
channels, this implementation is fast enough to run in nearly real time.
2 Model formulation
We start with a generative model of the raw electrical voltage. Unlike the traditional pipeline, this
algorithm does not start with a spike detection step, nor project the spike waveforms to a lower-
dimensional PCA space. As we show below, both of these steps would discard potentially useful
information.
2.1 Pre-processing: common average referencing, temporal filtering and spatial whitening
To remove low-frequency fluctuations, such as the local field potential, we high-pass filter each
channel of the raw data at 300 Hz. To diminish the effect of artifacts shared across all channels,
we subtract at each timepoint the median of the signal across all recording sites, an operation
known as common average referencing
22
. This step is best performed after high-pass filtering,
because the LFP magnitude is variable across channels and can be comparable in size to the
artifacts.
Next, we whiten the data across channels to remove correlated noise. In the frequency range
typical of spikes, spatially correlated noise arises primarily from neurons far from the probe, whose
spikes are too small to sort directly
16,23
and have a large spatial spread over the surface of the
probe. Since there are many such neurons, their noise averages out to have a stereotypical cross-
correlation pattern across channels (Fig. 1c). To estimate this noise covariance, we first remove
3
.CC-BY-NC-ND 4.0 International licensea
certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted June 30, 2016. ; https://doi.org/10.1101/061481doi: bioRxiv preprint

raw waveform temporal PC (common) spatiotemporal PC (private)
10
1
10
2
10
3
temporal PC (common)
10
1
10
2
10
3
spatiotemporal PC (private)
residual waveform variance
a b
Figure 2. Spike reconstruction from three private PCs. a, Four example average waveforms (black) with their
respective reconstruction from three common temporal PCs/channel (blue), and with reconstruction from three spa-
tiotemporal PCs private to each spike (red). The red traces mostly overlap the black traces. b, Summary of residual
waveform variance for all neurons in one dataset.
the times of putative spikes (detected with a threshold criterion). We then estimate the covariance
matrix Σ, and use its singular vectors and singular values E, D to obtain a symmetrical whitening
matrix that maintains the spatial structure of the data, known as zero-phase component analysis
(ZCA): W
ZCA
= Σ
1/2
= ED
1/2
E
T
. To regularize D, we add a small value to its diagonal. For
very large channel counts, estimation of the full covariance matrix Σ is noisy, and we therefore
compute the columns of the whitening matrix W
ZCA
independently for each channel, based on
its nearest 32 neighbors. We then multiply the raw data matrix containing all channels with this
whitening matrix.
2.2 Modelling mean spike waveforms with SVD
When single spike waveforms are recorded across a large number of channels, most channels
will have no signal and only noise. To prevent the large total energy on these many noise chan-
nels from swamping the signal present on a the smaller number of signal channels, previous
approaches have estimated a “mask” to exclude channels with insufficient SNR to identify any
given spike
18,19
; to further reduce noise and lower dimensionality, the spikes are usually projected
into a small number of temporal principal components per channel
20
, typically three.
Here we introduce a different method for simultaneous spatial denoising/masking and for lower-
ing the dimensionality of spikes. This method is based on the observation that any mean spike
waveforms can be well explained by a singular value decomposition (SVD) decomposition of its
spatiotemporal waveform, with as few as three components, but that the spatial and temporal
components required can vary substantially between neurons (Fig. 2a). This approach of tailoring
“private PCs” to each spike allows us to fit the spikes with 5 times less residual variance than
the standard approach of applying a single PCA approximation per channel, to all neurons on that
channel (Fig. 2b). This decomposition results in an automated masking strategy, which allows the
waveforms to be denoised and irrelevant channels ignored, and also speeds up the algorithm, by
allowing the use of standard low-rank filtering techniques (see below).
4
.CC-BY-NC-ND 4.0 International licensea
certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted June 30, 2016. ; https://doi.org/10.1101/061481doi: bioRxiv preprint

2.3 Integrated template matching framework
To define a generative model of the electrical recorded voltage, we take advantage of the approxi-
mately linear summation of electrical potentials from different sources in the extracellular medium.
We combine the spike times of all neurons into a N
spikes
-dimensional vector s, such that the wave-
forms start at time samples s + 1. We define the cluster identity of spike k as σ(k), taking values
into the set {1, 2, 3, ..., N}, where N is the total number of neurons. We represent the normal-
ized waveform of neuron n as the matrix K
n
of size number of channels by number of sample
timepoints t
s
(typically 61). The matrix K
n
is approximated by a three-dimensional singular value
decomposition, K
n
= U
n
W
n
, whereby K
n
is deconstructed into three pairs of spatial and temporal
basis functions, U
n
and W
n
, such that the norm of U
n
W
n
is 1. The value of the electrical voltage
at time t on channel i is modeled by
V (i, t) = V
0
(i, t) + N (0, )
where the noise is modelled as independent Gaussian of variance .V
0
(I, t) is defined as
V
0
(i, t) =
s(k)tt
s
X
k,s(k)<t
x
k
K
σ(k)
(i, t s(k)) (1)
where the index k picks out those spikes that overlap with the timepoint t, because they happen
at nearby times s(k), and x
k
> 0 is the amplitude of spike k, further constrained by
x
k
N
µ
σ(k)
, λµ
2
σ(k)
.
This last equation models variations in spike amplitudes for spikes from the same neuron, due
to factors like burst adaptation and drift. We modelled this variability with a Gaussian distribution
whose variance scales with the square of its mean, to capture the fact that the spikes of neurons
closer to the probe vary in relative, not absolute amplitude. λ and are hyperparameters that
control the relative scaling with respect to each other of the reconstruction error and the prior on
the amplitude.
This model formulation leads to the following cost function, which we minimize with respect to
spike times s, spike amplitudes x, templates K, and cluster assignments σ:
L(s, x, K, σ) = kV V
0
k
2
+
λ
X
k
x
k
µ
σk
1
2
(2)
The second term in this expression has the purpose of limiting the number of spikes that are
assigned amplitudes that deviate strongly from the mean of the relevant cluster. It is scaled by the
ratio
λ
, which we usually set to a constant between 1 and 10.
3 Learning and inference in the model
To optimize the cost function, we first initialize the templates and then alternate between two steps:
finding the best spike times s, cluster assignments σ, and amplitudes x (template matching); and
optimizing the template waveforms K for a given s, σ, x (template optimization). After the final spike
times and amplitudes have been extracted, we run a final post-optimization merging algorithm
which finds pairs of clusters whose spikes form a single continuous density. These steps are
described in detail below.
5
.CC-BY-NC-ND 4.0 International licensea
certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted June 30, 2016. ; https://doi.org/10.1101/061481doi: bioRxiv preprint

Citations
More filters
Journal ArticleDOI

Single-trial neural dynamics are dominated by richly varied movements

TL;DR: This paper found that neurons with similar trial-averaged activity often reflected different combinations of cognitive and movement variables, accounting for trial-by-trial fluctuations that are often considered 'noise'.
Journal ArticleDOI

Open Ephys: an open-source, plugin-based platform for multichannel electrophysiology.

TL;DR: The Open Ephys GUI is the first widely used application for multichannel electrophysiology that leverages a plugin-based workflow, and it is hoped that it will lower the barrier to entry for Electrophysiologists who wish to incorporate real-time feedback into their research.
Journal ArticleDOI

Neuropixels 2.0: A miniaturized high-density probe for stable, long-term brain recordings

TL;DR: A suite of electrophysiological tools comprising a miniaturized high-density probe, recoverable chronic implant fixtures, and algorithms for automatic post hoc motion correction are demonstrated, enabling an order-of-magnitude increase in the number of sites that can be recorded in small animals, such as mice, and the ability to record from them stably over long time scales.
Journal ArticleDOI

Genetic Dissection of Neural Circuits: A Decade of Progress.

TL;DR: Evaluated methods allow systematic classification of cell types and provide genetic access to diverse neuronal types for studies of connectivity and neural coding during behavior in invertebrate and vertebrate organisms.
Journal ArticleDOI

A Fully Automated Approach to Spike Sorting

TL;DR: An automated clustering approach and associated software package that has the potential to enable reproducible and automated spike sorting of larger scale recordings than is currently possible and has accuracy comparable to or exceeding that achieved using manual or semi-manual techniques.
References
More filters
Journal ArticleDOI

Matching pursuits with time-frequency dictionaries

TL;DR: The authors introduce an algorithm, called matching pursuit, that decomposes any signal into a linear expansion of waveforms that are selected from a redundant dictionary of functions, chosen in order to best match the signal structures.
Journal ArticleDOI

Clustering by fast search and find of density peaks

TL;DR: A method in which the cluster centers are recognized as local density maxima that are far away from any points of higher density, and the algorithm depends only on the relative densities rather than their absolute values.
Proceedings Article

An analysis of single-layer networks in unsupervised feature learning

TL;DR: In this paper, the authors show that the number of hidden nodes in the model may be more important to achieving high performance than the learning algorithm or the depth of the model, and they apply several othe-shelf feature learning algorithms (sparse auto-encoders, sparse RBMs, K-means clustering, and Gaussian mixtures) to CIFAR, NORB, and STL datasets using only single-layer networks.
Journal ArticleDOI

Unsupervised spike detection and sorting with wavelets and superparamagnetic clustering

TL;DR: A new method for detecting and sorting spikes from multiunit recordings that combines the wave let transform with super paramagnetic clustering, which allows automatic classification of the data without assumptions such as low variance or gaussian distributions is introduced.
Journal ArticleDOI

Large-scale recording of neuronal ensembles

TL;DR: Large-scale recordings from neuronal ensembles now offer the opportunity to test competing theoretical frameworks and require further development of the neuron–electrode interface, automated and efficient spike-sorting algorithms for effective isolation and identification of single neurons, and new mathematical insights for the analysis of network properties.
Related Papers (5)
Frequently Asked Questions (11)
Q1. What are the contributions mentioned in the paper "Kilosort: realtime spike-sorting for extracellular electrophysiology with hundreds of channels" ?

Here the authors introduce Kilosort, a spike sorting framework that meets these criteria, and show that it allows rapid and accurate sorting of large-scale in vivo data. The authors compare Kilosort to an established algorithm on data obtained from 384-channel electrodes, and show superior performance, at much reduced processing times. 

The time taken to run Kilosort scales linearly with the number of recorded neurons, rather than the number of channels, due to the low-dimensional parametrization of template waveforms. 

The main loop alternating template matching and inference is run until the cost function approaches convergence (typically less than six full passes through the data). 

The oldest and most reliable method for recording neural activity involves lowering an electrode into the brain and recording the local electrical activity around the electrode tip. 

The authors also anneal from small to large the ratio /λ, which controls the relative impact of the reconstruction term and amplitude bias term in equation 2; therefore, at the beginning of the optimization, spikes assigned to the same cluster are allowed to have more variable amplitudes. 

To estimate the best achievable score after operator merges, the authors took advantage ofthe ground truth data, and automatically merged together candidate clusters so as to greedily maximize their score. 

Since firing rates vary over two orders of magnitude in typical recordings (from < 0.5 to 50 spikes/s), this adaptive running average procedure allows clusters with low firing rates to nonetheless average enough of their spikes to generate a smooth running-average template. 

After processing every hundred batches (or more, depending on their time length), the templates are obtained from the running average waveform 

To define a generative model of the electrical recorded voltage, the authors take advantage of the approximately linear summation of electrical potentials from different sources in the extracellular medium. 

To avoid increasing the spike density at any location on the probe, the authors also subtracted off the denoised waveform from its original location. 

This method is based on the observation that any mean spike waveforms can be well explained by a singular value decomposition (SVD) decomposition of its spatiotemporal waveform, with as few as three components, but that the spatial and temporal components required can vary substantially between neurons (Fig. 2a).