Curation Tutorial

After spike sorting and computing quality metrics, you can automatically curate the spike sorting output using the quality metrics that you have calculated.

Import the modules and/or functions necessary from spikeinterface

import spikeinterface.core as si

Let’s generate a simulated dataset, and imagine that the ground-truth sorting is in fact the output of a sorter.

recording, sorting = si.generate_ground_truth_recording()
print(recording)
print(sorting)
GroundTruthRecording (InjectTemplatesRecording): 4 channels - 25.0kHz - 1 segments
                      250,000 samples - 10.00s - float32 dtype - 3.81 MiB
GroundTruthSorting (NumpySorting): 10 units - 1 segments - 25.0kHz

Create SortingAnalyzer

For this example, we will need a SortingAnalyzer and some extensions to be computed first

analyzer = si.create_sorting_analyzer(sorting=sorting, recording=recording, format="memory")
analyzer.compute(["random_spikes", "waveforms", "templates", "noise_levels"])

analyzer.compute("principal_components", n_components=3, mode="by_channel_local")
print(analyzer)
estimate_sparsity (no parallelization):   0%|          | 0/10 [00:00<?, ?it/s]
estimate_sparsity (no parallelization): 100%|██████████| 10/10 [00:00<00:00, 413.80it/s]

compute_waveforms (no parallelization):   0%|          | 0/10 [00:00<?, ?it/s]
compute_waveforms (no parallelization): 100%|██████████| 10/10 [00:00<00:00, 297.93it/s]

noise_level (no parallelization):   0%|          | 0/20 [00:00<?, ?it/s]
noise_level (no parallelization): 100%|██████████| 20/20 [00:00<00:00, 247.98it/s]

Fitting PCA:   0%|          | 0/10 [00:00<?, ?it/s]
Fitting PCA: 100%|██████████| 10/10 [00:00<00:00, 128.43it/s]

Projecting waveforms:   0%|          | 0/10 [00:00<?, ?it/s]
Projecting waveforms: 100%|██████████| 10/10 [00:00<00:00, 1238.87it/s]
SortingAnalyzer: 4 channels - 10 units - 1 segments - memory - sparse - has recording
Loaded 5 extensions: random_spikes, waveforms, templates, noise_levels, principal_components

Then we compute some quality metrics:

metrics_ext = analyzer.compute("quality_metrics", metric_names=["snr", "isi_violation", "nearest_neighbor"])
metrics = metrics_ext.get_data()
print(metrics)
         snr  isi_violations_ratio  ...  nn_hit_rate  nn_miss_rate
0  40.423687                   0.0  ...     0.943262      0.004283
1  31.630341                   0.0  ...     0.930952      0.007132
2  34.230616                   0.0  ...     0.886897      0.012026
3   8.423454                   0.0  ...     0.677372      0.032598
4   1.692318                   0.0  ...     0.803390      0.017729
5  10.022086                   0.0  ...     0.767470      0.031831
6  34.510001                   0.0  ...     0.943624      0.007466
7  13.812330                   0.0  ...     0.816667      0.020601
8  24.762407                   0.0  ...     0.865432      0.015362
9   9.400082                   0.0  ...     0.739869      0.031102

[10 rows x 5 columns]

We can now threshold each quality metric and select units based on some rules.

The easiest and most intuitive way is to use boolean masking with a dataframe.

Then create a list of unit ids that we want to keep

keep_mask = (metrics["snr"] > 7.5) & (metrics["isi_violations_ratio"] < 0.2) & (metrics["nn_hit_rate"] > 0.80)
print(keep_mask)

keep_unit_ids = keep_mask[keep_mask].index.values
keep_unit_ids = [unit_id for unit_id in keep_unit_ids]
print(keep_unit_ids)
0     True
1     True
2     True
3    False
4    False
5    False
6     True
7     True
8     True
9    False
dtype: bool
['0', '1', '2', '6', '7', '8']

And now let’s create a sorting that contains only curated units and save it.

curated_sorting = sorting.select_units(keep_unit_ids)
print(curated_sorting)


curated_sorting.save(folder="curated_sorting", overwrite=True)
GroundTruthSorting (UnitsSelectionSorting): 6 units - 1 segments - 25.0kHz
NumpyFolder (NumpyFolderSorting): 6 units - 1 segments - 25.0kHz
Unit IDs
    ['0' '1' '2' '6' '7' '8']
Annotations
  • name : GroundTruthSorting
Properties
    gt_unit_locations[[ 2.8220482e+00 -7.8601375e+00 5.5757146e+00] [ 2.5794661e+01 1.0354284e+01 6.2242894e+00] [-2.0695219e-02 2.2856422e+01 5.6213870e+00] [ 1.8962755e+01 -6.6455288e+00 2.0228270e+01] [ 2.7098755e+01 1.4829044e+01 2.6593056e+01] [-6.5016389e+00 1.6009506e+01 2.6310440e+01]]


We can also save the analyzer with only theses units

clean_analyzer = analyzer.select_units(unit_ids=keep_unit_ids, format="zarr", folder="clean_analyzer")

print(clean_analyzer)
SortingAnalyzer: 4 channels - 6 units - 1 segments - zarr - sparse - has recording
Loaded 6 extensions: random_spikes, waveforms, templates, noise_levels, principal_components, quality_metrics

Total running time of the script: (0 minutes 0.647 seconds)

Gallery generated by Sphinx-Gallery