Curation Tutorial¶

After spike sorting and computing quality metrics, you can automatically curate the spike sorting output using the quality metrics that you have calculated.

Import the modules and/or functions necessary from spikeinterface

import spikeinterface.core as si
import spikeinterface.extractors as se

from spikeinterface.postprocessing import compute_principal_components
from spikeinterface.qualitymetrics import compute_quality_metrics

Let’s download a simulated dataset from the repo ‘https://gin.g-node.org/NeuralEnsemble/ephy_testing_data’

Let’s imagine that the ground-truth sorting is in fact the output of a sorter.

local_path = si.download_dataset(remote_path="mearec/mearec_test_10s.h5")
recording, sorting = se.read_mearec(file_path=local_path)
print(recording)
print(sorting)

MEArecRecordingExtractor: 32 channels - 32.0kHz - 1 segments - 320,000 samples - 10.00s
                          float32 dtype - 39.06 MiB
  file_path: /home/docs/spikeinterface_datasets/ephy_testing_data/mearec/mearec_test_10s.h5
MEArecSortingExtractor: 10 units - 1 segments - 32.0kHz
  file_path: /home/docs/spikeinterface_datasets/ephy_testing_data/mearec/mearec_test_10s.h5

Create SortingAnalyzer¶

For this example, we will need a SortingAnalyzer and some extensions to be computed first

analyzer = si.create_sorting_analyzer(sorting=sorting, recording=recording, format="memory")
analyzer.compute(["random_spikes", "waveforms", "templates", "noise_levels"])

analyzer.compute("principal_components", n_components=3, mode="by_channel_local")
print(analyzer)

/home/docs/checkouts/readthedocs.org/user_builds/spikeinterface/checkouts/0.101.1/src/spikeinterface/core/job_tools.py:103: UserWarning: `n_jobs` is not set so parallel processing is disabled! To speed up computations, it is recommended to set n_jobs either globally (with the `spikeinterface.set_global_job_kwargs()` function) or locally (with the `n_jobs` argument). Use `spikeinterface.set_global_job_kwargs?` for more information about job_kwargs.
  warnings.warn(

estimate_sparsity:   0%|          | 0/10 [00:00<?, ?it/s]
estimate_sparsity: 100%|##########| 10/10 [00:00<00:00, 1005.15it/s]

compute_waveforms:   0%|          | 0/10 [00:00<?, ?it/s]
compute_waveforms: 100%|##########| 10/10 [00:00<00:00, 328.38it/s]

Fitting PCA:   0%|          | 0/10 [00:00<?, ?it/s]
Fitting PCA:  30%|███       | 3/10 [00:00<00:00, 29.22it/s]
Fitting PCA:  60%|██████    | 6/10 [00:00<00:00, 28.98it/s]
Fitting PCA:  90%|█████████ | 9/10 [00:00<00:00, 24.78it/s]
Fitting PCA: 100%|██████████| 10/10 [00:00<00:00, 23.09it/s]

Projecting waveforms:   0%|          | 0/10 [00:00<?, ?it/s]
Projecting waveforms: 100%|██████████| 10/10 [00:00<00:00, 204.09it/s]
SortingAnalyzer: 32 channels - 10 units - 1 segments - memory - sparse - has recording
Loaded 5 extensions: random_spikes, waveforms, templates, noise_levels, principal_components

Then we compute some quality metrics:

metrics = compute_quality_metrics(analyzer, metric_names=["snr", "isi_violation", "nearest_neighbor"])
print(metrics)

/home/docs/checkouts/readthedocs.org/user_builds/spikeinterface/checkouts/0.101.1/src/spikeinterface/core/job_tools.py:103: UserWarning: `n_jobs` is not set so parallel processing is disabled! To speed up computations, it is recommended to set n_jobs either globally (with the `spikeinterface.set_global_job_kwargs()` function) or locally (with the `n_jobs` argument). Use `spikeinterface.set_global_job_kwargs?` for more information about job_kwargs.
  warnings.warn(

calculate pc_metrics:   0%|          | 0/10 [00:00<?, ?it/s]
calculate pc_metrics:  50%|█████     | 5/10 [00:00<00:00, 49.74it/s]
calculate pc_metrics: 100%|██████████| 10/10 [00:00<00:00, 67.01it/s]
          snr  isi_violations_ratio  ...  nn_hit_rate  nn_miss_rate
#0   24.06462                     0  ...          1.0      0.001289
#1  25.655051                     0  ...         0.99      0.000744
#2    13.9052                     0  ...     0.976744      0.005831
#3  21.825417                     0  ...          1.0           0.0
#4   7.465719                     0  ...     0.989583       0.00101
#5   7.408548                     0  ...     0.993243      0.002653
#6  20.917828                     0  ...     0.995098           0.0
#7   7.396322                     0  ...     0.986486      0.010753
#8    8.01203                     0  ...     0.989744      0.001506
#9   9.053889                     0  ...     0.996124      0.003348

[10 rows x 5 columns]

We can now threshold each quality metric and select units based on some rules.

The easiest and most intuitive way is to use boolean masking with a dataframe.

Then create a list of unit ids that we want to keep

keep_mask = (metrics["snr"] > 7.5) & (metrics["isi_violations_ratio"] < 0.2) & (metrics["nn_hit_rate"] > 0.90)
print(keep_mask)

keep_unit_ids = keep_mask[keep_mask].index.values
keep_unit_ids = [unit_id for unit_id in keep_unit_ids]
print(keep_unit_ids)

#0     True
#1     True
#2     True
#3     True
#4    False
#5    False
#6     True
#7    False
#8     True
#9     True
dtype: boolean
['#0', '#1', '#2', '#3', '#6', '#8', '#9']

And now let’s create a sorting that contains only curated units and save it.

curated_sorting = sorting.select_units(keep_unit_ids)
print(curated_sorting)


curated_sorting.save(folder="curated_sorting")

UnitsSelectionSorting: 7 units - 1 segments - 32.0kHz

NumpyFolder: 7 units - 1 segments - 32.0kHz

Unit IDs

['#0' '#1' '#2' '#3' '#6' '#8' '#9']

Annotations

Unit Properties

We can also save the analyzer with only theses units

clean_analyzer = analyzer.select_units(unit_ids=keep_unit_ids, format="zarr", folder="clean_analyzer")

print(clean_analyzer)

Traceback (most recent call last):
  File "/home/docs/checkouts/readthedocs.org/user_builds/spikeinterface/checkouts/0.101.1/examples/tutorials/qualitymetrics/plot_4_curation.py", line 78, in <module>
    clean_analyzer = analyzer.select_units(unit_ids=keep_unit_ids, format="zarr", folder="clean_analyzer")
  File "/home/docs/checkouts/readthedocs.org/user_builds/spikeinterface/checkouts/0.101.1/src/spikeinterface/core/sortinganalyzer.py", line 894, in select_units
    return self._save_or_select_or_merge(format=format, folder=folder, unit_ids=unit_ids)
  File "/home/docs/checkouts/readthedocs.org/user_builds/spikeinterface/checkouts/0.101.1/src/spikeinterface/core/sortinganalyzer.py", line 827, in _save_or_select_or_merge
    new_sorting_analyzer.extensions[extension_name] = extension.copy(
  File "/home/docs/checkouts/readthedocs.org/user_builds/spikeinterface/checkouts/0.101.1/src/spikeinterface/core/sortinganalyzer.py", line 2015, in copy
    new_extension.save()
  File "/home/docs/checkouts/readthedocs.org/user_builds/spikeinterface/checkouts/0.101.1/src/spikeinterface/core/sortinganalyzer.py", line 2060, in save
    self._save_data(**kwargs)
  File "/home/docs/checkouts/readthedocs.org/user_builds/spikeinterface/checkouts/0.101.1/src/spikeinterface/core/sortinganalyzer.py", line 2127, in _save_data
    df_group.create_dataset(name="index", data=ext_data.index.to_numpy())
  File "/home/docs/checkouts/readthedocs.org/user_builds/spikeinterface/conda/0.101.1/lib/python3.10/site-packages/zarr/hierarchy.py", line 1111, in create_dataset
    return self._write_op(self._create_dataset_nosync, name, **kwargs)
  File "/home/docs/checkouts/readthedocs.org/user_builds/spikeinterface/conda/0.101.1/lib/python3.10/site-packages/zarr/hierarchy.py", line 952, in _write_op
    return f(*args, **kwargs)
  File "/home/docs/checkouts/readthedocs.org/user_builds/spikeinterface/conda/0.101.1/lib/python3.10/site-packages/zarr/hierarchy.py", line 1126, in _create_dataset_nosync
    a = array(data, store=self._store, path=path, chunk_store=self._chunk_store, **kwargs)
  File "/home/docs/checkouts/readthedocs.org/user_builds/spikeinterface/conda/0.101.1/lib/python3.10/site-packages/zarr/creation.py", line 441, in array
    z = create(**kwargs)
  File "/home/docs/checkouts/readthedocs.org/user_builds/spikeinterface/conda/0.101.1/lib/python3.10/site-packages/zarr/creation.py", line 209, in create
    init_array(
  File "/home/docs/checkouts/readthedocs.org/user_builds/spikeinterface/conda/0.101.1/lib/python3.10/site-packages/zarr/storage.py", line 454, in init_array
    _init_array_metadata(
  File "/home/docs/checkouts/readthedocs.org/user_builds/spikeinterface/conda/0.101.1/lib/python3.10/site-packages/zarr/storage.py", line 583, in _init_array_metadata
    raise ValueError("missing object_codec for object array")
ValueError: missing object_codec for object array

Total running time of the script: (0 minutes 1.528 seconds)

Gallery generated by Sphinx-Gallery