zscore fails when using whitenoise as an input. #1972

h-mayorquin · 2023-09-06T16:02:16Z

Related to this #1971:

Bug:

from spikeinterface.preprocessing import normalize_by_quantile, scale, center, zscore
from spikeinterface.core import generate_recording

seed = 1
rec = generate_recording(seed=seed, mode="lazy")
rec_int = scale(rec, dtype="int16", gain=100)

zscore_recording = zscore(rec_int, dtype="int16", int_scale=256, mode="mean+std", seed=seed)
traces = zscore_recording.get_traces(segment_index=0)
trace_mean = np.mean(traces, axis=0)
trace_std = np.std(traces, axis=0)
assert np.all(np.abs(trace_mean) < 1)
assert np.all(np.abs(trace_std - 256) < 1)

For most seeds with lazy this fails.

@samuelgarcia you are the one that implemented this range thing for z-scores, any idea on why this might be?

h-mayorquin · 2023-09-06T16:04:35Z

This functionality was implemented here:
#1437

alejoe91 · 2023-09-07T07:31:22Z

Interesting! I'll also take a look! Thanks for tracking this down @h-mayorquin

h-mayorquin · 2023-09-07T07:53:09Z

I thought more about this this morning. It is just a statistical problem with the test. This is not a bug.
The thing is that if you scale your data by a standadrd deviation you increase the standard error:

$ SE = \frac{\sigma}{\sqrt{n}} $

Then you need a larger n. If you calculate the standard error of the mean for the chunks available you get that the standard error is not small enough for the test above to be sound.

Plus, casting to int is not a linear function (it behaves like floor for positives and like ceiling for negatives) which also increases the standard deviation of the samples and will make the standard error larger in practice.

Here, I estimate taking the mean 1_000 times with 100_000 samples. You can see that most of the estimates are farther from the true (which is zero) than 1 and therefore will fail on the test above.

estimates = (256 * np.random.randn(int(5e4), 1000).astype("int16")).mean(axis=0)
plt.hist(estimates, bins=100)
plt.xlabel("Estimated mean from the process")
plt.ylabel("Frequency")

h-mayorquin · 2023-09-07T08:04:32Z

I would like for us to shutdown this test and slowly deprecate the int range functionality in ZScoreRecording so we can confine this complexity to kilosort only. Probably Sam tested this a lot there when he was implementing kilosort and as this is is an ad-hoc processing step for them we don't neeed to test it in the general.

Within the kilosort pipeline or anywhere else if the need for an operation like this arises the operation can be performed by composing the casting and scaling operations already available in the pre-processing module. No need for flags.

Simliar to previous discussions:
#1815 (comment)
#914

h-mayorquin added bug Something isn't working preprocessing Related to preprocessing module labels Sep 6, 2023

h-mayorquin mentioned this issue Sep 6, 2023

Pin down z-scale normalized to integer (!) failure by separating the tests #1971

Merged

h-mayorquin mentioned this issue Oct 24, 2023

Quality Metrics Test Failure Amplitude Cutoff #2120

Closed

h-mayorquin mentioned this issue Nov 10, 2023

Fix filtering rounding error #2189

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

zscore fails when using whitenoise as an input. #1972

zscore fails when using whitenoise as an input. #1972

h-mayorquin commented Sep 6, 2023 •

edited

Loading

h-mayorquin commented Sep 6, 2023

alejoe91 commented Sep 7, 2023

h-mayorquin commented Sep 7, 2023 •

edited

Loading

h-mayorquin commented Sep 7, 2023 •

edited

Loading

zscore fails when using whitenoise as an input. #1972

zscore fails when using whitenoise as an input. #1972

Comments

h-mayorquin commented Sep 6, 2023 • edited Loading

h-mayorquin commented Sep 6, 2023

alejoe91 commented Sep 7, 2023

h-mayorquin commented Sep 7, 2023 • edited Loading

h-mayorquin commented Sep 7, 2023 • edited Loading

h-mayorquin commented Sep 6, 2023 •

edited

Loading

h-mayorquin commented Sep 7, 2023 •

edited

Loading

h-mayorquin commented Sep 7, 2023 •

edited

Loading