-
Notifications
You must be signed in to change notification settings - Fork 190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
in whiten.py, eps is too large for some datasets #2064
Comments
Are the traces in volte? Another option is to scale them up with a |
Not sure this is good to have epsilon too low. |
Thanks @alejoe91 Alessio. I have added a note to the docs here: that says: # Note that if the recording traces are of float type, you may need to scale
# it to a reasonable voltage range in order for whitening to work properly
# recording = spre.scale(recording, gain=...) But it's unfortunate that this is necessary, because I need to include that blurb for every example, and it's not always easy for users to know whether they need a scale factor. So it would be great if you could consider changing this to 1e-16 unless you forsee any trouble with that. |
The eps could also be automatically calculated from the data. For example, |
I think that you could also do it in MS5 directly, right? When it comes to preprocessing, in SpikeInterface we generally assume that you're either dealing with non-scaled int16 or uV traces, both of which will work with 1e-6. My worry is that 1e-16 could be too low as an epsilon. Could it? |
It wouldn't quite work, because whitening will have already failed in more ways than just incorrect scaling of the channels.
Yeah, that makes sense.
Possibly too low, yeah. I think the best solution could be to choose epsilon based on the data. Something like # proposal
median_sqr = np.median(data ** 2)
eps = min(1-e6, max(1e-16, median_sqr * 1e-6)) For typical voltage range examples, this would end up being the same 1-e6 |
Instead of median we should probably use the mad or std no? Median could very well be 0 and say nothing about the range. Or even a ptp or percentiles |
But yeah I think your solution would work well! |
It's the median of the square, so it would be strictly greater than zero. I chose the square because |
Ah sorry I missed the square! Makes sense then. Can you make a PR? |
Made PR: #2070 |
When the magnitude of the recording traces is very low (e.g. ~ 1e-6 or lower), the eps used in the whiten preprocessing is too large. This results in a whitened signal that is not normalized properly, and detection for mountainsort (and perhaps other algorithms) then fails.
spikeinterface/src/spikeinterface/preprocessing/whiten.py
Lines 70 to 72 in ca48f6b
Here is a dataset for which the raw traces are float and low magnitude
https://dandiarchive.org/dandiset/000463/draft/files?location=sub-BH395
The simplest fix is to change this to 1e-16. Not sure if this will introduce any other issues.
The text was updated successfully, but these errors were encountered: