-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Template and selfdetection are different #573
Comments
The correlation sum will frequently not be exactly equal to the number of channels for a self-detection due to small rounding errors - you should expect that the correlation sum is close to the number of channels though, however from your description it sounds like you think there is an issue in how the data are processed. Without having access to your bank and catalog it is impossible to verify this. Can you please provide an example that I can test and debug, either using an open online catalog, or by providing the necessary data and a complete working example. The difference may stem from how the start and end-times are handled given that your end time is < 86400 seconds after your start time (and hence not a full day). |
Sorry for the long wait. I needed to check the specific conditions of my data that were producing the errors in order to reproduce them and explain myself better. Here you could find a data set with one event, few stations and waveforms to use to reproduce the problems I had: This includes an script that reproduces the problems, a catalog with one event, the picks of such event, and 2 days of waveforms of one station. I found 3 problems, the first two problem are related to my issue described before: 1) trying to process chunks of exactly 86400 seconds of data with daylong=True
In this instance, this error happens because the data is exatly 86400 seconds long (from 2020-01-08 00:00:00 to 2020-01-09 00:00:00).
This is after trimming out the already processed chunks from the stream. If the last chunk happen to finish exactly where the stream finishes, the last triming will trim the last portion of data and and the stream becomes an empty stream. The empty stream loses all its attributes, so when trying to get There are different ways I went around this error:
2) fixing lengths, shifting the templates one sample
This length sanitation drops the first sample, but doesn't modify the starttime of the trace, moving the second sample to the starttime, effectively shifting the trace one sample back in time. At the moment I compromise to get the one sample shift, therefore my selfdetections usually happens 0.02 seconds from the template. I compromise because under this conditions the template and the selfdetection traces are exactly the same by the exception of the one sample shift. 3) Lag calc with specific shift len leads to drop of channels In the given example with one station, the channels for this stations are discarded leading to a catalog with no picks whatsoever. |
Thanks for all your work documenting those issues and sorry once again that you ran into them!
For context, the
Thanks again for your help with this! |
It looks like point 3 stems from the way obspy trims/slices data when the selected times do not align with a sample. There isn't an obvious way to handle this how I want it to be handled in obspy, so I am working on a patch to EQcorrscan to select the nearest sample before the selected start-time as the start of the cut, then trimming to length from there. I'm working on this here: #582 - feel free to pull that branch and test with your data. |
I think #582 should fix your issue with lag-calc. It seems to work for me with your dataset and I have added a test to confirm that these part-sample trims work as expected. I will leave this open for a couple of weeks then close this issue if there isn't a response (unless I forget to close it!). |
I just want to remark I found the issue (2) happening in template_gen.download_from_client() here from line 567:
I am not changing it because I am guessing that if is the case for all the dowload modules then it makes more sense to leave like this and they are all coherent with each ohter and not shifting would happen. Maybe it is supposed to be like that, I am not sure. Just wanted to point it out in case it needs a fix. |
obsplus version 0.3.0
eqcorrscan 0.5.0
obspy version 1.4.1
numpy version 1.26.4
After noticing that the sum correlation detection value was not exactly the number of channels (each channel correlation should be 1) on a self detection I experimented to find out why. I took one event , build my tribe, ran detections and with return_stream=True, processed the stream and used it to extract the self detection stream and found the following:
-Of 61 traces, 58 have the same values (traces.data) but they are consistently different in length. The templates starts one time delta earlier and the self detection are one sample longer.
To Reproduce
Once I have the Catalog and bank loaded I do the following:
Expected behavior
I expected the self detection and the template to be identical and that the correlation coefficient sum is exactly the number of channels
Desktop (please complete the following information):
Additional context
I also notices that the multi_processing step must have daylong=True to this to somehow work, otherwise the traces would be totally different.
The text was updated successfully, but these errors were encountered: