-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
catalog_to_dd.py: input and output of dt.cc #516
base: develop
Are you sure you want to change the base?
catalog_to_dd.py: input and output of dt.cc #516
Conversation
Correlations from existing_corr_file added to write_correlations & compute_differential_times. Default None skips new sections of code. 1. existing_corr_file read in with new function: _read_correlation_file returning a existing event pairs dictionary keyed to core event, with list of paired events. 2. existing_pairs dictionary used to remove events paired with core event from sub_catalog so they are not recalculated 3. existing_corr_file read in & output into output file, followed by newly calculated correlations. Added option of output_filename to write_correlations. Default "dt.cc" Works as is, could be more completely implemented. Things to do: - implement checking whether there are stations missing from existing correlations - speed up saving existing correlations output (do I need to write out every line individually?) - sort output by event numbers (interleaving existing with new) - check existing correlations for those which wouldn't be calculated otherwise?
Same changes as previous commit, but made to master version of code, rather than older one I was working with.
Comma error removed
with open(existing_corr_file, "r") as in_f: | ||
for line in in_f: | ||
f.write(line) | ||
## TODO: speed up saving existing correlations output (do I need to write out every line individually?) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want to just dump everything you can do something like:
with open(blah) as f:
f.write(["\n".join(l for l in in_f])
assuming that your lines don't already have a newline character at the end of them. If they do then you can just ''.join(in_f)
and write that string.
Thanks for this @erhmestel! Can you update your branch to track develop (press the "Update branch" button underneath all the test ticks, then pull the updated branch back to your local computer) please? This looks good, even if stickler doesn't think so. I like using PyCharm for this kind of big project and I set up a ruler at 80 characters long (I forget how to do this, but this SO answer has it. PyCharm is pretty good at removing trailing whitespace for you and you can set up code linters (flake8 is the one run by stickler I think) which will highlight non pep8 friendly code. I would go for cleaning that up, then work on writing a short test that tests this new behaviour. Once we have a test then we can play around with optimising. If there are any edge cases that you can think of that might result in odd behaviour (e.g. when no new events have been added, or if events are in the correlation file, but not in the catalog provided for the update, ...) then having tests for them would be helpful as well so that we can avoid code crashing as much as possible! I should also use this as an opportunity to better document:
If you have any suggestions of where those docs should go (e.g. where you would look for them) then let me know, otherwise we can put them in in all the relevant places. |
What does this PR do?
Editing catalog_to_dd.py to allow input of existing correlation measurements and selection of output filename. Defaults keep everything the same as it currently is.
existing_corr_file
option added towrite_correlations
&compute_differential_times
. DefaultNone
skips new sections of code.existing_corr_file
read in with new function:_read_correlation_file
returning a existing event pairs dictionary keyed to core event, with list of paired events.existing_pairs
dictionary used to remove events paired with core event fromsub_catalog
so they are not recalculatedexisting_corr_file
read in & output into output file, followed by newly calculated correlations.output_filename
towrite_correlations
. Default"dt.cc"
Why was it initiated? Any relevant Issues?
To prevent having to recalculate correlations each time new events are added & to allow the output of different correlation files without overwriting.
Not been extensively tested. In small amount of testing code works as is, but could be more completely implemented. Things to do:
PR Checklist
develop
base branch selected?CHANGES.md
.CONTRIBUTORS.md
.Boxes left unchecked when I don't know