-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
how does resuming a batch output job work? #172
Comments
Were you using In short, if you are using batches you get If you are not using batches, we do not resume, b/c afaik, rust zip utils can't append to zips and incomplete zipfiles are not readable. With current strategy, if we were to read Happy to modify if I'm missing something about rust zip writing or you have other strategy suggestions. |
I was using batch, but it didn't pick it up. Maybe I got something wrong. I'll give it a try again! For the bigger databases, I'm also thinking of doing a manual split of the input CSV to get to a small chunk size and then using snakemake on that. Animal genomes are all really big! |
I also think using the NCBI REST API links instead might help, especially since we could up the # of simultaneous downloads if providing an API key. I'll make an issue for that it is much faster with simultaneous downloads, especially since genome sizes vary and the biggest ones take a lot of time. |
something died with OOM and then resuming didn't work - it just restarted from scratch. Any tips or tricks?
The text was updated successfully, but these errors were encountered: