-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[24.0] Do not copy purged outputs to object store #18342
[24.0] Do not copy purged outputs to object store #18342
Conversation
ed022ba
to
6be29de
Compare
This in particular needs a lot of new tests. We will also need to actively purge datasets in the model store import code, since users might have purged datasets while the job ran. Again, more tests needed.
6be29de
to
8e6e25d
Compare
# Users can purge outputs before the job completes, | ||
# in that case we don't want to copy the output to a purged path. | ||
# Static, non work_dir_output files are handled in job_finish code. | ||
return f'\nif [ -f "{source_file}" -a -f "{destination}" ] ; then cp "{source_file}" "{destination}" ; fi' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be a wild check 😭. I hate it but I don't know the alternative. I know people love the disk object store and jobs writing directly to the object store 🤔.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit of an "optimization" .. I'm happy to restore the original version, the important thing is that we don't store data in purged datasets.
These are only written if the command actually ran, so we'd fail here for instance if the outputs are deleted before the job had a chance to run.
0467e96
to
94bd0c5
Compare
Add a sleep so we can delay output purging until command line is templated. If we don't sleep we generate a command line where the output path is just `''`. That needs another fix!
94bd0c5
to
03eca7c
Compare
This is a tricky one. There could be systematic issues here that an admin would want to fix, but it could also just be the case that a user forced a wrong datatype. Ideally we'd probably tag this as job execution issue ? It's not very different from a job error IMO.
I haven't tested extra_files handling yet, and I know that command line templating is broken if you purge the output dataset before the job is run and the command line is queued if you're not using But there's so many fixes in here and they have coverage, so let's get this in (if the tests pass) while I work on the remaining tests. |
25bca7c
to
f0e09cc
Compare
Fixes #18337. Needs a whole lot of tests still, this is just a rough pass at the most obvious cases where this might happen.
Tests to write:
extra_files* dynamic outputs * extended_metadataHow to test the changes?
(Select all options that apply)
License