Name WDL output files in a human-readable way #5046
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This fixes #5008 by cramming WDL task names into the encoded Toil file URIs, along with the UUIDs that identify source directories. When we make a directory to hold the files from a given UUID, we will name it after the task (or workflow) that uploaded the files.
If I run:
I get an output directory like this:
If a task uploads files from multiple directories, we will start adding deduplicating numbers onto the ends of the directories.
Weird cases like a scatter node in a workflow or a subworkflow uploading a file from a directory also referenced by the main workflow should work, but only one WDL task path gets to name the directory we create.
This adds some complexity to devirtualizing files because now you need to keep track of this additional bit of state. We might want to roll it up into an object with the caches. I had to change the
share_files
function to a kwarg on the standard library constructor because I neither wanted nor needed to deal with what would happen if you tried to merge this new state.Since we have one copy of the download logic for outputs and for task input files, tasks should also get nicer input file names for free now.
Changelog Entry
To be copied to the draft changelog by merger:
Reviewer Checklist
issues/XXXX-fix-the-thing
in the Toil repo, or from an external repo.camelCase
that want to be insnake_case
.docs/running/{cliOptions,cwl,wdl}.rst
Merger Checklist