-
Notifications
You must be signed in to change notification settings - Fork 1k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Implement a cross product collection operation tools.
These are available in a list(n) x list (m) -> list(nxm) version (a cross product that produces two flat lists) and a list(n) x list(m) -> list(n):list(m) version (a cross product that produces two nested lists). After two lists have been run through one of these two tools - the result is two new lists that can be passed into another tool to perform all-against-all operations using Galaxy's normal collection mapping semantics. The choice of which to use will depend on how you want to continue to process the all-against-all results after the next step in an analysis. My sense is the flat version is "easier" to think about and pick through manually and the nested version perserves more structure if additional collection operation tools will be used to filter or aggregate the results. Some considerations: Apply Rules? I do not believe the Apply Rules tool semanatics would allow these operations but certainly the Apply Rules tool could be used to convert the result of the flat version to the nested version or vice versa - so no metadata is really lost per se between the two versions. I think it is still worth including both versions though - they both have utility (both for instance are baked into CWL's workflow semantics - https://docs.sevenbridges.com/docs/about-parallelizing-tool-executions#nested-cross-product) and avoiding requiring complex Apply Rules programs for simple workflows is probably ideal. One Tool vs Two? Marius and I agree that few simpler tools for these kinds of operations are better. The tool help can be more focused and avoiding the conditional and conditional outputs make the static analysis done for instance by the workflow editor simpler.
- Loading branch information
Showing
8 changed files
with
334 additions
and
11 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,89 @@ | ||
<tool id="__CROSS_PRODUCT_FLAT__" | ||
name="Flat Cross Product" | ||
version="1.0.0"> | ||
<description></description> | ||
<type class="CrossProductFlatCollectionTool" module="galaxy.tools" /> | ||
<macros> | ||
<import>model_operation_macros.xml</import> | ||
</macros> | ||
<expand macro="uses_a_model_operation_action" /> | ||
<edam_operations> | ||
<edam_operation>operation_3436</edam_operation> <!-- DataHandling -> Aggregation --> | ||
</edam_operations> | ||
<inputs> | ||
<param type="data_collection" name="input_a" collection_type="list" label="Input Collection A" /> | ||
<param type="data_collection" name="input_b" collection_type="list" label="Input Collection B" /> | ||
<expand macro="join_identifier" /> | ||
</inputs> | ||
<outputs> | ||
<collection name="output_a" format_source="input_a" type="list" label="${on_string} (A files)" > | ||
</collection> | ||
<collection name="output_b" format_source="input_b" type="list" label="${on_string} (B files)" > | ||
</collection> | ||
</outputs> | ||
<tests> | ||
<test> | ||
<param name="input_a"> | ||
<collection type="list"> | ||
<element name="a1" value="simple_line.txt" /> | ||
<element name="a2" value="simple_line_alternative.txt" /> | ||
</collection> | ||
</param> | ||
<param name="input_b"> | ||
<collection type="list"> | ||
<element name="b1" value="1.txt" /> | ||
<element name="b2" value="1.fasta" /> | ||
</collection> | ||
</param> | ||
<param name="join_identifier" value="_" /> | ||
<output_collection name="output_a" type="list"> | ||
<element name="a1_b1"> | ||
<expand macro="assert_is_simple_line" /> | ||
</element> | ||
<element name="a1_b2"> | ||
<expand macro="assert_is_simple_line" /> | ||
</element> | ||
<element name="a2_b1"> | ||
<expand macro="assert_is_simple_line_alt" /> | ||
</element> | ||
<element name="a2_b2"> | ||
<expand macro="assert_is_simple_line_alt" /> | ||
</element> | ||
</output_collection> | ||
<output_collection name="output_b" type="list"> | ||
<element name="a1_b1"> | ||
<expand macro="assert_is_1_dot_txt" /> | ||
</element> | ||
<element name="a1_b2"> | ||
<expand macro="assert_is_1_dot_fasta" /> | ||
</element> | ||
<element name="a2_b1"> | ||
<expand macro="assert_is_1_dot_txt" /> | ||
</element> | ||
<element name="a2_b2"> | ||
<expand macro="assert_is_1_dot_fasta" /> | ||
</element> | ||
</output_collection> | ||
</test> | ||
</tests> | ||
<help><![CDATA[ | ||
======== | ||
Synopsis | ||
======== | ||
=========== | ||
Description | ||
=========== | ||
---- | ||
.. class:: infomark | ||
@QUOTA_USAGE_NOTE@ | ||
]]></help> | ||
</tool> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,93 @@ | ||
<tool id="__CROSS_PRODUCT_NESTED__" | ||
name="Nested Cross Product" | ||
version="1.0.0"> | ||
<description></description> | ||
<type class="CrossProductNestedCollectionTool" module="galaxy.tools" /> | ||
<macros> | ||
<import>model_operation_macros.xml</import> | ||
</macros> | ||
<expand macro="uses_a_model_operation_action" /> | ||
<expand macro="annotate_as_aggregation_operation" /> | ||
<inputs> | ||
<param type="data_collection" name="input_a" collection_type="list" label="Input Collection A" /> | ||
<param type="data_collection" name="input_b" collection_type="list" label="Input Collection B" /> | ||
</inputs> | ||
<outputs> | ||
<collection name="output_a" format_source="input_a" type="list:list" label="${on_string} (A files)" > | ||
</collection> | ||
<collection name="output_b" format_source="input_b" type="list:list" label="${on_string} (B files)" > | ||
</collection> | ||
</outputs> | ||
<tests> | ||
<test> | ||
<param name="input_a"> | ||
<collection type="list"> | ||
<element name="a1" value="simple_line.txt" /> | ||
<element name="a2" value="simple_line_alternative.txt" /> | ||
</collection> | ||
</param> | ||
<param name="input_b"> | ||
<collection type="list"> | ||
<element name="b1" value="1.txt" /> | ||
<element name="b2" value="1.fasta" /> | ||
</collection> | ||
</param> | ||
<output_collection name="output_a" type="list:list"> | ||
<element name="a1"> | ||
<element name="b1"> | ||
<expand macro="assert_is_simple_line" /> | ||
</element> | ||
<element name="b2"> | ||
<expand macro="assert_is_simple_line" /> | ||
</element> | ||
</element> | ||
<element name="a2"> | ||
<element name="b1"> | ||
<expand macro="assert_is_simple_line_alt" /> | ||
</element> | ||
<element name="b2"> | ||
<expand macro="assert_is_simple_line_alt" /> | ||
</element> | ||
</element> | ||
</output_collection> | ||
<output_collection name="output_b" type="list:list"> | ||
<element name="a1"> | ||
<element name="b1"> | ||
<expand macro="assert_is_1_dot_txt" /> | ||
</element> | ||
<element name="b2"> | ||
<expand macro="assert_is_1_dot_fasta" /> | ||
</element> | ||
</element> | ||
<element name="a2"> | ||
<element name="b1"> | ||
<expand macro="assert_is_1_dot_txt" /> | ||
</element> | ||
<element name="b2"> | ||
<expand macro="assert_is_1_dot_fasta" /> | ||
</element> | ||
</element> | ||
</output_collection> | ||
</test> | ||
</tests> | ||
<help><![CDATA[ | ||
======== | ||
Synopsis | ||
======== | ||
=========== | ||
Description | ||
=========== | ||
---- | ||
.. class:: infomark | ||
@QUOTA_USAGE_NOTE@ | ||
]]></help> | ||
</tool> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
<macros> | ||
<xml name="uses_a_model_operation_action"> | ||
<action module="galaxy.tools.actions.model_operations" | ||
class="ModelOperationToolAction"/> | ||
</xml> | ||
<token name="@QUOTA_USAGE_NOTE@">This tool will create new history datasets copied from your input collections but your quota usage will not increase.</token> | ||
<xml name="annotate_as_aggregation_operation"> | ||
<edam_operations> | ||
<edam_operation>operation_3436</edam_operation> <!-- DataHandling -> Aggregation --> | ||
</edam_operations> | ||
</xml> | ||
<xml name="join_identifier"> | ||
<param type="select" name="join_identifier" label="Join collection identifiers using" help="Separator for merging dataset identifiers"> | ||
<option value="_">underscore ( _ )</option> | ||
<option value=":">colon ( : )</option> | ||
<option value="-">dash ( - )</option> | ||
</param> | ||
</xml> | ||
<xml name="assert_is_simple_line"> | ||
<assert_contents> | ||
<has_text_matching expression="^This is a line of text.\n$" /> | ||
</assert_contents> | ||
</xml> | ||
<xml name="assert_is_simple_line_alt"> | ||
<assert_contents> | ||
<has_text_matching expression="^This is a different line of text.\n$" /> | ||
</assert_contents> | ||
</xml> | ||
<xml name="assert_is_1_dot_txt"> | ||
<assert_contents> | ||
<has_text_matching expression="^chr1" /> | ||
<has_text_matching expression="42287290" /> | ||
</assert_contents> | ||
</xml> | ||
<xml name="assert_is_1_dot_fasta"> | ||
<assert_contents> | ||
<has_text_matching expression="^>hg17\n" /> | ||
</assert_contents> | ||
</xml> | ||
</macros> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.