Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenMS place each in/output symlink in unique dir #555

Conversation

bernt-matthias
Copy link
Collaborator

@bernt-matthias bernt-matthias commented Feb 6, 2021

before each symlink was placed in a dir named by the name of the corresponding parameter. this still allows for conflicts if several
data sets have the same identifier.

solution: place symlinks in directory parameter_name/galaxy-dataset-id

usecase: ConsensusID takes the result of different search engines for the same sample. then the identifiers of the inputs are all the sample name which leads to a conflict.

question is how we can manage the tool version bump since we manually bumped some of the tools (bump.json). we could continue to bump manually, or just use the macro token and increase it to max bump + 1.

For most tools the change is just an additional import.. I could also change the converter such that the import is only added if needed.

@bernt-matthias bernt-matthias changed the title OpenMS place each in/output synlink in unique dir OpenMS place each in/output symlink in unique dir Feb 6, 2021
@bernt-matthias bernt-matthias force-pushed the topic/openms-unique-in-out branch 2 times, most recently from 31e113d to b4071d7 Compare February 7, 2021 11:10
@bernt-matthias bernt-matthias force-pushed the topic/openms-unique-in-out branch from 8dd7f29 to 8c188a0 Compare February 10, 2021 10:14
@@ -24,7 +24,8 @@ mkdir out &&
#if $adv_opts_cond.adv_opts_selector=='advanced':
#if $adv_opts_cond.reannotate_filenames:
mkdir adv_opts_cond.reannotate_filenames &&
${ ' '.join(["ln -s '%s' 'adv_opts_cond.reannotate_filenames/%s.%s' &&" % (_, re.sub('[^\w\-_]', '_', _.element_identifier), $gxy2omsext(_.ext)) for _ in $adv_opts_cond.reannotate_filenames if _]) }
mkdir ${' '.join(["'adv_opts_cond.reannotate_filenames/%s'" % (i) for i, f in enumerate($adv_opts_cond.reannotate_filenames) if f])} &&
${' '.join(["ln -s '%s' 'adv_opts_cond.reannotate_filenames/%s/%s.%s' && " % (f, i, re.sub('[^\w\-_]', '_', f.element_identifier), $gxy2omsext(f.ext)) for i, f in enumerate($adv_opts_cond.reannotate_filenames) if f])}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bernt-matthias a very general question, should we maybe have f.save_element_identifier that is doing this inside Galaxy, s that we can avoid all this sed/replace stuff in the wrappers?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah! This is used in so many tools. Started almost 2 years ago with this galaxyproject/galaxy#7937 .. somehow got distracted. We should try to get this in 21.05.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh gosh, complete forget about it ... 💯

@bernt-matthias bernt-matthias force-pushed the topic/openms-unique-in-out branch 2 times, most recently from 9853199 to 6a75bcb Compare February 20, 2021 18:01
@bernt-matthias
Copy link
Collaborator Author

Hi @bgruening , do you think we can merge soon (and if you like selectively push at least ConsensusID to the TS)?

The latest test fails for the online mascot online adapter (maybe we should set this on the skip list .. which is not possible yet galaxyproject/planemo-ci-action#13). I guess this is transient, but I could ask at the OpenMS gitter channel.

For an overview let me summarize the changes (The first can be critical - I need it in particular for ConsensusID. The last two are cosmetic):

  1. The symlink for each input dataset is placed in an individual directory: PARAMETER_NAME/INDEX, where the INDEX is a unique integer. This avoids clashes of inputs with the same element identifier. Apart from ConsensusID I do not know if this is relevant in realistic use cases .. but who knows.

The complete list of 35 affected tools

AccurateMassSearch.xml
AssayGeneratorMetabo.xml
CVInspector.xml
ConsensusID.xml
DecoyDatabase.xml
EICExtractor.xml
Epifany.xml
FeatureLinkerUnlabeled.xml
FeatureLinkerUnlabeledKD.xml
FeatureLinkerUnlabeledQT.xml
FileMerger.xml
GNPSExport.xml
IDMassAccuracy.xml
IDMerger.xml
MSSimulator.xml
MSstatsConverter.xml
MaRaClusterAdapter.xml
MapAlignerIdentification.xml
MapAlignerPoseClustering.xml
MapAlignerSpectrum.xml
MapAlignerTreeGuided.xml
OpenSwathAnalyzer.xml
OpenSwathChromatogramExtractor.xml
OpenSwathDIAPreScoring.xml
OpenSwathFeatureXMLToTSV.xml
OpenSwathRTNormalizer.xml
PSMFeatureExtractor.xml
PercolatorAdapter.xml
ProteinInference.xml
ProteinResolver.xml
QCMerger.xml
QualityControl.xml
SemanticValidator.xml
SpecLibSearcher.xml
SpectraSTSearchAdapter.xml
  1. The change from optional="true" to false for many parameters. The reason is:

OpenMS sets text, int, select, bool parameters that have a default as optional (required=False), the default value is set implicitly if no value is given. This is reasonable for the CLI because one certainly does not want the user to specify the default manually for all parameters.

For Galaxy tools setting these parameters as required leads to the equivalent behavior. Assuming required is better because it makes the implicit setting of parameters more transparent to the user (in Galaxy the default would be prefilled in the form and at least one option needs to be selected).

  1. I added a token to the validator macro such that the output also shows which parameter violates the validator: <expand macro="list_string_san" name="matrix"/>

@bgruening
Copy link
Member

That looks all pretty great! I would prefer if you only push ConsensusID to the TS if that is possible and wait for the other enhancements for a new OpenMS release. If possible.

@bgruening
Copy link
Member

Thanks a lot!!!

before each symlink was placed in a dir named by the name of the
corresponding parameter. this still allows for conflicts if several
data sets have the sem identifier.

solution: place symlinks in directory parameter_name/galaxy-dataset-id

usecase: ConsensusID takes the result of different search engines
for the same sample. then the identifiers of the inputs are all the
sample name which leads to a conflict.
for outputs with a type selector set the default format
and only have change_format for the others
nicer (less potential info leakage to the user) and simpler.
also no need to import os in the command section
OpenMS uses sets text, int, select, bool parameters that have a default
as optional (required=False), the default value is set implicitly if no
value is given.
This is reasonable for the CLI because one certainly does not want the
user to specify the default manually for all parameters.
For Galaxy tools setting these parameters as required leads to the
equivalent behavior. Assuming required is better because it makes
the implicit setting of parameters more transparent to the user
(in Galaxy the default would be prefilled in the form and at least
one option needs to be selected).
adapt test generation (get default from CTD file if missing in INI files)
@bernt-matthias bernt-matthias force-pushed the topic/openms-unique-in-out branch from 6a75bcb to 7585813 Compare February 21, 2021 18:09
@bernt-matthias
Copy link
Collaborator Author

I would prefer if you only push ConsensusID to the TS

Will do.

@bernt-matthias bernt-matthias merged commit 20912a4 into galaxyproteomics:master Feb 22, 2021
@bernt-matthias
Copy link
Collaborator Author

I cancelled the CI workflow and uploaded ConsensusID manuallyto the TS.

bernt-matthias added a commit to bernt-matthias/tools-galaxyp that referenced this pull request Mar 17, 2022
with

galaxyproteomics#555
galaxyproteomics@fecd2cc
galaxyproteomics@ffdd5c5

optional (non file) parameters were rendered as required in the galaxy
xml

this commit tries to revert this, since it breaks the tests for Comet
Adapter, where the modification select parameters also allow to select
no parameter (and in the tests this is really done)
bernt-matthias added a commit to bernt-matthias/tools-galaxyp that referenced this pull request Mar 18, 2022
with

galaxyproteomics#555
galaxyproteomics@fecd2cc
galaxyproteomics@ffdd5c5

optional (non file) parameters were rendered as required in the galaxy
xml

this commit tries to revert this, since it breaks the tests for Comet
Adapter, where the modification select parameters also allow to select
no parameter (and in the tests this is really done)
bernt-matthias added a commit to bernt-matthias/tools-galaxyp that referenced this pull request Mar 19, 2022
with

galaxyproteomics#555
galaxyproteomics@fecd2cc
galaxyproteomics@ffdd5c5

optional (non file) parameters were rendered as required in the galaxy
xml

this commit tries to revert this, since it breaks the tests for Comet
Adapter, where the modification select parameters also allow to select
no parameter (and in the tests this is really done)
bernt-matthias added a commit to bernt-matthias/tools-galaxyp that referenced this pull request Apr 5, 2022
with

galaxyproteomics#555
galaxyproteomics@fecd2cc
galaxyproteomics@ffdd5c5

optional (non file) parameters were rendered as required in the galaxy
xml

this commit tries to revert this, since it breaks the tests for Comet
Adapter, where the modification select parameters also allow to select
no parameter (and in the tests this is really done)
bernt-matthias added a commit to bernt-matthias/tools-galaxyp that referenced this pull request Oct 4, 2022
with

galaxyproteomics#555
galaxyproteomics@fecd2cc
galaxyproteomics@ffdd5c5

optional (non file) parameters were rendered as required in the galaxy
xml

this commit tries to revert this, since it breaks the tests for Comet
Adapter, where the modification select parameters also allow to select
no parameter (and in the tests this is really done)
bernt-matthias added a commit to bernt-matthias/tools-galaxyp that referenced this pull request Nov 8, 2022
with

galaxyproteomics#555
galaxyproteomics@fecd2cc
galaxyproteomics@ffdd5c5

optional (non file) parameters were rendered as required in the galaxy
xml

this commit tries to revert this, since it breaks the tests for Comet
Adapter, where the modification select parameters also allow to select
no parameter (and in the tests this is really done)
bernt-matthias added a commit to bernt-matthias/tools-galaxyp that referenced this pull request Nov 24, 2022
with

galaxyproteomics#555
galaxyproteomics@fecd2cc
galaxyproteomics@ffdd5c5

optional (non file) parameters were rendered as required in the galaxy
xml

this commit tries to revert this, since it breaks the tests for Comet
Adapter, where the modification select parameters also allow to select
no parameter (and in the tests this is really done)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants