-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dynamic Models for Tool Test Validation #18679
Conversation
9622d8a
to
31002fa
Compare
I love it, and I think realistically the alternative tool syntax will allow more precise validation. One comment on the validation report: https://gist.github.com/jmchilton/e84f91cf2646057e5cd25781ba7d422b#file-gistfile1-txt-L76:
that's not unqualified and much shorter than writing <repat name="tables">
<conditional name="input_opts">
<repeat name="linefilters">
<conditional name="filter">
<param name="filter_type" value="something"/>
</conditional>
</repeat>
</conditional>
</repeat> I guess i wouldn't be too concerned i we can automatically rewrite this, but if not I would say that's a lot of tool author work for little gain (a cleaner dynamic test case XSD I assume ?). We have a helper than can turn this kind of string into a path tuple ( galaxy/lib/galaxy/tools/parameters/wrapped.py Line 192 in 5f635a3
|
@mvdbeek That test looks fine to my eye - I don't know why it didn't validate. Probably a bug... I bet I didn't implement sections 😅😭. Let me check it out and get back to you. I did intend for those kinds of parameters to validate though. As for automatically changing the tools - I would be okay I think if we were certain we weren't going to lose comments or touch the rest of the tool. Do we have any examples of doing that well? I certainly never meant for |
e6ff57c
to
477a1a1
Compare
Now revised with section support 😅. I added a test case that exhibits this failure. The updated validation results for that query_tabular tool look like this:
The remaining failure is basically point 3 above - selection options should be selected by value and not display name. |
ab71ec1
to
82b297e
Compare
I'm not sure which strawman I'm supposed to defend here 😅 I appreciate stricter typing, it will help IUC. The IUC and especially @bernt-matthias has done a lot to improve linting and the overall quality of tools. Which also makes our tools more sustainable. In the language-server plugin, we autogenerate test-sections and if we can improve this and make it more stable that would be great. The only thing I'm worried about is the readability aspect. In the end, a human needs to review a tool and here readability counts. So my only request would be to take readability seriously and find a good balance. E.g. if we agree that more people should use IDEs for Galaxy tool dev, a more verbose syntax would be ok if its improve readability etc ... If there are large syntactic changes I think we should run them by https://github.com/galaxy-iuc/standards |
I'm quite happy with the "reformat" option of the language server, maybe we can try it at this level. It seems to keep comments etc .. |
82b297e
to
d529820
Compare
The situation to avoid is to make it look like maintaining or developing a galaxy tool is hard. At one point I bet we'll need to enforce a profile version that requires rewriting the test section. We should avoid putting this work onto the occasional contributor who just needs a feature flag added or a bug fixed, this needs to be handled by the tool maintainers. If we had an easy way to fix up trivial-ish linting things like a different test syntax (internally I would hope test cases in both syntaxes are parsed into the same pydantic models) it would be easy for tool maintainers to stay ahead of the curve.
Not in Galaxy, but CWL maintains comments. Could our parameter models hold comments ? |
😮 - I didn't expect LXML to preserve so much of the whitespace and comments. It would only fluck with the tools before and after the root If we do opt to write a tool upgrader - are there other features we think should be auto-upgraded? |
8c23038
to
cd32382
Compare
stdio cruft that is redundant with a new profile version, unnecessary name parameters if I agree though this is isn't something for right now, and it doesn't have to be you. It would make transitioning to a newer syntax a much lighter decision though. |
cd32382
to
8124755
Compare
Work scoped out in 18536.
8124755
to
0a35eea
Compare
re: Upgrading tools automatically. I've integrated this with the tool upgrade advisor (jmchilton@5820ed1) implemented separately as #18728. Once that commit becomes green I will include it in this PR along with #18728 if this PR is still open. I've outlined why I think upgrading automatically is more challenging than I thought but also how all the semantic data included in #18728 |
# In an effort to squeeze all the ambiguity out of test cases - at some point Marius and John | ||
# agree tools should be using value_json for typed inputs to parameters but John has come around on | ||
# this now that we're validating the parameters as a whole on load. The models are ensuring only | ||
# unambigious test cases are being loaded. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💯 ... I think what I care about is that where exact types are possible (yaml, json) we should use them. Being able to accurately coerce types is such an important part of the whole tool state work IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, this looks really nice!
24.2
or newer tools will not allow a test case to load at all if that test case does not validate.24.2
or newer tools disable unqualified access to conditional and repeat parameters - these lead to ambiguities. Tool developers should be using<conditional>
and<repeat>
tags to avoid ambiguity or using the fully qualified state path (see @mvdbeek's first comment below).24.2
or newer tools are more strongly typed -select
parameters must be specified in test parameter specification by option value and not option display,boolean
parameters must be specified as booleans and not using thetruevalue
/falsevalue
, and column parameters must be specified as integers.I imagine everyone is vaguely on board for validating the test cases immediately. It will catch many class of tool test problems immediately and unquestionably speed up new tool development. I think what I've declared best practices and am validating vs. not validating is probably a lot more contentious. I will setup strawman arguments here and argue against them but I will also try to be open to changing things around if people think my path on the particulars is wrong.
In my head, there is going to be a pragmatic camp led by IUC folks (maybe @bgruening ?) that have been burnt by ever increasing linting stringency making it harder to upgrade older existing tools. I would argue that the increased type safety and simplicity will reduce potential problems with tool and make tools more uniform and safer to reason about. I think it is also helpful to simplify the tool chain and helpful in training developers how they should use their tools with the API ideally. The benefits are worthy of a little disruption vs. some more picky linting we've done in the past. I've also included utilities to help developers with the migration (see Command-line Validation below).
In my head, the other side is going to be the purist camp led by @mvdbeek that would like even stricter validation - further migration toward CWL purity. For instance, we can ensure we're loading integers right from the tool source using
value_json
instead ofvalue
. @mvdbeek and I had discussed making this the best practice at some point. Working through this project - I've come around on this though. By doing the validation and ensuring that each parameter type (or combination of parameter type and whether it is multiple or not), there is not ambiguity any more. No reason to think<parameter name="threshold" value="7">
might be a string after this PR - we now know it is an integer when validating the generated API request for the test. And all that logic is in galaxy-tool-util and available without a Galaxy runtime. There is a flag in the code to enable this requirement - at least at the warning level - but I don't think it should be switched on.For hundreds of real world examples to evaluate these models against - the tool test validation applied to the IUC looks like this https://gist.github.com/jmchilton/e84f91cf2646057e5cd25781ba7d422b.
Command-line Validation
Ideally, this will land up in Planemo after we've worked through the right path forward on best practices and what should validate and what shouldn't. But I have still included a little helper utility to check existing tools. This can be used to generate JSON details about validation errors (and some classes of warnings for older tools) on the command-line as either plain text or as JSON. The json generation might be useful for IDE helpers for instance.
Run validation on the test cases and collect validation failures and type related warnings.
Run validation on the test cases and collect validation failures and type related warnings. Use
24.2
profile to do this and ignore the tool's profile. Gives some indication on whether the tool's test cases would be a blocker to upgrading the tool to the latest profile.Run validation but generate JSON - for CLI tooling and non-Python/non-Planemo tooling chains (IDE support).
I've run the tool on the IUC repository to produce this report https://gist.github.com/jmchilton/e84f91cf2646057e5cd25781ba7d422b.
Limitations and Future Work
All of this is kind of... bleh in terms of implementation and utility because the thing being validated isn't the developer's tool source but instead a weird json dictionary used by Galaxy to communicate the tool test to the client tooling. For this reason I've switched the name of the parameter model representation from
test_case
totest_case_xml
. I don't think there is anything wrong with validating things at this level but ideally, we would be actually validating the source file contents also.My next steps toward that end are to implementation a model representation - maybe called
test_case_source
that would be similar to this but targeting the tool source. It would be future facing and do things like replace comma separated lists with actual JSON lists. Then we would use that dynamic model to either:This would allow line number based issues and since we can easily generate jsonschema for these dynamic models - there are many ways a tool chain could validate this part of the tool regardless of language and runtime.
A note about included workflow models.
I started on the workflow state work simultaneously but the test case work advanced more rapidly given that it is easier and doesn't require a tool shed upgrade. I have stripped out all of the validation and conversion plumbing (it is its own commit in #17393 now) but there is still a bunch of validation tests setting up a shot at #18536 in this PR and I can't really decouple them from the test case work. I would prefer we defer any conversation about those test cases until I open a PR for the workflow state validation stuff. I promise to foreground all the examples and be very open to having them picked apart at that time if we can stay focused on the tool test stuff here? Is that okay?
How to test the changes?
(Select all options that apply)
License