Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CommandLineProgram interface #96

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

magicDGS
Copy link
Contributor

As suggested in broadinstitute/gatk#3621

@magicDGS
Copy link
Contributor Author

@cmnbroad - I think that this is a fair proposal. Can you please have a look? I can take the changes in the Picard/GATK repositorires after this is accepted, and then you can update the Picard integration in GATK....

@codecov-io
Copy link

codecov-io commented Sep 28, 2017

Codecov Report

Merging #96 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff            @@
##             master      #96   +/-   ##
=========================================
  Coverage     75.68%   75.68%           
  Complexity      576      576           
=========================================
  Files            22       22           
  Lines          2159     2159           
  Branches        447      447           
=========================================
  Hits           1634     1634           
  Misses          350      350           
  Partials        175      175

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e2320da...39716f2. Read the comment docs.

*
* @return result object after performing the work. May be {@code null}.
*/
public Object instanceMain(final String[] argv);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the issue we need to resolve is what the return type of this method should be. Picard currently uses int, but GATK uses Object. I think int makes more sense, but that will require changes to GATK.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the GATK way more, because it allows to return to the caller some results. That's useful for combining tools if necessary, by instantiating one and use the returned value for the next.
In addition, if the instanceMain returns an exit status, then it is tool-specific how it fails (command line error, validation of parameters, user exception, processing issues, unexpected errors, network connection). I like the idea to catch all the exceptions in the main class, or let the caller decide what to do with each exception.

I understand that returning an Object without catching exceptions may be a large change in Picard, but I can make the changes if we are going in that direction. Anyway, Picard will change to the new parser at some point, and it is a good idea to try to normalize somehow how they handle exceptions and return values.

Maybe an idea for now is to have two methods, one throwing exceptions (called by GATK at the moment) and another one returning the exit status. Another idea is to create an abstract main here to encapsulate some common behaviour in Picard/GATK main, and then port both of them to the new abstraction. Honestly, I am not sure what is the best way to go here...

* <li>Return the result of the tool.</li>
* </ul>
*
* <p>Note: Most exceptions should be cached by the caller.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you meant to say "catched" ? In which case the past participle is actually "caught". So I think this should be "Most exceptions should be caught by the caller." ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, my English is really bad, I meant "catched", and it was too early to realize that I was even misspelling it. Changed.

Copy link
Member

@lbergelson lbergelson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a few comments

* </li>
* <li>
* Mark the class with {@link CommandLineProgramProperties} to include information for the
* tool. Users may consider to annotate with {@link BetaFeature} tools tat are in experimental
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: tools tat

*
* <p>It should be considered as a marker interface to identify runnable Command Line Program
* classes in different toolkits, to aggregate them when looking for them. In addition, it includes
* the method {@link #instanceMain(String[])} to be call to parse the arguments and run the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we're moving this down we might want to consider renaming it? maybe runCommandLineProgram

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we change the name here, then it requires changes in Picard too, no?

*
* @return result object after performing the work. May be {@code null}.
*/
public Object instanceMain(final String[] argv);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might make sense to change this from Object to int since it seems like it's clearly meant to be a return code. That would mean some changes to gatk.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I said above, maybe it is good to have a method to be called and return an object result to combine different tools. For example: a program that processes reads and return intervals (written down); if it returns the list of intervals, the tool can be chained into another tool to make pipelines, such as process that intervals and perform some operation on them.

On the other hand, it is also nice that the caller decides what to do with exceptions instead of letting each tool handle them. But I am happy with a return code if that's the way to proceed here. Just let me know your opinion before changing the interface...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants