Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CommandLineProgram interface #96

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
package org.broadinstitute.barclay.argparser;

/**
* Interface class to facilitate implementing command-line programs.
*
* <p>It should be considered as a marker interface to identify runnable Command Line Program
* classes in different toolkits, to aggregate them when looking for them. In addition, it includes
* the method {@link #instanceMain(String[])} to be call to parse the arguments and run the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we're moving this down we might want to consider renaming it? maybe runCommandLineProgram

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we change the name here, then it requires changes in Picard too, no?

* actual work.
*
* <p>Tool classes may want to extend this interface and {@link CommandLinePluginProvider} to
* include plugins in their command line.
*
* <p>Usage example:
*
* <ul>
* <li>
* Extend the class. Users may want to extend also {@link CommandLinePluginProvider} to
* include plugins in the command line.
* </li>
* <li>
* Mark the class with {@link CommandLineProgramProperties} to include information for the
* tool. Users may consider to annotate with {@link BetaFeature} tools that are in experimental
* phase.
* </li>
* <li>
* Include data members marked with {@link Argument}, {@link ArgumentCollection},
* {@link PositionalArguments} or {@link TaggedArgument} to include in the command line.
* This members could be also annotated with {@link Hidden} or {@link Advanced} to tweak
* how they are display.
* </li>
* <li>
* Implement {@link #instanceMain(String[])} to use {@link CommandLineArgumentParser} to
* parse the arguments and use the data members to perform the work.
* </li>
* <<li>
* In a Main class, look for all the classes implementing {@link CommandLineProgram} (e.g.,
* using {@link ClassFinder}). Matching the class name with a provided String, the user can
* select the tool to be run, and the Main class can call {@link #instanceMain(String[])}
* with the provided arguments.
* </li>
* </ul>
*
* <p>Note: Barclay does not use or enforce this class for any purpose.
*
* @author Daniel Gomez-Sanchez (magicDGS)
*/
public interface CommandLineProgram {

/**
* Implements the work to be done with the tool.
*
* <p>This method should implement:
*
* <ul>
* <li>Command line parsing using {@link CommandLineParser}</li>
* <li>Run the actual work of the tool using data members populated by the parser.</li>
* <li>Return the result of the tool.</li>
* </ul>
*
* <p>Note: Most exceptions should be caught by the caller.
*
* @param argv arguments to pass to the {@link CommandLineParser}.
*
* @return result object after performing the work. May be {@code null}.
*/
public Object instanceMain(final String[] argv);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might make sense to change this from Object to int since it seems like it's clearly meant to be a return code. That would mean some changes to gatk.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I said above, maybe it is good to have a method to be called and return an object result to combine different tools. For example: a program that processes reads and return intervals (written down); if it returns the list of intervals, the tool can be chained into another tool to make pipelines, such as process that intervals and perform some operation on them.

On the other hand, it is also nice that the caller decides what to do with exceptions instead of letting each tool handle them. But I am happy with a return code if that's the way to proceed here. Just let me know your opinion before changing the interface...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the issue we need to resolve is what the return type of this method should be. Picard currently uses int, but GATK uses Object. I think int makes more sense, but that will require changes to GATK.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the GATK way more, because it allows to return to the caller some results. That's useful for combining tools if necessary, by instantiating one and use the returned value for the next.
In addition, if the instanceMain returns an exit status, then it is tool-specific how it fails (command line error, validation of parameters, user exception, processing issues, unexpected errors, network connection). I like the idea to catch all the exceptions in the main class, or let the caller decide what to do with each exception.

I understand that returning an Object without catching exceptions may be a large change in Picard, but I can make the changes if we are going in that direction. Anyway, Picard will change to the new parser at some point, and it is a good idea to try to normalize somehow how they handle exceptions and return values.

Maybe an idea for now is to have two methods, one throwing exceptions (called by GATK at the moment) and another one returning the exit status. Another idea is to create an abstract main here to encapsulate some common behaviour in Picard/GATK main, and then port both of them to the new abstraction. Honestly, I am not sure what is the best way to go here...

}