Skip to content

Launching MSFragger

Fengchao edited this page Oct 21, 2020 · 14 revisions

Complete workflows can be run with FragPipe. To run MSFragger searches in the command line:

We always recommend upgrading to the most recent release of MSFragger. To read Thermo or Bruker raw files, select the latest .zip release.

Windows users:

Place the MSFragger.jar file in your working directory and modify the parameter file (.params) to reference your protein database.

To read Thermo .raw or Bruker .d formats directly, make sure the ext folder (from the de-compressed .zip release of MSFragger) is in the same directory as the JAR file.

Determine the amount of system memory that you would like to make available to MSFragger. This will be specified by the Java maximum heap size parameter -Xmx (e.g. -Xmx3700m for 3700 MB or -Xmx32g for 32 GB).

To perform a search either use a parameter file:

  • java -Xmx32g -jar MSFragger.jar <parameter file> <list of mzML/mzXML files>. Parameter file templates (closed, open, and semi-enzymatic) can be generated by running java -jar MSFragger.jar --config.

Or specify options on the command line:

  • java -Xmx32g -jar MSFragger.jar <options> <list of mzML/mzXML files>

The -Xmx flag is needed to ensure that MSFragger has access to sufficient memory to perform the search efficiently. We recommend that you allocate a minimum of 8 GB for standard tryptic digestions.

-Xmx32g specifies the maximum memory assigned to the Java virtual machine. In this example, the maximum value is 32 GB. This should be changed to suit your computer configuration.

Detailed command line options can be displayed with:

  • java -jar MSFragger.jar --help

Linux users:

A FragPipe-equivalent shell script can be found here.

Performance notes

Batch processing

MSFragger allows multiple MS/MS input files to be processed in a batch. Passing multiple files to MSFragger at once allows MSFragger to reuse the fragment index for subsequent MS/MS run. This is particularly important for narrow window searches which may only take fractions of a second.

On computers or clusters with many processor cores, we highly recommended setting MSFragger to process files sequentially with all available processor cores rather than running multiple instances of MSFragger in parallel (assigning a smaller number of cores to each). This reduces initialization times and allows the fragment index to be re-used, at the same time reducing overall memory requirements.