Skip to content

Glossary of Terminology

Robert Millikin edited this page Apr 3, 2020 · 17 revisions

All Scores -- The scores of all the theoretical peptides that were searched against the spectrum. This is only reported if e-Value calculations are specified in the search task.

Base Sequence -- The peptide amino acid sequence without modifications.

Base Sequences Mapped

Best Peptide Notch QValue

Best Peptide Score

Contaminant -- Specifies if the peptide's parent protein is from a contaminant database, "Y", or not "N".

Cumulative Decoy -- The target/decoy approach for determination of FDR yields lists of peptides and proteins matching either target or decoy. These commingled lists are sorted by score. The top scoring decoy match is labeled as 1. Each additional match to decoy is incremented by one. The total count of decoy matches scoring at or above a particular score at any point in the list is reported as Cumulative Decoy. Cumulative Decoy divided by Cumulative Target is the FDR.

Cumulative Decoy Notch -- The cumulative number of decoys specific to the specified notch.

Cumulative Target -- The target/decoy approach for determination of FDR yields lists of peptides and proteins matching either target or decoy. These commingled lists are sorted by score. The top scoring target match is labeled as 1. Each additional match to target is incremented by one. The total count of target matches scoring at or above a particular score at any point in the list is reported as Cumulative Target. Cumulative Decoy divided by Cumulative Target is the FDR.

Cumulative Target Notch -- The cumulative number of targets specific to the specified notch.

Decoy -- Specifies if the peptide is a decoy peptide "Y", or not "N".

Decoy/Contaminant/Target -- Each peptide spectral match, unique peptide and protein is assigned as decoy (D)/contaminant (C)/or target (T). The preference in assignment is D>C>T.

Delta Score -- The MetaMorpheus score difference between the reported peptide and the next highest scoring peptide. If the next highest scoring peptide has the same score, both peptides are reported in the same row (ambiguity) and the next highest scoring peptide is used for the delta score. Thus, a delta score of 0 is not possible.

Different Peak Matches

eScore -- An expectation score for the PSM (high is good). This is only reported if e-Value calculations are specified in the search task.

Essential Sequence -- The full sequence containing only database-defined modifications and absent of fixed/variable modifications.

eValue - The expectation value for the PSM (low is good). This is only reported if e-Value calculations are specified in the search task.

File Name -- The filename and path that contained the scan used in the identification.

Full Sequence -- The complete peptide sequence containing all variable and localized modifications.

Full Sequences Mapped

Gene -- The gene name associated with the protein.

Gene Name -- The gene name associated with the identified peptide's parent protein.

Identified Sequence Variations -- If the search was conducted using a database containing annotated sequence variants, this column displays the sequence variant that is identified by the PSM or peptide. To be considered as identified, the variant must satisfy one of two conditions. The first condition being that the peptide and variant must intersect creating a base sequence different than that which would be observed by digestion of the canonical, non-variant containing protein. The second condition is that the variant creates a proteolytic cleavage site which results in a peptide sequence that would not exist in the non-variant protein.

Improvement Possible -- The increase in the MetaMorpheus score produced by localization of the modification to the position specified in the Full Sequence.

Intensity

Localized Scores -- If there is no ambiguity (only one peptide was assigned), then there is an attempt to localize the mass difference between the experimental and theoretical precursor masses. This mass difference is "placed" on each possible amino acid, and the resulting peptide score is calculated and reported in this column. Each reported score represents an amino acid (N-to-C) that the mass difference was "localized" to.

Mass Diff (Da) -- The absolute mass difference between the observed and theoretical precursor mass. (Calculated as observed-theoretical).

MassDiff (ppm) -- The ppm mass difference between the observed and theoretical precursor mass. (Calculated as observed-theoretical).

Matched Ion Counts -- The number of product ions found for each series.

Matched Ion Series -- The found product ions and their respective charges.

Matched Ion Mass Diff (Da) -- The absolute mass differences between the observed and theoretical product ion masses. (Calculated as observed-theoretical). Order can be found in Matched Ion Series

Matched Ion Mass Diff (Ppm) -- The ppm mass differences between the observed and theoretical product ion masses. (Calculated as observed-theoretical). Order can be found in Matched Ion Series

Matched Ion Mass-To-Charge Ratios -- The theoretical m/zs that were matched to the observed spectrum.

Matched Ion Intensities -- The observed intensities for the matched product ions.

Missed Cleavages -- Digestion of proteins into peptides by proteases (e.g. trypsin) is occasionally incomplete. Users may specify the number of allowed missed cleavages to compensate for incomplete digestion of proteins. This expands the search space. Commonly, two missed cleavages are allowed. However, observation of missed cleavages should be expected as a rare event. For non-specific digestions, the number of missed cleavages is often reported as (the peptide length - 1).

Modification Info List

Mods -- The name(s) of the modification(s) on the peptide.

Mods Chemical Formulas -- The chemical formula(s) of the identified modification(s).

Mods Combined Chemical Formula -- The aggregated chemical formula of all identified modification.

MS2 Retention Time

Next Amino Acid -- Amino acid in the protein that is next in line on the C-terminal end.

Notch -- A narrow mass window in which the is an allowed mass difference between the experimentally observed peptide and the best matching theoretical peptide. This is an arbitrary number that signifies the notch's category.

Num Charge States Observed -- The number of unique charge states a precursor peptide was found to exist as. Only MS1 evidence is required for an observation.

Num Experimental Peaks -- The number of experimental peaks (post-peak trimming) in the MS2 scan.

Num Variable Mods -- The number of variable modifications matched to a peptide.

Number of Peptides -- (Protein Groups Output) -- Number of unique+shared peptides observed that match to the specified protein group.

Number of Proteins in Group -- (Protein Groups Output) -- The number of proteins in the protein group. Multiple proteins are associated with a peptide identification when parsimony cannot distinguish between the options.

Number of PSMs -- (Protein Groups Output) -- The number of peptide spectral matches below with a Q-Value <0.01 observed for all peptides assigned to the protein group.

Number of Unique Peptides -- (Protein Groups Output) -- Number of unique proteins for the protein group. See Unique Peptides definition.

Organism

Organism Name -- The database specified organism that the peptide's parent protein originated from.

Peak Apex Mass Error (ppm)

Peak Charge

Peak Detection Type

Peak intensity

Peak MZ

Peak RT Apex

Peak RT End

Peak RT Start

Peak Split Valley RT

Peptide Description -- A brief statement regarding the peptide's digestion.

Peptide Monoisotopic Mass -- The mass of the peptide calculated from atoms in their most abundant isotopic form (12C, 16O, 14N, etc.). This is the uncharged (neutral) mass.

Precursor Charge -- Th charge of the isolated precursor peptide.

Precursor Mass -- The neutral (uncharged) mass of the peptide.

Precursor MZ -- The mass to charge of the isolated precursor peptide. This is not necessarily the selected MZ for isolation.

Precursor Scan Number -- The scan number of the most recent MS1 scan.

Previous Amino Acid -- The amino acid in the protein preceding the specified peptide.

Protein Accession -- The accession number of the protein as specified in the protein database.

Protein Cumulative Decoy

Protein Cumulative Target

Protein Decoy/Contaminant/Target

Protein Full Name -- The full name of the protein as specified in the protein database.

Protein Group

Protein Name -- The full name of the peptide's parent protein.

Protein QValue

Protein Unmodified Mass

PSMs Mapped

QValue -- The q-value for the identification, calculated as the number of cumulative decoys (false positives) divided by the number of cumulative targets (true positives).

QValue Notch -- The notch specific q-value for the identification, calculated as the number of cumulative notch decoys (false positives) divided by the number of cumulative notch targets (true positives).

Scan Number -- The scan number is specified in the header of each scan. The scan number reported usually contains the MS2 data used in the peptide spectral match. It is possible for multiple co-isolated peptides to be matched to the same scan number.

Scan Retention Time -- The experimental time (in minutes) that the scan was acquired.

Score -- MetaMorpheus score is incremented by one for each matching b- and y-ion. c- and z.-ions are used for ETD. The number after the decimal is the fraction of total peak intensity from the MS2 scan that can be assigned to the particular peptide spectral match.

Sequence Coverage -- (Protein Groups Output) -- Displays amino acids in the protein observed in any peptide spectral match with a Q value <0.01 for each protein in the group, with the "|" character as the delimiter. Lowercase residues were not observed. Uppercase residues were observed.

Sequence Coverage % -- (Protein Groups Output) -- The fraction of amino acids in the protein observed in any peptide spectral match with a Q value <0.01.

Sequence Coverage with Mods

Sequence Variations -- Sequence variants identified.

Shared Peptides -- (Protein Groups Output) -- Peptides that are shared between multiple proteins in the protein database(s) used for the search are listed.

Splice Sites -- If the search was conducted using a database that contains annotated splice sites, this column contains splice sites which the PSM or peptide crossed.

Start and End Residues In Protein -- The one-based amino acid positions of the peptide in the parent protein(s).

Theoretical MZ

Theoreticals Searched -- The number of theoretical peptides searched against the spectrum. This is only reported if e-Value calculations are specified in the search task.

Total Ion Current -- The total ion current of the MS2 spectrum. This is the sum of intensities from every MS2 peak. These intensities can come from fragmentation of multiple precursors depending on the selectivity for fragmentation (aka isolation width) and crowdedness of the MS1 spectrum.

Unique Peptides -- (Protein Groups Output) -- Peptides that are unique to the listed protein (they can only come from that one protein, based on the database in silico digestion). Currently, peptides that are unique to the group are not listed here; i.e., a protein group with >1 protein will always have 0 unique peptides because they are shared between all proteins in the group.

Write Excel Compatible TSVs -- Microsoft Excel only allows 32000 characters in a single cell. If the cell contains more than 32000 characters, the cell wraps to the next line, which can result in "garbled"-looking output. You can make MetaMorpheus write Excel-compatible output by going to Settings -> Close MetaMorpheus and open global settings file -> change "WriteExcelCompatibleTSVs" from "false" to "true". Note that by enabling this setting, if the line is too long for Excel, "Too long for Excel" will be written instead of the expected output.