From f5c524d8dd5d516a325231cd2b9221d16b7db436 Mon Sep 17 00:00:00 2001 From: skschum Date: Mon, 13 May 2019 09:30:06 -0400 Subject: [PATCH] Update README.md --- README.md | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/README.md b/README.md index 7f7b14d..cd2bec6 100644 --- a/README.md +++ b/README.md @@ -2,22 +2,21 @@ ## Package Overview and References -The MFAssignR package was designed for multi-element molecular formula (MF) assignment of ultrahigh resolution mass spectrometry measurements. A number of tools for internal mass recalibration, MF assignment, signal-to-noise evaluation, and unambiguous MF assignments are provided. This package contains MFAssign(), MFAssign_RMD(), MFAssignCHO(), MFAssignCHO_RMD(), MFAssignAll(), MFAssignAll_MSMS(), SNplot(), HistNoise(), KMDNoise(), RecalList(), Recal(), and IsoFiltR() described in the sections below. Note, the functions with “RMD” were designed to be run within an R Markdown file and are otherwise identical to the corresponding non-”RMD” versions. To learn more, please see the section titled “Semi-Automated MFAssignR Functions”. User caution with the function parameter settings and output evaluation is required; thus, several function outputs are provided to assist the user with these evaluations. +The MFAssignR package was designed for multi-element molecular formula (MF) assignment of ultrahigh resolution mass spectrometry measurements. A number of tools for internal mass recalibration, MF assignment, signal-to-noise evaluation, and unambiguous MF assignments are provided. This package contains MFAssign(), MFAssign_RMD(), MFAssignCHO(), MFAssignCHO_RMD(), SNplot(), HistNoise(), KMDNoise(), RecalList(), Recal(), Recal_2(), RecalX(), Recal_2X(), and IsoFiltR() described in the sections below. Note, the functions with “RMD” were designed to be run within an R Markdown file and are otherwise identical to the corresponding non-”RMD” versions. To learn more, please see the section titled “Semi-Automated MFAssignR Functions”. User caution with the function parameter settings and output evaluation is required; thus, several function outputs are provided to assist the user with these evaluations. ## Molecular Formula (MF) Assignment -The MF assignment algorithm in MFAssign was adapted from the low mass moiety CHOFIT assignment algorithm developed by Green and Perdue (2015). In total there are 4 versions of MF Assign, including MFAssign(), MFAssignCHO(), MFAssignAll(), and MFAssignAll_MSMS(). Where MFAssign(), MFAssignAll(), and MFAssignAll_MSMS() include external nested loops to assign additional heteroatoms, as described in Green and Perdue (2015) while MFAssignCHO() does not. Briefly, the CHOFIT algorithm uses low mass moieties such as CH4O-1 and C4O-3 to move around in the O/C and H/C space to assign MF with C, H, and O (CHO MF). These low mass moieties efficiently assign CHO MF without conventional loops. Additional combinatorial assignments with various heteroatoms are made using nested loops that subtract the mass of a heteroatom from the measured ion mass, creating a CHO “core” mass, which can then be assigned using the low mass moiety CHOFIT approach. This is further explained in Green and Perdue (2015) and Perdue and Green (2015). +The MF assignment algorithm in MFAssign was adapted from the low mass moiety CHOFIT assignment algorithm developed by Green and Perdue (2015). In total there are 2 versions of MF Assign, including MFAssign() and MFAssignCHO(). MFAssign() includes external nested loops to assign additional heteroatoms, as described in Green and Perdue (2015) while MFAssignCHO() does not. Briefly, the CHOFIT algorithm uses low mass moieties such as CH4O-1 and C4O-3 to move around in the O/C and H/C space to assign MF with C, H, and O (CHO MF). These low mass moieties efficiently assign CHO MF without conventional loops. Additional combinatorial assignments with various heteroatoms are made using nested loops that subtract the mass of a heteroatom from the measured ion mass, creating a CHO “core” mass, which can then be assigned using the low mass moiety CHOFIT approach. This is further explained in Green and Perdue (2015) and Perdue and Green (2015). ### MFAssign() Using the low mass moiety and combinatorial assignment approach, MFAssign() can be used to assign MF with 12C, 1H, and 16O and a variety of heteroatoms and isotopes, including 2H, 13C, 14N, 15N, 31P, 32S, 34S, 35Cl, 37Cl,and 19F. It can also assign Na+ adducts, which are common in positive ion mode. Due to the increasing number of chemically reasonable MF with the increasing number of possible elements and increasing molecular weight, the output will provide a list of ambiguous and unambiguous MF. Advanced Kendrick mass and z* sorting tools are used to reduce the number of ambiguous MF in MFAssign(). First, Kendrick mass defect (KMD) and z* values are calculated with a CH2 Kendrick base to sort the measured masses into CH2 homologous series (Stenson et al., 2003). The function then selects 1 to 3 members of each CH2 homologous series with masses below the user defined cutoff and attempts to assign MF. The ambiguous MF are then returned to the unassigned list. Then, the unambiguous MF are used as seeds for additional assignments using CH2, O, H2, H2O, and CH2O MF extensions (Kujawinski and Behn, 2006). To do the formula extensions the KMD and z* values for each of these bases are calculated and then used to assign MF through the addition or subtraction of the series bases. MFAssign() (and MFAssignCHO()) tracks how many different “paths” can be used to assign each MF and if a single mass has multiple MF, the function will choose the MF that has the largest number of paths that intercept with it. For example, if a single mass has two possible MF and one has 20 potential “paths” to it, while the other has 4, the function will choose the MF with 20 paths. Work is ongoing to track these paths and the removed MF in the data frame output of these functions. Overall, the multi-path MF extension approach greatly reduces the number of ambiguous assignments and provides an increased level of confidence in the final MF list because the MF are related to unambiguous MF assigned below the user defined cut point. An additional step to decrease the number of ambiguous and/or incorrect sulfur assignments was also added. This step requires that for a sulfur containing compound to act as a seed it must be unambiguous and have a matching 34S peak, when both monoisotopic and isotopic mass lists from the IsoFiltR() function are are assigned MF. This has been implemented for all versions of the MFAssign functions. +To allow for more ambiguity in the formula assignment there is the "Ambig" parameter which can be turned "on" or "off". This option turns off the path choosing step for formula assignment, described above, which allows for more assignments for each mass to be kept. Additionally, the "MSMS" parameter is present, which can help to assign molecular formulas in a data set that is not very continuous with respect to homologous series, such as MS/MS data. What it does is remove the pre-filtering of masses below the DeNovo threshold, meaning that all masses below that point will be assigned directly. This causes the function to run somewhat slower, but can help to get better assignments. These parameters replace the MFAssignAll() and MFAssignMSMS() functions from previous versions (<= v.0.0.2). + ### MFAssignCHO() MFAssignCHO() is a simplified version of MFAssign() used only to assign MF with CHO elements. MFAssignCHO() runs faster than MFAssign() and is best used as a preliminary MF assignment step prior to the selection of recalibrant ions in conjunction with MFRecalList() and MFRecalCheck(), which are described below. -### MFAssignAll() and MFAssignAll_MSMS() -MFAssignAll() uses the low mass moiety and combinatorial assignment approach with a simplified MF extension approach. However, only CH2 and H2O formula extensions are used for MF assignment. This function results in a significantly higher number of ambiguous MF and was intended to be used after MFAssign() or on short mass lists without a complex mixture. MFAssignAll_MSMS() is a further simplified version of MFAssignAll(), which runs somewhat slower, but is more effective for assigning small mass lists with very few homologous series relationships as can be observed in MS/MS data. - ## Isotope Filtering The IsoFiltR() function can identify many of the 13C and 34S isotope masses, which when removed from the mass list can lower the number of peaks assigned with an incorrect MF. This function operates on a two column data frame using the same structure as the MFAssign() function.