diff --git a/MFAssignR/vignettes/MFAssignR Vignette.Rmd b/MFAssignR/vignettes/MFAssignR Vignette.Rmd index d1bab7e..2ab0844 100644 --- a/MFAssignR/vignettes/MFAssignR Vignette.Rmd +++ b/MFAssignR/vignettes/MFAssignR Vignette.Rmd @@ -41,7 +41,7 @@ The MF assignment algorithm in MFAssign was adapted from the low mass moiety CHO ###MFAssign() Using the low mass moiety and combinatorial assignment approach, MFAssign() can be used to assign MF with 12C, 1H, and 16O and a variety of heteroatoms and isotopes, including 2H, 13C, 14N, 15N, 31P, 32S, 34S, 35Cl, 37Cl,and 19F. It can also assign Na+ adducts, which are common in positive ion mode. Due to the increasing number of chemically reasonable MF with the increasing number of possible elements and increasing molecular weight, the output will provide a list of ambiguous and unambiguous MF. -Advanced Kendrick mass and z* sorting tools are used to reduce the number of ambiguous MF in MFAssign(). First, Kendrick mass defect (KMD) and z* values are calculated with a CH2 Kendrick base to sort the measured masses into CH2 homologous series (Stenson et al., 2003). The function then selects 1 to 3 members of each CH2 homologous series with masses below the user defined cutoff and attempts to assign MF. The ambiguous MF are then returned to the unassigned list. Then, the unambiguous MF are used as seeds for additional assignments using CH2, O, H2, H2O, and CH2O MF extensions (Kujawinski and Behn, 2006). To do the formula extensions the KMD and z* values for each of these bases are calculated and then used to assign MF through the addition or subtraction of the series bases. MFAssign() (and MFAssignCHO()) tracks how many different “paths” can be used to assign each MF and if a single mass has multiple MF, the function will choose the MF that has the largest number of paths that intercept with it. For example, if a single mass has two possible MF and one has 20 potential “paths” to it, while the other has 4, the function will choose the MF with 20 paths. Work is ongoing to track these paths and the removed MF in the data frame output of these functions. Overall, the multi-path MF extension approach greatly reduces the number of ambiguous assignments and provides an increased level of confidence in the final MF list because the MF are related to unambiguous MF assigned below the user defined cut point. An additional step to decrease the number of ambiguous and/or incorrect sulfur assignments was also added. This step requires that for a sulfur containing compound to act as a seed it must be unambiguous and have a matching 34S peak, when both monoisotopic and isotopic mass lists from the IsoFiltR() function are are assigned MF. This has been implemented for all versions of the MFAssign functions. +Advanced Kendrick mass and z* sorting tools are used to reduce the number of ambiguous MF in MFAssign(). First, Kendrick mass defect (KMD) and z* values are calculated with a CH2 Kendrick base to sort the measured masses into CH2 homologous series (Stenson et al., 2003). The function then selects 1 to 3 members of each CH2 homologous series with masses below the user defined cutoff and attempts to assign MF. The ambiguous MF are then returned to the unassigned list. Then, the unambiguous MF are used as seeds for additional assignments using CH2, O, H2, H2O, and CH2O MF extensions (Kujawinski and Behn, 2006). To do the formula extensions the KMD and z* values for each of these bases are calculated and then used to assign MF through the addition or subtraction of the series bases. MFAssign() (and MFAssignCHO()) tracks how many different “paths” can be used to assign each MF and if a single mass has multiple MF, the function will choose the MF that has the largest number of paths that intercept with it. For example, if a single mass has two possible MF and one has 20 potential “paths” to it, while the other has 4, the function will choose the MF with 20 paths. Work is ongoing to track these paths and the removed MF in the data frame output of these functions. Overall, the multi-path MF extension approach greatly reduces the number of ambiguous assignments and provides an increased level of confidence in the final MF list because the MF are related to unambiguous MF assigned below the user defined cut point. An additional step to decrease the number of ambiguous and/or incorrect sulfur assignments was also added. This step requires that for a sulfur containing compound to act as a seed it must be unambiguous and have a matching 34S peak, when both monoisotopic and isotopic mass lists from the IsoFiltR() function are are assigned MF. This has been implemented for all versions of the MFAssign functions as the "SulfCheck" parameter, which can be turned "on"" or "off". To allow for more ambiguity in the formula assignment there is the "Ambig" parameter which can be turned "on" or "off". This option turns off the path choosing step for formula assignment, described above, which allows for more assignments for each mass to be kept. Additionally, the "MSMS" parameter is present, which can help to assign molecular formulas in a data set that is not very continuous with respect to homologous series, such as MS/MS data. What it does is remove the pre-filtering of masses below the DeNovo threshold, meaning that all masses below that point will be assigned directly. This causes the function to run somewhat slower, but can help to get better assignments. These parameters replace the MFAssignAll() and MFAssignMSMS() functions from previous versions (<= v.0.0.3).