-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gptmd approach update #2419
base: master
Are you sure you want to change the base?
Gptmd approach update #2419
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #2419 +/- ##
==========================================
+ Coverage 93.67% 93.89% +0.21%
==========================================
Files 141 141
Lines 21925 22034 +109
Branches 3007 3020 +13
==========================================
+ Hits 20539 20688 +149
+ Misses 934 902 -32
+ Partials 452 444 -8
|
…taMorpheus into gptmdApproachUpdate
…taMorpheus into gptmdApproachUpdate
@@ -62,18 +79,13 @@ protected override MyTaskResults RunSpecific(string OutputFolder, List<DbForTask | |||
ProseCreatedWhileRunning.Append("precursor mass tolerance(s) = {" + tempSearchMode.ToProseString() + "}; "); | |||
|
|||
ProseCreatedWhileRunning.Append("product mass tolerance = " + CommonParameters.ProductMassTolerance + ". "); | |||
ProseCreatedWhileRunning.Append("The combined search database contained " + proteinList.Count(p => !p.IsDecoy) + " non-decoy protein entries including " + proteinList.Where(p => p.IsContaminant).Count() + " contaminant sequences. "); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason we got rid of this prose line?
|
||
// if a variant protein and the mod is on the variant, index to the variant protein sequence | ||
if (modIsOnVariant) | ||
if (CommonParameters.DissociationType == DissociationType.Autodetect) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this line, if it is autodetect, we fragement with all possible fragmentation types. On line 120, if it is autodetect, we fragment with the type as indicated in the scan header. What is the reason for two different approaches?
@@ -217,13 +217,13 @@ public static void TestModificationInfoListInProteinGroupsOutput() | |||
int totalNumberOfMods = proteins.Sum(p => p.OneBasedPossibleLocalizedModifications.Count + p.SequenceVariations.Sum(sv => sv.OneBasedModifications.Count)); | |||
|
|||
//tests that modifications are being done correctly | |||
Assert.AreEqual(0, totalNumberOfMods); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change of an expected value with the comment above, "test that mods are being done correctly" is worrisome
GPTMD is promiscuous in the addition of potential modifications to the xml database. This PR reduces the number of candidate modifications added to those that produce the highest score for each possible PTM. The high level details of the new algorithm are as follows:
For bottom up:
six mann A549 files with human fasta.
old method added 200513 mods; new method added 128449 mods
old method 102324 psms; new 103546
old 39283 peptides; new 39277
old 6042 proteins; new 6012
For top down:
14 fractions x 2 techreps jurkate td files from sean dai paper
old method added 19188 mods; new method added 11013 mods
old method 23688 psms; new 24022
old 904 proteoforms; new 899
old 279 proteins; new 273
Additonal updates: