Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix parsing PSMs and complete protein names in XTandem #83

Merged
merged 10 commits into from
Jul 10, 2024

Conversation

julianu
Copy link
Contributor

@julianu julianu commented May 3, 2024

[edited after adding fix for PSM parsing]

  1. As XTandem's protein names tend to be abbreviated in the protein "label" tag, change the origin to the "note" tag.

  2. While XTandem saves only the highest scoring PSMs per spectrum, these can still be more than one PSM, with different peptidoforms, if the score is exact the same. This is not an extremely rare case, especially with equal peptides (think of a single AA flip in the sequence). This fix parses the identifications with same peptidoforms into one new PSM, with only the relevant proteins assigned to each PSM. Before, there were weird matches of proteins to peptides, which did not occur in the databases used by XTandem.

  3. Also, it seems as the remark that only one protein per peptide/PSM is parsed is thus not true anymore.

@julianu julianu changed the title small fix for complete protein names in XTandem fix to parse PSMs and complete protein names in XTandem May 7, 2024
@julianu julianu changed the title fix to parse PSMs and complete protein names in XTandem fix parsing PSMs and complete protein names in XTandem May 7, 2024
@julianu
Copy link
Contributor Author

julianu commented May 7, 2024

I updated the comment for the initial PR, as there were some further additions to it.

@RalfG RalfG requested a review from paretje July 4, 2024 11:44
Copy link

codecov bot commented Jul 10, 2024

Codecov Report

Attention: Patch coverage is 15.38462% with 11 lines in your changes missing coverage. Please review.

Project coverage is 63.97%. Comparing base (6e51896) to head (5d01b6f).
Report is 2 commits behind head on main.

Files Patch % Lines
psm_utils/io/xtandem.py 15.38% 11 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #83      +/-   ##
==========================================
- Coverage   64.12%   63.97%   -0.16%     
==========================================
  Files          26       26              
  Lines        2492     2498       +6     
==========================================
  Hits         1598     1598              
- Misses        894      900       +6     
Flag Coverage Δ
unittests 63.97% <15.38%> (-0.16%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@paretje paretje left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thank you for your contribution. I added a test case and made a slight change.

@paretje paretje merged commit beaa5c9 into compomics:main Jul 10, 2024
5 checks passed
@RalfG RalfG added this to the v0.9.1 milestone Jul 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants