Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignoring weird tRNA from Aragorn #245

Merged
merged 5 commits into from
Jul 3, 2024
Merged

Ignoring weird tRNA from Aragorn #245

merged 5 commits into from
Jul 3, 2024

Conversation

JeanMainguy
Copy link
Member

In rare cases, Aragorn predicts RNA genes with negative coordinates.

For example, in contig NZ_CSQN01000072, it predicts the following tRNA:

>NZ_CSQN01000072.1
1 gene found
1   tRNA-Ile                      c[-3,86]	34  	(gat)

Previously, start and stop positions were not validated, allowing such erroneous coordinates to persist. When written to the pangenome file, which stores coordinates as positive integers, the -3 was converted to 4294967293.

Now, with the updates made in the development branch (particularly in PR #206, which includes management of joined coordinates), extensive checks are performed on coordinates.

It has been decided that PPanGGOLiN should ignore genes predicted by Aragorn with invalid coordinates. tRNAs are only predicted to remove any CDS that would overlap them and have no other use in the PPanGGOLiN.

This PR addresses the issue in two scenarios:

  1. If Aragorn predicts a gene with negative coordinates, this gene is ignored and not written to the pangenome file.
  2. If a user uses a pangenome file created with version 2.0.5 or earlier that contains invalid tRNA annotations, these genes are ignored when reading the annotations to maintain compatibility between versions.

@JeanMainguy JeanMainguy changed the base branch from master to dev July 1, 2024 17:59
@JeanMainguy
Copy link
Member Author

Here, is the contig file to reproduce Aragorn gene prediction with negative coordinates with the following command:

aragorn -t -gcbact -l -w NZ_CSQN01000072.fna.txt

NZ_CSQN01000072.fna.txt

@JeanMainguy JeanMainguy marked this pull request as ready for review July 2, 2024 08:04
@JeanMainguy JeanMainguy merged commit fd495cc into dev Jul 3, 2024
4 checks passed
@JeanMainguy JeanMainguy deleted the fix_weird_trna branch July 24, 2024 15:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants