Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What are the principle for manual curation on treating tandem insertions separately? #8

Open
TangShan99 opened this issue Nov 10, 2022 · 1 comment

Comments

@TangShan99
Copy link

Hi ,i have a doubt.
Actually i encountered a problem that some tandem insertions appeared,which have been talked about in previous questions. Here in my results, many multiple pI next to each other , different predicted sequences start and end with different sites, so i can't well define which one is the true boundary, and i wonder if you can tell me more detail on the principle for manual curation treating tandem insertions separately?

@simroux
Copy link
Owner

simroux commented Nov 10, 2022

Hi,

I don't know of any efficient approach that would resolve tandem insertions. One thing you can try to do is extract the whole region (i.e. the multiple tandem insertions), and get a dot plot (e.g. via a self-blast on NCBI). This may help you identify repeat regions that may signal boundaries of individual insertions. But that's assuming these repeats are still here and intact, while my feeling is that these tandem insertions often include some (partially) decayed prophages.

The other option is to use these regions for what they are, i.e. "hotspots" of inovirus insertions from which one can not robustly/easily identify individual genome units. You can still count the number of distinct pI to get an approximated number of inoviruses in the region, for instance, but the gene content of each individual inovirus genome is much harder to establish.

Hope that helps, and sorry to not have a real solution there !
Best,
Simon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants