Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Renamed contigs? #39

Open
JBuongio opened this issue Sep 22, 2022 · 4 comments
Open

Renamed contigs? #39

JBuongio opened this issue Sep 22, 2022 · 4 comments

Comments

@JBuongio
Copy link

JBuongio commented Sep 22, 2022

Hi Arkadiy,
Thank you for all that you do! I am noticing that FeGenie renumbers/names contigs within individual MAGs' depth files. For those of us who cross-reference output from different tools, like anvi'o, would it be possible to retain the original contig names? For example, if I'm looking at output about c_000000001 from anvi'o and I use the same input fastas for FeGenie analysis, it would be really great to have that same c_000000001 contig be tied to the same MAG that anvi'o analyzed. Does this make sense? I double-checked that the input MAGs fastas for FeGenie were those made by the SUMMARIZE program within anvi'o.
Thank you!!
Best,
Joy

@JBuongio
Copy link
Author

I should say also, the contig names in the geneSummary-clusters.csv file and the depth files don't match each other, making it a little hard to check if a contig that has been identified to contain a gene cluster is covered in the metagenome/transcriptome (although the .csv does tell you which MAG it came from). Thanks! =)

@Arkadiy-Garber
Copy link
Owner

Hey Joy,

Thanks for the note - hope you are well! :)

I was not aware that FeGenie renames contigs. As far as I remember that is not the intended function. Could you please attach or send me an example of a .depth file where the contig names are changes, perhaps with the geneSummary.csv file for reference.

Thanks!
Arkadiy

@JBuongio
Copy link
Author

JBuongio commented Sep 23, 2022

Thank you! I've placed in the FeGenie_troubleshooting Google Drive folder that we have used before files that should help: geneSummary.csv, geneSummary-clusters.csv, the bin fastas, and the depth files for each bin.
You'll notice that while the geneSummary files have the contig names that correlate to the original bin fasta contig names, the contig names in the depth files all restart at c_000000000001. When I grep this contig name in the bin_dir, it should only appear in MAG-19: `(fegenie) $ grep "c_000000000001" *fa

Lentisphaerae_bacterium_MAG-19-contigs.fa:>c_000000000001`

Thank you so much for your time.

@Arkadiy-Garber
Copy link
Owner

Hey Joy! So sorry for the delay in getting to this - been a busy month for me, but I'll have some time coming up to get to this issue.

Thanks for your patience!
Arkadiy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants