Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Are antismash versions important? #83

Open
ayaloglu opened this issue May 28, 2024 · 3 comments
Open

Are antismash versions important? #83

ayaloglu opened this issue May 28, 2024 · 3 comments

Comments

@ayaloglu
Copy link

Can I use input files run with different antismash versions for Bigslice analysis?

@ZhangZF1102
Copy link

Yes, it is very important

@MWMullowney
Copy link

Can you expand on this? I am working with @ayaloglu and here is our situation (cross-posted from the antiSMASH discussion): We are working on a project to compare about 100 genomes to the antiSMASH dataset included in the BiG-SLiCE paper. It's a massive dataset that was generated using antiSMASH version 5.1.1 on genomes from NCBI. I have looked up the changelog for antiSMASH between 5.1.1 and 7.1, which is quite informative. Still, I'm not clear on how much my comparison results (using BiG-SLICE and potentially other tools) will be biased by the updates in the software (newly detected BGCs, updates to detection rules, etc.) rather than actual differences in the datasets. I don't have the resources or time to rerun the entire NCBI dataset in antiSMASH 7 so that it matches my data, and I don't think regressing to antiSMASH 5 to reanalyze my data to match the BiG-SLiCE dataset is a good idea. Does anyone have a sense for what portion of newly detected BGCs in my dataset might be solely from the updates to antiSMASH rather than from the fact that my genomes might have unique BGCs? Thanks for your thoughts!

@ZhangZF1102
Copy link

Can you expand on this? I am working with @ayaloglu and here is our situation (cross-posted from the antiSMASH discussion): We are working on a project to compare about 100 genomes to the antiSMASH dataset included in the BiG-SLiCE paper. It's a massive dataset that was generated using antiSMASH version 5.1.1 on genomes from NCBI. I have looked up the changelog for antiSMASH between 5.1.1 and 7.1, which is quite informative. Still, I'm not clear on how much my comparison results (using BiG-SLICE and potentially other tools) will be biased by the updates in the software (newly detected BGCs, updates to detection rules, etc.) rather than actual differences in the datasets. I don't have the resources or time to rerun the entire NCBI dataset in antiSMASH 7 so that it matches my data, and I don't think regressing to antiSMASH 5 to reanalyze my data to match the BiG-SLiCE dataset is a good idea. Does anyone have a sense for what portion of newly detected BGCs in my dataset might be solely from the updates to antiSMASH rather than from the fact that my genomes might have unique BGCs? Thanks for your thoughts!

As in my test, the antismash version is quite important. I tested antismash V6.1 amd V7.1, BGC number identified by V7 was about 20% higher than that identified by V6, and the BGCs generated by V7 can not reruned by V6 software. While I did not test if BGCs generated by V6 could be reruned by V7 or not.

if you want to compare your BGCs with BigFAM database (I guess it is your purpose), your should use antismash V6 and BigSLICE V1.1. Because using antismash V7 and BigSLICE V1.1 you may encounter an issue "cannot find hmm library". To compare your BGCs with MiBiG, you can use antismash V7.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants