-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Steps to implement checkm2 quality report to dRep #220
Comments
Thanks for this, @mpdoane2 When I get a chance I'll add this to the documentation and cite you / this issue |
Hi, thanks for the information for converting the checkm output to a csv! I tried running drep with the --genomeInfo flag and the csv, but when I check the log for the job, it is still running checkm. Is there another flag I need to add? Thanks! |
Hi @achenderson - my guess is that there is a mismatch between the "genome" names provided in the Best, |
That's it! Thank you :) |
Hi there, I've think I've run into this issue but as I have ~60k MAGs, CheckM is taking a long time and the Bdb.csv file is not present. As an example, my genome files look like /scratch/usr/SBsP_T2_sr_metabat2_refined.002.fna and the corresponding cell in the .csv is /scratch/usr/SBsP_T2_sr_metabat2_refined.002 but it is still running checkM. Is it possibly due to the '.' before 002? Are the full paths unnecessary? Or is the .fna extension an issue? Thanks, |
Hi @CJREID - are you running checkM within dRep, or are you running checkM2 outside of dRep? |
Hi Matt, I ran checkM2 outside of dRep and formatted it as described above for dRep. It worked on once I added the .fna extension to the names in the genomeInfo file. I was confused because the help message says this file must contain Thanks, |
Hi @CJREID - thanks for the update and for the suggestion. I'll update in the next verison of dRep. Best, |
Checkm2 quality report can be used. You will need to convert the Checkm2 quality report to a .csv file using:
awk -F'\t' 'BEGIN {OFS=","} {print $1, $2, $3}' quality_report.tsv > new_file_name.csv
In the new file convert headings to: genome,completeness,contamination
dRep command for using checkm2 output instead of checkm_genome which is default currently,
dRep dereplicate output --genomeInfo new_file_name.csv -g bins/*.fna
Just thought I would write it out in case others were facing similar issues.
-Mike
The text was updated successfully, but these errors were encountered: