Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERROR: Something wrong with maxbin binning, terminating... #12

Open
comingkms opened this issue Jun 20, 2024 · 9 comments
Open

ERROR: Something wrong with maxbin binning, terminating... #12

comingkms opened this issue Jun 20, 2024 · 9 comments

Comments

@comingkms
Copy link

Hi,

No issue for installation, but I got the error when running your test dataset. Thanks,

nanophase meta -l '/home/Downloads/lr.fa.gz' -t 12 -o nanophase-out
[2024-06-20 14:48:58] INFO: nanophase (meta) starts
[2024-06-20 14:48:58] INFO: Command line: /home/comingkms/anaconda3/envs/nanophase/bin/nanophase meta -l /home/comingkms/Downloads/lr.fa.gz -t 12 -o nanophase-out
[2024-06-20 14:48:58] INFO: long_read_only model was selected, only Nanopore long reads will be used
[2024-06-20 14:48:58] CHECK: Nanopore long-read (fa.gz) file has been found
[2024-06-20 14:48:58] CHECK: Check software availability and locations
[2024-06-20 14:48:59] INFO: The following packages have been found
#package location
nanophase /home/comingkms/anaconda3/envs/nanophase/bin/nanophase
flye /home/comingkms/anaconda3/envs/nanophase/bin/flye
metabat2 /home/comingkms/anaconda3/envs/nanophase/bin/metabat2
maxbin2 /home/comingkms/anaconda3/envs/nanophase/bin/run_MaxBin.pl
SemiBin /home/comingkms/anaconda3/envs/nanophase/bin/SemiBin
metawrap /home/comingkms/anaconda3/envs/nanophase/bin/metawrap
checkm /home/comingkms/anaconda3/envs/nanophase/bin/checkm
racon /home/comingkms/anaconda3/envs/nanophase/bin/racon
medaka /home/comingkms/anaconda3/envs/nanophase/bin/medaka
polypolish /home/comingkms/anaconda3/envs/nanophase/bin/polypolish
POLCA /home/comingkms/anaconda3/envs/nanophase/bin/polca.sh
bwa /home/comingkms/anaconda3/envs/nanophase/bin/bwa
seqtk /home/comingkms/anaconda3/envs/nanophase/bin/seqtk
minimap2 /home/comingkms/anaconda3/envs/nanophase/bin/minimap2
BBMap /home/comingkms/anaconda3/envs/nanophase/bin/BBMap
parallel /home/comingkms/anaconda3/envs/nanophase/bin/parallel
perl /home/comingkms/anaconda3/envs/nanophase/bin/perl
samtools /home/comingkms/anaconda3/envs/nanophase/bin/samtools
gtdbtk /home/comingkms/anaconda3/envs/nanophase/bin/gtdbtk
fastANI /home/comingkms/anaconda3/envs/nanophase/bin/fastANI
All required packages have been found in the environment. If the above certain packages integrated into nanophase were used in your investigation, please give them credit as well :)
[2024-06-20 14:48:59] TASK: Long-read assembly starts (be patient)
[2024-06-20 14:55:31] DONE: long-read assembly finished successfully: detailed log file is nanophase-out/01-LongAssemblies/flye.log
[2024-06-20 14:55:31] TASK: Initial binning::metabat2 binning starts
[2024-06-20 14:55:32] DONE: Initial binning::metabat2 binning finished successfully
MetaBAT 2 (v2.12.1) using minContig 2500, minCV 1.0, minCVSum 1.0, maxP 95%, minS 60, and maxEdges 200.
1 bins (2028309 bases in total) formed.
[2024-06-20 14:55:32] TASK: Initial binning::maxbin2 binning starts
Can't load '/home/comingkms/perl5/lib/perl5/x86_64-linux-thread-multi/auto/Encode/Encode.so' for module Encode: /home/comingkms/perl5/lib/perl5/x86_64-linux-thread-multi/auto/Encode/Encode.so: undefined symbol: Perl__is_utf8_char_helper at /home/comingkms/anaconda3/envs/nanophase/lib/perl5/core_perl/XSLoader.pm line 93.
at /home/comingkms/perl5/lib/perl5/x86_64-linux-thread-multi/Encode.pm line 12.
BEGIN failed--compilation aborted at /home/comingkms/perl5/lib/perl5/x86_64-linux-thread-multi/Encode.pm line 13.
Compilation failed in require at /home/comingkms/anaconda3/envs/nanophase/lib/perl5/site_perl/LWP/UserAgent.pm line 1073.
Compilation failed in require at /home/comingkms/anaconda3/envs/nanophase/bin/run_MaxBin.pl line 4.
BEGIN failed--compilation aborted at /home/comingkms/anaconda3/envs/nanophase/bin/run_MaxBin.pl line 4.
mv: cannot stat 'nanophase-out/02-LongBins/INITIAL_BINNING/maxbin2/bin*fasta': No such file or directory
[2024-06-20 14:55:32] ERROR: Something wrong with maxbin binning, terminating...

@Hydro3639
Copy link
Owner

Hi,

May I know the version of nanophase you installed? And for the long-read dataset (lr.fa.gz), are you using SRR17913199 (if not, I would suggest to use this one)?

@comingkms
Copy link
Author

nanophase v=0.2.3
I used the long-read dataset from https://github.com/example-data/np-example mentioned in your README. Will try SRR17913199.

Thanks

@Hydro3639
Copy link
Owner

besides the dataset, i also noticed that there was a mismatch between the version of Perl and the version of the Encode module in your provided log file. So maybe consider to reinstall the Encode module in the nanophase env to ensure that it is properly compiled against your current Perl version (the command you may refer to: cpan -f -i Encode)

@comingkms
Copy link
Author

Yes, I've realized that, but I still had the issue with SemiBin which has been mentioned before due to the simple dataset. I'll try your SRR17913199. Should I remove the host genome first to improve bacterial assemblies?

Thanks,

@Hydro3639
Copy link
Owner

Glad you have resolved it! Yes, due to the limitations of the simple dataset, nanophase will exit during the semibin stage. That's why I provided the whole mock dataset: SRR17913199. Just give it a try.

I would suggest removing host genomes before the bacterial assemblies, it will make the assembly process easier and faster and lower the potential contamination.

Best

@comingkms
Copy link
Author

Using SRR17913199, " I got the following error: " ERROR: Something wrong with medaka polishing, please also check nanophase-out/03-Polishing/medaka/medaka.polish.log, terminating..." Please check the attached medaka.log, indicating out of memory issue. I am just wondering how big GPU is needed ? Mine is RTX 3090 (24G)
medaka.polish.log

@Hydro3639
Copy link
Owner

I don't have much experience running Medaka polish with a GPU. If it is a GPU memory issue, you might consider lowering the number of threads to 2 (-t 2). This way, nanophase will only run Medaka polish once at a time, reducing the GPU memory requirement. Alternatively, you can use the CPU for polishing by setting export CUDA_VISIBLE_DEVICES=""

@comingkms
Copy link
Author

Finally, I could complete the run using SRR17913199 with some modifications:

  1. Medaka polish with GPU. As you suggested, I have to lower the number of threads to 2.
  2. pplacer uses mass amounts of memory( around 90G). Please add --scratch_dir nanophase-out/03-Polishing/Final-bins/tmp_1
    Thanks,

@Hydro3639
Copy link
Owner

Hi, thank you for the suggestion regarding the memory usage of pplacer. I agree that pplacer requires substantial memory, and using the --scratch_dir flag can reduce the memory load during this step.

The main reason we didn't include this flag in the default settings is that, for many environmental samples with higher complexity than the mock dataset, the most memory-intensive step is actually the long-read assembly with Flye. Although Flye is quite memory-efficient, it remains the primary memory-consuming process. I may consider adding an optional flag for --scratch_dir in the future to give users more flexibility in managing memory usage but it is not a high-priority task at the moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants