Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZeroDivisionError while running BESST #30

Open
cimendes opened this issue May 11, 2021 · 1 comment
Open

ZeroDivisionError while running BESST #30

cimendes opened this issue May 11, 2021 · 1 comment

Comments

@cimendes
Copy link
Contributor

Greetings,

I've been using gatb-minima-pipeline for a while now for a de novo assembly software benchmark, available at Github. I've been running it so far without issue with real data, in this case, the ZymoBIOMICS Microbial Community Standard with even and log distribution.

Recently I've generated a mock community with the ZymoBIOMICS complete genomes and GATB-MINIA-PiPELINE is consistently failing on the samples without error model (perfect reads matching the reference). The mock read data is available here

I use the following command to run this assembler:

gatb -1 ${fastq_pair[0]} -2 ${fastq_pair[1]} --kmer-sizes ${kmer_list} -o ${sample_id}_GATBMiniaPipeline --no-error-correction (available here

the parameters used are

    gatbkmer = '21,61,101,141,181'
    gatb_besst_iter = 10000
    GATB_error_correction = false

This is the end of the stdout that I get while running gatb-minia-pipeline:

pe1_path: subENN_1.fq.gz
pe2_path: subENN_2.fq.gz
genome_path: subENN_GATBMiniaPipeline_k181.contigs.fa
output_path: subENN_GATBMiniaPipeline.lib_0
tmp_path: /mnt/beegfs/scratch/ONEIDA/cimendes/LMAS/work/18/993d693d0c79e5aef093045d4571d9/BESST_tmp
bwa path: bwa
number of threads: 8
Remove temp SAM and BAM files: No
Use bwa aln and sampe instead of bwa mem: No
Start processing.
Aligning with bwa mem.
Temp directory: /mnt/beegfs/scratch/ONEIDA/cimendes/LMAS/work/18/993d693d0c79e5aef093045d4571d9/BESST_tmp
Output path:    subENN_GATBMiniaPipeline.lib_0
Stderr file:    subENN_GATBMiniaPipeline.lib_0.bwa.1
Make bwa index... Done.
Align with bwa mem... Done.
Time elapsed for bwa index and mem:  0:02:10.200315
Convert SAM to BAM... Done.
Time elapsed for SAM to BAM conversion: 0:01:16.767939
Sort BAM... Done.
Time elapsed for BAM sorting: 0:01:08.141072
Index BAM... Done.
Time elapsed for BAM indexing: 0:00:03.765598
Remove temp files... Done.
Time elapsed for temp files removing: 0:00:00.008573
Processing is finished.
(2021-05-10 20:20:32) Execution of 'python BESST/runBESST'. Command line:
     /NGStools/gatb-minia-pipeline/tools/memused python /NGStools/gatb-minia-pipeline/BESST/runBESST -c subENN_GATBMiniaPipeline_k181.contigs.fa -f subENN_GATBMiniaPipeline.lib_0.bam -o subENN_GATBMiniaPipeline_besst --orientation fr --iter 10000
Number of initial contigs: 2028
Traceback (most recent call last):
  File "/NGStools/gatb-minia-pipeline/BESST/runBESST", line 401, in <module>
    main(args)
  File "/NGStools/gatb-minia-pipeline/BESST/runBESST", line 160, in main
    libmetrics.get_metrics(bam_file, param, Information)
  File "/NGStools/gatb-minia-pipeline/BESST/BESST/libmetrics.py", line 317, in get_metrics
    mean_isize = sum(filtered_list) / n
ZeroDivisionError: float division by zero
maximal memory used: 182 MB
(2021-05-10 20:20:40) Execution of 'python BESST/runBESST' failed. Command line:
     /NGStools/gatb-minia-pipeline/tools/memused python /NGStools/gatb-minia-pipeline/BESST/runBESST -c subENN_GATBMiniaPipeline_k181.contigs.fa -f subENN_GATBMiniaPipeline.lib_0.bam -o subENN_GATBMiniaPipeline_besst --orientation fr --iter 10000

This pipeline is being run in a dockerfile with the latest version in the master branch: cimendes/gatb-minia-pipeline:31.07.2020-1

Thank you for your assistance in understanding this error! Is there anything you suggest I do?

Best,

Inês

@rchikhi
Copy link
Member

rchikhi commented May 29, 2021

Hi Inês, apologies for the slow response, I can suggest the following quick fix: add the --no-scaffolding flag to the gatb executable. This will skip BESST, and for metagenome paired-end data, this may not be a too bad thing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants