QUESTION: What's the largest genome that end-users have assembled with RAVEN? #36

cement-head · 2021-02-17T14:58:49Z

What's the largest genome that end-users have assembled with RAVEN?
Did you use the GPU version (built for CUDA/GPU)
What were your options, if any?
How long did it take?
Approximately how big was your computer?

rvaser · 2021-02-17T15:26:26Z

Here is the preprint: https://www.biorxiv.org/content/10.1101/2020.08.07.242461v1. Although, the version in the benchmark is 1.1.10, and versions 1.3.0 and upwards use far less memory. We should update the preprint soon. Answers:

I think 3Gbp (haploid) size, not sure tho.
We did not benchmark with CUDA enabled.
No additional options, only number of threads.
Depends on coverage, see preprint.
1TB RAM/128 cores (run on 64 threads).

cement-head · 2021-03-03T15:41:55Z

Okay, we just did a 6.0 Gbp beastie; but RAVEN gave us just over 7.0 Gbp.

Took five days, 2 TB ECC RAM; 124 threads; two CUDAS (RTX TITANS used for polishing; -c=100)

Given that the assembly is a little large, I'm wondering if I should change any of these three parameters, and whether or not you'd have some recommendations?

-m, --match <int>
      default: 3
      score for matching bases
    -n, --mismatch <int>
      default: -5
      score for mismatching bases
    -g, --gap <int>
      default: -4
      gap penalty (must be negative)

cement-head · 2021-03-03T15:43:27Z

Also, would increasing the rounds of polishing (RACON) drastically improve the assembly?

cement-head · 2021-03-04T15:14:40Z

Okay - got 0.1% Complete with a BUSCO analysis. Something is wrong, would you suggest increasing the penalty for the mismatch score?

rvaser · 2021-03-07T08:50:00Z

Can you print the assembly statistics (length/#contigs/NX/NGX)? Which sequencing technology are you using? What is the sequencing depth? The BUSCO score is abysmal, not sure if changing alignment parameters will help. Running more than 2 iterations of Racon will not increase the accuracy by much either.

Sorry for my late reply!
Best regards,
Robert

P.S. You can also paste here the log Raven created.

cement-head · 2021-03-10T14:26:38Z

Technology is PacBioSII CLR with the N50 of the raw reads >36Kbp.

The coverage is about 70x.

Q: Would adjusting the -m, -n, -g parameters improve assembly?

What file is the RAVEN logfile?

Here's the QUAST analysis; the # of contigs is good-ish, but the N50 isn't the greatest:

Assembly                    raven_asm 
# contigs (>= 0 bp)         25505     
# contigs (>= 1000 bp)      25505     
# contigs (>= 5000 bp)      25505     
# contigs (>= 10000 bp)     25504     
# contigs (>= 25000 bp)     25504     
# contigs (>= 50000 bp)     25473     
Total length (>= 0 bp)      7048262437
Total length (>= 1000 bp)   7048262437
Total length (>= 5000 bp)   7048262437
Total length (>= 10000 bp)  7048257309
Total length (>= 25000 bp)  7048257309
Total length (>= 50000 bp)  7046876721
# contigs                   25505     
Largest contig              3296975   
Total length                7048262437
GC (%)                      43.05     
N50                         337254    
N75                         208232    
L50                         6350      
L75                         13031     
# N's per 100 kbp           0.00

rvaser · 2021-03-11T02:59:51Z

The log is outputed to stderr. I am not sure if changing alignment parameters will help at all. The assembly is quite fragmented which might be the reason for bad BUSCO performance.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

QUESTION: What's the largest genome that end-users have assembled with RAVEN? #36

QUESTION: What's the largest genome that end-users have assembled with RAVEN? #36

cement-head commented Feb 17, 2021

rvaser commented Feb 17, 2021

cement-head commented Mar 3, 2021 •

edited

Loading

cement-head commented Mar 3, 2021

cement-head commented Mar 4, 2021

rvaser commented Mar 7, 2021 •

edited

Loading

cement-head commented Mar 10, 2021

rvaser commented Mar 11, 2021

QUESTION: What's the largest genome that end-users have assembled with RAVEN? #36

QUESTION: What's the largest genome that end-users have assembled with RAVEN? #36

Comments

cement-head commented Feb 17, 2021

rvaser commented Feb 17, 2021

cement-head commented Mar 3, 2021 • edited Loading

cement-head commented Mar 3, 2021

cement-head commented Mar 4, 2021

rvaser commented Mar 7, 2021 • edited Loading

cement-head commented Mar 10, 2021

rvaser commented Mar 11, 2021

cement-head commented Mar 3, 2021 •

edited

Loading

rvaser commented Mar 7, 2021 •

edited

Loading