You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We noticed for larger viral genomes a high --max-iteration helped generate more contiguous assembly. We were surprised to find a large number of SNPs that were not supported when we aligned the reads back to this assembly.
There does not seem to be a magic number for --max-iterations for some assemblies. Either it will not be in 1 piece, or it will have SNPs. Since these SNPs are not supported by reads, we could use a tool like pilon to correct the assembly, but it may be easy/better to fix within penguin to avoid this caveat to assembly accuracy.
See below for the rhinovirus assembly with default settings (few/no SNPs0) and --max-iterations 15 (many SNPs) also zoomed.
The text was updated successfully, but these errors were encountered:
Expected Behavior
penguin assembly with few/no SNPs relative to the reads used to assemble.
Current Behavior
high --max-iterations results in SNPs that are not supported by the reads used during assembly
Steps to Reproduce (for bugs)
observable on most samples we tested with --max-iterations 15
also happens with the benchmark rhinovirus data here: https://github.com/AnnSeidel/penguin-analysis/tree/main/benchmarking/rhinovirus-3-mixture on some of the contigs, screenshots attached.
Context
We noticed for larger viral genomes a high --max-iteration helped generate more contiguous assembly. We were surprised to find a large number of SNPs that were not supported when we aligned the reads back to this assembly.
There does not seem to be a magic number for --max-iterations for some assemblies. Either it will not be in 1 piece, or it will have SNPs. Since these SNPs are not supported by reads, we could use a tool like pilon to correct the assembly, but it may be easy/better to fix within penguin to avoid this caveat to assembly accuracy.
See below for the rhinovirus assembly with default settings (few/no SNPs0) and --max-iterations 15 (many SNPs) also zoomed.
The text was updated successfully, but these errors were encountered: