Decide what to do with coverage reporting in presence of large deletions. #1193

Donaim · 2024-11-01T18:45:09Z

Decide what to do with coverage reporting in presence of large deletions.

Currently, we can have following two cases:

query aligned as 100M600D100M somewhere in the reference. Then coverage values for the big deletion in the middle are missing. (reference region is not covered by query)
query aligned as 100M599D100M somewhere in the reference. Then coverage values for the big deletion in the middle are present (reference region is covered by query).

The threshold of 600 deletions is sort of arbitrary.

We would like to develop a better decision procedure on what to report as "coverage".
Possibly, one that looks into the individual reads (from fastq files) in order to see whether it was the reads that spanned the big deletion, or whether the query is two separate consensus sequences "stitched" together.

The text was updated successfully, but these errors were encountered:

Donaim · 2024-11-01T18:46:19Z

The current threshold is defined here:

MiCall/micall/utils/consensus_aligner.py

Line 34 in 5d90205

MAX_GAP_SIZE = 600 # TODO: make this smaller?

Donaim added the enhancement label Nov 1, 2024

Donaim added this to the far future milestone Nov 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Decide what to do with coverage reporting in presence of large deletions. #1193

Decide what to do with coverage reporting in presence of large deletions. #1193

Donaim commented Nov 1, 2024

Donaim commented Nov 1, 2024

Decide what to do with coverage reporting in presence of large deletions. #1193

Decide what to do with coverage reporting in presence of large deletions. #1193

Comments

Donaim commented Nov 1, 2024

Donaim commented Nov 1, 2024