How to get a reduced BAM file to visualize breakpoints of a chromosomal inversion? #4660

apfuentes · 2024-11-15T14:16:12Z

apfuentes
Nov 15, 2024

Hi!

Thank you for the very nice talk on JBrowse2 during BGA24.

During the talk it was mentioned that there is a way to get a reduced BAM file for a particular region of interest (retaining only informative reads) to visualize a chromosomal inversion. This approach would save us storage space as we would like to avoid adding whole genome BAM files to JBrowse.

How could such BAM file be generated?

Thanks :-)

Answered by cmdcolin

Nov 15, 2024

thanks for the kind words and for attending the BGA talk! I'm glad you followed up.

here are some random ideas

filtering long read BAM files for informative reads

one idea to filter for informative reads is to get all reads that have a signature of being split. one of these tags is the "SA:" tag for long reads. the SA tag indicates "split" or "supplementary" alignments. note that filtering by flag 2048 is not sufficient to get all these reads because it will miss the "primary" alignment of the split

my short guide on this https://cmdcolin.github.io/posts/2022-02-06-sv-sam#what-is-the-sa-tag (<-- that specific section discusses SA a bit, but the whole article has some more of my thoughts o…

View full answer

cmdcolin · 2024-11-15T20:53:14Z

cmdcolin
Nov 15, 2024
Maintainer

thanks for the kind words and for attending the BGA talk! I'm glad you followed up.

here are some random ideas

filtering long read BAM files for informative reads

one idea to filter for informative reads is to get all reads that have a signature of being split. one of these tags is the "SA:" tag for long reads. the SA tag indicates "split" or "supplementary" alignments. note that filtering by flag 2048 is not sufficient to get all these reads because it will miss the "primary" alignment of the split

my short guide on this https://cmdcolin.github.io/posts/2022-02-06-sv-sam#what-is-the-sa-tag (<-- that specific section discusses SA a bit, but the whole article has some more of my thoughts on the matter)

filtering short read BAM files for informative reads

for short reads, you can filter out all reads that are NOT " read mapped in proper pair" which gives you all "badly paired reads", which are often too far apart or in the wrong orientation https://broadinstitute.github.io/picard/explain-flags.html

random other ideas: creating "coverage" files for the above filtered BAM files

by creating a simple BigWig file of coverage of the filtered BAMs, you can quickly zoom in on them, and then load the actual alignments underneath these regions for more info

creating a "contact" matrix for read pairing

maybe even better than the above, you can create a "contact matrix" from the read pairing

sort of similar to the previous discussion about LD matrix, this can also be done to visualize pairing for large SVs like inversions. this is on place I saw a figure like this from a paper on butterfly

part B of this figure shows this

this method of a sort of contact matrix for SVs has also been used in programs like Cue for deep learning of SVs
https://github.com/PopicLab/cue

random other things: plotting population genetic statistics like Fst, linkage disequilibrium, and recombination rates

I know that plotting population genetic statistics over the genome, for certain circumstances where such population genetic approaches are relevant, can give some insights into weird structural variants like inversions. Hopi Hoekstras work comes to mind https://www.biorxiv.org/content/10.1101/2022.05.25.493470v1.full.pdf

Conclusion

This is actually a great topic and i'd be happy to hear more if you find any results. Indeed, I often look at papers for visual inspiration and bring those visualization types to life in the JBrowse UI! part of the challenge is that sometimes it involves a little extra analysis steps or workflows, which sometimes has to be done outside of JBrowse itself, and not as common, but hopefully as more people do it, it can be workflow-ized and made available to everyone!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to get a reduced BAM file to visualize breakpoints of a chromosomal inversion? #4660

{{title}}

Replies: 1 comment

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

How to get a reduced BAM file to visualize breakpoints of a chromosomal inversion? #4660

apfuentes Nov 15, 2024

filtering long read BAM files for informative reads

Replies: 1 comment

cmdcolin Nov 15, 2024 Maintainer

filtering long read BAM files for informative reads

filtering short read BAM files for informative reads

random other ideas: creating "coverage" files for the above filtered BAM files

creating a "contact" matrix for read pairing

random other things: plotting population genetic statistics like Fst, linkage disequilibrium, and recombination rates

Conclusion

apfuentes
Nov 15, 2024

cmdcolin
Nov 15, 2024
Maintainer