-
Notifications
You must be signed in to change notification settings - Fork 10
Filter BAM file examples
Buys de Barbanson edited this page Jan 11, 2021
·
6 revisions
Extracting a subset of reads can be done using bamFilter.py
.
In the filter expression r
is a reference to a pysam.AlignedSegment, see https://pysam.readthedocs.io/en/latest/api.html#pysam.AlignedSegment for all available attributes.
The following line extracts cells with sample names (SM) my_experiment_cell_315 and my_experiment_cell_314, and writes it to cell_315_and_314.bam .
bamFilter.py input.bam "r.get_tag('SM') in ['my_experiment_cell_315','my_experiment_cell_314']" -o cell_315_and_314.bam
Extract all transcriptome reads from a mixed type file:
bamFilter.py input.bam "r.has_tag('dt') and r.get_tag('dt)=='RNA'" -o transcriptome.bam
Extract all DNA reads from a mixed type file:
bamFilter.py input.bam "r.has_tag('dt') and r.get_tag('dt)=='DNA'" -o genome.bam
Extract all reads with an assigned cut site:
bamFilter.py input.bam "r.has_tag('DS')" -o with_cut_site.bam
Extract all unique reads with an assigned cut site:
bamFilter.py input.bam "r.has_tag('DS') and not r.is_duplicate" -o with_cut_site_unique.bam
Extract all unmethylated RNA reads from a mixed type file:
bamFilter.py input.bam "r.has_tag('dt') and r.get_tag('dt)=='RNA' and not r.has_tag('MC') or read.get_tag('MC')==0" -o transcriptome_without_methylation.bam