# Coverage files ## Allele base counts Filename: `allele_base_coverage.json` This file contains per base coverage counts for alleles. Consider a read which maps exactly to the PRG and overlaps some number of allele bases. Allele base counts are incremented for every overlapping read. This file contains separate counts for each allele base. #### Example The following example consists of two sites. The first site consist of two alleles and the second site consists of three alleles. ```JSON { "allele_base_counts": [ [ [0, 0, 0], [1, 1, 0] ], [ [0, 0, 1], [2, 2, 0], [2, 2, 0, 1, 3] ] ] } ``` The third allele of the second site consists of five bases and therefore five counts: ```python sites = data["allele_base_counts"] first_site = sites[0] second_site = sites[1] assert second_site[2] == [2, 2, 0, 1, 3] ``` ## Grouped allele counts Filename: `grouped_allele_counts.json` Consider a single read with maps exactly to the PRG multiple times. Lets refer to each distinct mapping of single read as a "mapping instance". When two different mapping instances overlap a common site, the overlapped alleles are grouped together. Then, mapping coverage counts are aggregated for each allele group. #### Example ```json { "grouped_allele_counts": { "site_counts": [ { "0": 10, "1": 3, "14": 10 }, { "3": 30, "2": 2, "14": 1 } ], "allele_groups": { "0": [0, 2], "1": [0, 2, 3], "2": [0, 2, 4], "3": [2, 5], "14": [7, 8] } } } ``` ```json { "grouped_allele_counts": { "site_counts": [ { "": , ... }, , ... ], "allele_groups": { "": [, ...], ... } } } ``` ```python grouped_allele_counts = data["grouped_allele_counts"] sites = grouped_allele_counts["site_counts"] allele_groups = grouped_allele_counts["allele_groups"] site = sites[0] for allele_group_id, count in site.iter(): allele_ids = allele_groups[allele_group_id] print(allele_ids, count) ``` ## Allele sum coverage Filename: `allele_sum_coverage` This file contains coverage information for each allele within the PRG. Each row (line) represents a variant site within the PRG. Each column (space separated within a single line) represents allele coverage counts. #### Example ``` 0 0 0 1 0 0 3 ``` This example describes the coverage information for three sites. The first site consists of three alleles. The second and third sites both consist of two alleles each. Read mapping instances have overlapped the first allele of the second site once (hence: 1). Similarly, read mapping instances have overlapped the second allele of the third site three times.