-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
1 changed file
with
30 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
Indexing into a graph is one of the most challenging parts of working with graph genomes. As an example, take the following | ||
reference genome with variants found in a sample: | ||
|
||
```mermaid | ||
flowchart | ||
subgraph sample-1 | ||
ATC --> GAA | ||
GAA --> TTGCATG | ||
ATC ---|deletion|TTGCATG | ||
TTGCATG ---|insertion-1|AAA --> ACATACA | ||
TTGCATG ---|insertion-2| CAAAGA --> ACATACA | ||
end | ||
subgraph reference-genome | ||
ATCGAATTGCATGACATACA | ||
end | ||
``` | ||
|
||
Now, suppose we want to perform engineering on sample-1 and wish to carry out an insertion at the `CATA` position. How | ||
can this position be addressed using the linear coordinate space? Because each chromosome's path is a different length due | ||
to variants, there is no common reference frame for this position. To try and address this issue, the following | ||
conventions are used. | ||
|
||
* If the region is not altered, the base genome can be referenced explicitly. For example, within sample-1, `CATA` can | ||
be referenced as position 15-19 as that region is not impacted by any variants. | ||
* If the organism has a single copy of the genetic material, sample-1 can be accessed assuming all variants have been | ||
incorporated. Thus, if the reference genome has an `A` at position 100 and there is a 1 basepair deletion at position 90, | ||
in sample-1, that position can be referenced | ||
* To refer to positions within alterations where no unambiguous coordinate system exists, `accessions` can be utilized | ||
to provide named regions for modifications. Coordinates within accessions are relative to the accession itself. | ||
Therefore, if we named the `CAAGA` insertion as `insertion-2`, the name `insertion-2:3-5` would refer to positions `AG`. |