Skip to content

Commit

Permalink
Proposal for coordinate use
Browse files Browse the repository at this point in the history
  • Loading branch information
Chris7 committed Oct 24, 2024
1 parent dd8450c commit 7c9f9fa
Showing 1 changed file with 30 additions and 0 deletions.
30 changes: 30 additions & 0 deletions docs/coordinates.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
Indexing into a graph is one of the most challenging parts of working with graph genomes. As an example, take the following
reference genome with variants found in a sample:

```mermaid
flowchart
subgraph sample-1
ATC --> GAA
GAA --> TTGCATG
ATC ---|deletion|TTGCATG
TTGCATG ---|insertion-1|AAA --> ACATACA
TTGCATG ---|insertion-2| CAAAGA --> ACATACA
end
subgraph reference-genome
ATCGAATTGCATGACATACA
end
```

Now, suppose we want to perform engineering on sample-1 and wish to carry out an insertion at the `CATA` position. How
can this position be addressed using the linear coordinate space? Because each chromosome's path is a different length due
to variants, there is no common reference frame for this position. To try and address this issue, the following
conventions are used.

* If the region is not altered, the base genome can be referenced explicitly. For example, within sample-1, `CATA` can
be referenced as position 15-19 as that region is not impacted by any variants.
* If the organism has a single copy of the genetic material, sample-1 can be accessed assuming all variants have been
incorporated. Thus, if the reference genome has an `A` at position 100 and there is a 1 basepair deletion at position 90,
in sample-1, that position can be referenced
* To refer to positions within alterations where no unambiguous coordinate system exists, `accessions` can be utilized
to provide named regions for modifications. Coordinates within accessions are relative to the accession itself.
Therefore, if we named the `CAAGA` insertion as `insertion-2`, the name `insertion-2:3-5` would refer to positions `AG`.

0 comments on commit 7c9f9fa

Please sign in to comment.