forked from astheeggeggs/lshmm
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
1 changed file
with
15 additions
and
7 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,19 +1,27 @@ | ||
# lshmm | ||
This is a Python library for prototyping and testing implementations of algorithms using the Li & Stephens (2003) HMM. | ||
**lshmm** is a Python library for prototyping, experimenting, and testing implementations of algorithms using the Li & Stephens (2003) Hidden Markov Model. | ||
|
||
## Usage | ||
|
||
### Inputs | ||
Reference panel contains sample and/or (partial) ancestral haplotypes. | ||
#### Data | ||
* Sample and/or ancestral haplotypes comprising a reference panel. | ||
* Query haplotypes. | ||
|
||
### Demo | ||
Forwards algorithm | ||
Backwards algorithm | ||
In the haploid mode, the alleles in haplotypes can be represented by any integer value (besides `-1` and `-2`, which are special values). In the diploid mode, the genotypes (encoded as allele dosages) can be `0` (homozygous for the reference allele), `1` (heterozygous for the alternative allele), or `2` (homozygous for the alternative allele). Currently, multiallelic sites are supported only in the haploid mode, but not the diploid mode. | ||
|
||
Note that there are two special values `NONCOPY` and `MISSING`. `NONCOPY` (or `-2`) represent non-copiable states, and can only be found in partial ancestral haplotypes in the reference panel. `MISSING` (or `-1`) representing missing data, and can be found only in query haplotypes. | ||
|
||
#### Parameters | ||
* Per-site recombination probabilities. | ||
* Per-site mutation probabilities. | ||
|
||
### Algorithms | ||
Viterbi algorithm | ||
Log-likelihood evaluation of a copying path | ||
|
||
### Features | ||
* Scaling of mutation rate by the number of distinct alleles per site. | ||
* Non-copiable allelic state in the reference panel (`NONCOPY`). | ||
* Missing allelic state in the query (`MISSING`). | ||
* Non-copiable state in the reference panel (`NONCOPY`). | ||
* Missing state in the query (`MISSING`). | ||
* Multiallelic sites. |