Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Future work: weighting HMM matching by mutational probability #431

Open
hyanwong opened this issue Dec 4, 2024 · 2 comments
Open

Future work: weighting HMM matching by mutational probability #431

hyanwong opened this issue Dec 4, 2024 · 2 comments

Comments

@hyanwong
Copy link
Contributor

hyanwong commented Dec 4, 2024

Too complex to implement for the final paper, but it would probably be quite easy to weight the HMM to account for the probability of different sorts of mutation. I would imagine that we would keep the current weighting of e.g. 5 mutations to 1 recombination, but weight the mutations such that some counted more than a unit contribution, and some counted less, with the mean being 1. Then I think we wouldn't have to tweak the cutoffs again.

We could do this iteratively: we could use the find_problematic ARG as a first pass to estimate the probabilities of each type of SNP mutation, then weight the HMM using those probabilities.

I was motivated to think of this because of the large range of probabilities of the different SARS-CoV2 mutation types in https://academic.oup.com/mbe/article/40/4/msad085/7113660:

Screenshot 2024-12-04 at 12 40 59

We see 40x more C->T mutations than e.g. G->C or C->G.

@jeromekelleher
Copy link
Owner

I think this would have to go into the HMM implementation itself to work properly, but yes, definitely a worthwhile refinement for the future.

@szhan
Copy link
Contributor

szhan commented Dec 5, 2024

Could easily create some tests in https://github.com/astheeggeggs/lshmm before implementing? I think we briefly talked about implementing this extension on and off before.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants