-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Position of 11288-11296 deletion #256
Comments
It's also a 3 way recomb that's just snipping out the section of genome with the deletion, isn't it? Seems pretty unlikely 👍 |
So, repeating the same test, I added deletion-mutations at the BA.1 origin node in the pre-BA.2-treeseq, using the same positions as in BA.2 (which are the same as Alpha). I then tried matching sample SRR17712694, which is the nearest BA.2 sample to the BA.2 origination node in the current ARG. As I expected, this is more equivocal than the Alpha / BA.1 case: we only require weighting the deletion as equivalent to 3 or more SNPs to find that there is a single origin. In all settings, the BA.2 sample is treated as a recombinant of B.1 and BA.1.1.
The notebook is in #257 |
Hmm, that might be true for Omicron vs Alpha, but I'm not sure about BA.1 and BA.2. The question there is whether a 9bp deletion should be weighted as an equivalent to 3 SNPs. It seems plausible to me that a specific 9bp deletion is rarer than getting 3 SNPs, I think? I don;t know what @IsobelGuthrie thinks? |
Amazing. The key thing is that the recombinant origins are stable. We can discuss the uncertainty around the deletions, but it's totally fine to say a full investigation requires future work. |
Our comments are crossing each other here, I was talking about the first example. I just commented above on the second. Fundamentally we don't really care about the deletions for this paper once they don't muck up the arg toplogy. |
Ah, sorry for the miscommunication
Here's a more detailed breakdown. It appears as if we do get a (slightly) different topology from forward & backward passes, even without looking at deletions (B.1+BA.1.1, vs B.1.1+BA.1.1)
|
That's good to know, thanks. Presumably what's happening here is that there are quite a few close solutions to the HMM and there's a bit of uncertainty about where the left, very deep, parent comes from. This seems quite reasonable, and reassuring in some ways. This may clear up a bit in the next version where we get rid of the top 100 recurrent mutation sites. Such a long branch is going to have a bunch of unseen mutations on these sites, which is perhaps confusing things. Overall, this is extremely encouraging to me! |
Opening up a new issue for this, following on from @jeromekelleher 's suggestion in #249 (comment), and to avoid bloating that issue thread.
I tried out running the sc2ts matching on a "synthetic" BA.1 haplotype (ERR7602255), which had the deletion shifted to the right by 5bp, to align with the Alpha deletion. To make this match against an alpha strain in the deletion region, you need to treat the deletion as being 5 times most costly than a single mutation: i.e. you treat it as 5 separate SNPS, match using the following alignments.
In this case, BA.1 becomes a recombinant with the deleted section coming from an Alpha strain:
If you treat a deletion as 4 separate SNPs, we revert to assuming that BA.1 is not a recombinant, but a direct descendant and the deletion(s) have happened independently.
I think it's more likely that the deletions in BA.1 and BA.2 have the same origin, so I'll repeat the test on those.
The text was updated successfully, but these errors were encountered: