-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is Delta (B.1.167.2) a recombinant? #258
Comments
I wonder if we would need to present support additional evidence like Jackson et al. with the Alpha recombinants. They checked that the proposed parents of a candidate recombinant were co-circulating in the UK around the time the recombinant was sampled. |
That may not even be possible for Delta, given poor early sampling. |
I don't think that'll be possible here yeah, there's just not enough samples. |
Hmm, that maybe why no one has really claimed that Delta is a recombinant, knowing that sampling was simply not enough (it makes me uneasy as well). Likely the same reason that there have only been hypotheses and speculation about the origin of Omicron. |
The Alpha recombinants may be a special case, because the COG-UK did intense enough sampling in the UK to afford the analysis of Jackson et al. |
I think it's fine to say that the most parsimonious explanation of our data, given the model and samples we have are that 617.2 was a recombinant. We can lay out the alternative hypothesis (the match we get with no recombination) and just say that we chose the recombination explanation because we would have had to put a special case in for it to not be a recombinant in the ARG. The key thing here is that we choose our seed samples well, so let's focus on that. |
Thanks @hyanwong, sounds good. I'm running the additional HMM matches currently and will update with some results later on (and a notebook PR). |
[Note: edited this post to delete the content as there were some mistakes. See below for updated details] |
'C5184T' is found in both the solutions. |
I'm tempted to go with the non-recombinant solution to keep the early part of the history more tree-like. Not that I think it is more biologically plausible, but because it is just hard to tell considering that early sampling for Delta is not that great (so err on the safe side?). |
Whoops, scratch the bit above about the forward and reverse paths being consistent. It's a more complex story, I'll update later. |
I disagree, actually. I think the left hand side of Delta is very unlike the highly mutated node 118401 on the RHS, and I don't think better delta sampling would help that. |
Let's wait till we look at the reverse match a bit more closely, I made a mistake above, and it is different to the forward match. |
I've followed through the logic and I'm pretty sure I can hand-craft the delta origin above to make it substantially more parsimonious with the same topology. There is still only one recombinant but I only require 1 recurrent mutation (and no reversions). It would be good to talk this through with someone to check I have my logic right. If so, it's an interesting test case for how you might be able to improve the algorithm, as I think it's an artefact of the way that we can remove multiple mutations after getting the HMM cost. |
I've gone through the details a fair bit in #273 notebook and it's really not obvious to me what the right answer is. I think one thing to point out here is that we are either stating that 617.2 is a recombinant which pulls in some 617 mutations or it's entirely independent of 617 and 617.1. The no-recombination solution entirely bypasses 617. I thought this sounded a bit unlikely, but it seems that we only need to have a handful of recurrent mutations in order to do this, and that they happen to be at sites quite prone to recurring anyway. Here's some details: forward solutionleft_parent = 5299 (B.1.384?) right_parent=118401 (B.1.617?), breakpoint=25469 (interval = 25277-25469) 24 mutations reverse solutionleft_parent = 11294 (B.1.1) , right_parent=119685 (B.1.617?), breakpoint=22023 (note: not coinciding with interval above. Haven't computed interval for reverse match) 24 mutations. no mutation solutionparent = 2910 (B.1) 29 mutations Mutation overlapsAll of these sets of mutations differ quite a bit. There are 19 mutations shared by all three solutions. The forward and reverse recomb solutions differ by 4 mutations. In fwd but not reverse:
in reverse but not forward:
There's then 5 mutations that are in the no recombination and not in either of the recomb solutions:
Of these, three are in the characteristic mutations for 617 defined in cov-lineages/pango-designation#38 and two of those come from sites that have well above the average number of mutations (25469 and 28881). So: hmm. 🤔 |
As another data point, if we match the 617.2 sample against an ARG that doesn't contain any 617 or 617.1 sequences, we also get the "no mutation solution" above with num_mismatches=4. Given how much uncertainty there is about Delta's origins, I think the simplest thing is to just accept the current recombinant solution, and write up a section in the paper discussing the fact that this is one of a bunch of different potential solutions which we can't really distinguish without better data. |
I roughly agree with this. However, in my comment above (if I'm right) I think I could construct a recombinant solution that reduces the total number of mutations required by 2. The HMM isn't going to spot this, however. It's only found by post-hoc processing. |
Sure - this is potentially worth mentioning in the paper. I think we just have to go with "this was a reasonable best effort to put some verified sequences into the ARG at the right times, and this is what we got under the parameters we're using. A detailed analysis of the origins of Delta using the tools we have provided is an important avenue for future work." |
In most of our sc2ts ARGs, for example the one labelled "DELETEME_testzarr_v4-2021-11-26.ts.il.tsz", Delta (B.1.167.2 / AY.*) is a product of recombination between B.1.617.1 and another, deep branching lineage, e.g. B.1.384 (a US-only variant). Here's a visual:
If we are claiming that Delta is a recombinant, we should do some due diligence, and also make sure we are seeding the Deltas with a sensible combination of B.1.167, B.1.167.1, B.1.167.2, and (perhaps, if we can add one, B.1.167.3): see #226 .
The text was updated successfully, but these errors were encountered: