-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
possible bug in ace #63
Comments
Hello, (XY)' = Y'X' As a side note, R treats a vector either as a 1-column matrix or as a 1-row matrix depending on the order of the terms in a matrix product: R> (P <- matrix(c(.9, .1, .1, .9), 2))
[,1] [,2]
[1,] 0.9 0.1
[2,] 0.1 0.9
R> x <- c(.1, .9)
R> x %*% P
[,1] [,2]
[1,] 0.18 0.82
R> P %*% x
[,1]
[1,] 0.18
[2,] 0.82 If you observed differences, they could be caused by the order of terms. Here's a small example based on the above equality: R> p <- 9
R> X <- matrix(runif(p^2), p)
R> Y <- matrix(runif(p^2), p)
R> mean((t(Y) %*% t(X)) - t(X %*% Y))
[1] -1.918904e-17 Generally if p is odd there is this kind of deviation, and for p even > 60. But if you observe more substantial differences, please report them. |
Hi, thank you! Yep, I did mean matrix product, not dot product, sorry! I think I was more wondering about the case when P is not symmetrical (which I think can happen when the rate model is "ARD"). lik.anc[des1, ] %*% P If in the above code lik.anc[des1, ] is X and P is Y, then I think I had changed it from XY to (YX)' = X'Y' which would only be the same as XY when Y = Y' or when P is symmetric. Thanks again! |
Hi, lik.anc[des1, ] <- (tmp %*% P) * lik.anc[des1, ] |
Hi, thank you! I don't think the correction needs to be applied to the next line. Since in this line, you are updating the elements of the vector lik.anc[des1,] based on the value of lik.anc[anc,], I think it is correct to multiply by the columns of P. This way the value of lik.anc[des1, i] (the likelihood that des1 is in state i) is based off column i of P which represents the probability of transitioning from different possible states at the anc node to a fixed state i of des1. In the previous line for which I suggested the correction, each element of tmp, tmp[i], corresponds to lik.anc[anc, i] since lik.anc[anc,] is in the numerator. However, it is using the same P matrix in which the states in the row correspond to the state of the anc and the states in the column correspond to the states of the des1. (P[i,j] is the probability of transitioning from state i at the anc to state j at the des1). This is why I think it makes sense to multiply by the rows. This would be like fixing the anc in state i and summing over the different possible states of des1. Sorry for any confusion and please correct me if you have a different understanding! Thank you! |
You are correct (I tried to remove any doubt in my mind replacing I tried your fix (from your first post) with the data in Line 283 in 9ce7240
to this one:
and did the same on line 290 with data(bird.orders)
x <- factor(c(rep(0, 5), rep(1, 18)))
ans <- ace(x, bird.orders, type = "d", model = "ARD")
ans$lik.anc Is that expected? |
Hi, sorry for the late reply. I stepped through the example to see where the NaNs are coming from. |
Hello, I think I may have found a bug in ace. This bug comes up when calculating ancestral likelihoods with type = "discrete" and marginal = FALSE.
I believe the bug occurs in the following line of code from the second pass of the double pass algorithm:
tmp <- lik.anc[anc, ] / (lik.anc[des1, ] %*% P)
I noticed that the dot product in the denominator uses the probability of transitioning from the state at the descendant node to the state at the ancestral node. However, I think it should use the probability of transitioning from the state at the ancestral node to the state at the descendant node.
Please correct me if I am wrong, but my reasoning is that the dot product used above is the opposite direction of the dot product in the forward pass, and though the direction of traversal through the tree changes, I don't think the dot product should be flipped because the direction of evolution is the same. This would only impact the results when the rate model is asymmetrical (e.g. model = "ARD").
I modified the line of code by changing the direction of the dot product and taking the transpose, as follows:
tmp <- t(lik.anc[anc, ] / (P %*% lik.anc[des1, ]))
I tested this change on a very small tree for which I calculated ancestral likelihoods (summing over all possible states at every other node) by brute force to obtain an exact answer. With the small modification I made, ace was able to reproduce the exact answer whereas before it was slightly off.
Additionally, writing out the math that the code is doing, I believe that flipping the direction of the dot product gives an equation that can be rearranged to give the exact answer for the ancestral likelihood from Yang et. al. (1995).
Thank you so much. Please let me know if you think I have made any mistakes or misinterpretations.
The text was updated successfully, but these errors were encountered: