-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementation of inverse trigamma #415
base: master
Are you sure you want to change the base?
Conversation
Codecov ReportBase: 93.63% // Head: 93.61% // Decreases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## master #415 +/- ##
==========================================
- Coverage 93.63% 93.61% -0.03%
==========================================
Files 12 12
Lines 2767 2789 +22
==========================================
+ Hits 2591 2611 +20
- Misses 176 178 +2
Flags with carried forward coverage won't be shown. Click here to find out more.
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution and apologies for the delayed review! This is looking good overall, just some comments and questions. Also, I believe you'll need this (or something like it) in order for e.g. Float32
input to work:
diff --git a/src/gamma.jl b/src/gamma.jl
index 00fea79..ba7e5d8 100644
--- a/src/gamma.jl
+++ b/src/gamma.jl
@@ -515,7 +515,7 @@ for T in (Float16, Float32)
@eval f64(x::Complex{$T}) = Complex{Float64}(x)
@eval f64(x::$T) = Float64(x)
- for f in (:_digamma, :_trigamma, :_zeta, :_eta, :_invdigamma)
+ for f in (:_digamma, :_trigamma, :_zeta, :_eta, :_invdigamma, :_invtrigamma)
@eval $f(z::ComplexOrReal{$T}) = oftype(z, $f(f64(z)))
end
invtrigamma(x) | ||
Compute the inverse [`trigamma`](@ref) function of `x`. | ||
""" | ||
invtrigamma(y::Number) = _invtrigamma(float(y)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
invtrigamma(y::Number) = _invtrigamma(float(y)) | |
invtrigamma(x::Number) = _invtrigamma(float(x)) |
It's kind of breaking my brain that what we're calling x
and y
are the reverse of how they're used in the paper you linked.
""" | ||
invtrigamma(y::Number) = _invtrigamma(float(y)) | ||
|
||
function _invtrigamma(y::Float64) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
function _invtrigamma(y::Float64) | |
function _invtrigamma(x::Float64) |
(See above)
invtrigamma(x) | ||
Compute the inverse [`trigamma`](@ref) function of `x`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
invtrigamma(x) | |
Compute the inverse [`trigamma`](@ref) function of `x`. | |
invtrigamma(x) | |
Compute the inverse of the [`trigamma`](@ref) function at the positive real value `x`. | |
This is the solution `y` to the equation `trigamma(y) = x`. |
The line break is for consistency with other docstrings. I realize the text is modified from that of invdigamma
but in this case I think it's worth noting the restriction on the domain of x
. The added line is just for some extra clarity.
if y <= 0 | ||
throw(DomainError(y, "Only positive `y` supported.")) | ||
end | ||
|
||
if y > 1e7 | ||
return inv(sqrt(y)) | ||
elseif y < 1e-6 | ||
return inv(y) | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if y <= 0 | |
throw(DomainError(y, "Only positive `y` supported.")) | |
end | |
if y > 1e7 | |
return inv(sqrt(y)) | |
elseif y < 1e-6 | |
return inv(y) | |
end | |
if x <= 0 | |
throw(DomainError(x, "`x` must be positive.")) | |
elseif x > 1e7 | |
return inv(sqrt(x)) | |
elseif x < 1e-6 | |
return inv(x) | |
end |
This brings the error message more in line with the text used in other DomainError
messages in this package, e.g. from the Bessel functions. Condensing the conditional is just for brevity.
x_old = inv(y) + 0.5 | ||
x_new = x_old | ||
|
||
# Newton iteration | ||
δ = Inf | ||
iteration = 0 | ||
while δ > 1e-8 && iteration <= 25 | ||
iteration += 1 | ||
f_x_old = trigamma(x_old) | ||
δx = f_x_old*(1-f_x_old/y) / polygamma(2, x_old) | ||
x_new = x_old + δx | ||
δ = - δx / x_new | ||
x_old = x_new | ||
end | ||
|
||
return x_new |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
x_old = inv(y) + 0.5 | |
x_new = x_old | |
# Newton iteration | |
δ = Inf | |
iteration = 0 | |
while δ > 1e-8 && iteration <= 25 | |
iteration += 1 | |
f_x_old = trigamma(x_old) | |
δx = f_x_old*(1-f_x_old/y) / polygamma(2, x_old) | |
x_new = x_old + δx | |
δ = - δx / x_new | |
x_old = x_new | |
end | |
return x_new | |
# Newton iteration | |
invx = inv(x) | |
y_new = y_old = invx + 0.5 | |
for _ in 1:26 | |
ψ′ = trigamma(y_old) | |
δ = ψ′ * (1 - ψ′ * invx) / polygamma(2, y_old) | |
y_new = y_old + δ | |
-δ / y_new < 1e-8 && break | |
y_old = y_new | |
end | |
return y_new |
AFAICT this is equivalent and more directly maps to the paper, as it avoids introducing a second step size variable. You can also avoid dividing by the input at every iteration by inverting once then multiplying by the inverse.
I assume the number of iterations was chosen to match invdigamma
. Do you know whether the algorithm generally converges within the given number of iterations and under what circumstances it may not? When it doesn't, how inaccurate is the result?
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some excess space
This PR implements the function
invtrigamma
, which is the inverse oftrigamma
.The implementation follows the Appendix "Inversion of Trigamma Function"
in "Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments"
by Gordon K. Smyth (2004). (The pdf is available, e.g., here.)
In implementing this I also tried to follow the
invdigamma
code.