Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of inverse trigamma #415

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

nignatiadis
Copy link

This PR implements the function invtrigamma, which is the inverse of trigamma.

The implementation follows the Appendix "Inversion of Trigamma Function"
in "Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments"
by Gordon K. Smyth (2004). (The pdf is available, e.g., here.)
In implementing this I also tried to follow the invdigamma code.

@codecov
Copy link

codecov bot commented Dec 2, 2022

Codecov Report

Base: 93.63% // Head: 93.61% // Decreases project coverage by -0.02% ⚠️

Coverage data is based on head (cc445ea) compared to base (36c547b).
Patch coverage: 100.00% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #415      +/-   ##
==========================================
- Coverage   93.63%   93.61%   -0.03%     
==========================================
  Files          12       12              
  Lines        2767     2789      +22     
==========================================
+ Hits         2591     2611      +20     
- Misses        176      178       +2     
Flag Coverage Δ
unittests 93.61% <100.00%> (-0.03%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/SpecialFunctions.jl 100.00% <ø> (ø)
src/chainrules.jl 100.00% <100.00%> (ø)
src/gamma.jl 95.40% <100.00%> (+0.25%) ⬆️
src/beta_inc.jl 92.21% <0.00%> (-0.30%) ⬇️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

Copy link
Member

@ararslan ararslan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution and apologies for the delayed review! This is looking good overall, just some comments and questions. Also, I believe you'll need this (or something like it) in order for e.g. Float32 input to work:

diff --git a/src/gamma.jl b/src/gamma.jl
index 00fea79..ba7e5d8 100644
--- a/src/gamma.jl
+++ b/src/gamma.jl
@@ -515,7 +515,7 @@ for T in (Float16, Float32)
     @eval f64(x::Complex{$T}) = Complex{Float64}(x)
     @eval f64(x::$T) = Float64(x)

-    for f in (:_digamma, :_trigamma, :_zeta, :_eta, :_invdigamma)
+    for f in (:_digamma, :_trigamma, :_zeta, :_eta, :_invdigamma, :_invtrigamma)
         @eval $f(z::ComplexOrReal{$T}) = oftype(z, $f(f64(z)))
     end

invtrigamma(x)
Compute the inverse [`trigamma`](@ref) function of `x`.
"""
invtrigamma(y::Number) = _invtrigamma(float(y))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
invtrigamma(y::Number) = _invtrigamma(float(y))
invtrigamma(x::Number) = _invtrigamma(float(x))

It's kind of breaking my brain that what we're calling x and y are the reverse of how they're used in the paper you linked.

"""
invtrigamma(y::Number) = _invtrigamma(float(y))

function _invtrigamma(y::Float64)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
function _invtrigamma(y::Float64)
function _invtrigamma(x::Float64)

(See above)

Comment on lines +402 to +403
invtrigamma(x)
Compute the inverse [`trigamma`](@ref) function of `x`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
invtrigamma(x)
Compute the inverse [`trigamma`](@ref) function of `x`.
invtrigamma(x)
Compute the inverse of the [`trigamma`](@ref) function at the positive real value `x`.
This is the solution `y` to the equation `trigamma(y) = x`.

The line break is for consistency with other docstrings. I realize the text is modified from that of invdigamma but in this case I think it's worth noting the restriction on the domain of x. The added line is just for some extra clarity.

Comment on lines +414 to +422
if y <= 0
throw(DomainError(y, "Only positive `y` supported."))
end

if y > 1e7
return inv(sqrt(y))
elseif y < 1e-6
return inv(y)
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if y <= 0
throw(DomainError(y, "Only positive `y` supported."))
end
if y > 1e7
return inv(sqrt(y))
elseif y < 1e-6
return inv(y)
end
if x <= 0
throw(DomainError(x, "`x` must be positive."))
elseif x > 1e7
return inv(sqrt(x))
elseif x < 1e-6
return inv(x)
end

This brings the error message more in line with the text used in other DomainError messages in this package, e.g. from the Bessel functions. Condensing the conditional is just for brevity.

Comment on lines +424 to +439
x_old = inv(y) + 0.5
x_new = x_old

# Newton iteration
δ = Inf
iteration = 0
while δ > 1e-8 && iteration <= 25
iteration += 1
f_x_old = trigamma(x_old)
δx = f_x_old*(1-f_x_old/y) / polygamma(2, x_old)
x_new = x_old + δx
δ = - δx / x_new
x_old = x_new
end

return x_new
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
x_old = inv(y) + 0.5
x_new = x_old
# Newton iteration
δ = Inf
iteration = 0
while δ > 1e-8 && iteration <= 25
iteration += 1
f_x_old = trigamma(x_old)
δx = f_x_old*(1-f_x_old/y) / polygamma(2, x_old)
x_new = x_old + δx
δ = - δx / x_new
x_old = x_new
end
return x_new
# Newton iteration
invx = inv(x)
y_new = y_old = invx + 0.5
for _ in 1:26
ψ′ = trigamma(y_old)
δ = ψ′ * (1 - ψ′ * invx) / polygamma(2, y_old)
y_new = y_old + δ
-δ / y_new < 1e-8 && break
y_old = y_new
end
return y_new

AFAICT this is equivalent and more directly maps to the paper, as it avoids introducing a second step size variable. You can also avoid dividing by the input at every iteration by inverting once then multiplying by the inverse.

I assume the number of iterations was chosen to match invdigamma. Do you know whether the algorithm generally converges within the given number of iterations and under what circumstances it may not? When it doesn't, how inaccurate is the result?

Comment on lines +441 to +442


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

Just some excess space

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants