Add SkewNormal distribution #1104

azev77 · 2020-04-23T16:11:31Z

Add SkewNormal distribution based on:

Add Skewnormal distribution based on: * [Skew normal distribution on Wikipedia](https://en.wikipedia.org/wiki/Skew_normal_distribution) * [Discourse](https://discourse.julialang.org/t/skew-normal-distribution/21549/7) * [SkewDist.jl](https://github.com/STOR-i/SkewDist.jl)

codecov-io · 2020-05-09T22:50:58Z

Codecov Report

Merging #1104 into master will decrease coverage by 0.32%.
The diff coverage is 0.00%.

@@            Coverage Diff             @@
##           master    #1104      +/-   ##
==========================================
- Coverage   79.54%   79.21%   -0.33%     
==========================================
  Files         113      114       +1     
  Lines        5787     5811      +24     
==========================================
  Hits         4603     4603              
- Misses       1184     1208      +24

Impacted Files	Coverage Δ
src/univariate/continuous/skewnormal.jl	`0.00% <0.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2d7f203...3f40d9c. Read the comment docs.

johnczito · 2020-05-10T03:35:41Z

Thanks! This would be a good add. My thoughts:

I'm not sure about the fact that this implementation comes from a discourse post. At the very least, the source code must include much more explicit attribution.
Needs tests. It would be good to include tests where you check against archived output from Azzalini's sn package. Note that the unit tests for the univariates have a fairly standardized format that you can check out here: Distributions.jl/test/ref/ and Distributions.jl/test/univariates.jl. If it's straightforward to ultimately cast the tests in that mold, that might be nice.
Since this is a well-studied univariate, it should have a cdf method as soon as possible. In the (medium-term?) that would require that someone merge an implementation of Owen's T function into StatsFuns.jl.

azev77 · 2020-05-10T05:08:15Z

@johnczito thanks for responding.
The PR is from wikipedia, @maximerischard's discourse post, @fairbrot's Multivariate pkg.
1 I opened issue #1101 weeks before the PR, where I gave a shoutout to @maximerischard & @fairbrot
2 @fairbrot encouraged this.
3 I haven't heard from @maximerischard. I emailed him on discourse.
4 I'm happy to add more explicit attribution
5 happy to add tests
Let me know anything else I can do.

maximerischard · 2020-05-10T09:16:24Z

Sorry I didn't see your discourse message. I'm very happy that you found my code snippet useful and for some of the code to find its way to Distributions.jl.

matbesancon · 2020-05-23T14:40:29Z

src/univariate/continuous/skewnormal.jl

+    return d.ξ + d.ω * sign(u0) * u1
+end
+
+## Fitting


the commented lines should be removed, they seem to come from another file

matbesancon · 2020-05-23T14:41:10Z

this looks like a cool addition to Distributions. +1 to add tests

codecov-commenter · 2020-06-29T03:36:43Z

Codecov Report

Merging #1104 into master will increase coverage by 0.16%.
The diff coverage is 64.51%.

@@            Coverage Diff             @@
##           master    #1104      +/-   ##
==========================================
+ Coverage   79.54%   79.70%   +0.16%     
==========================================
  Files         113      115       +2     
  Lines        5787     5958     +171     
==========================================
+ Hits         4603     4749     +146     
- Misses       1184     1209      +25

Impacted Files	Coverage Δ
src/Distributions.jl	`100.00% <ø> (ø)`
src/univariates.jl	`61.62% <ø> (-3.63%)`	⬇️
src/univariate/continuous/skewnormal.jl	`64.51% <64.51%> (ø)`
src/multivariate/mvnormal.jl	`70.94% <0.00%> (-0.65%)`	⬇️
src/matrix/matrixreshaped.jl	`90.47% <0.00%> (ø)`
src/truncated/normal.jl	`84.61% <0.00%> (+0.26%)`	⬆️
src/mixtures/mixturemodel.jl	`77.19% <0.00%> (+0.54%)`	⬆️
src/multivariates.jl	`79.41% <0.00%> (+1.63%)`	⬆️
... and 5 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2d7f203...8b9a254. Read the comment docs.

azev77 · 2020-06-29T04:33:47Z

@johnczito @matbesancon @fairbrot @maximerischard
I fine-tuned the code & added tests. Please have a look.

maximerischard · 2020-06-29T21:14:25Z

Looks great! I noticed there weren't any tests for rand, which also appears to be lacking for most distributions in the package. As a sanity check I compared a histogram of a million samples to the PDF, and it all looks good.

using PyPlot: PyPlot; plt=PyPlot
using Distributions
d1 = SkewNormal(1, 2, 3)
samples = rand(d1, 10^6);
plt.hist(samples; density=true, bins=1000)
let
    xx = range(-3.0, stop=8.0, length=1000)
    plt.plot(xx, pdf.(d1, xx))
end

I then also compared the empirical to the theoretical moments:

using Statistics
@show mean(samples), mean(d1)
@show std(samples), std(d1)
@show skewness(samples), skewness(d1)
@show kurtosis(samples), kurtosis(d1)
xx = range(-3.0, stop=8.0, length=100000)
@show xx[argmax(pdf.(d1, xx))], mode(d1)

(mean(samples), mean(d1)) = (2.5142092281272057, 2.513879513212096)
(std(samples), std(d1)) = (1.3088404926822659, 1.3069693261422424)
(skewness(samples), skewness(d1)) = (0.6675144945558559, 0.6670235701524081)
(kurtosis(samples), kurtosis(d1)) = (0.5017438999403141, 0.5097701294494131)
(xx[argmax(pdf.(d1, xx))], mode(d1)) = (1.946749467494675, 2.1058875860443687)

Again, all looking good except the mode is slightly off since it is an approximation (as is made clear in the comments in the code).

johnczito · 2020-06-29T22:32:07Z

I noticed there weren't any tests for rand, which also appears to be lacking for most distributions in the package.

Yeah, it's a tricky one, as discussed in #143 and #1142. Anything would have some positive failure rate, even if the code is error free. I include some KS testing in the unit tests for the matrix variates, along with crude checks for the sample mean and covariance, but they're really just regression tests. They happen not to fail for this seed. If they start failing, it's just something to look into.

azev77 · 2020-07-03T00:37:39Z

@johnczito @matbesancon
Are we ready to roll?

johnczito · 2020-07-03T01:32:37Z

Looks good to me. Can you add tests checking that it agrees with Normal in the α = 0 case?

azev77 · 2020-07-03T05:50:58Z

@johnczito @matbesancon
Great idea. I added tests checking that Normal is nested in SkewNormal when α = 0.
Are we ready to roll?

johnczito · 2020-07-03T15:03:40Z

Sorry, for the sake of coverage, can you check every method that Normal and SkewNormal share? So skewness, kurtosis, mgf, and cf as well?

azev77 · 2020-07-03T18:23:34Z

@johnczito done!
Ready to roll?

azev77 · 2020-07-09T03:09:19Z

@johnczito please let me know if there is anything else I can do

johnczito · 2020-07-09T03:48:58Z

Looks good to me. I'm not sure what the path forward is on the cdf. Such as it is, this would be one of only two or three univariates that don't provide it, which means you can't really compute probabilities. I'm not sure how central folks think that functionality is.

azev77 · 2020-07-09T04:29:07Z

@johnczito thanks for approving!

there are several other properties I would like to add to SkewNormal in the future & I will think more about the cdf & possibly contributing to Owen's T
in mean time SkewDist.jl does
cdf(dist::SkewNormal, x::Real)=quadgk(t->pdf(dist,t), -Inf, x)[1]
quantile(dist::SkewNormal, β::Float64)=newton(x->cdf(dist,x) - β, dist.ξ)
I'm new to PRs & Github, does this mean the current PR is approved & will be integrated in the package?

johnczito · 2020-07-09T18:53:11Z

@mschauer thoughts on merging this without a cdf method?

azev77 · 2020-07-16T04:53:11Z

@mschauer do you think we can merge this PR?

src/univariate/continuous/skewnormal.jl

mschauer · 2020-07-17T16:58:56Z

src/univariate/continuous/skewnormal.jl

+var(d::SkewNormal) = abs2(d.ω)*(1-mean_z(d)^2)
+std(d::SkewNormal) = √var(d)
+skewness(d::SkewNormal) = (4-π)/2 * mean_z(d)^3 / (1-mean_z(d)^2)^(3/2)
+kurtosis(d::SkewNormal) = 2*(π-3)*( (delta(d)* sqrt(2/π) )^4 / (1-2*(delta(d)^2)/π)^2) 


Random whitespace

mschauer · 2020-07-17T17:03:30Z

test/skewnormal.jl

+    @test pdf(d3, 3.3) == Distributions.pdf(d4, 3.3)
+    @test pdf.(d3, 1:3) == Distributions.pdf.(d4, 1:3)
+    a = mean(d3), var(d3), std(d3)
+    #b = mean(d4), var(d4), std(d4)


Suggested change

#b = mean(d4), var(d4), std(d4)

mschauer · 2020-07-17T17:03:57Z

test/skewnormal.jl

+    a = mean(d3), var(d3), std(d3)
+    #b = mean(d4), var(d4), std(d4)
+    b = Distributions.mean(d4), Distributions.var(d4), Distributions.std(d4)
+    @test  a == b


Suggested change

@test a == b

@test a == b

azev77 · 2020-07-17T21:14:30Z

@mschauer
Thank you. Please have a look at the updated code.

azev77 · 2020-08-05T04:33:55Z

@mschauer are we ready to roll?

matbesancon · 2020-08-05T06:09:43Z

seems travis is still failing

azev77 mentioned this pull request Apr 24, 2020

Porting to Distributions.jl? STOR-i/SkewDist.jl#3

Open

Update skewnormal.jl

3f40d9c

matbesancon reviewed May 23, 2020

View reviewed changes

Update skewnormal.jl

3678a83

azev77 added 4 commits June 28, 2020 23:46

Update Distributions.jl

d0022ba

Update univariates.jl

0145677

Update runtests.jl

dfcaf7b

Create skewnormal.jl

333459f

azev77 mentioned this pull request Jun 29, 2020

Missing distributions list #124

Open

68 tasks

Update skewnormal.jl

56e4d4c

maximerischard mentioned this pull request Jun 29, 2020

Feature request: Owen's t function JuliaStats/StatsFuns.jl#99

Open

Update skewnormal.jl

40a4d67

matbesancon approved these changes Jul 3, 2020

View reviewed changes

Update skewnormal.jl

e6c0c94

johnczito approved these changes Jul 9, 2020

View reviewed changes

mschauer reviewed Jul 17, 2020

View reviewed changes

src/univariate/continuous/skewnormal.jl Outdated Show resolved Hide resolved

mschauer reviewed Jul 17, 2020

View reviewed changes

azev77 added 3 commits July 17, 2020 16:59

Update skewnormal.jl

e08e7f6

Update skewnormal.jl

81cad7c

Update skewnormal.jl

8b9a254

mschauer approved these changes Aug 8, 2020

View reviewed changes

mschauer merged commit 3f4856c into JuliaStats:master Aug 8, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add SkewNormal distribution #1104

Add SkewNormal distribution #1104

azev77 commented Apr 23, 2020

codecov-io commented May 9, 2020 •

edited

Loading

johnczito commented May 10, 2020

azev77 commented May 10, 2020

maximerischard commented May 10, 2020

matbesancon May 23, 2020

matbesancon commented May 23, 2020

codecov-commenter commented Jun 29, 2020 •

edited

Loading

azev77 commented Jun 29, 2020

maximerischard commented Jun 29, 2020

johnczito commented Jun 29, 2020 •

edited

Loading

azev77 commented Jul 3, 2020

johnczito commented Jul 3, 2020 •

edited

Loading

azev77 commented Jul 3, 2020 •

edited

Loading

johnczito commented Jul 3, 2020

azev77 commented Jul 3, 2020

azev77 commented Jul 9, 2020

johnczito commented Jul 9, 2020

azev77 commented Jul 9, 2020

johnczito commented Jul 9, 2020

azev77 commented Jul 16, 2020

mschauer Jul 17, 2020

mschauer Jul 17, 2020

mschauer Jul 17, 2020

azev77 commented Jul 17, 2020

azev77 commented Aug 5, 2020

matbesancon commented Aug 5, 2020

Add SkewNormal distribution #1104

Add SkewNormal distribution #1104

Conversation

azev77 commented Apr 23, 2020

codecov-io commented May 9, 2020 • edited Loading

Codecov Report

johnczito commented May 10, 2020

azev77 commented May 10, 2020

maximerischard commented May 10, 2020

matbesancon May 23, 2020

Choose a reason for hiding this comment

matbesancon commented May 23, 2020

codecov-commenter commented Jun 29, 2020 • edited Loading

Codecov Report

azev77 commented Jun 29, 2020

maximerischard commented Jun 29, 2020

johnczito commented Jun 29, 2020 • edited Loading

azev77 commented Jul 3, 2020

johnczito commented Jul 3, 2020 • edited Loading

azev77 commented Jul 3, 2020 • edited Loading

johnczito commented Jul 3, 2020

azev77 commented Jul 3, 2020

azev77 commented Jul 9, 2020

johnczito commented Jul 9, 2020

azev77 commented Jul 9, 2020

johnczito commented Jul 9, 2020

azev77 commented Jul 16, 2020

mschauer Jul 17, 2020

Choose a reason for hiding this comment

mschauer Jul 17, 2020

Choose a reason for hiding this comment

mschauer Jul 17, 2020

Choose a reason for hiding this comment

azev77 commented Jul 17, 2020

azev77 commented Aug 5, 2020

matbesancon commented Aug 5, 2020

codecov-io commented May 9, 2020 •

edited

Loading

codecov-commenter commented Jun 29, 2020 •

edited

Loading

johnczito commented Jun 29, 2020 •

edited

Loading

johnczito commented Jul 3, 2020 •

edited

Loading

azev77 commented Jul 3, 2020 •

edited

Loading