-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix total mean calculation in ANOVA #273
base: master
Are you sure you want to change the base?
Conversation
Codecov Report
@@ Coverage Diff @@
## master #273 +/- ##
=======================================
Coverage 93.65% 93.65%
=======================================
Files 28 28
Lines 1717 1717
=======================================
Hits 1608 1608
Misses 109 109
Continue to review full report at Codecov.
|
@@ -60,7 +60,7 @@ end | |||
function anova(scores::AbstractVector{<:Real}...) | |||
Nᵢ = [length(g) for g in scores] | |||
Z̄ᵢ = mean.(scores) | |||
Z̄ = mean(Z̄ᵢ) | |||
Z̄ = sum(Iterators.flatten(scores))/sum(Nᵢ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, one could use
Z̄ = sum(Iterators.flatten(scores))/sum(Nᵢ) | |
Z̄ = dot(Z̄ᵢ, Nᵢ) / sum(Nᵢ) |
In a quick benchmark this seemed to be similarly fast, and usually even marginally faster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'd probably need a very large dataset to see the difference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually a tiny dataset is sufficient. Of course, the difference is very small but it seemed to be consistent:
julia> using Statistics, LinearAlgebra, BenchmarkTools
julia> function f(scores::AbstractVector{<:Real}...)
Nᵢ = [length(g) for g in scores]
Z̄ᵢ = mean.(scores)
Z̄ = sum(Iterators.flatten(scores)) / sum(Nᵢ)
return Nᵢ, Z̄ᵢ, Z̄
end
f (generic function with 1 method)
julia> function g(scores::AbstractVector{<:Real}...)
Nᵢ = [length(g) for g in scores]
Z̄ᵢ = mean.(scores)
Z̄ = dot(Z̄ᵢ, Nᵢ) / sum(Nᵢ)
return Nᵢ, Z̄ᵢ, Z̄
end
g (generic function with 1 method)
julia> scores = map(n -> rand(n), (3, 9, 12));
julia> @btime f($(scores...));
33.893 ns (1 allocation: 64 bytes)
julia> @btime g($(scores...));
33.687 ns (1 allocation: 64 bytes)
julia> scores = map(n -> rand(n), (3, 9, 12, 134));
julia> @btime f($(scores...));
33.944 ns (1 allocation: 64 bytes)
julia> @btime g($(scores...));
33.681 ns (1 allocation: 64 bytes)
julia> scores = map(n -> rand(n), (3, 9, 12, 134, 12, 4134, 1231, 122, 12, 1, 23, 58));
julia> @btime f($(scores...));
34.070 ns (1 allocation: 64 bytes)
julia> @btime g($(scores...));
33.781 ns (1 allocation: 64 bytes)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like something else from Z evaluation dominates in the function. But nanoseconds 😏.
Could you also add a test that failed before the PR? |
Fix for #242