Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Subtypes of HypothesisTest can have optional fields :tail and :alpha #100

Open
wants to merge 19 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 18 commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
ef60cb7
Subtypes of HypothesisTest have an optional tail field
mirkobunse Jun 2, 2017
5f82c9c
Subtypes of HypothesisTest have an optional alpha field
mirkobunse Jun 2, 2017
14f15d1
Indent all labels of details in Base.show by the same length
mirkobunse Jun 2, 2017
c3363a1
get_tail and get_alpha reuse the same function getting a field value …
mirkobunse Jun 2, 2017
596444f
Fixed test run by providing optional tail parameter to all pvalue fun…
mirkobunse Jun 2, 2017
15282b5
Confidence interval in Base.show respects alpha parameter
mirkobunse Jun 6, 2017
8007b1e
Optional tail parameter of pvalue is initialized with hard-coded defa…
mirkobunse Jun 14, 2017
e63c378
Pretty-printing with fixed detail line length
mirkobunse Jun 14, 2017
c882e25
Fallback-function approach to tail and alpha.
mirkobunse Jun 14, 2017
c1ac852
Redirect deprecated default_tail to tail
mirkobunse Jun 16, 2017
36a9e76
Fixed Base.show for the case of tail being no keyword argument of the…
mirkobunse Jun 16, 2017
e6bc7a6
Added comments describing how to test deprecation forwarding.
mirkobunse Jun 16, 2017
bda80a7
Define IO Buffer once to ease temporary redirection to STDOUT
mirkobunse Jun 16, 2017
480e152
Added unit tests for tail parameter
mirkobunse Jun 16, 2017
96eb8cf
Updated doc of t-tests
mirkobunse Jun 16, 2017
7f66aa8
Merge remote-tracking branch 'origin/master' into tail_alpha
mirkobunse Jun 17, 2017
86626ec
Uncommented julia:nightly because not even specified by REQUIRE file
mirkobunse Jun 17, 2017
1938ffb
Removed named tail parameter from the remaining pvalue functions
mirkobunse Jun 18, 2017
5d38a50
Realised requested changes. Only missing change: tail and alpha as fi…
mirkobunse Jun 30, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ os:
- osx
julia:
- 0.5
- nightly
# - nightly # The REQUIRE file states that only julia 0.5 is supported
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please leave this as it was, it's not an issue if the CI fails for this version (we're not robots).

notifications:
email: false
# uncomment the following lines to override the default test script
Expand Down
7 changes: 4 additions & 3 deletions doc/api/confint.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,12 @@
Confidence Interval
==============================================

.. function:: confint(test::HypothesisTest, alpha=0.05; tail=:both)
.. function:: confint(test::HypothesisTest, alpha=alpha(test); tail=tail(test))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alpha and tail should be exported if they are shown here.


Compute a confidence interval C with coverage 1-``alpha``.
Compute a confidence interval C with coverage 1-``alpha``
(``alpha=0.05`` is the default for all tests).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0.05 is the default for all tests unless overridden when creating the object.


If ``tail`` is ``:both`` (default), then a two-sided confidence
If ``tail`` is ``:both`` (default for most tests), then a two-sided confidence
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"most" isn't precise enough: give a list of tests for which it is the case below. Same remark for pvalue.

interval is returned. If ``tail`` is ``:left`` or
``:right``, then a one-sided confidence interval is returned

Expand Down
4 changes: 2 additions & 2 deletions doc/api/pvalue.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,11 @@
p-value
==============================================

.. function:: pvalue(test::HypothesisTest; tail=:both)
.. function:: pvalue(test::HypothesisTest; tail=tail(test))

Compute the p-value for a given significance test.

If ``tail`` is ``:both`` (default), then the p-value for the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to mention that the default tail depends on the specific test, and give a list of tests with their default.

If ``tail`` is ``:both`` (default for most hypothesis tests), then the p-value for the
two-sided test is returned. If ``tail`` is ``:left`` or
``:right``, then a one-sided test is performed.

Expand Down
27 changes: 21 additions & 6 deletions doc/parametric/test_t.rst
Original file line number Diff line number Diff line change
@@ -1,45 +1,57 @@
T-test
=============================================

.. function:: OneSampleTTest(v::AbstractVector{T<:Real}, mu0::Real=0)
.. function:: OneSampleTTest(v::AbstractVector{T<:Real}, mu0::Real=0; tail::Symbol=:both, alpha::Real=0.05)

Perform a one sample t-test of the null hypothesis that the data
in vector ``v`` comes from a distribution with mean ``mu0`` against
the alternative hypothesis that the distribution does not have mean
``mu0``.

``tail`` and ``alpha`` specify the defaults for the application of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"for the application of" is an unusual vocabulary. Maybe "when calling"?

:ref:`pvalue<pvalue>` and :ref:`confint<confint>`.

Implements: :ref:`pvalue<pvalue>`, :ref:`confint<confint>`

.. function:: OneSampleTTest(xbar::Real, stdev::Real, n::Int, mu0::Real=0)
.. function:: OneSampleTTest(xbar::Real, stdev::Real, n::Int, mu0::Real=0; tail::Symbol=:both, alpha::Real=0.05)

Perform a one sample t-test of the null hypothesis that ``n``
values with mean ``xbar`` and sample standard deviation
``stdev`` come from a distribution with ``mu0`` against
the alternative hypothesis that the distribution does not have mean
``mu0``.

``tail`` and ``alpha`` specify the defaults for the application of
:ref:`pvalue<pvalue>` and :ref:`confint<confint>`.

Implements: :ref:`pvalue<pvalue>`, :ref:`confint<confint>`

.. function:: OneSampleTTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real}, mu0::Real=0)
.. function:: OneSampleTTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real}, mu0::Real=0; tail::Symbol=:both, alpha::Real=0.05)

Perform a paired sample t-test of the null hypothesis that
the differences between pairs of values in vectors ``x`` and
``y`` come from a distribution with ``mu0`` against the
alternative hypothesis that the distribution does not have mean
``mu0``.

``tail`` and ``alpha`` specify the defaults for the application of
:ref:`pvalue<pvalue>` and :ref:`confint<confint>`.

Implements: :ref:`pvalue<pvalue>`, :ref:`confint<confint>`

.. function:: EqualVarianceTTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real})
.. function:: EqualVarianceTTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real}; tail::Symbol=:both, alpha::Real=0.05)

Perform a two-sample t-test of the null hypothesis that
``x`` and ``y`` come from a distributions with the same mean
and equal variances against the alternative hypothesis that the
distributions have different means and but equal variances.

``tail`` and ``alpha`` specify the defaults for the application of
:ref:`pvalue<pvalue>` and :ref:`confint<confint>`.

Implements: :ref:`pvalue<pvalue>`, :ref:`confint<confint>`

.. function:: UnequalVarianceTTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real})
.. function:: UnequalVarianceTTest(x::AbstractVector{T<:Real}, y::AbstractVector{T<:Real}; tail::Symbol=:both, alpha::Real=0.05)

Perform an unequal variance two-sample t-test of the null
hypothesis that ``x`` and ``y`` come from a distributions with
Expand All @@ -54,5 +66,8 @@ T-test
.. math::
\nu_{\chi'} \approx \frac{\left(\sum_{i=1}^n k_i s_i^2\right)^2}
{\sum_{i=1}^n \frac{(k_i s_i^2)^2}{\nu_i}}


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove trailing spaces here and elsewhere.

``tail`` and ``alpha`` specify the defaults for the application of
:ref:`pvalue<pvalue>` and :ref:`confint<confint>`.

Implements: :ref:`pvalue<pvalue>`, :ref:`confint<confint>`
38 changes: 21 additions & 17 deletions src/HypothesisTests.jl
Original file line number Diff line number Diff line change
Expand Up @@ -72,32 +72,35 @@ end
function Base.show{T<:HypothesisTest}(io::IO, test::T)
println(io, testname(test))
println(io, repeat("-", length(testname(test))))

# utilities for pretty-printing
conf_string = string(floor((1 - alpha(test)) * 100, 6)) # limit to 6 decimals in %
prettify_detail(label::String, value::Any, len::Int) = # len is max length of label
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say "format" rather than "prettify".

" " * label * " "^max(len - length(label), 0) * string(value)

# population details
has_ci = applicable(StatsBase.confint, test)
(param_name, param_under_h0, param_estimate) = population_param_of_interest(test)
println(io, "Population details:")
println(io, " parameter of interest: $param_name")
println(io, " value under h_0: $param_under_h0")
println(io, " point estimate: $param_estimate")
println(io, prettify_detail("parameter of interest:", param_name, 32))
println(io, prettify_detail("value under h_0:", param_under_h0, 32))
println(io, prettify_detail("point estimate:", param_estimate, 32))
if has_ci
println(io, " 95% confidence interval: $(StatsBase.confint(test))")
println(io, prettify_detail(conf_string*"% confidence interval:", StatsBase.confint(test), 32))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for StatsBase., right?

end
println(io)

# test summary
p = pvalue(test)
outcome = if p > 0.05 "fail to reject" else "reject" end
tail = default_tail(test)
tail = HypothesisTests.tail(test)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HypothesisTests. isn't needed. Same below in comment.

p = pvalue(test) # obeys value of HypothesisTests.tail(test) if applicable
outcome = if p > alpha(test) "fail to reject" else "reject" end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's more standard to use p > alpha(test) ? ... : .... Below, since the conditions are relatively long, would be better to repeat tailvalue = inside each branch (i.e. closer to what it was before).

tailvalue =
if tail == :both "two-sided p-value:"
elseif tail == :left || tail == :right "one-sided p-value ($(string(tail)) tail):"
else "p-value:" end
println(io, "Test summary:")
println(io, " outcome with 95% confidence: $outcome h_0")
if tail == :both
println(io, " two-sided p-value: $p")
elseif tail == :left || tail == :right
println(io, " one-sided p-value: $p")
else
println(io, " p-value: $p")
end
println(io, prettify_detail("outcome with "*conf_string*"% confidence:", outcome*" h_0", 36))
println(io, prettify_detail(tailvalue, p, 36))
println(io)

# further details
Expand All @@ -108,8 +111,9 @@ end
# parameter of interest: name, value under h0, point estimate
population_param_of_interest{T<:HypothesisTest}(test::T) = ("not implemented yet", NaN, NaN)

# is the test one- or two-sided
default_tail(test::HypothesisTest) = :undefined
# is the test one- or two-sided?
tail(test::HypothesisTest) = :undefined # overloaded for defaults or field access
alpha(test::HypothesisTest) = 0.05

function show_params{T<:HypothesisTest}(io::IO, test::T, ident="")
fieldidx = find(Bool[t<:Number for t in T.types])
Expand Down
4 changes: 2 additions & 2 deletions src/anderson_darling.jl
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ function OneSampleADTest{T<:Real}(x::AbstractVector{T}, d::UnivariateDistributio
end

testname(::OneSampleADTest) = "One sample Anderson-Darling test"
default_tail(test::OneSampleADTest) = :right
tail(test::OneSampleADTest) = :right

function show_params(io::IO, x::OneSampleADTest, ident="")
println(io, ident, "number of observations: $(x.n)")
Expand Down Expand Up @@ -78,7 +78,7 @@ function KSampleADTest{T<:Real}(xs::AbstractVector{T}...; modified=true)
end

testname(::KSampleADTest) = "k-sample Anderson-Darling test"
default_tail(test::KSampleADTest) = :right
tail(test::KSampleADTest) = :right

function show_params(io::IO, x::KSampleADTest, ident="")
println(io, ident, "number of samples: $(x.k)")
Expand Down
1 change: 1 addition & 0 deletions src/augmented_dickey_fuller.jl
Original file line number Diff line number Diff line change
Expand Up @@ -207,6 +207,7 @@ end
testname(::ADFTest) = "Augmented Dickey-Fuller unit root test"
population_param_of_interest(x::ADFTest) =
("coefficient on lagged non-differenced variable", 0, x.coef)
tail(test::ADFTest) = :left

function show_params(io::IO, x::ADFTest, ident)
println(io, ident, "sample size in regression: ", x.n)
Expand Down
4 changes: 2 additions & 2 deletions src/binomial.jl
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ Returns the string value. E.g. "Binomial test", "Sign Test"
"""
testname(::BinomialTest) = "Binomial test"
population_param_of_interest(x::BinomialTest) = ("Probability of success", x.p, x.x/x.n) # parameter of interest: name, value under h0, point estimate
default_tail(test::BinomialTest) = :both
tail(test::BinomialTest) = :both

function show_params(io::IO, x::BinomialTest, ident="")
println(io, ident, "number of observations: $(x.n)")
Expand Down Expand Up @@ -157,7 +157,7 @@ SignTest{T<:Real, S<:Real}(x::AbstractVector{T}, y::AbstractVector{S}) = SignTes

testname(::SignTest) = "Sign Test"
population_param_of_interest(x::SignTest) = ("Median", x.median, median(x.data)) # parameter of interest: name, value under h0, point estimate
default_tail(test::SignTest) = :both
tail(test::SignTest) = :both

function show_params(io::IO, x::SignTest, ident="")
text1 = "number of observations:"
Expand Down
4 changes: 2 additions & 2 deletions src/box_test.jl
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ end
testname(::BoxPierceTest) = "Box-Pierce autocorrelation test"
population_param_of_interest(x::BoxPierceTest) = ("autocorrelations up to lag k",
"all zero", NaN)
default_tail(test::BoxPierceTest) = :right
tail(test::BoxPierceTest) = :right

function show_params(io::IO, x::BoxPierceTest, ident)
println(io, ident, "number of observations: ", x.n)
Expand Down Expand Up @@ -111,7 +111,7 @@ end
testname(::LjungBoxTest) = "Ljung-Box autocorrelation test"
population_param_of_interest(x::LjungBoxTest) = ("autocorrelations up to lag k",
"all zero", NaN)
default_tail(test::LjungBoxTest) = :right
tail(test::LjungBoxTest) = :right

function show_params(io::IO, x::LjungBoxTest, ident)
println(io, ident, "number of observations: ", x.n)
Expand Down
2 changes: 1 addition & 1 deletion src/breusch_godfrey.jl
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ end
testname(::BreuschGodfreyTest) = "Breusch-Godfrey autocorrelation test"
population_param_of_interest(x::BreuschGodfreyTest) =
("coefficients on lagged residuals up to lag p", "all zero", NaN)
default_tail(test::BreuschGodfreyTest) = :right
tail(test::BreuschGodfreyTest) = :right

function show_params(io::IO, x::BreuschGodfreyTest, ident)
println(io, ident, "number of observations: ", x.n)
Expand Down
6 changes: 3 additions & 3 deletions src/circular.jl
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ end

testname(::RayleighTest) = "Rayleigh test"
population_param_of_interest(x::RayleighTest) = ("Mean resultant length", 0, x.Rbar) # parameter of interest: name, value under h0, point estimate
default_tail(test::RayleighTest) = :both
tail(test::RayleighTest) = :both

function show_params(io::IO, x::RayleighTest, ident="")
println(io, ident, "number of observations: $(x.n)")
Expand Down Expand Up @@ -99,7 +99,7 @@ FisherTLinearAssociation{S <: Real, T <: Real}(theta::Vector{S},
testname(::FisherTLinearAssociation) =
"T-linear test of circular-circular association"
population_param_of_interest(x::FisherTLinearAssociation) = ("Circular correlation coefficient", 0, x.rho_t) # parameter of interest: name, value under h0, point estimate
default_tail(test::FisherTLinearAssociation) = :both
tail(test::FisherTLinearAssociation) = :both

function show_params(io::IO, x::FisherTLinearAssociation, ident="")
println(io, ident, "number of observations: [$(length(x.theta)),$(length(x.phi))]")
Expand Down Expand Up @@ -212,7 +212,7 @@ end

testname(::JammalamadakaCircularCorrelation) = "Jammalamadaka circular correlation"
population_param_of_interest(x::JammalamadakaCircularCorrelation) = ("Circular-circular correlation coefficient", 0, x.r) # parameter of interest: name, value under h0, point estimate
default_tail(test::JammalamadakaCircularCorrelation) = :both
tail(test::JammalamadakaCircularCorrelation) = :both

function show_params(io::IO, x::JammalamadakaCircularCorrelation, ident="")
println(io, ident, "test statistic: $(x.Z)")
Expand Down
1 change: 1 addition & 0 deletions src/deprecated.jl
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
using Base: @deprecate

@deprecate ci(args...) confint(args...)
@deprecate default_tail(test::HypothesisTest) tail(test)
2 changes: 1 addition & 1 deletion src/fisher.jl
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ end

testname(::FisherExactTest) = "Fisher's exact test"
population_param_of_interest(x::FisherExactTest) = ("Odds ratio", 1.0, x.ω) # parameter of interest: name, value under h0, point estimate
default_tail(test::FisherExactTest) = :both
tail(test::FisherExactTest) = :both

# The sizing argument to print_matrix was removed during the 0.5 dev period
if VERSION < v"0.5.0-dev+1936"
Expand Down
2 changes: 1 addition & 1 deletion src/jarque_bera.jl
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,7 @@ end
testname(::JarqueBeraTest) = "Jarque-Bera normality test"
population_param_of_interest(x::JarqueBeraTest) =
("skewness and kurtosis", "0 and 3", "$(x.skew) and $(x.kurt)")
default_tail(test::JarqueBeraTest) = :right
tail(test::JarqueBeraTest) = :right

function show_params(io::IO, x::JarqueBeraTest, ident)
println(io, ident, "number of observations: ", x.n)
Expand Down
2 changes: 1 addition & 1 deletion src/kolmogorov_smirnov.jl
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ export
@compat abstract type ExactKSTest <: KSTest end

population_param_of_interest(x::KSTest) = ("Supremum of CDF differences", 0.0, x.δ) # parameter of interest: name, value under h0, point estimate
default_tail(test::KSTest) = :both
tail(test::KSTest) = :both

## ONE SAMPLE KS-TEST

Expand Down
2 changes: 1 addition & 1 deletion src/kruskal_wallis.jl
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ end

testname(::KruskalWallisTest) = "Kruskal-Wallis rank sum test (chi-square approximation)"
population_param_of_interest(x::KruskalWallisTest) = ("Location parameters", "all equal", NaN) # parameter of interest: name, value under h0, point estimate
default_tail(test::KruskalWallisTest) = :right
tail(test::KruskalWallisTest) = :right

function show_params(io::IO, x::KruskalWallisTest, ident)
println(io, ident, "number of observation in each group: ", x.n_i)
Expand Down
4 changes: 2 additions & 2 deletions src/mann_whitney.jl
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ ExactMannWhitneyUTest{S<:Real,T<:Real}(x::AbstractVector{S}, y::AbstractVector{T

testname(::ExactMannWhitneyUTest) = "Exact Mann-Whitney U test"
population_param_of_interest(x::ExactMannWhitneyUTest) = ("Location parameter (pseudomedian)", 0, x.median) # parameter of interest: name, value under h0, point estimate
default_tail(test::ExactMannWhitneyUTest) = :both
tail(test::ExactMannWhitneyUTest) = :both

function show_params(io::IO, x::ExactMannWhitneyUTest, ident)
println(io, ident, "number of observations in each group: ", [x.nx, x.ny])
Expand Down Expand Up @@ -153,7 +153,7 @@ ApproximateMannWhitneyUTest{S<:Real,T<:Real}(x::AbstractVector{S}, y::AbstractVe

testname(::ApproximateMannWhitneyUTest) = "Approximate Mann-Whitney U test"
population_param_of_interest(x::ApproximateMannWhitneyUTest) = ("Location parameter (pseudomedian)", 0, x.median) # parameter of interest: name, value under h0, point estimate
default_tail(test::ApproximateMannWhitneyUTest) = :both
tail(test::ApproximateMannWhitneyUTest) = :both

function show_params(io::IO, x::ApproximateMannWhitneyUTest, ident)
println(io, ident, "number of observations in each group: ", [x.nx, x.ny])
Expand Down
2 changes: 1 addition & 1 deletion src/power_divergence.jl
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ end

# parameter of interest: name, value under h0, point estimate
population_param_of_interest(x::PowerDivergenceTest) = ("Multinomial Probabilities", x.theta0, x.thetahat)
default_tail(test::PowerDivergenceTest) = :right
tail(test::PowerDivergenceTest) = :right

pvalue(x::PowerDivergenceTest; tail=:right) = pvalue(Chisq(x.df),x.stat; tail=tail)

Expand Down
Loading