-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarify whether chisq_test is supposed to give same results as base::chisq.test() #515
Comments
Thank you for the issue and reprex @sda030! What you're seeing is actually an inconsistency of library(infer)
data <- structure(list(y = structure(c(2L, 2L, 2L, 2L, 1L, 2L, 1L, 2L,
2L, 2L, 2L, 1L, 2L, NA, 1L, NA, 2L, 2L, 3L, 3L, 2L, NA, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, NA, 1L, 2L, 2L, 2L, 2L, 2L,
2L, 1L, 2L, 1L, 2L, 2L, 2L, 2L, 1L, NA, 2L, 1L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 1L, 2L, NA, 2L, 2L, 2L, NA, 2L, 2L, NA, 2L, NA,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 2L, 2L,
2L, 2L, NA, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, NA, 2L, 2L, 2L,
2L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L,
NA, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, NA, 2L,
2L, 2L, 1L, 2L, 2L, 2L, 1L, 2L, 1L, NA, 2L, 2L, 2L, 2L, 2L, 2L,
1L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, NA, NA, 2L, 2L, 2L, 3L, 2L, 1L,
2L, 2L, 2L, 2L, 2L, NA, 2L, 2L, NA, 2L, 2L, 2L, 1L, NA, 1L, NA,
NA, NA, 2L, 2L, NA, 2L, NA, NA, 2L, 2L, 2L, 2L, NA, 2L, 2L, 2L,
2L, 2L, 2L, NA, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, NA, 2L,
NA, 1L, 2L, 1L, NA, 2L, 1L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, NA,
2L, 2L, 2L, 2L, 2L, 2L, NA, NA, 2L, NA, NA, 2L, 2L, 2L, 2L, NA,
NA, 2L, NA, NA, 2L, 2L, 2L, 1L, 2L, NA, 2L, 2L, 2L, 2L, 2L, 1L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
NA, 2L, 2L, NA, 2L, 2L, 2L, 1L, 2L, 1L, 2L, 2L, 1L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, NA, 2L, 2L, 2L, 2L, 1L, 2L, 1L, 2L, 2L, 2L, 1L,
2L, 2L, 2L, 2L, NA, 2L, 2L, 2L, 2L, 2L, NA, 2L, 2L, 2L, 2L, 2L,
2L, 2L, NA, 2L, 2L, 2L, 2L, NA, 1L, NA, 2L, 2L, 2L, NA, 2L, NA,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, NA, 2L, 2L, 1L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, NA, NA, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L,
2L, 2L, 2L, 2L, NA, 2L, NA, NA, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 1L,
2L, 1L, NA, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, NA, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 3L, 2L, 2L, 2L, NA,
2L, 2L, 2L, 2L, 2L, 2L, NA, 2L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 2L,
2L, NA, 2L, 2L, 1L, NA, 2L, 2L, 1L, 2L, 1L, 2L, 2L, 2L, 2L, 2L,
1L, 1L, 2L, 2L, 2L, NA, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 3L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, NA, 2L, 2L, 2L, NA, 2L, 2L, NA, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, NA,
2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, NA, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, NA, 2L, NA, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, NA, NA,
NA, NA, NA, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, NA, 2L, NA, 2L,
NA, 2L, NA, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 1L, 1L, NA, 2L, 2L, 3L, 2L, 2L, NA, 2L, 2L, NA, 1L, NA,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, NA, 2L, 2L, 2L, 2L, 1L, 2L, 2L,
2L, 2L, NA, 2L, 2L, 2L, 2L, 2L, NA, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, NA, 2L, 2L, 2L, 2L, 2L, NA, 1L,
2L, 2L, 2L, 2L, NA, 2L, 2L, 2L, NA, NA, 2L, 2L, 2L, 2L, NA, NA,
2L, NA, 1L, 2L, 2L, NA, 2L, 2L, 2L, 2L, 2L), levels = c("A",
"B", "C"), class = "factor"),
x = structure(c(3L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L,
1L, 3L, 3L, 1L, 1L, 2L, NA, NA, NA, 2L, 3L, 2L, 2L, 2L, 2L,
3L, 3L, 3L, 3L, 1L, 2L, NA, NA, 1L, 3L, 2L, 3L, NA, 3L, NA,
1L, 1L, 2L, NA, 1L, NA, 3L, 1L, 2L, 1L, 1L, NA, 2L, 2L, 1L,
2L, 2L, 3L, 3L, 3L, 2L, 2L, 1L, NA, NA, 3L, 3L, NA, 2L, 2L,
3L, 1L, 3L, 3L, 3L, 2L, 3L, NA, NA, NA, 1L, 3L, NA, 2L, NA,
1L, 1L, 3L, 2L, 1L, NA, 2L, 1L, 3L, 2L, 3L, NA, NA, NA, 3L,
3L, 3L, 3L, 3L, 3L, 2L, 1L, 1L, 3L, NA, NA, 3L, 1L, 3L, 3L,
NA, 2L, 2L, 3L, 2L, 1L, 3L, NA, NA, 3L, NA, 3L, 2L, 1L, NA,
2L, 2L, 2L, 2L, 2L, 3L, 3L, 2L, 1L, NA, 3L, 1L, 2L, 3L, 2L,
NA, NA, 1L, 3L, NA, 2L, NA, NA, 1L, 3L, NA, 3L, 1L, NA, 2L,
3L, NA, NA, 2L, NA, 1L, 1L, 3L, 2L, 2L, 2L, 3L, NA, 1L, NA,
2L, 3L, 3L, 1L, 1L, 2L, 1L, 2L, 3L, 2L, 2L, 2L, 2L, 2L, 3L,
2L, 3L, 2L, 2L, 2L, 1L, 2L, 3L, 2L, 2L, 2L, NA, 1L, NA, 2L,
2L, 2L, 2L, 1L, NA, 2L, 2L, 3L, 3L, 3L, 2L, 1L, 2L, 1L, 1L,
1L, 2L, 1L, 1L, NA, 2L, NA, 2L, 1L, 1L, 1L, 1L, NA, 3L, NA,
2L, 1L, 1L, 3L, 3L, NA, 1L, 3L, 3L, 3L, 2L, 1L, 1L, 1L, 2L,
2L, 3L, 2L, 3L, 2L, 1L, NA, 3L, 1L, NA, 1L, 1L, 2L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 2L, 2L, 3L, 2L, 3L, 2L, 1L, 2L, 3L, 2L,
3L, 2L, 2L, 2L, 1L, 2L, NA, 1L, 2L, 3L, 2L, 3L, 2L, NA, 2L,
2L, NA, 3L, 2L, NA, 2L, 3L, NA, 3L, 2L, NA, 3L, 2L, 2L, NA,
2L, 3L, NA, 2L, 1L, 3L, 2L, 2L, 2L, 1L, 2L, 3L, 3L, 1L, 1L,
NA, 2L, 1L, 3L, 3L, 1L, 2L, NA, 1L, 3L, 2L, 2L, 2L, 2L, 1L,
1L, 2L, 2L, NA, 2L, 3L, 2L, NA, NA, 3L, 3L, 1L, 1L, 2L, 2L,
NA, NA, 1L, 2L, 3L, 3L, 1L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L,
3L, 2L, 1L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 2L, NA, 3L, NA, 3L,
3L, 1L, 2L, 2L, 3L, 1L, 2L, 2L, 2L, 1L, 3L, 3L, 2L, 2L, 2L,
3L, 3L, 3L, 2L, 2L, 1L, 3L, 2L, 2L, NA, NA, 3L, 3L, NA, 2L,
3L, 1L, 1L, NA, 2L, 1L, 2L, NA, 3L, 3L, 2L, 1L, NA, 2L, 3L,
3L, 1L, 2L, 3L, 2L, 1L, NA, 3L, 2L, 3L, NA, 3L, 2L, 2L, 2L,
2L, 1L, 2L, 2L, 2L, 2L, 2L, NA, NA, 3L, 2L, 1L, 3L, NA, 2L,
3L, 2L, 2L, 3L, 2L, 3L, 3L, NA, 3L, NA, NA, 2L, 2L, 2L, 3L,
3L, 3L, NA, NA, 1L, 2L, NA, NA, NA, 2L, 2L, 3L, 3L, 1L, 2L,
2L, NA, 1L, 3L, 2L, 1L, 1L, 1L, 1L, 2L, 3L, NA, 3L, 1L, NA,
1L, 1L, 1L, 3L, 1L, 1L, 3L, 1L, 3L, 2L, 2L, 3L, 2L, 1L, 3L,
2L, 3L, NA, 1L, 3L, 2L, 2L, 2L, 1L, 3L, 3L, 3L, 3L, 3L, 3L,
NA, NA, 3L, NA, NA, 3L, 2L, 2L, 2L, 2L, 2L, 3L, NA, NA, 3L,
2L, 3L, 1L, 3L, 3L, 2L, 2L, 3L, 1L, 2L, 2L, 3L, 2L, 1L, NA,
1L, NA, 3L, 3L, 1L, 2L, 3L, 2L, NA, NA, 2L, 2L, 3L, 1L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 2L, 3L, NA, 3L, 1L, 3L, 2L,
2L, 2L, 1L, 2L, NA, 1L, 3L, 2L, 1L, 1L, 3L, 1L, NA, 2L, NA,
NA, 3L, 2L, 3L, 2L, 1L, 2L, 3L, 2L, 2L, 1L, 1L, NA, 3L, NA,
2L, 2L, NA, 1L, 1L, 2L, 3L, 3L, 1L, 2L, 3L, 2L, 2L, NA, NA,
1L, 2L, 2L, 3L, 2L, 1L, 2L, 1L, 3L, 3L, 2L, 1L, 2L, 3L, 3L,
2L, 1L, 2L, NA, 3L, 3L, 3L, 3L, 2L, 3L, 3L, 3L, 2L, 3L, 1L,
3L, NA, NA, 2L, 2L, NA, 2L, 2L, 1L, 1L, 1L, 2L, 3L, 1L, 2L,
2L, 2L, 3L, 3L, 3L), levels = c("A", "B",
"C"), class = "factor")), row.names = NULL, class = c("data.frame"))
chisq.test(data$y, data$x)
#>
#> Pearson's Chi-squared test
#>
#> data: data$y and data$x
#> X-squared = 33.258, df = 2, p-value = 5.999e-08
# note:
table(data)
#> x
#> y A B C
#> A 31 21 7
#> B 87 184 162
#> C 0 0 0
# so:
chisq.test(table(data))
#> Warning in chisq.test(table(data)): Chi-squared approximation may be incorrect
#>
#> Pearson's Chi-squared test
#>
#> data: table(data)
#> X-squared = NaN, df = 4, p-value = NA Created on 2023-11-13 with reprex v2.0.2 |
This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue. |
I expected infer::chisq_test() to provide the same results as chisq.test(), just containing some nice wrapper features. However, this inconsistency has led to some headaches for us. Note that I do not worry about the warning, but the NaN.
The data are real.
Created on 2023-11-11 with reprex v2.0.2
The text was updated successfully, but these errors were encountered: