You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Description
When a nominal vector has a lot of levels, and we compute a Kruskal test between this vector and a numerical one, the kruskal.test() function works fine, and returns fast. But the call to report() on the resulting object takes a long time, triggers a lot of warnings, and eventually fails with an error.
To Reproduce
The following code should reproduce the issue from scratch:
library("report")
n <- 200
df <- data.frame(a = as.factor(1:n), b = sample(30, n, replace = TRUE))
test <- kruskal.test(df$b, df$a)
report(test)
The last line of this code takes a lot of time to process, while the call to kruskal.test is instantaneous. The number of warnings seems to grow at least quadratically with n.
It eventually ends with:
[1] "All values of t are equal to 1 \n Cannot calculate confidence intervals"
Erreur dans data.frame(CI = ci, CI_low = bCI[1], CI_high = bCI[2]) :
les arguments impliquent des nombres de lignes différents : 1, 0
De plus : Il y a eu 50 avis ou plus (utilisez warnings() pour voir les 50 premiers)
Expected behaviour
I don't think report should have a bigger time complexity than the call to Kruskal.test, since it's only supposed to describe the results of that test, not perform additional computations. For the same reasons, the conditions of applications should be similar. When given a valid htest object, report should be able to describe it, even if that description has to involves some warnings and NAs.
Observed on version 0.5.8
The text was updated successfully, but these errors were encountered:
The report() function calls effectsize::rank_epsilon_squared() which uses a stratified bootstrap approach to compute confidence intervals for the effect size - this is why the computation time is longer.
Also, in your example, there is only one observation per group. I'm surprised kruskal.test() does not fail in this case.
In such a case, the effect size will be equal to 1, as will all bootstrapped effect size - which is why you get the error:
All values of t are equal to 1
Cannot calculate confidence intervals
Description
When a nominal vector has a lot of levels, and we compute a Kruskal test between this vector and a numerical one, the kruskal.test() function works fine, and returns fast. But the call to report() on the resulting object takes a long time, triggers a lot of warnings, and eventually fails with an error.
To Reproduce
The following code should reproduce the issue from scratch:
The last line of this code takes a lot of time to process, while the call to kruskal.test is instantaneous. The number of warnings seems to grow at least quadratically with n.
It eventually ends with:
Expected behaviour
I don't think report should have a bigger time complexity than the call to Kruskal.test, since it's only supposed to describe the results of that test, not perform additional computations. For the same reasons, the conditions of applications should be similar. When given a valid htest object, report should be able to describe it, even if that description has to involves some warnings and NAs.
Observed on version 0.5.8
The text was updated successfully, but these errors were encountered: