Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible bug while passing resampled data directly to bayesglm #3

Open
bakaburg1 opened this issue Jan 28, 2020 · 0 comments
Open

Possible bug while passing resampled data directly to bayesglm #3

bakaburg1 opened this issue Jan 28, 2020 · 0 comments

Comments

@bakaburg1
Copy link

bakaburg1 commented Jan 28, 2020

Hello,
I'm observing a strange phenomenon while using bayesglm(). If I create a resampled data.frame directly in the function call, it will create estimates centered around the null, while if I first create the resampled data.frame and then pass it seems to produce correct estimates:

> bayesglm(I(Species == 'versicolor') ~ I(Sepal.Width <= 2.9), binomial(), data = iris %>% sample_frac(1, replace = T)) %>% tidy(exp = T)
# A tibble: 2 x 5
  term                      estimate std.error statistic  p.value
  <chr>                        <dbl>     <dbl>     <dbl>    <dbl>
1 (Intercept)                  0.458     0.215    -3.63  0.000278
2 I(Sepal.Width <= 2.9)TRUE    1.29      0.355     0.716 0.474   
> local({
+     DF <- iris %>% sample_frac(1, replace = T)
+     bayesglm(I(Species == 'versicolor') ~ I(Sepal.Width <= 2.9), binomial(), data = DF) %>% tidy(exp = T)
+ })
# A tibble: 2 x 5
  term                      estimate std.error statistic      p.value
  <chr>                        <dbl>     <dbl>     <dbl>        <dbl>
1 (Intercept)                  0.212     0.278     -5.57 0.0000000256
2 I(Sepal.Width <= 2.9)TRUE    7.79      0.379      5.42 0.0000000593

The non-resampled estimates are:

bayesglm(I(Species == 'versicolor') ~ I(Sepal.Width <= 2.9), binomial(), data = iris) %>% tidy(exp = T)
# A tibble: 2 x 5
  term                      estimate std.error statistic      p.value
  <chr>                        <dbl>     <dbl>     <dbl>        <dbl>
1 (Intercept)                  0.214     0.270     -5.71 0.0000000114
2 I(Sepal.Width <= 2.9)TRUE    6.72      0.377      5.05 0.000000442 

and are similar to the version in which the data.frame is passed after creation.

The problem doesn't seem to appear if using simple glm().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant