-
-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add nested R-hat convergence diagnostic #303
Conversation
Codecov Report
@@ Coverage Diff @@
## master #303 +/- ##
==========================================
- Coverage 95.66% 95.60% -0.07%
==========================================
Files 46 47 +1
Lines 3645 3682 +37
==========================================
+ Hits 3487 3520 +33
- Misses 158 162 +4
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
This is how benchmark results would change (along with a 95% confidence interval in relative change) if da6fed7 is merged into master:
|
This is how benchmark results would change (along with a 95% confidence interval in relative change) if e1d22ff is merged into master:
|
This is how benchmark results would change (along with a 95% confidence interval in relative change) if e4d63a9 is merged into master:
|
One thing to check: Should the Currently summarise_draws(example_draws(), rhat_basic, rhat_nested, .args = list(superchain_ids = c(1,2,3,4))
# A tibble: 10 × 3
variable rhat_basic rhat_nested
<chr> <dbl> <dbl>
1 mu 0.9979105738 1.003389884
2 tau 1.009976393 1.003445833
3 theta[1] 1.014966741 1.007488973
4 theta[2] 0.9981447065 1.002107795
5 theta[3] 1.000405648 1.008267202
6 theta[4] 0.9957624905 1.000607206
7 theta[5] 0.9987923422 1.007260715
8 theta[6] 0.9982158544 1.002237514
9 theta[7] 1.002538583 1.003347448
10 theta[8] 0.9933503132 1.003124265 |
@avehtari do you want this feature to already be in the new posterior released to be release shortly? Just asking so I can set priority accordingly. |
I've now added input checks for the superchain_ids and NA/Inf values in the chains, and corresponding tests. |
Great! Is there anything else you want to add from your side? Or is this ready for me to check and then merge? |
I have check and things look good to me. @n-kall do I have your OK to merge? |
This is how benchmark results would change (along with a 95% confidence interval in relative change) if 10bdfc5 is merged into master:
|
@paul-buerkner hold off on merging, as Charles might take a look beforehand. We'll let you know when it's ready for merge |
There's still a discrepancy between rhat_basic (without splitting) and rhat_nested with 1 chain per superchain. summarise_draws(
example_draws(),
rhat_basic_nosplit = ~rhat_basic(.x, split = FALSE),
rhat_nested = ~rhat_nested(.x, superchain_ids = c(1, 2, 3, 4))
)
# # A tibble: 10 × 3
# variable rhat_basic_nosplit rhat_nested
# <chr> <dbl> <dbl>
# 1 mu 0.99839 1.0034
# 2 tau 0.99845 1.0034
# 3 theta[1] 1.0025 1.0075
# 4 theta[2] 0.99711 1.0021
# 5 theta[3] 1.0033 1.0083
# 6 theta[4] 0.99560 1.0006
# 7 theta[5] 1.0023 1.0073
# 8 theta[6] 0.99724 1.0022
# 9 theta[7] 0.99835 1.0033
# 10 theta[8] 0.99813 1.0031 |
Ok, it pays to check the footnotes. Page 7 of Margossian et al. states: "The original R-hat uses a slightly different estimate for the within-chain variance when computing the numerator in R-hat. There W is scaled by 1/N , rather than 1/(N − 1). This explains why occasionally R-hat < 1. This is of little concern when N is large, but we care about the case where N is small, and we therefore adjust the R-hat statistic slightly." So I think this discrepancy is fine |
This is how benchmark results would change (along with a 95% confidence interval in relative change) if 9500ca2 is merged into master:
|
This is how benchmark results would change (along with a 95% confidence interval in relative change) if 8fce13a is merged into master:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me! Only changes I would put are for the doc:
- indicate that the referenced preprint is version 4
- in the documentation, explain the slight discrepancy between nRhat and Rhat because the former is lower-bounded by 1 (footnote 1 in the preprint).
if (nchains_per_superchain != min(superchain_id_table)) { | ||
warning_no_call("Number of chains per superchain is not the same for ", | ||
"each superchain, returning NA.") | ||
return(NA_real_) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically we could define nRhat to work on superchains with different sizes, but we didn't do this in the paper. I don't see a strong motivation for addressing this, but we can think about it in the future.
#' Charles C. Margossian, Matthew D. Hoffman, Pavel Sountsov, Lionel | ||
#' Riou-Durand, Aki Vehtari and Andrew Gelman (2023). Nested R-hat: | ||
#' Assessing the convergence of Markov chain Monte Carlo when running | ||
#' many short chains. arxiv:arXiv:2110.13017 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you're listing equation numbers and the paper is still under review, make sure to list which version of the preprint you're referencing (the latest version is v4)
add details section with explanation of superchains and discrepancy between Rhat and nested Rhat calculation, specify version of preprint
I've now updated the docs as @charlesm93 suggested and added some further details. I think it's ready to merge |
This is how benchmark results would change (along with a 95% confidence interval in relative change) if afd6cff is merged into master:
|
Thank you! The failing tests seem to be unrelated to this PR so I will merge it now. |
Summary
This adds the nested R-hat convergence diagnostic which is useful when running many short chains. Chains need to be grouped into superchains (given as an additional argument). This addresses issue #256
Nested R-hat is described in:
Charles C. Margossian, Matthew D. Hoffman, Pavel Sountsov, Lionel Riou-Durand, Aki Vehtari, Andrew Gelman (2022). Nested Rˆ: Assessing the convergence of Markov chain Monte Carlo when running many short chains. https://arxiv.org/abs/2110.13017
Status
Working.
Currently work in progress. Functionality seems to work, but untested. Opening this draft PR early as place for discussion and further development.TODOs:
addDonerhat_nested.rvar
methodadd testsExample usage
Example usage:
Copyright and Licensing
By submitting this pull request, the copyright holder is agreeing to
license the submitted work under the following licenses: