Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure rank overlay plot starts at 0 even if not all bins present #332

Merged
merged 2 commits into from
Dec 14, 2024

Conversation

jgabry
Copy link
Member

@jgabry jgabry commented Dec 13, 2024

fixes #331

Code to test:

x <- posterior::as_draws_array(
  list(
    list(theta = -2 + 0.003 * 1:1000 + arima.sim(list(ar = 0.7), n = 1000, sd = 0.5)),
    list(theta = 1 + -0.01 * 1:1000 + arima.sim(list(ar = 0.7), n = 1000, sd = 0.5))
  )
)
bayesplot::mcmc_rank_overlay(x)

R/mcmc-traces.R Outdated
Comment on lines 314 to 321
# Now ensure that all combinations of parameter, chain, and bin_start exist
# even if no counts are present (https://github.com/stan-dev/bayesplot/issues/331)
all_params_chains <- dplyr::distinct(data, .data$parameter, .data$chain)
all_bins <- dplyr::distinct(histobins, .data$bin_start, .data$cut)
combos <- dplyr::cross_join(all_params_chains, all_bins)
d_bin_counts <- full_join(combos, d_bin_counts, by = c("parameter", "chain", "bin_start")) %>%
mutate(n = dplyr::coalesce(n, 0L))

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There may be a more elegant solution, but this seems to get the job done

Copy link
Contributor

@sims1253 sims1253 Dec 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this is more elegant but an expand.grid with one left join and an if_else to fill the zeros should also work. Something like the following. Reads a little easier for me, though that might just be me. (not tested!)

all_combos <- tibble::as_tibble(expand.grid(
  parameter = unique(data$parameter),
  chain = unique(data$chain),
  bin_start = unique(histobins$bin_start),
  stringsAsFactors = FALSE
))

d_bin_counts <- all_combos %>%
  left_join(d_bin_counts, by = c("parameter", "chain", "bin_start")) %>%
  mutate(n = if_else(is.na(n), 0L, n))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# Ensure all combinations exist and count values
d_bin_counts <- data %>%
  left_join(histobins, by = "value_rank") %>%
  count(.data$parameter, .data$chain, .data$bin_start) %>%
  right_join(
    dplyr::expand(.data$parameter, .data$chain, unique(histobins$bin_start)),
    by = c("parameter", "chain", "bin_start")
  ) %>%
  mutate(n = dplyr::coalesce(n, 0L))

Or courtesy of claude, telling me my solution was verbouse and not dplyr-ish enough.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks to you and Claude, both!

Copy link
Member Author

@jgabry jgabry Dec 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dplyr::expand

Except Claude didn't realize there is no dplyr::expand . There is tidyr::expand, but we don't already depend on tidyr so probably best not to add another dependency just for this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ha, great help :D

@jgabry
Copy link
Member Author

jgabry commented Dec 14, 2024

The failure on macOS-latest (devel) is due to the vdiffr package failing to compile on GitHub actions

@sims1253
Copy link
Contributor

Maybe add a set.seed to the test to make it deterministic?

@jgabry
Copy link
Member Author

jgabry commented Dec 14, 2024

Maybe add a set.seed to the test to make it deterministic?

There's a set.seed in the file where I added the code to create the data for the test (it just doesn't show up in the diff since I didn't change it):

Co-Authored-By: Maximilian Scholz <[email protected]>
@jgabry
Copy link
Member Author

jgabry commented Dec 14, 2024

@sims1253 Thanks. I just pushed a commit that uses basically your first suggestion of expand_grid plus a left_join.

@jgabry
Copy link
Member Author

jgabry commented Dec 14, 2024

Going to merge since everything is passing except the unrelated failure on macos-latest (devel) that happens when installing the vdiffr dependency. Thanks again @sims1253.

@jgabry jgabry merged commit 20910f5 into master Dec 14, 2024
9 of 10 checks passed
@jgabry jgabry deleted the rank-overlay-always-start-0 branch December 14, 2024 17:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

mcmc_rank_overlay: d_bin_counts doesn't contain rows for cases with n = 0
2 participants