bookdown 1.1.0 html

ASKurz · Mar 2, 2020 · 0f79729 · 0f79729
1 parent 686783e
commit 0f79729
Show file tree

Hide file tree

Showing 46 changed files with 973 additions and 946 deletions.
diff --git a/_book/01.md b/_book/01.md
@@ -1,7 +1,7 @@
 ---
 title: "Chapter 01. The Golem of Prague"
 author: "A Solomon Kurz"
-date: "2020-02-29"
+date: "2020-03-01"
 output:
   github_document
 ---

diff --git a/_book/02.md b/_book/02.md
@@ -1,7 +1,7 @@
 ---
 title: "Chapter 02. Small Worlds and Large Worlds"
 author: "A Solomon Kurz"
-date: "2020-02-29"
+date: "2020-03-01"
 output:
   github_document
 ---

diff --git a/_book/03.md b/_book/03.md
@@ -1,7 +1,7 @@
 ---
 title: "Chapter 03. Sampling the Imaginary"
 author: "A Solomon Kurz"
-date: "2020-02-29"
+date: "2020-03-01"
 output:
   github_document
 ---

diff --git a/_book/04.md b/_book/04.md
@@ -1,7 +1,7 @@
 ---
 title: "Ch. 4 Linear Models"
 author: "A Solomon Kurz"
-date: "2020-02-29"
+date: "2020-03-01"
 output:
   github_document
 ---
@@ -768,7 +768,7 @@ d_grid_samples %>%
 
 That `labeller = label_parsed` bit in the `facet_wrap()` function is what converted our subplot strip labels into Greek. Anyway, $\sigma$ is not so Gaussian with that small $n$.
 
-This is the point in the project where we hop off the grid-approximation train. On the one hand, I think this is a great idea. Most of y'all reading this will never use grid approximation in a real-world applied data analysis. On the other hand, there is some pedagogical utility in practicing with it. It can help you grasp what it is we're dong when we apply Bayes' theorem. If you'd like more practice, check out the first several chapters in [Kruschke's (2014) textbook](https://sites.google.com/site/doingbayesiandataanalysis/) and the corresponding chapters in [my project translating it into brms and tidyverse](https://bookdown.org/ajkurz/DBDA_recoded/).
+This is the point in the project where we hop off the grid-approximation train. On the one hand, I think this is a great idea. Most of y'all reading this will never use grid approximation in a real-world applied data analysis. On the other hand, there is some pedagogical utility in practicing with it. It can help you grasp what it is we're dong when we apply Bayes' theorem. If you'd like more practice, check out the first several chapters in [Kruschke's (2015) textbook](https://sites.google.com/site/doingbayesiandataanalysis/) and the corresponding chapters in [my project translating it into brms and tidyverse](https://bookdown.org/content/3686).
 
 ### Fitting the model with ~~`map`~~ `brm()`.
 

diff --git a/_book/05.md b/_book/05.md
@@ -1,7 +1,7 @@
 ---
 title: "Ch. 5 Multivariate Linear Models"
 author: "A Solomon Kurz"
-date: "2020-02-29"
+date: "2020-03-01"
 output:
   github_document
 ---

diff --git a/_book/06.md b/_book/06.md
@@ -1,7 +1,7 @@
 ---
 title: "Ch. 6 Overfitting, Regularization, and Information Criteria"
 author: "A Solomon Kurz"
-date: "2020-02-29"
+date: "2020-03-01"
 output:
   github_document
 ---

diff --git a/_book/07.md b/_book/07.md
@@ -1,7 +1,7 @@
 ---
 title: "Ch. 7 Interactions"
 author: "A Solomon Kurz"
-date: "2020-02-29"
+date: "2020-03-01"
 output:
   github_document
 ---

diff --git a/_book/07_files/figure-gfm/unnamed-chunk-44-1.png b/_book/07_files/figure-gfm/unnamed-chunk-44-1.png
diff --git a/_book/07_files/figure-gfm/unnamed-chunk-50-1.png b/_book/07_files/figure-gfm/unnamed-chunk-50-1.png
diff --git a/_book/08.md b/_book/08.md
@@ -1,7 +1,7 @@
 ---
 title: "Ch. 8 Markov Chain Monte Carlo"
 author: "A Solomon Kurz"
-date: "2020-02-29"
+date: "2020-03-01"
 output:
   github_document
 ---

diff --git a/_book/09.md b/_book/09.md
@@ -1,7 +1,7 @@
 ---
 title: "Chapter 09. Big Entropy and the Generalized Linear Model"
 author: "A Solomon Kurz"
-date: "2020-02-29"
+date: "2020-03-01"
 output:
   github_document
 ---
@@ -155,15 +155,15 @@ d %>%
 
 Behold the probability density for the generalized normal distribution:
 
-$$\text{Pr} (y | \mu, \alpha, \beta) = \frac{\beta}{2 \alpha \Gamma \bigg (\frac{1}{\beta} \bigg )} e ^ {- \bigg (\frac{|y - \mu|}{\alpha} \bigg ) ^ {\beta}}$$
+$$\text{Pr} (y | \mu, \alpha, \beta) = \frac{\beta}{2 \alpha \Gamma \bigg (\frac{1}{\beta} \bigg )} e ^ {- \bigg (\frac{|y - \mu|}{\alpha} \bigg ) ^ {\beta}},$$
 
-In this formulation, $\alpha =$ the scale, $\beta =$ the shape, $\mu =$ the location, and $\Gamma =$ the [gamma function](https://en.wikipedia.org/wiki/Gamma_function). If you read closely in the text, you'll discover that the densities in the right panel of Figure 9.2 were all created with the constraint $\sigma^2 = 1$. But $\sigma^2 \neq \alpha$ and there's no $\sigma$ in the equations in the text. However, it appears the variance for the generalized normal distribution follows the form
+where $\alpha =$ the scale, $\beta =$ the shape, $\mu =$ the location, and $\Gamma =$ the [gamma function](https://en.wikipedia.org/wiki/Gamma_function). If you read closely in the text, you'll discover that the densities in the right panel of Figure 9.2 were all created with the constraint $\sigma^2 = 1$. But $\sigma^2 \neq \alpha$ and there's no $\sigma$ in the equations in the text. However, it appears the variance for the generalized normal distribution follows the form
 
 $$\sigma^2 = \frac{\alpha^2 \Gamma (3/\beta)}{\Gamma (1/\beta)}.$$
 
-So if you do the algebra, you'll see that you can compute $\alpha$ for a given $\sigma^2$ and $\beta$ like so:
+So if you do the algebra, you'll see that you can compute $\alpha$ for a given $\sigma^2$ and $\beta$ with the equation
 
-$$\alpha = \sqrt{ \frac{\sigma^2 \Gamma (1/\beta)}{\Gamma (3/\beta)} }$$
+$$\alpha = \sqrt{ \frac{\sigma^2 \Gamma (1/\beta)}{\Gamma (3/\beta)} }.$$
 
 I got the formula from [Wikipedia.com](https://en.wikipedia.org/wiki/Generalized_normal_distribution). Don't judge. We can wrap that formula in a custom function, `alpha_per_beta()`, use it to solve for the desired $\beta$ values, and plot. But one more thing: McElreath didn't tell us exactly which $\beta$ values the left panel of Figure 9.2 was based on. So the plot below is my best guess.
 

diff --git a/_book/10.md b/_book/10.md
@@ -1,7 +1,7 @@
 ---
 title: "Ch. 10 Counting and Classification"
 author: "A Solomon Kurz"
-date: "2020-02-29"
+date: "2020-03-01"
 output:
   github_document
 ---
@@ -1743,8 +1743,8 @@ ppa %>%
               stat = "identity",
               alpha = 1/4, size = 1/2) +
   geom_text(data = d, 
-             aes(y = total_tools, label = total_tools),
-             size = 3.5) +
+            aes(y = total_tools, label = total_tools),
+            size = 3.5) +
   labs(subtitle = "Blue is the high contact rate; black is the low.",
        x = "log population",
        y = "total tools") +
@@ -1867,6 +1867,7 @@ my_upper <- function(data, mapping, ...) {
     mapping = aes(),
     size = 4,
     color = wes_palette("Moonrise2")[4], 
+    alpha = 4/5,
     family = "Times") +
     theme_void()
 }
@@ -2186,8 +2187,6 @@ library(brms)
 
 Before we fit the model, we might take a quick look at the prior structure with `brms::get_prior()`.
 
-Here's our multinomial model in brms. Do note the specification `family = categorical(link = logit)`.
-
 
 ```r
 get_prior(data = list(career = career), 
@@ -2531,7 +2530,7 @@ rbind(f[, , 1],
 
 <img src="10_files/figure-gfm/unnamed-chunk-106-1.png" width="312" />
 
-For more practice fitting multinomial models with brms, check out my [translation of Kruschke's text, Chapter 22](https://bookdown.org/ajkurz/DBDA_recoded/nominal-predicted-variable.html).
+For more practice fitting multinomial models with brms, check out my [translation of Kruschke's text, Chapter 22](https://bookdown.org/content/3686/nominal-predicted-variable.html).
 
 #### Multinomial in disguise as Poisson.
 

diff --git a/_book/10_files/figure-gfm/unnamed-chunk-17-1.png b/_book/10_files/figure-gfm/unnamed-chunk-17-1.png
diff --git a/_book/10_files/figure-gfm/unnamed-chunk-80-1.png b/_book/10_files/figure-gfm/unnamed-chunk-80-1.png
diff --git a/_book/10_files/figure-gfm/unnamed-chunk-80-2.png b/_book/10_files/figure-gfm/unnamed-chunk-80-2.png
diff --git a/_book/11.md b/_book/11.md
@@ -1,7 +1,7 @@
 ---
 title: "Ch. 11 Monsters and Mixtures"
 author: "A Solomon Kurz"
-date: "2020-02-29"
+date: "2020-03-01"
 output:
   github_document
 ---
@@ -1340,7 +1340,7 @@ ggplot(data = tibble(x = seq(from = 0, to = 1, by = .01)),
 
 <img src="11_files/figure-gfm/unnamed-chunk-44-1.png" width="384" />
 
-McElreath encouraged us to "explore different values for `pbar` and `theta`" (p. 348). Here's a grid of plots with `pbar = c(.25, .5, .75)` and `theta = c(5, 10, 15)`
+McElreath encouraged us to "explore different values for `pbar` and `theta`" (p. 348). Here's a grid of plots with `pbar = c(.25, .5, .75)` and `theta = c(5, 10, 15)`.
 
 
 ```r
@@ -1367,16 +1367,18 @@ crossing(pbar  = c(.25, .5, .75),
 
 <img src="11_files/figure-gfm/unnamed-chunk-45-1.png" width="480" />
 
-If you'd like to see how to make a similar plot in terms of $\alpha$ and $\beta$, see [Chapter 6](https://bookdown.org/ajkurz/DBDA_recoded/inferring-a-binomial-probability-via-exact-mathematical-analysis.html#a-description-of-credibilities-the-beta-distribution) of my project [recoding Kruschke's text into tidyverse and brms code](https://bookdown.org/content/3686/).
+If you'd like to see how to make a similar plot in terms of $\alpha$ and $\beta$, see [Chapter 6](https://bookdown.org/content/3686/inferring-a-binomial-probability-via-exact-mathematical-analysis.html) of my project [recoding Kruschke's text into tidyverse and brms code](https://bookdown.org/content/3686/).
 
 But remember, we're not fitting a beta model. We're using the beta-binomial. "We're going to bind our linear model to $\bar p$, so that changes in predictor variables change the central tendency of the distribution" (p. 348). The statistical model we'll be fitting follows the form
 
+$$
 \begin{align*}
-\text{admit}_i & \sim \operatorname{BetaBinomial} (n_i, \overline p_i, \theta)\\
-\text{logit} (\overline p_i) & = \alpha \\
+\text{admit}_i & \sim \operatorname{BetaBinomial} (n_i, \bar p_i, \theta)\\
+\text{logit} (\bar p_i) & = \alpha \\
 \alpha                       & \sim \operatorname{Normal} (0, 2) \\
 \theta                       & \sim \operatorname{Exponential} (1).
 \end{align*}
+$$
 
 Here the size $n = \text{applications}$. In case you're confused, yes, our statistical model is not the one McElreath presented at the top of page 348 in the text. If you look closely, the statistical formula he presented does not match up with the one implied by his R code 11.26. Our statistical formula and the `brm()` model we'll be fitting, below, correspond to his R code 11.26.
 
@@ -1650,8 +1652,8 @@ ggplot(data = tibble(x = seq(from = 0, to = 12, by = .01)),
              color = canva_pal("Green fields")(4)[3]) +
   scale_x_continuous(NULL, breaks = c(0, mu, 10)) +
   scale_y_continuous(NULL, breaks = NULL) +
-  coord_cartesian(xlim = 0:10) +
   ggtitle(expression(paste("Our sweet ", gamma, "(3, 1)"))) +
+  coord_cartesian(xlim = 0:10) +
   theme_hc() +
   theme(plot.background = element_rect(fill = "grey92"))
 ```

diff --git a/_book/12.md b/_book/12.md
@@ -1,7 +1,7 @@
 ---
 title: "Ch. 12 Multilevel Models"
 author: "A Solomon Kurz"
-date: "2020-02-29"
+date: "2020-03-01"
 output:
   github_document
 ---
@@ -364,12 +364,12 @@ post_mdn %>%
   geom_vline(xintercept = c(16.5, 32.5), size = 1/4) +
   geom_point(aes(y = propsurv), color = "orange2") +
   geom_point(aes(y = post_mdn), shape = 1) +
-  coord_cartesian(ylim = c(0, 1)) +
+  annotate(geom = "text", x = c(8, 16 + 8, 32 + 8), y = 0, 
+           label = c("small tanks", "medium tanks", "large tanks")) +
   scale_x_continuous(breaks = c(1, 16, 32, 48)) +
   labs(title = "Multilevel shrinkage!",
        subtitle = "The empirical proportions are in orange while the model-\nimplied proportions are the black circles. The dashed line is\nthe model-implied average survival proportion.") +
-  annotate(geom = "text", x = c(8, 16 + 8, 32 + 8), y = 0, 
-           label = c("small tanks", "medium tanks", "large tanks")) +
+  coord_cartesian(ylim = c(0, 1)) +
   theme_fivethirtyeight() +
   theme(panel.grid = element_blank())
 ```
@@ -392,9 +392,9 @@ p1 <-
   ggplot(aes(x = x, group = iter)) +
   geom_line(aes(y = dnorm(x, b_Intercept, sd_tank__Intercept)),
             alpha = .2, color = "orange2") +
+  scale_y_continuous(NULL, breaks = NULL) +
   labs(title = "Population survival distribution",
        subtitle = "log-odds scale") +
-  scale_y_continuous(NULL, breaks = NULL) +
   coord_cartesian(xlim = c(-3, 4))
 ```
 
@@ -845,12 +845,12 @@ dsim %>%
                    y = mean_error, yend = mean_error),
                color = rep(c("orange2", "black"), each = 4),
                linetype = rep(1:2, each = 4)) +
-  scale_x_continuous(breaks = c(1, 10, 20, 30, 40, 50, 60)) +
   annotate("text", x = c(15 - 7.5, 30 - 7.5, 45 - 7.5, 60 - 7.5), y = .45, 
            label = c("tiny (5)", "small (10)", "medium (25)", "large (35)")) +
-  labs(title    = "Estimate error by model type",
+  scale_x_continuous(breaks = c(1, 10, 20, 30, 40, 50, 60)) +
+  labs(title = "Estimate error by model type",
        subtitle = "The horizontal axis displays pond number. The vertical axis measures\nthe absolute error in the predicted proportion of survivors, compared to\nthe true value used in the simulation. The higher the point, the worse\nthe estimate. No-pooling shown in orange. Partial pooling shown in black.\nThe orange and dashed black lines show the average error for each kind\nof estimate, across each initial density of tadpoles (pond size). Smaller\nponds produce more error, but the partial pooling estimates are better\non average, especially in smaller ponds.",
-       y        = "absolute error") +
+       y = "absolute error") +
   theme_fivethirtyeight() +
   theme(panel.grid = element_blank(),
         plot.subtitle = element_text(size = 10))
@@ -1406,11 +1406,11 @@ post %>%
   geom_density(size = 0, fill = "orange1", alpha = 3/4) +
   geom_density(aes(x = sd_block__Intercept), 
                size = 0, fill = "orange4", alpha = 3/4)  +
-  scale_y_continuous(NULL, breaks = NULL) +
-  coord_cartesian(xlim = c(0, 4)) +
-  ggtitle(expression(sigma["[x]"])) +
   annotate(geom = "text", x = 2/3, y = 2, label = "block", color = "orange4") +
   annotate(geom = "text", x = 2, y = 3/4, label = "actor", color = "orange1") +
+  scale_y_continuous(NULL, breaks = NULL) +
+  ggtitle(expression(sigma["[x]"])) +
+  coord_cartesian(xlim = c(0, 4)) +
   theme_fivethirtyeight()
 ```
 
@@ -1757,7 +1757,7 @@ fix_ef <-
 ran_and_fix_ef <-
   bind_cols(ran_ef, fix_ef) %>%
   mutate(intercept = fixed_effect + random_effect) %>%
-  mutate(prob      = inv_logit_scaled(intercept))
+  mutate(prob = inv_logit_scaled(intercept))
 
 # to simplify things, we'll reduce them to summaries
 (
@@ -1809,9 +1809,9 @@ p3 <-
   filter(iter %in% c(1:50)) %>%
 
   ggplot(aes(x = condition, y = prob, group = iter)) +
+  geom_line(alpha = 1/2, color = "orange3") +
   ggtitle("50 simulated actors") +
   coord_cartesian(ylim = 0:1) +
-  geom_line(alpha = 1/2, color = "orange3") +
   theme_fivethirtyeight() +
   theme(plot.title = element_text(size = 14, hjust = .5))
 
@@ -2133,7 +2133,7 @@ and we've been grappling with the relation between the grand mean $\alpha$ and t
 
 For our first step, we'll introduce the models.
 
-### Intercepts-only models with one or two grouping variables
+### Intercepts-only models with one or two grouping variables.
 
 If you recall, `b12.4` was our first multilevel model with the chimps data. We can retrieve the model formula like so.
 
@@ -2310,22 +2310,22 @@ print(b12.8)
 
 Now we've fit our two intercepts-only models, let's get to the heart of this section. We are going to practice four methods for working with the posterior samples. Each method will revolve around a different primary function. In order, they are
 
-* `brms::posterior_samples()`
-* `brms::coef()`
-* `brms::fitted()`
-* `tidybayes::spread_draws()`
+* `brms::posterior_samples()`,
+* `brms::coef()`,
+* `brms::fitted()`, and
+* `tidybayes::spread_draws()`.
 
 We've already had some practice with the first three, but I hope this section will make them even more clear. The `tidybayes::spread_draws()` method will be new, to us. I think you'll find it's a handy alternative.
 
-With each of the four methods, we'll practice three different model summaries.
+With each of the four methods, we'll practice three different model summaries:
 
-* Getting the posterior draws for the `actor`-level estimates from the `b12.7` model
-* Getting the posterior draws for the `actor`-level estimates from the cross-classified `b12.8` model, averaging over the levels of `block`
-* Getting the posterior draws for the `actor`-level estimates from the cross-classified `b12.8` model, based on `block == 1`
+* getting the posterior draws for the `actor`-level estimates from the `b12.7` model;
+* getting the posterior draws for the `actor`-level estimates from the cross-classified `b12.8` model, averaging over the levels of `block`; and
+* getting the posterior draws for the `actor`-level estimates from the cross-classified `b12.8` model, based on `block == 1`.
 
 So to be clear, our goal is to accomplish those three tasks with four methods, each of which should yield equivalent results.
 
-### `brms::posterior_samples()`
+### `brms::posterior_samples()`.
 
 To warm up, let's take a look at the structure of the `posterior_samples()` output for the simple `b12.7` model.
 
@@ -2458,7 +2458,7 @@ str(p3)
 
 Again, I like this method because of how close the wrangling code within `transmute()` is to the statistical model formula. I wrote a lot of code like this in my early days of working with these kinds of models, and I think the pedagogical insights were helpful. But this method has its limitations. It works fine if you're working with some small number of groups. But that's a lot of repetitious code and it would be utterly un-scalable to situations where you have 50 or 500 levels in your grouping variable. We need alternatives. 
 
-### `brms::coef()`
+### `brms::coef()`.
 
 First, let's review what the `coef()` function returns. 
 
@@ -2643,7 +2643,7 @@ $$10 + \operatorname{Normal}(0, 1).$$
 
 Conversely, it can be a little abstract. Let's keep expanding our options. 
 
-### `brms::fitted()`
+### `brms::fitted()`.
 
 As is often the case, we're going to want to define our predictor values for `fitted()`.
 
@@ -2802,7 +2802,7 @@ str(f3)
 
 Let's learn one more option.
 
-### `tidybayes::spread_draws()`
+### `tidybayes::spread_draws()`.
 
 Up till this point, we've really only used the tidybayes package for plotting (e.g., with `geom_halfeyeh()`) and summarizing (e.g., with `median_qi()`). But tidybayes is more general; it offers a handful of convenience functions for wrangling posterior draws from a tidyverse perspective. One such function is `spread_draws()`, which you can learn all about in Matthew Kay's vignette, [*Extracting and visualizing tidy draws from brms models*](https://mjskay.github.io/tidybayes/articles/tidy-brms.html). Let's take a look at how we'll be using it.