diff --git a/404.html b/404.html
index c3b92ef6..83ddc72c 100644
--- a/404.html
+++ b/404.html
@@ -23,7 +23,7 @@
 <meta name="author" content="Stephanie Zimmer, Rebecca J. Powell, and Isabella Velásquez" />
 
 
-<meta name="date" content="2024-04-16" />
+<meta name="date" content="2024-04-24" />
 
   <meta name="viewport" content="width=device-width, initial-scale=1" />
   <meta name="apple-mobile-web-app-capable" content="yes" />
@@ -229,7 +229,7 @@
 <li class="chapter" data-level="4.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#introduction-3"><i class="fa fa-check"></i><b>4.1</b> Introduction</a></li>
 <li class="chapter" data-level="4.2" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup"><i class="fa fa-check"></i><b>4.2</b> Setup</a>
 <ul>
-<li class="chapter" data-level="4.2.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#packages"><i class="fa fa-check"></i><b>4.2.1</b> Packages</a></li>
+<li class="chapter" data-level="4.2.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup-load-pkgs"><i class="fa fa-check"></i><b>4.2.1</b> Packages</a></li>
 <li class="chapter" data-level="4.2.2" data-path="c04-getting-started.html"><a href="c04-getting-started.html#data"><i class="fa fa-check"></i><b>4.2.2</b> Data</a></li>
 <li class="chapter" data-level="4.2.3" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup-des-obj"><i class="fa fa-check"></i><b>4.2.3</b> Design objects</a></li>
 </ul></li>
@@ -358,7 +358,7 @@
 <li class="chapter" data-level="9.9.1" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#random-number-seeds"><i class="fa fa-check"></i><b>9.9.1</b> Random number seeds</a></li>
 <li class="chapter" data-level="9.9.2" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#descriptive-names-and-labels"><i class="fa fa-check"></i><b>9.9.2</b> Descriptive names and labels</a></li>
 </ul></li>
-<li class="chapter" data-level="9.10" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#summary"><i class="fa fa-check"></i><b>9.10</b> Summary</a></li>
+<li class="chapter" data-level="9.10" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#additional-resources-1"><i class="fa fa-check"></i><b>9.10</b> Additional resources</a></li>
 </ul></li>
 <li class="part"><span><b>IV Real life data</b></span></li>
 <li class="chapter" data-level="10" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html"><i class="fa fa-check"></i><b>10</b> Sample designs and replicate weights</a>
@@ -380,7 +380,7 @@
 <ul>
 <li class="chapter" data-level="10.4.1" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#balanced-repeated-replication-brr-method"><i class="fa fa-check"></i><b>10.4.1</b> Balanced Repeated Replication (BRR) method</a></li>
 <li class="chapter" data-level="10.4.2" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#fays-brr-method"><i class="fa fa-check"></i><b>10.4.2</b> Fay’s BRR method</a></li>
-<li class="chapter" data-level="10.4.3" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#jackknife-method"><i class="fa fa-check"></i><b>10.4.3</b> Jackknife method</a></li>
+<li class="chapter" data-level="10.4.3" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#samp-jackknife"><i class="fa fa-check"></i><b>10.4.3</b> Jackknife method</a></li>
 <li class="chapter" data-level="10.4.4" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#bootstrap-method"><i class="fa fa-check"></i><b>10.4.4</b> Bootstrap method</a></li>
 </ul></li>
 <li class="chapter" data-level="10.5" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#exercises-2"><i class="fa fa-check"></i><b>10.5</b> Exercises</a></li>
@@ -398,14 +398,14 @@
 <li class="chapter" data-level="11.4" data-path="c11-missing-data.html"><a href="c11-missing-data.html#analysis-with-missing-data"><i class="fa fa-check"></i><b>11.4</b> Analysis with missing data</a>
 <ul>
 <li class="chapter" data-level="11.4.1" data-path="c11-missing-data.html"><a href="c11-missing-data.html#recoding-missing-data"><i class="fa fa-check"></i><b>11.4.1</b> Recoding missing data</a></li>
-<li class="chapter" data-level="11.4.2" data-path="c11-missing-data.html"><a href="c11-missing-data.html#accounting-for-skip-patterns"><i class="fa fa-check"></i><b>11.4.2</b> Accounting for skip patterns</a></li>
+<li class="chapter" data-level="11.4.2" data-path="c11-missing-data.html"><a href="c11-missing-data.html#missing-skip-patt"><i class="fa fa-check"></i><b>11.4.2</b> Accounting for skip patterns</a></li>
 </ul></li>
 </ul></li>
 <li class="chapter" data-level="12" data-path="c12-recommendations.html"><a href="c12-recommendations.html"><i class="fa fa-check"></i><b>12</b> Successful survey analysis recommendations</a>
 <ul>
 <li class="chapter" data-level="" data-path="c12-recommendations.html"><a href="c12-recommendations.html#prereq12"><i class="fa fa-check"></i>Prerequisites</a></li>
 <li class="chapter" data-level="12.1" data-path="c12-recommendations.html"><a href="c12-recommendations.html#introduction-10"><i class="fa fa-check"></i><b>12.1</b> Introduction</a></li>
-<li class="chapter" data-level="12.2" data-path="c12-recommendations.html"><a href="c12-recommendations.html#recs-survey-process"><i class="fa fa-check"></i><b>12.2</b> Follow survey analysis process</a></li>
+<li class="chapter" data-level="12.2" data-path="c12-recommendations.html"><a href="c12-recommendations.html#recs-survey-process"><i class="fa fa-check"></i><b>12.2</b> Follow the survey analysis process</a></li>
 <li class="chapter" data-level="12.3" data-path="c12-recommendations.html"><a href="c12-recommendations.html#begin-with-descriptive-analysis"><i class="fa fa-check"></i><b>12.3</b> Begin with descriptive analysis</a>
 <ul>
 <li class="chapter" data-level="12.3.1" data-path="c12-recommendations.html"><a href="c12-recommendations.html#table-review"><i class="fa fa-check"></i><b>12.3.1</b> Table review</a></li>
diff --git a/anes-cb.html b/anes-cb.html
index 355610ea..f3f69d72 100644
--- a/anes-cb.html
+++ b/anes-cb.html
@@ -23,7 +23,7 @@
 <meta name="author" content="Stephanie Zimmer, Rebecca J. Powell, and Isabella Velásquez" />
 
 
-<meta name="date" content="2024-04-16" />
+<meta name="date" content="2024-04-24" />
 
   <meta name="viewport" content="width=device-width, initial-scale=1" />
   <meta name="apple-mobile-web-app-capable" content="yes" />
@@ -229,7 +229,7 @@
 <li class="chapter" data-level="4.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#introduction-3"><i class="fa fa-check"></i><b>4.1</b> Introduction</a></li>
 <li class="chapter" data-level="4.2" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup"><i class="fa fa-check"></i><b>4.2</b> Setup</a>
 <ul>
-<li class="chapter" data-level="4.2.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#packages"><i class="fa fa-check"></i><b>4.2.1</b> Packages</a></li>
+<li class="chapter" data-level="4.2.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup-load-pkgs"><i class="fa fa-check"></i><b>4.2.1</b> Packages</a></li>
 <li class="chapter" data-level="4.2.2" data-path="c04-getting-started.html"><a href="c04-getting-started.html#data"><i class="fa fa-check"></i><b>4.2.2</b> Data</a></li>
 <li class="chapter" data-level="4.2.3" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup-des-obj"><i class="fa fa-check"></i><b>4.2.3</b> Design objects</a></li>
 </ul></li>
@@ -358,7 +358,7 @@
 <li class="chapter" data-level="9.9.1" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#random-number-seeds"><i class="fa fa-check"></i><b>9.9.1</b> Random number seeds</a></li>
 <li class="chapter" data-level="9.9.2" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#descriptive-names-and-labels"><i class="fa fa-check"></i><b>9.9.2</b> Descriptive names and labels</a></li>
 </ul></li>
-<li class="chapter" data-level="9.10" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#summary"><i class="fa fa-check"></i><b>9.10</b> Summary</a></li>
+<li class="chapter" data-level="9.10" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#additional-resources-1"><i class="fa fa-check"></i><b>9.10</b> Additional resources</a></li>
 </ul></li>
 <li class="part"><span><b>IV Real life data</b></span></li>
 <li class="chapter" data-level="10" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html"><i class="fa fa-check"></i><b>10</b> Sample designs and replicate weights</a>
@@ -380,7 +380,7 @@
 <ul>
 <li class="chapter" data-level="10.4.1" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#balanced-repeated-replication-brr-method"><i class="fa fa-check"></i><b>10.4.1</b> Balanced Repeated Replication (BRR) method</a></li>
 <li class="chapter" data-level="10.4.2" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#fays-brr-method"><i class="fa fa-check"></i><b>10.4.2</b> Fay’s BRR method</a></li>
-<li class="chapter" data-level="10.4.3" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#jackknife-method"><i class="fa fa-check"></i><b>10.4.3</b> Jackknife method</a></li>
+<li class="chapter" data-level="10.4.3" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#samp-jackknife"><i class="fa fa-check"></i><b>10.4.3</b> Jackknife method</a></li>
 <li class="chapter" data-level="10.4.4" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#bootstrap-method"><i class="fa fa-check"></i><b>10.4.4</b> Bootstrap method</a></li>
 </ul></li>
 <li class="chapter" data-level="10.5" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#exercises-2"><i class="fa fa-check"></i><b>10.5</b> Exercises</a></li>
@@ -398,14 +398,14 @@
 <li class="chapter" data-level="11.4" data-path="c11-missing-data.html"><a href="c11-missing-data.html#analysis-with-missing-data"><i class="fa fa-check"></i><b>11.4</b> Analysis with missing data</a>
 <ul>
 <li class="chapter" data-level="11.4.1" data-path="c11-missing-data.html"><a href="c11-missing-data.html#recoding-missing-data"><i class="fa fa-check"></i><b>11.4.1</b> Recoding missing data</a></li>
-<li class="chapter" data-level="11.4.2" data-path="c11-missing-data.html"><a href="c11-missing-data.html#accounting-for-skip-patterns"><i class="fa fa-check"></i><b>11.4.2</b> Accounting for skip patterns</a></li>
+<li class="chapter" data-level="11.4.2" data-path="c11-missing-data.html"><a href="c11-missing-data.html#missing-skip-patt"><i class="fa fa-check"></i><b>11.4.2</b> Accounting for skip patterns</a></li>
 </ul></li>
 </ul></li>
 <li class="chapter" data-level="12" data-path="c12-recommendations.html"><a href="c12-recommendations.html"><i class="fa fa-check"></i><b>12</b> Successful survey analysis recommendations</a>
 <ul>
 <li class="chapter" data-level="" data-path="c12-recommendations.html"><a href="c12-recommendations.html#prereq12"><i class="fa fa-check"></i>Prerequisites</a></li>
 <li class="chapter" data-level="12.1" data-path="c12-recommendations.html"><a href="c12-recommendations.html#introduction-10"><i class="fa fa-check"></i><b>12.1</b> Introduction</a></li>
-<li class="chapter" data-level="12.2" data-path="c12-recommendations.html"><a href="c12-recommendations.html#recs-survey-process"><i class="fa fa-check"></i><b>12.2</b> Follow survey analysis process</a></li>
+<li class="chapter" data-level="12.2" data-path="c12-recommendations.html"><a href="c12-recommendations.html#recs-survey-process"><i class="fa fa-check"></i><b>12.2</b> Follow the survey analysis process</a></li>
 <li class="chapter" data-level="12.3" data-path="c12-recommendations.html"><a href="c12-recommendations.html#begin-with-descriptive-analysis"><i class="fa fa-check"></i><b>12.3</b> Begin with descriptive analysis</a>
 <ul>
 <li class="chapter" data-level="12.3.1" data-path="c12-recommendations.html"><a href="c12-recommendations.html#table-review"><i class="fa fa-check"></i><b>12.3.1</b> Table review</a></li>
diff --git a/bookdown_files/figure-html/model-aug-examp-plot-1.png b/bookdown_files/figure-html/model-aug-examp-plot-1.png
index 4541e15b..35454d99 100644
Binary files a/bookdown_files/figure-html/model-aug-examp-plot-1.png and b/bookdown_files/figure-html/model-aug-examp-plot-1.png differ
diff --git a/bookdown_files/figure-html/model-logisticexamp-biden-plot-1.png b/bookdown_files/figure-html/model-logisticexamp-biden-plot-1.png
index 170f6a7f..e57a3aa6 100644
Binary files a/bookdown_files/figure-html/model-logisticexamp-biden-plot-1.png and b/bookdown_files/figure-html/model-logisticexamp-biden-plot-1.png differ
diff --git a/c01-intro.html b/c01-intro.html
index d08ff0d1..ef87f62d 100644
--- a/c01-intro.html
+++ b/c01-intro.html
@@ -23,7 +23,7 @@
 <meta name="author" content="Stephanie Zimmer, Rebecca J. Powell, and Isabella Velásquez" />
 
 
-<meta name="date" content="2024-04-16" />
+<meta name="date" content="2024-04-24" />
 
   <meta name="viewport" content="width=device-width, initial-scale=1" />
   <meta name="apple-mobile-web-app-capable" content="yes" />
@@ -229,7 +229,7 @@
 <li class="chapter" data-level="4.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#introduction-3"><i class="fa fa-check"></i><b>4.1</b> Introduction</a></li>
 <li class="chapter" data-level="4.2" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup"><i class="fa fa-check"></i><b>4.2</b> Setup</a>
 <ul>
-<li class="chapter" data-level="4.2.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#packages"><i class="fa fa-check"></i><b>4.2.1</b> Packages</a></li>
+<li class="chapter" data-level="4.2.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup-load-pkgs"><i class="fa fa-check"></i><b>4.2.1</b> Packages</a></li>
 <li class="chapter" data-level="4.2.2" data-path="c04-getting-started.html"><a href="c04-getting-started.html#data"><i class="fa fa-check"></i><b>4.2.2</b> Data</a></li>
 <li class="chapter" data-level="4.2.3" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup-des-obj"><i class="fa fa-check"></i><b>4.2.3</b> Design objects</a></li>
 </ul></li>
@@ -358,7 +358,7 @@
 <li class="chapter" data-level="9.9.1" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#random-number-seeds"><i class="fa fa-check"></i><b>9.9.1</b> Random number seeds</a></li>
 <li class="chapter" data-level="9.9.2" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#descriptive-names-and-labels"><i class="fa fa-check"></i><b>9.9.2</b> Descriptive names and labels</a></li>
 </ul></li>
-<li class="chapter" data-level="9.10" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#summary"><i class="fa fa-check"></i><b>9.10</b> Summary</a></li>
+<li class="chapter" data-level="9.10" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#additional-resources-1"><i class="fa fa-check"></i><b>9.10</b> Additional resources</a></li>
 </ul></li>
 <li class="part"><span><b>IV Real life data</b></span></li>
 <li class="chapter" data-level="10" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html"><i class="fa fa-check"></i><b>10</b> Sample designs and replicate weights</a>
@@ -380,7 +380,7 @@
 <ul>
 <li class="chapter" data-level="10.4.1" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#balanced-repeated-replication-brr-method"><i class="fa fa-check"></i><b>10.4.1</b> Balanced Repeated Replication (BRR) method</a></li>
 <li class="chapter" data-level="10.4.2" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#fays-brr-method"><i class="fa fa-check"></i><b>10.4.2</b> Fay’s BRR method</a></li>
-<li class="chapter" data-level="10.4.3" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#jackknife-method"><i class="fa fa-check"></i><b>10.4.3</b> Jackknife method</a></li>
+<li class="chapter" data-level="10.4.3" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#samp-jackknife"><i class="fa fa-check"></i><b>10.4.3</b> Jackknife method</a></li>
 <li class="chapter" data-level="10.4.4" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#bootstrap-method"><i class="fa fa-check"></i><b>10.4.4</b> Bootstrap method</a></li>
 </ul></li>
 <li class="chapter" data-level="10.5" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#exercises-2"><i class="fa fa-check"></i><b>10.5</b> Exercises</a></li>
@@ -398,14 +398,14 @@
 <li class="chapter" data-level="11.4" data-path="c11-missing-data.html"><a href="c11-missing-data.html#analysis-with-missing-data"><i class="fa fa-check"></i><b>11.4</b> Analysis with missing data</a>
 <ul>
 <li class="chapter" data-level="11.4.1" data-path="c11-missing-data.html"><a href="c11-missing-data.html#recoding-missing-data"><i class="fa fa-check"></i><b>11.4.1</b> Recoding missing data</a></li>
-<li class="chapter" data-level="11.4.2" data-path="c11-missing-data.html"><a href="c11-missing-data.html#accounting-for-skip-patterns"><i class="fa fa-check"></i><b>11.4.2</b> Accounting for skip patterns</a></li>
+<li class="chapter" data-level="11.4.2" data-path="c11-missing-data.html"><a href="c11-missing-data.html#missing-skip-patt"><i class="fa fa-check"></i><b>11.4.2</b> Accounting for skip patterns</a></li>
 </ul></li>
 </ul></li>
 <li class="chapter" data-level="12" data-path="c12-recommendations.html"><a href="c12-recommendations.html"><i class="fa fa-check"></i><b>12</b> Successful survey analysis recommendations</a>
 <ul>
 <li class="chapter" data-level="" data-path="c12-recommendations.html"><a href="c12-recommendations.html#prereq12"><i class="fa fa-check"></i>Prerequisites</a></li>
 <li class="chapter" data-level="12.1" data-path="c12-recommendations.html"><a href="c12-recommendations.html#introduction-10"><i class="fa fa-check"></i><b>12.1</b> Introduction</a></li>
-<li class="chapter" data-level="12.2" data-path="c12-recommendations.html"><a href="c12-recommendations.html#recs-survey-process"><i class="fa fa-check"></i><b>12.2</b> Follow survey analysis process</a></li>
+<li class="chapter" data-level="12.2" data-path="c12-recommendations.html"><a href="c12-recommendations.html#recs-survey-process"><i class="fa fa-check"></i><b>12.2</b> Follow the survey analysis process</a></li>
 <li class="chapter" data-level="12.3" data-path="c12-recommendations.html"><a href="c12-recommendations.html#begin-with-descriptive-analysis"><i class="fa fa-check"></i><b>12.3</b> Begin with descriptive analysis</a>
 <ul>
 <li class="chapter" data-level="12.3.1" data-path="c12-recommendations.html"><a href="c12-recommendations.html#table-review"><i class="fa fa-check"></i><b>12.3.1</b> Table review</a></li>
@@ -519,9 +519,9 @@ <h1>
             <section class="normal" id="section-">
 <div id="c01-intro" class="section level1 hasAnchor" number="1">
 <h1><span class="header-section-number">Chapter 1</span> Introduction<a href="c01-intro.html#c01-intro" class="anchor-section" aria-label="Anchor link to header"></a></h1>
-<p>Surveys are valuable tools for gathering information about a population, and are used by researchers, governments, and businesses alike to better understand public opinion and behaviors. For example, a non-profit group may analyze societal trends to measure their impact, government agencies may study behaviors to inform policy, or companies may seek to learn customer product preferences to refine business strategy. With survey data, we can explore the world around us.</p>
-<p>Surveys are often conducted with a sample of the population. Therefore, in order to use the survey data to understand the population, we use weights to adjust the survey results for unequal probabilities of selection, non-response, and post-stratification. These adjustments ensure the sample accurately represents the population of interest <span class="citation">(<a href="#ref-gard2023weightsdef">Gard et al. 2023</a>)</span>. To account for the intricate nature of the survey design, analysts rely on statistical software such as SAS, Stata, SUDAAN, and R.</p>
-<p>In this book, we focus on R to introduce survey analysis. Our goal is to provide a comprehensive guide for individuals new to survey analysis but with some familiarity with statistics and R programming. We use a combination of the {survey} and {srvyr} packages and present the code following best practices from the tidyverse and assume weights have already been calculated and are available <span class="citation">(<a href="#ref-R-srvyr">Freedman Ellis and Schneider 2023</a>; <a href="#ref-lumley2010complex">Lumley 2010</a>; <a href="#ref-tidyverse2019">Wickham et al. 2019</a>)</span>.</p>
+<p>Surveys are valuable tools for gathering information about a population. Researchers, governments, and businesses use surveys to better understand public opinion and behaviors. For example, a non-profit group may analyze societal trends to measure their impact, government agencies may study behaviors to inform policy, or companies may seek to learn customer product preferences to refine business strategy. With survey data, we can explore the world around us.</p>
+<p>Surveys are often conducted with a sample of the population. Therefore, to use the survey data to understand the population, we use weights to adjust the survey results for unequal probabilities of selection, non-response, and post-stratification. These adjustments ensure the sample accurately represents the population of interest <span class="citation">(<a href="#ref-gard2023weightsdef">Gard et al. 2023</a>)</span>. To account for the intricate nature of the survey design, analysts rely on statistical software such as SAS, Stata, SUDAAN, and R.</p>
+<p>In this book, we focus on R to introduce survey analysis. Our goal is to provide a comprehensive guide for individuals new to survey analysis but with some familiarity with statistics and R programming. We use a combination of the {survey} and {srvyr} packages and present the code following best practices from the tidyverse <span class="citation">(<a href="#ref-R-srvyr">Freedman Ellis and Schneider 2023</a>; <a href="#ref-lumley2010complex">Lumley 2010</a>; <a href="#ref-tidyverse2019">Wickham et al. 2019</a>)</span>.</p>
 <div id="survey-analysis-in-r" class="section level2 hasAnchor" number="1.1">
 <h2><span class="header-section-number">1.1</span> Survey analysis in R<a href="c01-intro.html#survey-analysis-in-r" class="anchor-section" aria-label="Anchor link to header"></a></h2>
 <p>The {survey} package was released on the <a href="https://cran.r-project.org/src/contrib/Archive/survey/">Comprehensive R Archive Network (CRAN)</a> in 2003 and has been continuously developed over time. This package, primarily authored by Thomas Lumley, offers an extensive array of features, including:</p>
@@ -531,13 +531,13 @@ <h2><span class="header-section-number">1.1</span> Survey analysis in R<a href="
 <li>Variances by Taylor linearization or by replicate weights, including balance repeated replication, jackknife, bootstrap, multistage bootstrap, or user-supplied methods</li>
 <li>Hypothesis testing for means, proportions, and other parameters</li>
 </ul>
-<p>The {srvyr} package builds on the {survey} package by providing wrappers for functions that align with the tidyverse philosophy. This is our motivation for using and recommending this package. We find that the {srvyr} package is user-friendly for those familiar with the tidyverse packages in R.</p>
-<p>For example, while many functions in the {survey} package use variables as formulas, the {srvyr} package uses tidy selection to pass variable names, a common feature in the tidyverse <span class="citation">(<a href="#ref-R-tidyselect">Henry and Wickham 2022</a>)</span>. Users of the tidyverse are likely familiar with the magrittr pipe operator (<code>%&gt;%</code>), which seamlessly works with functions from the {srvyr} package. Moreover, several common functions from {dplyr}, such as <code>filter()</code>, <code>mutate()</code>, and <code>summarize()</code>, can be applied to survey objects <span class="citation">(<a href="#ref-R-dplyr">Wickham et al. 2023</a>)</span>. This enables users to streamline their analysis workflow and leverage the benefits of both the {srvyr} and {tidyverse} packages.</p>
-<p>While the {srvyr} package offers many advantages, there is one notable limitation: it doesn’t fully incorporate the modeling capabilities of the {survey} package into tidy wrappers. When discussing modeling and hypothesis testing, we primarily rely on the {survey} package. However, we guide you on how to apply the pipe operator to these functions to maintain clarity and consistency in your analyses.</p>
+<p>The {srvyr} package builds on the {survey} package by providing wrappers for functions that align with the tidyverse philosophy. This is our motivation for using and recommending the {srvyr} package. We find that it is user-friendly for those familiar with the tidyverse packages in R.</p>
+<p>For example, while many functions in the {survey} package access variables through formulas, the {srvyr} package uses tidy selection to pass variable names, a common feature in the tidyverse <span class="citation">(<a href="#ref-R-tidyselect">Henry and Wickham 2022</a>)</span>. Users of the tidyverse are also likely familiar with the magrittr pipe operator (<code>%&gt;%</code>), which seamlessly works with functions from the {srvyr} package. Moreover, several common functions from {dplyr}, such as <code>filter()</code>, <code>mutate()</code>, and <code>summarize()</code>, can be applied to survey objects <span class="citation">(<a href="#ref-R-dplyr">Wickham et al. 2023</a>)</span>. This enables users to streamline their analysis workflow and leverage the benefits of both the {srvyr} and {tidyverse} packages.</p>
+<p>While the {srvyr} package offers many advantages, there is one notable limitation: it doesn’t fully incorporate the modeling capabilities of the {survey} package into tidy wrappers. When discussing modeling and hypothesis testing, we primarily rely on the {survey} package. However, we provide information on how to apply the pipe operator to these functions to maintain clarity and consistency in analyses.</p>
 </div>
 <div id="what-to-expect" class="section level2 hasAnchor" number="1.2">
 <h2><span class="header-section-number">1.2</span> What to expect<a href="c01-intro.html#what-to-expect" class="anchor-section" aria-label="Anchor link to header"></a></h2>
-<p>This book covers many aspects of survey design and analysis, from understanding how to create design objects to conducting descriptive analysis, statistical tests, and models. We emphasize coding best practices and effective presentation techniques while using real-world data and practical examples to help you gain proficiency in survey analysis.</p>
+<p>This book covers many aspects of survey design and analysis, from understanding how to create design objects to conducting descriptive analysis, statistical tests, and models. We emphasize coding best practices and effective presentation techniques while using real-world data and practical examples to help readers gain proficiency in survey analysis.</p>
 <p>Below is a summary of each chapter:</p>
 <ul>
 <li><strong>Chapter <a href="c02-overview-surveys.html#c02-overview-surveys">2</a> - Overview of Surveys</strong>:
@@ -547,7 +547,8 @@ <h2><span class="header-section-number">1.2</span> What to expect<a href="c01-in
 </ul></li>
 <li><strong>Chapter <a href="c03-survey-data-documentation.html#c03-survey-data-documentation">3</a> - Survey data documentation</strong>:
 <ul>
-<li>Guide to survey documentation</li>
+<li>Guide to survey documentation types</li>
+<li>How to read survey documentation</li>
 </ul></li>
 <li><strong>Chapter <a href="c04-getting-started.html#c04-getting-started">4</a> - Getting started</strong>:
 <ul>
@@ -558,7 +559,9 @@ <h2><span class="header-section-number">1.2</span> What to expect<a href="c01-in
 </ul></li>
 <li><strong>Chapter <a href="c05-descriptive-analysis.html#c05-descriptive-analysis">5</a> - Descriptive analyses</strong>:
 <ul>
-<li>Calculation of point estimates, standard errors, confidence intervals, and design effects</li>
+<li>Calculation of point estimates</li>
+<li>Estimation of standard errors and confidence intervals</li>
+<li>Calculation of design effects</li>
 </ul></li>
 <li><strong>Chapter <a href="c06-statistical-testing.html#c06-statistical-testing">6</a> - Statistical testing</strong>:
 <ul>
@@ -568,6 +571,7 @@ <h2><span class="header-section-number">1.2</span> What to expect<a href="c01-in
 </ul></li>
 <li><strong>Chapter <a href="c07-modeling.html#c07-modeling">7</a> - Modeling</strong>:
 <ul>
+<li>Overview of model formula specifications</li>
 <li>Linear regression, ANOVA, and logistic regression modeling</li>
 </ul></li>
 <li><strong>Chapter <a href="c08-communicating-results.html#c08-communicating-results">8</a> - Communication of results</strong>:
@@ -577,12 +581,14 @@ <h2><span class="header-section-number">1.2</span> What to expect<a href="c01-in
 </ul></li>
 <li><strong>Chapter <a href="c09-reprex-data.html#c09-reprex-data">9</a> - Reproducible research</strong>:
 <ul>
-<li>Various tools and methods for achieving reproducibility</li>
+<li>Tools and methods for achieving reproducibility</li>
+<li>Resources for reproducible research</li>
 </ul></li>
 <li><strong>Chapter <a href="c10-sample-designs-replicate-weights.html#c10-sample-designs-replicate-weights">10</a> - Sample designs and replicate weights</strong>:
 <ul>
-<li>Description of common sampling designs and how to specify in R</li>
-<li>Description of replicate weight methods and how to specify in R</li>
+<li>Overview of common sampling designs</li>
+<li>Replicate weight methods</li>
+<li>How to specify survey designs in R</li>
 </ul></li>
 <li><strong>Chapter <a href="c11-missing-data.html#c11-missing-data">11</a> - Missing data</strong>:
 <ul>
@@ -592,27 +598,27 @@ <h2><span class="header-section-number">1.2</span> What to expect<a href="c01-in
 <li><strong>Chapter <a href="c12-recommendations.html#c12-recommendations">12</a> - Successful survey analysis recommendations</strong>:
 <ul>
 <li>Tips for successful analysis</li>
-<li>Debugging skills</li>
+<li>Recommendations for debugging</li>
 </ul></li>
 <li><strong>Chapter <a href="c13-ncvs-vignette.html#c13-ncvs-vignette">13</a> - National Crime Victimization Survey Vignette</strong>:
 <ul>
 <li>Vignette on analyzing National Crime Victimization Survey (NCVS) data</li>
-<li>Illustrates analysis requiring multiple files for victimization rates</li>
+<li>Illustration of analysis requiring multiple files for victimization rates</li>
 </ul></li>
 <li><strong>Chapter <a href="c14-ambarom-vignette.html#c14-ambarom-vignette">14</a> - AmericasBarometer Vignette</strong>:
 <ul>
 <li>Vignette on analyzing AmericasBarometer survey data</li>
-<li>Includes making choropleth maps with survey estimates</li>
+<li>Creation of choropleth maps with survey estimates</li>
 </ul></li>
 </ul>
-<p>The majority of chapters contain code that you can follow. Each of these chapters starts with a “set-up” section, which includes the code needed to load the packages and datasets. We then provide the main idea of the chapter and examples of how to use the functions. Most chapters conclude with exercises to work through. We provide the solutions to the exercises in the online version of the book, available at <a href="https://tidy-survey-r.github.io/">tidy-survey-r.github.io</a>.</p>
-<p>While we provide a brief overview of survey methodology and statistical theory, this book is not intended to be the sole resource for these topics. We reference other materials throughout the book and encourage readers to seek those out for more information.</p>
+<p>The majority of chapters contain code that readers can follow. Each of these chapters starts with a “Prerequisites” section, which includes the code needed to load the packages and datasets used in the chapter. We then provide the main idea of the chapter and examples of how to use the functions. Most chapters conclude with exercises to work through. We provide the solutions to the exercises in the <a href="https://tidy-survey-r.github.io/tidy-survey-book/">online version of the book</a>.</p>
+<p>While we provide a brief overview of survey methodology and statistical theory, this book is not intended to be the sole resource for these topics. We reference other materials and encourage readers to seek them out for more information.</p>
 </div>
 <div id="prerequisites" class="section level2 hasAnchor" number="1.3">
 <h2><span class="header-section-number">1.3</span> Prerequisites<a href="c01-intro.html#prerequisites" class="anchor-section" aria-label="Anchor link to header"></a></h2>
-<p>To get the most of our this book, we assume that you have already conducted a survey and have the data or obtained a microdata file. Microdata, also known as respondent-level or row-level data, differs from summarized data typically found in tables. It contains individual survey responses, along with analysis weights and design variables such as strata or clusters.</p>
-<p>Additionally, the survey data should already include weights and design variables. These are required to accurately calculate unbiased estimates. The concepts and techniques discussed in this book will help you to extract meaningful insights from your survey data, but will not cover how to create weights in the first place as this is a separate complex topic. If you do not already have weights created for the survey data you are using, we recommend reviewing other resources focused on weight creation such as <span class="citation">Valliant and Dever (<a href="#ref-Valliant2018weights">2018</a>)</span>.</p>
-<p>This book is tailored for analysts already familiar with R and the tidyverse but who may be new to complex survey analysis in R. We anticipate that readers of this book can:</p>
+<p>To get the most out of our this book, we assume a survey has already been conducted and readers have obtained a microdata file. Microdata, also known as respondent-level or row-level data, differs from summarized data typically found in tables. They contain individual survey responses, along with analysis weights and design variables such as strata or clusters.</p>
+<p>Additionally, the survey data should already include weights and design variables. These are required to accurately calculate unbiased estimates. The concepts and techniques discussed in this book help readers to extract meaningful insights from survey data, but do not cover how to create weights as this is a separate complex topic. If weights are not already created for the survey data, we recommend reviewing other resources focused on weight creation such as <span class="citation">Valliant and Dever (<a href="#ref-Valliant2018weights">2018</a>)</span>.</p>
+<p>This book is tailored for analysts already familiar with R and the tidyverse, but who may be new to complex survey analysis in R. We anticipate that readers of this book can:</p>
 <ul>
 <li>Install R and their Integrated Development Environment (IDE) of choice, such as RStudio</li>
 <li>Install and load packages from CRAN and GitHub repositories</li>
@@ -621,58 +627,58 @@ <h2><span class="header-section-number">1.3</span> Prerequisites<a href="c01-int
 <li>Understand fundamental tidyverse concepts such as tidy/long/wide data, tibbles, the magrittr pipe (<code>%&gt;%</code>), and tidy selection</li>
 <li>Use the tidyverse packages to wrangle, tidy, and visualize data</li>
 </ul>
-<p>If these concepts or skills are new to you, we recommend starting with introductory resources to cover these topics before reading this book. R for Data Science <span class="citation">(<a href="#ref-wickham2023r4ds">Wickham, Çetinkaya-Rundel, and Grolemund 2023</a>)</span> is a beginner-friendly guide for getting started in data science using R. It offers guidance on preliminary installation steps and basic R syntax, and it introduces tidyverse concepts and packages.</p>
+<p>If these concepts or skills are new, we recommend starting with introductory resources to cover these topics before reading this book. R for Data Science <span class="citation">(<a href="#ref-wickham2023r4ds">Wickham, Çetinkaya-Rundel, and Grolemund 2023</a>)</span> is a beginner-friendly guide for getting started in data science using R. It offers guidance on preliminary installation steps, basic R syntax, and tidyverse concepts and packages.</p>
 </div>
 <div id="datasets-used-in-this-book" class="section level2 hasAnchor" number="1.4">
 <h2><span class="header-section-number">1.4</span> Datasets used in this book<a href="c01-intro.html#datasets-used-in-this-book" class="anchor-section" aria-label="Anchor link to header"></a></h2>
-<p>We work with two key datasets throughout the book: the Residential Energy Consumption Survey <span class="citation">(RECS – <a href="#ref-recs-2020-tech">U.S. Energy Information Administration 2023b</a>)</span> and the American National Election Studies <span class="citation">(ANES – <a href="#ref-debell">DeBell 2010</a>)</span>. We introduce and demonstrate the loading and preparation of these datasets in Chapter <a href="c04-getting-started.html#c04-getting-started">4</a>.</p>
+<p>We work with two key datasets throughout the book: the Residential Energy Consumption Survey <span class="citation">(RECS – <a href="#ref-recs-2020-tech">U.S. Energy Information Administration 2023b</a>)</span> and the American National Election Studies <span class="citation">(ANES – <a href="#ref-debell">DeBell 2010</a>)</span>. We introduce the loading and preparation of these datasets in Chapter <a href="c04-getting-started.html#c04-getting-started">4</a>.</p>
 </div>
 <div id="conventions" class="section level2 hasAnchor" number="1.5">
 <h2><span class="header-section-number">1.5</span> Conventions<a href="c01-intro.html#conventions" class="anchor-section" aria-label="Anchor link to header"></a></h2>
 <p>Throughout the book, we use the following typographical conventions:</p>
 <ul>
 <li>Package names are surrounded by curly brackets: {srvyr}</li>
-<li>Function names are in constant width text format and include parentheses: <code>survey_mean()</code></li>
-<li>Object and variable names are in constant width text format: <code>anes_des</code></li>
+<li>Function names are in constant-width text format and include parentheses: <code>survey_mean()</code></li>
+<li>Object and variable names are in constant-width text format: <code>anes_des</code></li>
 </ul>
 </div>
 <div id="getting-help" class="section level2 hasAnchor" number="1.6">
 <h2><span class="header-section-number">1.6</span> Getting help<a href="c01-intro.html#getting-help" class="anchor-section" aria-label="Anchor link to header"></a></h2>
 <p>We recommend first trying to resolve errors and issues independently using the tips provided in <strong>Chapter <a href="c12-recommendations.html#c12-recommendations">12</a></strong>.</p>
-<p>If you have questions or face issues while working through the book, please report them to its <a href="https://github.com/tidy-survey-r/tidy-survey-book">GitHub repository</a>.</p>
 <p>There are several community forums for asking questions, including:</p>
 <ul>
-<li>Posit Community: <a href="https://community.rstudio.com/" class="uri">https://community.rstudio.com/</a></li>
-<li>R for Data Science Slack Community: <a href="https://rfordatasci.com/" class="uri">https://rfordatasci.com/</a></li>
-<li>Stack Overflow: <a href="https://stackoverflow.com/" class="uri">https://stackoverflow.com/</a></li>
+<li><a href="https://forum.posit.co/">Posit Community</a></li>
+<li><a href="https://rfordatasci.com/">R for Data Science Slack Community</a></li>
+<li><a href="https://stackoverflow.com/">Stack Overflow</a></li>
 </ul>
+<p>Please report any bugs and issues to the book’s <a href="https://github.com/tidy-survey-r/tidy-survey-book/issues">GitHub repository</a>.</p>
 </div>
 <div id="acknowledgements" class="section level2 hasAnchor" number="1.7">
 <h2><span class="header-section-number">1.7</span> Acknowledgements<a href="c01-intro.html#acknowledgements" class="anchor-section" aria-label="Anchor link to header"></a></h2>
-<p>We would like to thank Holly Cast, Greg Freedman Ellis, Joe Murphy, and Sheila Saia for their reviews of the initial draft. Their detailed and honest feedback helped to make this book considerably better, and we are grateful for their input. Additionally, this book started from two short courses. The first at the Annual Conference for the American Association for Public Opinion Research (AAPOR) and the second as a series of webinars for the Midwest Association of Public Opinion Research (MAPOR). We would like to also thank those that assisted us by moderating breakout rooms and answering questions from attendees: Greg Freedman Ellis, Raphael Nishimura, and Benjamin Schneider.</p>
+<p>We would like to thank Holly Cast, Greg Freedman Ellis, Joe Murphy, and Sheila Saia for their reviews of the initial draft. Their detailed and honest feedback helped improve this book, and we are grateful for their input. Additionally, this book started with two short courses. The first was at the Annual Conference for the American Association for Public Opinion Research (AAPOR) and the second was a series of webinars for the Midwest Association of Public Opinion Research (MAPOR.) We would like to also thank those who assisted us by moderating breakout rooms and answering questions from attendees: Greg Freedman Ellis, Raphael Nishimura, and Benjamin Schneider.</p>
 </div>
 <div id="colophon" class="section level2 hasAnchor" number="1.8">
 <h2><span class="header-section-number">1.8</span> Colophon<a href="c01-intro.html#colophon" class="anchor-section" aria-label="Anchor link to header"></a></h2>
-<p>This book was written in <a href="http://bookdown.org/">bookdown</a> using <a href="http://www.rstudio.com/ide/">RStudio</a>. The complete source is available on GitHub: <a href="https://github.com/tidy-survey-r/tidy-survey-book" class="uri">https://github.com/tidy-survey-r/tidy-survey-book</a>.</p>
+<p>This book was written in <a href="http://bookdown.org/">bookdown</a> using <a href="http://www.rstudio.com/ide/">RStudio</a>. The complete source is available on <a href="https://github.com/tidy-survey-r/tidy-survey-book">GitHub</a>.</p>
 <p>This version of the book was built with R version 4.3.1 (2023-06-16) and with the packages listed in Table <a href="c01-intro.html#tab:intro-packages-tab">1.1</a>.</p>
 
-<div id="htbijaoair" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
-<style>#htbijaoair table {
+<div id="iuwtudrxst" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
+<style>#iuwtudrxst table {
   font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
   -webkit-font-smoothing: antialiased;
   -moz-osx-font-smoothing: grayscale;
 }
 
-#htbijaoair thead, #htbijaoair tbody, #htbijaoair tfoot, #htbijaoair tr, #htbijaoair td, #htbijaoair th {
+#iuwtudrxst thead, #iuwtudrxst tbody, #iuwtudrxst tfoot, #iuwtudrxst tr, #iuwtudrxst td, #iuwtudrxst th {
   border-style: none;
 }
 
-#htbijaoair p {
+#iuwtudrxst p {
   margin: 0;
   padding: 0;
 }
 
-#htbijaoair .gt_table {
+#iuwtudrxst .gt_table {
   display: table;
   border-collapse: collapse;
   line-height: normal;
@@ -698,12 +704,12 @@ <h2><span class="header-section-number">1.8</span> Colophon<a href="c01-intro.ht
   border-left-color: #D3D3D3;
 }
 
-#htbijaoair .gt_caption {
+#iuwtudrxst .gt_caption {
   padding-top: 4px;
   padding-bottom: 4px;
 }
 
-#htbijaoair .gt_title {
+#iuwtudrxst .gt_title {
   color: #333333;
   font-size: 125%;
   font-weight: initial;
@@ -715,7 +721,7 @@ <h2><span class="header-section-number">1.8</span> Colophon<a href="c01-intro.ht
   border-bottom-width: 0;
 }
 
-#htbijaoair .gt_subtitle {
+#iuwtudrxst .gt_subtitle {
   color: #333333;
   font-size: 85%;
   font-weight: initial;
@@ -727,7 +733,7 @@ <h2><span class="header-section-number">1.8</span> Colophon<a href="c01-intro.ht
   border-top-width: 0;
 }
 
-#htbijaoair .gt_heading {
+#iuwtudrxst .gt_heading {
   background-color: #FFFFFF;
   text-align: center;
   border-bottom-color: #FFFFFF;
@@ -739,13 +745,13 @@ <h2><span class="header-section-number">1.8</span> Colophon<a href="c01-intro.ht
   border-right-color: #D3D3D3;
 }
 
-#htbijaoair .gt_bottom_border {
+#iuwtudrxst .gt_bottom_border {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
 }
 
-#htbijaoair .gt_col_headings {
+#iuwtudrxst .gt_col_headings {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -760,7 +766,7 @@ <h2><span class="header-section-number">1.8</span> Colophon<a href="c01-intro.ht
   border-right-color: #D3D3D3;
 }
 
-#htbijaoair .gt_col_heading {
+#iuwtudrxst .gt_col_heading {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -780,7 +786,7 @@ <h2><span class="header-section-number">1.8</span> Colophon<a href="c01-intro.ht
   overflow-x: hidden;
 }
 
-#htbijaoair .gt_column_spanner_outer {
+#iuwtudrxst .gt_column_spanner_outer {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -792,15 +798,15 @@ <h2><span class="header-section-number">1.8</span> Colophon<a href="c01-intro.ht
   padding-right: 4px;
 }
 
-#htbijaoair .gt_column_spanner_outer:first-child {
+#iuwtudrxst .gt_column_spanner_outer:first-child {
   padding-left: 0;
 }
 
-#htbijaoair .gt_column_spanner_outer:last-child {
+#iuwtudrxst .gt_column_spanner_outer:last-child {
   padding-right: 0;
 }
 
-#htbijaoair .gt_column_spanner {
+#iuwtudrxst .gt_column_spanner {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
@@ -812,11 +818,11 @@ <h2><span class="header-section-number">1.8</span> Colophon<a href="c01-intro.ht
   width: 100%;
 }
 
-#htbijaoair .gt_spanner_row {
+#iuwtudrxst .gt_spanner_row {
   border-bottom-style: hidden;
 }
 
-#htbijaoair .gt_group_heading {
+#iuwtudrxst .gt_group_heading {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -842,7 +848,7 @@ <h2><span class="header-section-number">1.8</span> Colophon<a href="c01-intro.ht
   text-align: left;
 }
 
-#htbijaoair .gt_empty_group_heading {
+#iuwtudrxst .gt_empty_group_heading {
   padding: 0.5px;
   color: #333333;
   background-color: #FFFFFF;
@@ -857,15 +863,15 @@ <h2><span class="header-section-number">1.8</span> Colophon<a href="c01-intro.ht
   vertical-align: middle;
 }
 
-#htbijaoair .gt_from_md > :first-child {
+#iuwtudrxst .gt_from_md > :first-child {
   margin-top: 0;
 }
 
-#htbijaoair .gt_from_md > :last-child {
+#iuwtudrxst .gt_from_md > :last-child {
   margin-bottom: 0;
 }
 
-#htbijaoair .gt_row {
+#iuwtudrxst .gt_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -884,7 +890,7 @@ <h2><span class="header-section-number">1.8</span> Colophon<a href="c01-intro.ht
   overflow-x: hidden;
 }
 
-#htbijaoair .gt_stub {
+#iuwtudrxst .gt_stub {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -897,7 +903,7 @@ <h2><span class="header-section-number">1.8</span> Colophon<a href="c01-intro.ht
   padding-right: 5px;
 }
 
-#htbijaoair .gt_stub_row_group {
+#iuwtudrxst .gt_stub_row_group {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -911,15 +917,15 @@ <h2><span class="header-section-number">1.8</span> Colophon<a href="c01-intro.ht
   vertical-align: top;
 }
 
-#htbijaoair .gt_row_group_first td {
+#iuwtudrxst .gt_row_group_first td {
   border-top-width: 2px;
 }
 
-#htbijaoair .gt_row_group_first th {
+#iuwtudrxst .gt_row_group_first th {
   border-top-width: 2px;
 }
 
-#htbijaoair .gt_summary_row {
+#iuwtudrxst .gt_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -929,16 +935,16 @@ <h2><span class="header-section-number">1.8</span> Colophon<a href="c01-intro.ht
   padding-right: 5px;
 }
 
-#htbijaoair .gt_first_summary_row {
+#iuwtudrxst .gt_first_summary_row {
   border-top-style: solid;
   border-top-color: #D3D3D3;
 }
 
-#htbijaoair .gt_first_summary_row.thick {
+#iuwtudrxst .gt_first_summary_row.thick {
   border-top-width: 2px;
 }
 
-#htbijaoair .gt_last_summary_row {
+#iuwtudrxst .gt_last_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -948,7 +954,7 @@ <h2><span class="header-section-number">1.8</span> Colophon<a href="c01-intro.ht
   border-bottom-color: #D3D3D3;
 }
 
-#htbijaoair .gt_grand_summary_row {
+#iuwtudrxst .gt_grand_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -958,7 +964,7 @@ <h2><span class="header-section-number">1.8</span> Colophon<a href="c01-intro.ht
   padding-right: 5px;
 }
 
-#htbijaoair .gt_first_grand_summary_row {
+#iuwtudrxst .gt_first_grand_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -968,7 +974,7 @@ <h2><span class="header-section-number">1.8</span> Colophon<a href="c01-intro.ht
   border-top-color: #D3D3D3;
 }
 
-#htbijaoair .gt_last_grand_summary_row_top {
+#iuwtudrxst .gt_last_grand_summary_row_top {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -978,11 +984,11 @@ <h2><span class="header-section-number">1.8</span> Colophon<a href="c01-intro.ht
   border-bottom-color: #D3D3D3;
 }
 
-#htbijaoair .gt_striped {
+#iuwtudrxst .gt_striped {
   background-color: rgba(128, 128, 128, 0.05);
 }
 
-#htbijaoair .gt_table_body {
+#iuwtudrxst .gt_table_body {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -991,7 +997,7 @@ <h2><span class="header-section-number">1.8</span> Colophon<a href="c01-intro.ht
   border-bottom-color: #D3D3D3;
 }
 
-#htbijaoair .gt_footnotes {
+#iuwtudrxst .gt_footnotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -1005,7 +1011,7 @@ <h2><span class="header-section-number">1.8</span> Colophon<a href="c01-intro.ht
   border-right-color: #D3D3D3;
 }
 
-#htbijaoair .gt_footnote {
+#iuwtudrxst .gt_footnote {
   margin: 0px;
   font-size: 90%;
   padding-top: 4px;
@@ -1014,7 +1020,7 @@ <h2><span class="header-section-number">1.8</span> Colophon<a href="c01-intro.ht
   padding-right: 5px;
 }
 
-#htbijaoair .gt_sourcenotes {
+#iuwtudrxst .gt_sourcenotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -1028,7 +1034,7 @@ <h2><span class="header-section-number">1.8</span> Colophon<a href="c01-intro.ht
   border-right-color: #D3D3D3;
 }
 
-#htbijaoair .gt_sourcenote {
+#iuwtudrxst .gt_sourcenote {
   font-size: 90%;
   padding-top: 4px;
   padding-bottom: 4px;
@@ -1036,63 +1042,63 @@ <h2><span class="header-section-number">1.8</span> Colophon<a href="c01-intro.ht
   padding-right: 5px;
 }
 
-#htbijaoair .gt_left {
+#iuwtudrxst .gt_left {
   text-align: left;
 }
 
-#htbijaoair .gt_center {
+#iuwtudrxst .gt_center {
   text-align: center;
 }
 
-#htbijaoair .gt_right {
+#iuwtudrxst .gt_right {
   text-align: right;
   font-variant-numeric: tabular-nums;
 }
 
-#htbijaoair .gt_font_normal {
+#iuwtudrxst .gt_font_normal {
   font-weight: normal;
 }
 
-#htbijaoair .gt_font_bold {
+#iuwtudrxst .gt_font_bold {
   font-weight: bold;
 }
 
-#htbijaoair .gt_font_italic {
+#iuwtudrxst .gt_font_italic {
   font-style: italic;
 }
 
-#htbijaoair .gt_super {
+#iuwtudrxst .gt_super {
   font-size: 65%;
 }
 
-#htbijaoair .gt_footnote_marks {
+#iuwtudrxst .gt_footnote_marks {
   font-size: 75%;
   vertical-align: 0.4em;
   position: initial;
 }
 
-#htbijaoair .gt_asterisk {
+#iuwtudrxst .gt_asterisk {
   font-size: 100%;
   vertical-align: 0;
 }
 
-#htbijaoair .gt_indent_1 {
+#iuwtudrxst .gt_indent_1 {
   text-indent: 5px;
 }
 
-#htbijaoair .gt_indent_2 {
+#iuwtudrxst .gt_indent_2 {
   text-indent: 10px;
 }
 
-#htbijaoair .gt_indent_3 {
+#iuwtudrxst .gt_indent_3 {
   text-indent: 15px;
 }
 
-#htbijaoair .gt_indent_4 {
+#iuwtudrxst .gt_indent_4 {
   text-indent: 20px;
 }
 
-#htbijaoair .gt_indent_5 {
+#iuwtudrxst .gt_indent_5 {
   text-indent: 25px;
 }
 </style>
@@ -1192,8 +1198,8 @@ <h2><span class="header-section-number">1.8</span> Colophon<a href="c01-intro.ht
 <td headers="Version" class="gt_row gt_left">1.2.0</td>
 <td headers="Source" class="gt_row gt_left">GitHub (gergness/srvyr@1917f75)</td></tr>
     <tr><td headers="Package" class="gt_row gt_left">srvyrexploR</td>
-<td headers="Version" class="gt_row gt_left">0.0.0.9000</td>
-<td headers="Source" class="gt_row gt_left">GitHub (tidy-survey-r/srvyrexploR@914fc0f)</td></tr>
+<td headers="Version" class="gt_row gt_left">1.0.0</td>
+<td headers="Source" class="gt_row gt_left">GitHub (tidy-survey-r/srvyrexploR@e03f36c)</td></tr>
     <tr><td headers="Package" class="gt_row gt_left">stringr</td>
 <td headers="Version" class="gt_row gt_left">1.5.1</td>
 <td headers="Source" class="gt_row gt_left">CRAN</td></tr>
@@ -1205,6 +1211,9 @@ <h2><span class="header-section-number">1.8</span> Colophon<a href="c01-intro.ht
 <td headers="Source" class="gt_row gt_left">CRAN</td></tr>
     <tr><td headers="Package" class="gt_row gt_left">tibble</td>
 <td headers="Version" class="gt_row gt_left">3.2.1</td>
+<td headers="Source" class="gt_row gt_left">CRAN</td></tr>
+    <tr><td headers="Package" class="gt_row gt_left">tidycensus</td>
+<td headers="Version" class="gt_row gt_left">1.6.2</td>
 <td headers="Source" class="gt_row gt_left">CRAN</td></tr>
     <tr><td headers="Package" class="gt_row gt_left">tidyr</td>
 <td headers="Version" class="gt_row gt_left">1.3.0</td>
@@ -1229,7 +1238,7 @@ <h3>References<a href="references.html#references" class="anchor-section" aria-l
 DeBell, Matthew. 2010. <span>“How to Analyze ANES Survey Data.”</span> ANES Technical Report Series nes012492. Palo Alto, CA: Stanford University; Ann Arbor, MI: the University of Michigan; <a href="https://electionstudies.org/wp-content/uploads/2018/05/HowToAnalyzeANESData.pdf" class="uri">https://electionstudies.org/wp-content/uploads/2018/05/HowToAnalyzeANESData.pdf</a>.
 </div>
 <div id="ref-R-srvyr" class="csl-entry">
-Freedman Ellis, Greg, and Ben Schneider. 2023. <em><span class="nocase">srvyr</span>: Dplyr-Like Syntax for Summary Statistics of Survey Data</em>.
+Freedman Ellis, Greg, and Ben Schneider. 2023. <em><span class="nocase">srvyr</span>: ’<span class="nocase">dplyr</span>’-Like Syntax for Summary Statistics of Survey Data</em>.
 </div>
 <div id="ref-gard2023weightsdef" class="csl-entry">
 Gard, Arianna M., Luke W. Hyde, Steven G. Heeringa, Brady T. West, and Colter Mitchell. 2023. <span>“Why Weight? Analytic Approaches for Large-Scale Population Neuroscience Data.”</span> <em>Dev Cogn Neurosci</em>. <a href="https://doi.org/10.1016/j.dcn.2023.101196">https://doi.org/10.1016/j.dcn.2023.101196</a>.
diff --git a/c02-overview-surveys.html b/c02-overview-surveys.html
index 1c1b7468..13f62c34 100644
--- a/c02-overview-surveys.html
+++ b/c02-overview-surveys.html
@@ -23,7 +23,7 @@
 <meta name="author" content="Stephanie Zimmer, Rebecca J. Powell, and Isabella Velásquez" />
 
 
-<meta name="date" content="2024-04-16" />
+<meta name="date" content="2024-04-24" />
 
   <meta name="viewport" content="width=device-width, initial-scale=1" />
   <meta name="apple-mobile-web-app-capable" content="yes" />
@@ -229,7 +229,7 @@
 <li class="chapter" data-level="4.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#introduction-3"><i class="fa fa-check"></i><b>4.1</b> Introduction</a></li>
 <li class="chapter" data-level="4.2" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup"><i class="fa fa-check"></i><b>4.2</b> Setup</a>
 <ul>
-<li class="chapter" data-level="4.2.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#packages"><i class="fa fa-check"></i><b>4.2.1</b> Packages</a></li>
+<li class="chapter" data-level="4.2.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup-load-pkgs"><i class="fa fa-check"></i><b>4.2.1</b> Packages</a></li>
 <li class="chapter" data-level="4.2.2" data-path="c04-getting-started.html"><a href="c04-getting-started.html#data"><i class="fa fa-check"></i><b>4.2.2</b> Data</a></li>
 <li class="chapter" data-level="4.2.3" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup-des-obj"><i class="fa fa-check"></i><b>4.2.3</b> Design objects</a></li>
 </ul></li>
@@ -358,7 +358,7 @@
 <li class="chapter" data-level="9.9.1" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#random-number-seeds"><i class="fa fa-check"></i><b>9.9.1</b> Random number seeds</a></li>
 <li class="chapter" data-level="9.9.2" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#descriptive-names-and-labels"><i class="fa fa-check"></i><b>9.9.2</b> Descriptive names and labels</a></li>
 </ul></li>
-<li class="chapter" data-level="9.10" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#summary"><i class="fa fa-check"></i><b>9.10</b> Summary</a></li>
+<li class="chapter" data-level="9.10" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#additional-resources-1"><i class="fa fa-check"></i><b>9.10</b> Additional resources</a></li>
 </ul></li>
 <li class="part"><span><b>IV Real life data</b></span></li>
 <li class="chapter" data-level="10" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html"><i class="fa fa-check"></i><b>10</b> Sample designs and replicate weights</a>
@@ -380,7 +380,7 @@
 <ul>
 <li class="chapter" data-level="10.4.1" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#balanced-repeated-replication-brr-method"><i class="fa fa-check"></i><b>10.4.1</b> Balanced Repeated Replication (BRR) method</a></li>
 <li class="chapter" data-level="10.4.2" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#fays-brr-method"><i class="fa fa-check"></i><b>10.4.2</b> Fay’s BRR method</a></li>
-<li class="chapter" data-level="10.4.3" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#jackknife-method"><i class="fa fa-check"></i><b>10.4.3</b> Jackknife method</a></li>
+<li class="chapter" data-level="10.4.3" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#samp-jackknife"><i class="fa fa-check"></i><b>10.4.3</b> Jackknife method</a></li>
 <li class="chapter" data-level="10.4.4" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#bootstrap-method"><i class="fa fa-check"></i><b>10.4.4</b> Bootstrap method</a></li>
 </ul></li>
 <li class="chapter" data-level="10.5" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#exercises-2"><i class="fa fa-check"></i><b>10.5</b> Exercises</a></li>
@@ -398,14 +398,14 @@
 <li class="chapter" data-level="11.4" data-path="c11-missing-data.html"><a href="c11-missing-data.html#analysis-with-missing-data"><i class="fa fa-check"></i><b>11.4</b> Analysis with missing data</a>
 <ul>
 <li class="chapter" data-level="11.4.1" data-path="c11-missing-data.html"><a href="c11-missing-data.html#recoding-missing-data"><i class="fa fa-check"></i><b>11.4.1</b> Recoding missing data</a></li>
-<li class="chapter" data-level="11.4.2" data-path="c11-missing-data.html"><a href="c11-missing-data.html#accounting-for-skip-patterns"><i class="fa fa-check"></i><b>11.4.2</b> Accounting for skip patterns</a></li>
+<li class="chapter" data-level="11.4.2" data-path="c11-missing-data.html"><a href="c11-missing-data.html#missing-skip-patt"><i class="fa fa-check"></i><b>11.4.2</b> Accounting for skip patterns</a></li>
 </ul></li>
 </ul></li>
 <li class="chapter" data-level="12" data-path="c12-recommendations.html"><a href="c12-recommendations.html"><i class="fa fa-check"></i><b>12</b> Successful survey analysis recommendations</a>
 <ul>
 <li class="chapter" data-level="" data-path="c12-recommendations.html"><a href="c12-recommendations.html#prereq12"><i class="fa fa-check"></i>Prerequisites</a></li>
 <li class="chapter" data-level="12.1" data-path="c12-recommendations.html"><a href="c12-recommendations.html#introduction-10"><i class="fa fa-check"></i><b>12.1</b> Introduction</a></li>
-<li class="chapter" data-level="12.2" data-path="c12-recommendations.html"><a href="c12-recommendations.html#recs-survey-process"><i class="fa fa-check"></i><b>12.2</b> Follow survey analysis process</a></li>
+<li class="chapter" data-level="12.2" data-path="c12-recommendations.html"><a href="c12-recommendations.html#recs-survey-process"><i class="fa fa-check"></i><b>12.2</b> Follow the survey analysis process</a></li>
 <li class="chapter" data-level="12.3" data-path="c12-recommendations.html"><a href="c12-recommendations.html#begin-with-descriptive-analysis"><i class="fa fa-check"></i><b>12.3</b> Begin with descriptive analysis</a>
 <ul>
 <li class="chapter" data-level="12.3.1" data-path="c12-recommendations.html"><a href="c12-recommendations.html#table-review"><i class="fa fa-check"></i><b>12.3.1</b> Table review</a></li>
@@ -521,23 +521,23 @@ <h1>
 <h1><span class="header-section-number">Chapter 2</span> Overview of Surveys<a href="c02-overview-surveys.html#c02-overview-surveys" class="anchor-section" aria-label="Anchor link to header"></a></h1>
 <div id="introduction" class="section level2 hasAnchor" number="2.1">
 <h2><span class="header-section-number">2.1</span> Introduction<a href="c02-overview-surveys.html#introduction" class="anchor-section" aria-label="Anchor link to header"></a></h2>
-<p>Developing surveys to gather accurate information about populations often involves a intricate and time-intensive process. Researchers can spend months, or even years, developing the study design, questions, and other methods for a single survey to ensure high-quality data is collected.</p>
-<p>Prior to analyzing survey data, we recommend understanding the entire survey life cycle. This understanding can provide a better insight into what types of analyses should be conducted on the data. The <em>survey life cycle</em> consists of the necessary stages to execute a survey project successfully. Each stage influences the survey’s timing, costs, and feasibility, consequently impacting the data collected and how we should analyze it. Figure <a href="c02-overview-surveys.html#fig:overview-diag">2.1</a> shows a high level view of the survey process and this chapter gives an overview of each step.</p>
+<p>Developing surveys to gather accurate information about populations involves an intricate and time-intensive process. Researchers can spend months, or even years, developing the study design, questions, and other methods for a single survey to ensure high-quality data is collected.</p>
+<p>Before analyzing survey data, we recommend understanding the entire survey life cycle. This understanding can provide better insight into what types of analyses should be conducted on the data. The <em>survey life cycle</em> consists of the necessary stages to execute a survey project successfully. Each stage influences the survey’s timing, costs, and feasibility, consequently impacting the data collected and how we should analyze it. Figure <a href="c02-overview-surveys.html#fig:overview-diag">2.1</a> shows a high-level overview of the survey process.</p>
 <div class="figure"><span style="display:block;" id="fig:overview-diag"></span>
-<div class="DiagrammeR html-widget html-fill-item-overflow-hidden html-fill-item" id="htmlwidget-743206c16cf4f215c03f" style="width:672px;height:480px;"></div>
-<script type="application/json" data-for="htmlwidget-743206c16cf4f215c03f">{"x":{"diagram":"\ngraph TD\n  A[Survey Concept]-->B[Sampling Design]\n  A-->C[Questionnaire Design]\n  A-->D[Data Collection Planning]\n  B-->E[Data Collection]\n  C-->E\n  D-->E\n  E-->F[Post-Survey Processing]\n  F-->G[Analysis]\n  G-->H[Reporting]\n  \n  style A fill: #bfd7ea, stroke: #0b3954\n  style B fill: #bfd7ea, stroke: #0b3954\n  style C fill: #bfd7ea, stroke: #0b3954\n  style D fill: #bfd7ea, stroke: #0b3954\n  style E fill: #bfd7ea, stroke: #0b3954\n  style F fill: #bfd7ea, stroke: #0b3954\n  style G fill: #bfd7ea, stroke: #0b3954\n  style H fill: #bfd7ea, stroke: #0b3954\n"},"evals":[],"jsHooks":[]}</script>
+<div class="DiagrammeR html-widget html-fill-item-overflow-hidden html-fill-item" id="htmlwidget-fb43a5e89902c70fe67b" style="width:672px;height:480px;"></div>
+<script type="application/json" data-for="htmlwidget-fb43a5e89902c70fe67b">{"x":{"diagram":"\ngraph TD\n  A[Survey Concept]-->B[Sampling Design]\n  A-->C[Questionnaire Design]\n  A-->D[Data Collection Planning]\n  B-->E[Data Collection]\n  C-->E\n  D-->E\n  E-->F[Post-Survey Processing]\n  F-->G[Analysis]\n  G-->H[Reporting]\n  \n  style A fill: #bfd7ea, stroke: #0b3954\n  style B fill: #bfd7ea, stroke: #0b3954\n  style C fill: #bfd7ea, stroke: #0b3954\n  style D fill: #bfd7ea, stroke: #0b3954\n  style E fill: #bfd7ea, stroke: #0b3954\n  style F fill: #bfd7ea, stroke: #0b3954\n  style G fill: #bfd7ea, stroke: #0b3954\n  style H fill: #bfd7ea, stroke: #0b3954\n"},"evals":[],"jsHooks":[]}</script>
 <p class="caption">
 FIGURE 2.1: Overview of the survey process
 </p>
 </div>
-<p>The survey life cycle starts with a <em>research topic or question of interest</em> (e.g., what impact does childhood trauma have on health outcomes later in life). Researchers typically review existing data sources to determine if data are already available that can address this question, as drawing from available resources can result in a reduced burden on respondents, cheaper research costs, and faster research outcomes. However, if existing data cannot answer the nuances of the research question, a survey can be used to capture the exact data that the researcher needs through a questionnaire, or a set of questions.</p>
+<p>The survey life cycle starts with a <em>research topic or question of interest</em> (e.g., what impact does childhood trauma have on health outcomes later in life.) Drawing from available resources can result in a reduced burden on respondents, cheaper research costs, and faster research outcomes. Therefore, we recommend reviewing existing data sources to determine if data that can address this question are already available. However, if existing data cannot answer the nuances of the research question, we can capture the exact data we need through a questionnaire, or a set of questions.</p>
 <p>To gain a deeper understanding of survey design and implementation, we recommend reviewing several pieces of existing literature in detail <span class="citation">(e.g., <a href="#ref-biemer2003survqual">Biemer and Lyberg 2003</a>; <a href="#ref-Bradburn2004">Bradburn, Sudman, and Wansink 2004</a>; <a href="#ref-dillman2014mode">Dillman, Smyth, and Christian 2014</a>; <a href="#ref-groves2009survey">Groves et al. 2009</a>; <a href="#ref-Tourangeau2000psych">Tourangeau, Rips, and Rasinski 2000</a>; <a href="#ref-valliant2013practical">Valliant, Dever, and Kreuter 2013</a>)</span>.</p>
 </div>
 <div id="searching-for-public-use-survey-data" class="section level2 hasAnchor" number="2.2">
 <h2><span class="header-section-number">2.2</span> Searching for public-use survey data<a href="c02-overview-surveys.html#searching-for-public-use-survey-data" class="anchor-section" aria-label="Anchor link to header"></a></h2>
 <p>Throughout this book, we use public-use datasets from different surveys, including the American National Election Survey (ANES), the Residential Energy Consumption Survey (RECS), the National Crime Victimization Survey (NCVS), and the AmericasBarometer surveys.</p>
-<p>As mentioned above, researchers should look for existing data that can provide insights into their research questions before embarking on a new survey. One of the greatest sources of data is the government. For example, in the U.S., we can get data directly from the various statistical agencies like with RECS and NCVS. Other countries often have data available through official statistics offices, such as the Office for National Statistics in the United Kingdom.</p>
-<p>In addition to government data, many researchers will make their data publicly available through repositories such as the <a href="https://www.icpsr.umich.edu/web/pages/ICPSR/ssvd/">Inter-university Consortium for Political and Social Research (ICPSR) variable search</a> or the <a href="https://odum.unc.edu/archive/">Odum Institute Data Archive</a>. Searching these repositories or other compiled lists (e.g., <a href="https://asdfree.com">Analyze Survey Data for Free</a>) can be an efficient way to identify surveys with questions related to the researcher’s topic of interest.</p>
+<p>As mentioned above, we should look for existing data that can provide insights into our research questions before embarking on a new survey. One of the greatest sources of data is the government. For example, in the U.S., we can get data directly from the various statistical agencies such as the U.S. Energy Information Administration or Bureau of Justice Statistics. Other countries often have data available through official statistics offices, such as the Office for National Statistics in the United Kingdom.</p>
+<p>In addition to government data, many researchers make their data publicly available through repositories such as the <a href="https://www.icpsr.umich.edu/web/pages/ICPSR/ssvd/">Inter-university Consortium for Political and Social Research (ICPSR)</a> or the <a href="https://odum.unc.edu/archive/">Odum Institute Data Archive</a>. Searching these repositories or other compiled lists (e.g., <a href="https://asdfree.com">Analyze Survey Data for Free</a>) can be an efficient way to identify surveys with questions related to our research topic.</p>
 </div>
 <div id="pre-survey-planning" class="section level2 hasAnchor" number="2.3">
 <h2><span class="header-section-number">2.3</span> Pre-survey planning<a href="c02-overview-surveys.html#pre-survey-planning" class="anchor-section" aria-label="Anchor link to header"></a></h2>
@@ -546,16 +546,16 @@ <h2><span class="header-section-number">2.3</span> Pre-survey planning<a href="c
 <ul>
 <li><strong>Representation</strong>
 <ul>
-<li><strong>Coverage Error</strong>: A mismatch between the <em>population of interest</em> (also known as the target population or study population) and the <em>sampling frame</em>, the list from which the sample is drawn.</li>
+<li><strong>Coverage Error</strong>: A mismatch between the <em>population of interest</em> and the <em>sampling frame</em>, the list from which the sample is drawn.</li>
 <li><strong>Sampling Error</strong>: Error produced when selecting a <em>sample</em>, the subset of the population, from the sampling frame. This error is due to randomization, and we discuss how to quantify this error in Chapter <a href="c10-sample-designs-replicate-weights.html#c10-sample-designs-replicate-weights">10</a>. There is no sampling error in a census as there is no randomization. The sampling error measures the difference between all potential samples under the same sampling method.</li>
-<li><strong>Nonresponse Error</strong>: Differences between those who responded and did not respond to the survey (unit nonresponse) or a given question (item nonresponse).</li>
+<li><strong>Nonresponse Error</strong>: Differences between those who responded and did not respond to the survey (unit nonresponse) or a given question (item nonresponse.)</li>
 <li><strong>Adjustment Error</strong>: Error introduced during post-survey statistical adjustments.</li>
 </ul></li>
 <li><strong>Measurement</strong>
 <ul>
-<li><strong>Validity</strong>: A mismatch between the topic of interest and the question(s) used to collect that information.</li>
+<li><strong>Validity</strong>: A mismatch between the research topic and the question(s) used to collect that information.</li>
 <li><strong>Measurement Error</strong>: A mismatch between what the researcher asked and how the respondent answered.</li>
-<li><strong>Processing Error</strong>: Edits by the researcher to responses provided by the respondent (e.g., adjustments to data based on illogical responses).</li>
+<li><strong>Processing Error</strong>: Edits by the researcher to responses provided by the respondent (e.g., adjustments to data based on illogical responses.)</li>
 </ul></li>
 </ul>
 <p>Almost every survey has errors. Researchers attempt to conduct a survey that reduces the <em>total survey error</em>, or the accumulation of all errors that may arise throughout the survey life cycle. By assessing these different types of errors together, researchers can seek strategies to maximize the overall survey quality and improve the reliability and validity of results <span class="citation">(<a href="#ref-tse-doc">Biemer 2010</a>)</span>. However, attempts to reduce individual sources errors (and therefore total survey error) come at the price of time and money. For example:</p>
@@ -563,13 +563,13 @@ <h2><span class="header-section-number">2.3</span> Pre-survey planning<a href="c
 <li><strong>Coverage Error Tradeoff</strong>: Researchers can search for or create more accurate and updated sampling frames, but they can be difficult to construct or obtain.</li>
 <li><strong>Sampling Error Tradeoff</strong>: Researchers can increase the sample size to reduce sampling error; however, larger samples can be expensive and time-consuming to field.</li>
 <li><strong>Nonresponse Error Tradeoff</strong>: Researchers can increase or diversify efforts to improve survey participation but this may be resource-intensive while not entirely removing nonresponse bias.</li>
-<li><strong>Adjustment Error Tradeoff</strong>: <em>Weighting</em> is a statistical technique used to adjust the contribution of individual survey responses to the final survey estimates. It is typically done to make the sample more representative of the target population. However, if researchers do not carefully execute the adjustments or base them on inaccurate information, they can introduce new biases, leading to less accurate estimates.</li>
+<li><strong>Adjustment Error Tradeoff</strong>: <em>Weighting</em> is a statistical technique used to adjust the contribution of individual survey responses to the final survey estimates. It is typically done to make the sample more representative of the population of interest. However, if researchers do not carefully execute the adjustments or base them on inaccurate information, they can introduce new biases, leading to less accurate estimates.</li>
 <li><strong>Validity Error Tradeoff</strong>: Researchers can increase validity through a variety of ways, such as using established scales or collaborating with a psychometrician during survey design to pilot and evaluate questions. However, doing so lengthens the amount of time and resources needed to complete survey design.</li>
-<li><strong>Measurement Error Tradeoff</strong>: Reseachers can use techniques such as questionnaire testing and cognitive interviewing to ensure respondents are answering questions as expected. However, these activities also require time and resources to complete.</li>
+<li><strong>Measurement Error Tradeoff</strong>: Researchers can use techniques such as questionnaire testing and cognitive interviewing to ensure respondents are answering questions as expected. However, these activities require time and resources to complete.</li>
 <li><strong>Processing Error Tradeoff</strong>: Researchers can impose rigorous data cleaning and validation processes. However, this requires supervision, training, and time.</li>
 </ul>
 <p>The challenge for survey researchers is to find the optimal tradeoffs among these errors. They must carefully consider ways to reduce each error source and total survey error while balancing their study’s objectives and resources.</p>
-<p>For survey analysts, understanding the decisions that researchers took to minimize these error sources can impact how results are interpreted. The remainder of this chapter dives into critical considerations for survey development. We explore how to consider each of these sources of error and how these error sources can inform the interpretations of the data.</p>
+<p>For survey analysts, understanding the decisions that researchers took to minimize these error sources can impact how results are interpreted. The remainder of this chapter explores critical considerations for survey development. We explore how to consider each of these sources of error and how these error sources can inform the interpretations of the data.</p>
 </div>
 <div id="overview-design" class="section level2 hasAnchor" number="2.4">
 <h2><span class="header-section-number">2.4</span> Study design<a href="c02-overview-surveys.html#overview-design" class="anchor-section" aria-label="Anchor link to header"></a></h2>
@@ -577,28 +577,28 @@ <h2><span class="header-section-number">2.4</span> Study design<a href="c02-over
 <div id="overview-design-sampdesign" class="section level3 hasAnchor" number="2.4.1">
 <h3><span class="header-section-number">2.4.1</span> Sampling design<a href="c02-overview-surveys.html#overview-design-sampdesign" class="anchor-section" aria-label="Anchor link to header"></a></h3>
 <p>The set or group we want to survey is known as the <em>population of interest</em> or the <em>target population</em>. The population of interest could be broad, such as “all adults age 18+ living in the U.S.” or a specific population based on a particular characteristic or location. For example, we may want to know about “adults aged 18-24 who live in North Carolina” or “eligible voters living in Illinois.”</p>
-<p>However, a <em>sampling frame</em> with contact information is needed to survey individuals in these populations of interest. If researchers are looking at eligible voters, the sampling frame could be the voting registry for a given state or area. If researchers are looking at more board target populations like all adults in the United States, the sampling frame is likely imperfect. In these cases, a full list of individuals in the United States is not available for a sampling frame. Instead, researchers may choose to use a sampling frame of mailing addresses and send the survey to households, or they may choose to use random digit dialing (RDD) and call random phone numbers (that may or may not be assigned, connected, and working).</p>
-<p>These imperfect sampling frames can result in <em>coverage error</em> where there is a mismatch between the target population and the list of individuals researchers can select. For example, if a researcher is looking to obtain estimates for “all adults aged 18+ living in the U.S.”, a sampling frame of mailing addresses will miss specific types of individuals, such as the homeless, transient populations, and incarcerated individuals. Additionally, many households have more than one adult resident, so researchers would need to consider how to get a specific individual to fill out the survey (called <em>within household selection</em>) or adjust the target population to report on “U.S. households” instead of “individuals.”</p>
-<p>Once the researchers have selected the sampling frame, the next step is determining how to select individuals for the survey. In rare cases, researchers may conduct a <em>census</em> and survey everyone on the sampling frame. However, the ability to implement a questionnaire at that scale is something only some can do (e.g., government censuses). Instead, researchers typically choose to sample individuals and use weights to estimate numbers in the target population. They can use a variety of different sampling methods, and more information on these can be found in Chapter <a href="c10-sample-designs-replicate-weights.html#c10-sample-designs-replicate-weights">10</a>. This decision of which sampling method to use impacts <em>sampling error</em> and can be accounted for in weighting.</p>
+<p>However, a <em>sampling frame</em> with contact information is needed to survey individuals in these populations of interest. If we are looking at eligible voters, the sampling frame could be the voting registry for a given state or area. If we are looking at more board populations of interest, like all adults in the United States, the sampling frame is likely imperfect. In these cases, a full list of individuals in the United States is not available for a sampling frame. Instead, we may choose to use a sampling frame of mailing addresses and send the survey to households, or we may choose to use random digit dialing (RDD) and call random phone numbers (that may or may not be assigned, connected, and working.)</p>
+<p>These imperfect sampling frames can result in <em>coverage error</em> where there is a mismatch between the population of interest and the list of individuals we can select. For example, if we are looking to obtain estimates for “all adults aged 18+ living in the U.S.”, a sampling frame of mailing addresses will miss specific types of individuals, such as the homeless, transient populations, and incarcerated individuals. Additionally, many households have more than one adult resident, so we would need to consider how to get a specific individual to fill out the survey (called <em>within household selection</em>) or adjust the population of interest to report on “U.S. households” instead of “individuals.”</p>
+<p>Once we have selected the sampling frame, the next step is determining how to select individuals for the survey. In rare cases, we may conduct a <em>census</em> and survey everyone on the sampling frame. However, the ability to implement a questionnaire at that scale is something only few can do (e.g., government censuses.) Instead, we typically choose to sample individuals and use weights to estimate numbers in the population of interest. They can use a variety of different sampling methods, and more information on these can be found in Chapter <a href="c10-sample-designs-replicate-weights.html#c10-sample-designs-replicate-weights">10</a>. This decision of which sampling method to use impacts <em>sampling error</em> and can be accounted for in weighting.</p>
 <div id="overview-design-sampdesign-ex" class="section level4 unnumbered hasAnchor">
 <h4>Example: Number of pets in a household<a href="c02-overview-surveys.html#overview-design-sampdesign-ex" class="anchor-section" aria-label="Anchor link to header"></a></h4>
-<p>Let’s use a simple example where a researcher is interested in the average number of pets in a household. Our researcher needs to consider the target population for this study. Specifically, are they interested in all households in a given country or households in a more local area (e.g., city or state)? Let’s assume our researcher is interested in the number of pets in a U.S. household with at least one adult (18 years old or older). In this case, a sampling frame of mailing addresses would introduce only a small amount of coverage error as the frame would closely match our target population. Specifically, our researcher would likely want to use the Computerized Delivery Sequence File (CDSF), which is a file of mailing addresses that the United States Postal Service (USPS) creates and covers nearly 100% of U.S. households <span class="citation">(<a href="#ref-harter2016address">Harter et al. 2016</a>)</span>. To sample these households, for simplicity, we use a stratified simple random sample design (see Chapter <a href="c10-sample-designs-replicate-weights.html#c10-sample-designs-replicate-weights">10</a> for more information on sample designs), where we randomly sample households within each state (i.e., we stratify by state).</p>
+<p>Let’s use a simple example where we are interested in the average number of pets in a household. We need to consider the population of interest for this study. Specifically, are we interested in all households in a given country or households in a more local area (e.g., city or state)? Let’s assume we are interested in the number of pets in a U.S. household with at least one adult (18 years old or older.) In this case, a sampling frame of mailing addresses would introduce only a small amount of coverage error as the frame would closely match our population of interest. Specifically, we would likely want to use the Computerized Delivery Sequence File (CDSF), which is a file of mailing addresses that the United States Postal Service (USPS) creates and covers nearly 100% of U.S. households <span class="citation">(<a href="#ref-harter2016address">Harter et al. 2016</a>)</span>. To sample these households, for simplicity, we use a stratified simple random sample design (see Chapter <a href="c10-sample-designs-replicate-weights.html#c10-sample-designs-replicate-weights">10</a> for more information on sample designs), where we randomly sample households within each state (i.e., we stratify by state.)</p>
 <p>Throughout this chapter, we build on this example research question to plan a survey.</p>
 </div>
 </div>
 <div id="overview-design-dcplanning" class="section level3 hasAnchor" number="2.4.2">
 <h3><span class="header-section-number">2.4.2</span> Data collection planning<a href="c02-overview-surveys.html#overview-design-dcplanning" class="anchor-section" aria-label="Anchor link to header"></a></h3>
-<p>With the sampling design decided, researchers can then decide how to survey these individuals. Specifically, the <em>modes</em> used for contacting and surveying the sample, how frequently to send reminders and follow-ups, and the overall timeline of the study are four of the major data collection determinations. Traditionally, researchers have considered four main modes<a href="#fn1" class="footnote-ref" id="fnref1"><sup>1</sup></a>:</p>
+<p>With the sampling design decided, researchers can then decide how to survey these individuals. Specifically, the <em>modes</em> used for contacting and surveying the sample, how frequently to send reminders and follow-ups, and the overall timeline of the study are four of the major data collection determinations. Traditionally, survey researchers have considered there to be four main modes<a href="#fn1" class="footnote-ref" id="fnref1"><sup>1</sup></a>:</p>
 <ul>
 <li>Computer Assisted Personal Interview (CAPI; also known as face-to-face or in-person interviewing)</li>
 <li>Computer Assisted Telephone Interview (CATI; also known as phone or telephone interviewing)</li>
 <li>Computer Assisted Web Interview (CAWI; also known as web or online interviewing)</li>
 <li>Paper and Pencil Interview (PAPI)</li>
 </ul>
-<p>Researchers can use a single mode to collect data or multiple modes (also called <em>mixed-modes</em>). Using mixed-modes can allow for broader reach and increase response rates depending on the target population <span class="citation">(<a href="#ref-biemer_choiceplus">Biemer et al. 2017</a>; <a href="#ref-deLeeuw2005">DeLeeuw 2005</a>, <a href="#ref-DeLeeuw_2018">2018</a>)</span>. For example, researchers could both call households to conduct a CATI survey and send mail with a PAPI survey to the household. Using both modes, researchers could gain participation through the mail from individuals who do not pick up the phone to unknown numbers or through the phone from individuals who do not open all of their mail. However, mode effects (where responses differ based on the mode of response) can be present in the data and may need to be considered during analysis.</p>
-<p>When selecting which mode, or modes, to use, understanding the unique aspects of the chosen target population and sampling frame provides insight into how they can best be reached and engaged. For example, if we plan to survey adults aged 18-24 who live in North Carolina, asking them to complete a survey using CATI (i.e., over the phone) would likely not be as successful as other modes like the web. This age group does not talk on the phone as much as other generations and often does not answer their phones for unknown numbers. Additionally, the mode for contacting respondents relies on what information is available in the sampling frame. For example, if our sampling frame includes an email address, we could email our selected sample members to convince them to complete a survey. Alternatively, if the sampling frame is a list of mailing addresses, we could contact sample members with a letter.</p>
+<p>We can use a single mode to collect data or multiple modes (also called <em>mixed-modes</em>.) Using mixed-modes can allow for broader reach and increase response rates depending on the population of interest <span class="citation">(<a href="#ref-biemer_choiceplus">Biemer et al. 2017</a>; <a href="#ref-deLeeuw2005">DeLeeuw 2005</a>, <a href="#ref-DeLeeuw_2018">2018</a>)</span>. For example, we could both call households to conduct a CATI survey and send mail with a PAPI survey to the household. By using both modes, we could gain participation through the mail from individuals who do not pick up the phone to unknown numbers or through the phone from individuals who do not open all of their mail. However, mode effects (where responses differ based on the mode of response) can be present in the data and may need to be considered during analysis.</p>
+<p>When selecting which mode, or modes, to use, understanding the unique aspects of the chosen population of interest and sampling frame provides insight into how they can best be reached and engaged. For example, if we plan to survey adults aged 18-24 who live in North Carolina, asking them to complete a survey using CATI (i.e., over the phone) would likely not be as successful as other modes like the web. This age group does not talk on the phone as much as other generations and often does not answer their phones for unknown numbers. Additionally, the mode for contacting respondents relies on what information is available in the sampling frame. For example, if our sampling frame includes an email address, we could email our selected sample members to convince them to complete a survey. Alternatively, if the sampling frame is a list of mailing addresses, we could contact sample members with a letter.</p>
 <p>It is important to note that there can be a difference between the contact and survey modes. For example, if we have a sampling frame with addresses, we can send a letter to our sample members and provide information on completing a web survey. Another option is using mixed-mode surveys by mailing sample members a paper and pencil survey but also including instructions to complete the survey online. Combining different contact modes and different survey modes can be helpful in reducing <em>unit nonresponse error</em>–where the entire unit (e.g., a household) does not respond to the survey at all–as different sample members may respond better to different contact and survey modes. However, when considering which modes to use, it is important to make access to the survey as easy as possible for sample members to reduce burden and unit nonresponse.</p>
-<p>Another way to reduce unit nonresponse error is by varying the language of the contact materials <span class="citation">(<a href="#ref-dillman2014mode">Dillman, Smyth, and Christian 2014</a>)</span>. People are motivated by different things, so constantly repeating the same message may not be helpful. Instead, mixing up the messaging and the type of contact material the sample member receives can increase response rates and reduce the unit nonresponse error. For example, instead of only sending standard letters, researchers could consider sending mailings that invoke “urgent” or “important” thoughts by sending priority letters or using other delivery services like FedEx, UPS, or DHL.</p>
+<p>Another way to reduce unit nonresponse error is by varying the language of the contact materials <span class="citation">(<a href="#ref-dillman2014mode">Dillman, Smyth, and Christian 2014</a>)</span>. People are motivated by different things, so constantly repeating the same message may not be helpful. Instead, mixing up the messaging and the type of contact material the sample member receives can increase response rates and reduce the unit nonresponse error. For example, instead of only sending standard letters, we could consider sending mailings that invoke “urgent” or “important” thoughts by sending priority letters or using other delivery services like FedEx, UPS, or DHL.</p>
 <p>A study timeline may also determine the number and types of contacts. If the timeline is long, there is plentiful time for follow-ups and diversified messages in contact materials. If the timeline is short, then fewer follow-ups can be implemented. Many studies start with the tailored design method put forth by <span class="citation">Dillman, Smyth, and Christian (<a href="#ref-dillman2014mode">2014</a>)</span> and implement five contacts:</p>
 <ul>
 <li>Prenotification (Prenotice) letting sample members know the survey is coming</li>
@@ -610,7 +610,7 @@ <h3><span class="header-section-number">2.4.2</span> Data collection planning<a
 <p>This method is easily adaptable based on the study timeline and needs but provides a starting point for most studies.</p>
 <div id="overview-design-dcplanning-ex" class="section level4 unnumbered hasAnchor">
 <h4>Example: Number of pets in a household<a href="c02-overview-surveys.html#overview-design-dcplanning-ex" class="anchor-section" aria-label="Anchor link to header"></a></h4>
-<p>Let’s return to our example of a researcher who wants to know the average number of pets in a household. We are using a sampling frame of mailing addresses, so we recommend starting our data collection with letters mailed to households, but later in data collection, we want to send interviewers to the house to conduct an in-person (or CAPI) interview to decrease unit nonresponse error. This means we have two contact modes (paper and in-person). As mentioned above, the survey mode does not have to be the same as the contact mode, so we recommend a mixed-mode study with both Web and CAPI modes. Let’s assume we have six months for data collection, so we may want to recommend the following protocol:</p>
+<p>Let’s return to our example of the average number of pets in a household. We are using a sampling frame of mailing addresses, so we recommend starting our data collection with letters mailed to households, but later in data collection, we want to send interviewers to the house to conduct an in-person (or CAPI) interview to decrease unit nonresponse error. This means we have two contact modes (paper and in-person.) As mentioned above, the survey mode does not have to be the same as the contact mode, so we recommend a mixed-mode study with both Web and CAPI modes. Let’s assume we have six months for data collection, so we could recommend the following protocol:</p>
 <table>
 <caption>Protocol Example for 6-month Web and CAPI Data Collection</caption>
 <colgroup>
@@ -684,83 +684,83 @@ <h4>Example: Number of pets in a household<a href="c02-overview-surveys.html#ove
 </tr>
 </tbody>
 </table>
-<p>This is just one possible protocol that we can use that starts respondents with the web (typically done to reduce costs). However, researchers may want to begin in-person data collection earlier during the data collection period or ask their interviewers to attempt more than two visits with a household.</p>
+<p>This is just one possible protocol that we can use that starts respondents with the web (typically done to reduce costs.) However, we could begin in-person data collection earlier during the data collection period or ask their interviewers to attempt more than two visits with a household.</p>
 </div>
 </div>
 <div id="overview-design-questionnaire" class="section level3 hasAnchor" number="2.4.3">
 <h3><span class="header-section-number">2.4.3</span> Questionnaire design<a href="c02-overview-surveys.html#overview-design-questionnaire" class="anchor-section" aria-label="Anchor link to header"></a></h3>
-<p>When developing the questionnaire, it can be helpful to first outline the topics to be asked and include the “why” each question or topic is important to the research question(s). This can help researchers better tailor the questionnaire and reduce the number of questions (and thus the burden on the respondent) if topics are deemed irrelevant to the research question. When making these decisions, researchers should also consider questions needed for weighting. While we would love to have everyone in our population of interest answer our survey, this rarely happens. Thus, including questions about demographics in the survey can assist with weighting for <em>nonresponse errors</em> (both unit and item nonresponse). Knowing the details of the sampling plan and what may impact <em>coverage error</em> and <em>sampling error</em> can help researchers determine what types of demographics to include. Thus questionnaire design is done in conjunction with sampling design.</p>
-<p>Researchers can benefit from the work of others by using questions from other surveys. Demographic sections such as race, ethnicity, or education borrow questions from a government census or other official surveys. Question banks such as the <a href="https://www.icpsr.umich.edu/web/pages/ICPSR/ssvd/">Inter-university Consortium for Political and Social Research (ICPSR) variable search</a> can provide additional potential questions.</p>
-<p>If a question does not exist in a question bank, researchers can craft their own. When developing survey questions, researchers should start with the research topic and attempt to write questions that match the concept. The closer the question asked is to the overall concept, the better <em>validity</em> there is. For example, if the researcher wants to know how people consume T.V. series and movies but only asks a question about how many T.V.s are in the house, then they would be missing other ways that people watch T.V. series and movies, such as on other devices or at places outside of the home. As mentioned above, researchers can employ techniques to increase the validity of their questionnaires. For example, <em>questionnaire testing</em> involves piloting the survey instrument to identify and fix potential issues before conducting the main survey. Additionally, researchers could conduct <em>cognitive interviews</em> – a technique where researchers walk through the survey with participants, encouraging them to speak their thoughts out loud to uncover how they interpret and understand survey questions.</p>
-<p>Additionally, when designing questions, researchers should consider the mode for the survey and adjust the language appropriately. In self-administered surveys (e.g., web or mail), respondents can see all the questions and response options, but that is not the case in interviewer-administered surveys (e.g., CATI or CAPI). With interviewer-administered surveys, the response options must be read aloud to the respondents, so the question may need to be adjusted to create a better flow to the interview. Additionally, with self-administered surveys, because the respondents are viewing the questionnaire, the formatting of the questions is even more critical to ensure accurate measurement. Incorrect formatting or wording can result in <em>measurement error</em>, so following best practices or using existing validated questions can reduce error. There are multiple resources to help researchers draft questions for different modes <span class="citation">(e.g., <a href="#ref-Bradburn2004">Bradburn, Sudman, and Wansink 2004</a>; <a href="#ref-dillman2014mode">Dillman, Smyth, and Christian 2014</a>; <a href="#ref-Fowler1989">Fowler and Mangione 1989</a>; <a href="#ref-Tourangeau2004spacing">Tourangeau, Couper, and Conrad 2004</a>)</span>.</p>
+<p>When developing the questionnaire, it can be helpful to first outline the topics to be asked and include the “why” each question or topic is important to the research question(s). This can help us better tailor the questionnaire and reduce the number of questions (and thus the burden on the respondent) if topics are deemed irrelevant to the research question. When making these decisions, we should also consider questions needed for weighting. While we would love to have everyone in our population of interest answer our survey, this rarely happens. Thus, including questions about demographics in the survey can assist with weighting for <em>nonresponse errors</em> (both unit and item nonresponse.) Knowing the details of the sampling plan and what may impact <em>coverage error</em> and <em>sampling error</em> can help us determine what types of demographics to include. Thus questionnaire design is typically done in conjunction with sampling design.</p>
+<p>We can benefit from the work of others by using questions from other surveys. Demographic sections in surveys, such as race, ethnicity, or education, often are borrowed questions from a government census or other official surveys. Question banks such as the <a href="https://www.icpsr.umich.edu/web/pages/ICPSR/ssvd/">Inter-university Consortium for Political and Social Research (ICPSR) variable search</a> can provide additional potential questions.</p>
+<p>If a question does not exist in a question bank, we can craft our own. When developing survey questions, we should start with the research topic and attempt to write questions that match the concept. The closer the question asked is to the overall concept, the better the <em>validity</em>. For example, if we want to know how people consume T.V. series and movies but only ask a question about how many T.V.s are in the house, then they would be missing other ways that people watch T.V. series and movies, such as on other devices or at places outside of the home. As mentioned above, we can employ techniques to increase the validity of their questionnaires. For example, <em>questionnaire testing</em> involves piloting the survey instrument to identify and fix potential issues before conducting the main survey. Additionally, we could conduct <em>cognitive interviews</em> – a technique where we walk through the survey with participants, encouraging them to speak their thoughts out loud to uncover how they interpret and understand survey questions.</p>
+<p>Additionally, when designing questions, we should consider the mode for the survey and adjust the language appropriately. In self-administered surveys (e.g., web or mail), respondents can see all the questions and response options, but that is not the case in interviewer-administered surveys (e.g., CATI or CAPI.) With interviewer-administered surveys, the response options must be read aloud to the respondents, so the question may need to be adjusted to create a better flow to the interview. Additionally, with self-administered surveys, because the respondents are viewing the questionnaire, the formatting of the questions is even more critical to ensure accurate measurement. Incorrect formatting or wording can result in <em>measurement error</em>, so following best practices or using existing validated questions can reduce error. There are multiple resources to help researchers draft questions for different modes <span class="citation">(e.g., <a href="#ref-Bradburn2004">Bradburn, Sudman, and Wansink 2004</a>; <a href="#ref-dillman2014mode">Dillman, Smyth, and Christian 2014</a>; <a href="#ref-Fowler1989">Fowler and Mangione 1989</a>; <a href="#ref-Tourangeau2004spacing">Tourangeau, Couper, and Conrad 2004</a>)</span>.</p>
 <div id="overview-design-questionnaire-ex" class="section level4 unnumbered hasAnchor">
 <h4>Example: Number of pets in a household<a href="c02-overview-surveys.html#overview-design-questionnaire-ex" class="anchor-section" aria-label="Anchor link to header"></a></h4>
-<p>As part of our survey on the average number of pets in a household, researchers may want to know what animal most people prefer to have as a pet. Let’s say we have the following question in our survey:</p>
+<p>As part of our survey on the average number of pets in a household, we may want to know what animal most people prefer to have as a pet. Let’s say we have a question in our survey displayed in Figure <a href="c02-overview-surveys.html#fig:overview-pet-examp1">2.2</a>.</p>
 <div class="figure" style="text-align: center"><span style="display:block;" id="fig:overview-pet-examp1"></span>
 <img src="images/PetExample1.png" alt="Example question asking &quot;What animal do you prefer to have as a pet?&quot; with response options of Dogs and Cats." width="70%" />
 <p class="caption">
 FIGURE 2.2: Example Question Asking Pet Preference Type
 </p>
 </div>
-<p>This question may have validity issues as it only provides the options of “dogs” and “cats” to respondents, and the interpretation of the data could be incorrect. For example, if we had 100 respondents who answered the question and 50 selected dogs, then the results of this question cannot be “50% of the population prefers to have a dog as a pet,” as only two response options were provided. If a respondent taking our survey prefers turtles, they could either be forced to choose a response between these two (i.e., interpret the question as “between dogs and cats, which do you prefer?” and result in <em>measurement error</em>), or they may not answer the question (which results in <em>item nonresponse error</em>). Based on this, the interpretation of this question should be, “When given a choice between dogs and cats, 50% of respondents preferred to have a dog as a pet.”</p>
-<p>To avoid this issue, researchers should consider these possibilities and adjust the question accordingly. One simple way could be to add an “other” response option to give respondents a chance to provide a different response. The “other” response option could then include a way for respondents to write their other preference. For example, we could rewrite this question as:</p>
+<p>This question may have validity issues as it only provides the options of “dogs” and “cats” to respondents, and the interpretation of the data could be incorrect. For example, if we had 100 respondents who answered the question and 50 selected dogs, then the results of this question cannot be “50% of the population prefers to have a dog as a pet,” as only two response options were provided. If a respondent taking our survey prefers turtles, they could either be forced to choose a response between these two (i.e., interpret the question as “between dogs and cats, which do you prefer?” and result in <em>measurement error</em>), or they may not answer the question (which results in <em>item nonresponse error</em>.) Based on this, the interpretation of this question should be, “When given a choice between dogs and cats, 50% of respondents preferred to have a dog as a pet.”</p>
+<p>To avoid this issue, we should consider these possibilities and adjust the question accordingly. One simple way could be to add an “other” response option to give respondents a chance to provide a different response. The “other” response option could then include a way for respondents to write their other preference. For example, we could rewrite this question as displayed in Figure <a href="c02-overview-surveys.html#fig:overview-pet-examp2">2.3</a>.</p>
 <div class="figure" style="text-align: center"><span style="display:block;" id="fig:overview-pet-examp2"></span>
 <img src="images/PetExample2.png" alt="Example question asking &quot;What animal do you prefer to have as a pet?&quot; with response options of Dogs, Cats, and Other.  The other option includes an open-ended box after for write in responses." width="70%" />
 <p class="caption">
 FIGURE 2.3: Example Question Asking Pet Preference Type with Other Specify Option
 </p>
 </div>
-<p>Researchers can then code the responses from the open-ended box and get a better understanding of the respondent’s choice of preferred pet. Interpreting this question becomes easier as researchers no longer need to qualify the results with the choices provided.</p>
-<p>This is a simple example of how the presentation of the question and options can impact the findings. For more complex topics and questions, researchers must thoroughly consider how to mitigate any impacts from the presentation, formatting, wording, and other aspects. As survey analysts, reviewing not only the data but also the wording of the questions is crucial to ensure the results are presented in a manner consistent with the question asked. Chapter <a href="c03-survey-data-documentation.html#c03-survey-data-documentation">3</a> provides further details on how to review existing survey documentation to inform our analyses.</p>
+<p>We can then code the responses from the open-ended box and get a better understanding of the respondent’s choice of preferred pet. Interpreting this question becomes easier as researchers no longer need to qualify the results with the choices provided.</p>
+<p>This is a simple example of how the presentation of the question and options can impact the findings. For more complex topics and questions, we must thoroughly consider how to mitigate any impacts from the presentation, formatting, wording, and other aspects. For survey analysts, reviewing not only the data but also the wording of the questions is crucial to ensure the results are presented in a manner consistent with the question asked. Chapter <a href="c03-survey-data-documentation.html#c03-survey-data-documentation">3</a> provides further details on how to review existing survey documentation to inform our analyses and Chapter <a href="c08-communicating-results.html#c08-communicating-results">8</a> goes into more details on communicating results.</p>
 </div>
 </div>
 </div>
 <div id="overview-datacollection" class="section level2 hasAnchor" number="2.5">
 <h2><span class="header-section-number">2.5</span> Data collection<a href="c02-overview-surveys.html#overview-datacollection" class="anchor-section" aria-label="Anchor link to header"></a></h2>
-<p>Once the data collection starts, researchers try to stick to the data collection protocol designed during pre-survey planning. However, effective researchers also prepare to adjust their plans and adapt as needed to the current progress of data collection <span class="citation">(<a href="#ref-Schouten2018">Schouten, Peytchev, and Wagner 2018</a>)</span>. Some extreme examples could be natural disasters that could prevent mailings or interviewers getting to the sample members. This could cause an in-person survey needing to quickly pivot to a self-administered survey, or the field period could be delayed, for example. Others could be smaller in that something newsworthy occurs connected to the survey, so researchers could choose to play this up in communication materials. In addition to these external factors, there could be factors unique to the survey, such as lower response rates for a specific sub-group, so the data collection protocol may need to find ways to improve response rates for that specific group.</p>
+<p>Once the data collection starts, we try to stick to the data collection protocol designed during pre-survey planning. However, effective researchers also prepare to adjust their plans and adapt as needed to the current progress of data collection <span class="citation">(<a href="#ref-Schouten2018">Schouten, Peytchev, and Wagner 2018</a>)</span>. Some extreme examples could be natural disasters that could prevent mailings or interviewers from getting to the sample members. This could cause an in-person survey needing to quickly pivot to a self-administered survey, or the field period could be delayed, for example. Others could be smaller in that something newsworthy occurs connected to the survey, so we could choose to play this up in communication materials. In addition to these external factors, there could be factors unique to the survey, such as lower response rates for a specific sub-group, so the data collection protocol may need to find ways to improve response rates for that specific group.</p>
 </div>
 <div id="overview-post" class="section level2 hasAnchor" number="2.6">
 <h2><span class="header-section-number">2.6</span> Post-survey processing<a href="c02-overview-surveys.html#overview-post" class="anchor-section" aria-label="Anchor link to header"></a></h2>
-<p>After data collection, various activities need to be completed before we can analyze the survey. Multiple decisions made during this post-survey phase can assist researchers in reducing different error sources, such as weighting to account for the sample selection. Knowing the decisions researchers made in creating the final analytic data can impact how analysts use the data and interpret the results.</p>
+<p>After data collection, various activities need to be completed before we can analyze the survey. Multiple decisions made during this post-survey phase can assist us in reducing different error sources, such as weighting to account for the sample selection. Knowing the decisions made in creating the final analytic data can impact how we use the data and interpret the results.</p>
 <div id="overview-post-cleaning" class="section level3 hasAnchor" number="2.6.1">
 <h3><span class="header-section-number">2.6.1</span> Data cleaning and imputation<a href="c02-overview-surveys.html#overview-post-cleaning" class="anchor-section" aria-label="Anchor link to header"></a></h3>
-<p>Post-survey cleaning is one of the first steps researchers do to get the survey responses into a dataset for use by analysts. Data cleaning can consist of correcting inconsistent data (e.g., with skip pattern errors or multiple questions throughout the survey being consistent with each other), editing numeric entries or open-ended responses for grammar and consistency, or recoding open-ended questions into categories for analysis. There is no universal set of fixed rules that every project must adhere to. Instead, each project or research study should establish its own guidelines and procedures for handling various cleaning scenarios based on its specific objectives.</p>
-<p>Researchers should use their best judgment to ensure data integrity, and all decisions should be documented and available to those using the data in the analysis. Each decision a researcher makes impacts <em>processing error</em>, so often, researchers have multiple people review these rules or recode open-ended data and adjudicate any differences in an attempt to reduce this error.</p>
-<p>Another crucial step in post-survey processing is <em>imputation</em>. Often, there is item nonresponse where respondents do not answer specific questions. If the questions are crucial to analysis efforts or the research question, researchers may implement imputation to reduce <em>item nonresponse error</em>. Imputation is a technique for replacing missing or incomplete data values with estimated values. However, as imputation is a way of assigning a value to missing data based on an algorithm or model, it can also introduce <em>processing error</em>, so researchers should consider the overall implications of imputing data compared to having item nonresponse. There are multiple ways to impute data. We recommend reviewing other resources like <span class="citation">Kim and Shao (<a href="#ref-Kim2021">2021</a>)</span> for more information.</p>
+<p>Post-survey cleaning is one of the first steps to get the survey responses into an analytic dataset. Data cleaning can consist of correcting inconsistent data (e.g., with skip pattern errors or multiple questions throughout the survey being consistent with each other), editing numeric entries or open-ended responses for grammar and consistency, or recoding open-ended questions into categories for analysis. There is no universal set of fixed rules that every survey must adhere to. Instead, each survey or research study should establish its own guidelines and procedures for handling various cleaning scenarios based on its specific objectives.</p>
+<p>We should use our best judgment to ensure data integrity, and all decisions should be documented and available to those using the data in the analysis. Each decision we make impacts <em>processing error</em>, so often, multiple people review these rules or recode open-ended data and adjudicate any differences in an attempt to reduce this error.</p>
+<p>Another crucial step in post-survey processing is <em>imputation</em>. Often, there is item nonresponse where respondents do not answer specific questions. If the questions are crucial to analysis efforts or the research question, we may implement imputation to reduce <em>item nonresponse error</em>. Imputation is a technique for replacing missing or incomplete data values with estimated values. However, as imputation is a way of assigning values to missing data based on an algorithm or model, it can also introduce <em>processing error</em>, so we should consider the overall implications of imputing data compared to having item nonresponse. There are multiple ways to impute data. We recommend reviewing other resources like <span class="citation">Kim and Shao (<a href="#ref-Kim2021">2021</a>)</span> for more information.</p>
 <div id="overview-post-cleaning-ex" class="section level4 unnumbered hasAnchor">
 <h4>Example: Number of pets in a household<a href="c02-overview-surveys.html#overview-post-cleaning-ex" class="anchor-section" aria-label="Anchor link to header"></a></h4>
-<p>Let’s return to the question we created to ask about <a href="c02-overview-surveys.html#overview-design-questionnaire-ex">animal preference</a>. The “other specify” invites respondents to specify the type of animal they prefer to have as a pet. If respondents entered answers such as “puppy,” “turtle,” “rabit,” “rabbit,” “bunny,” “ant farm,” “snake,” “Mr. Purr,” then researchers may wish to categorize these write-in responses to help with analysis. In this example, “puppy” could be assumed to be a reference to a “Dog”, and could be recoded there. The misspelling of “rabit” could be coded along with “rabbit” and “bunny” into a single category of “Bunny or Rabbit”. These are relatively standard decisions that a researcher could make. The remaining write-in responses could be categorized in a few different ways. “Mr. Purr,” which may be someone’s reference to their own cat, could be recoded as “Cat”, or it could remain as “Other” or some category that is “Unknown”. Depending on the number of responses related to each of the others, they could all be combined into a single “Other” category, or maybe categories such as “Reptiles” or “Insects” could be created. Each of these decisions may impact the interpretation of the data, so our researchers should document the types of responses that fall into each of the new categories and any decisions made.</p>
+<p>Let’s return to the question we created to ask about <a href="c02-overview-surveys.html#overview-design-questionnaire-ex">animal preference</a>. The “other specify” invites respondents to specify the type of animal they prefer to have as a pet. If respondents entered answers such as “puppy,” “turtle,” “rabit,” “rabbit,” “bunny,” “ant farm,” “snake,” “Mr. Purr,” then we may wish to categorize these write-in responses to help with analysis. In this example, “puppy” could be assumed to be a reference to a “Dog”, and could be recoded there. The misspelling of “rabit” could be coded along with “rabbit” and “bunny” into a single category of “Bunny or Rabbit”. These are relatively standard decisions that we can make. The remaining write-in responses could be categorized in a few different ways. “Mr. Purr,” which may be someone’s reference to their own cat, could be recoded as “Cat”, or it could remain as “Other” or some category that is “Unknown”. Depending on the number of responses related to each of the others, they could all be combined into a single “Other” category, or maybe categories such as “Reptiles” or “Insects” could be created. Each of these decisions may impact the interpretation of the data, so we should document the types of responses that fall into each of the new categories and any decisions made.</p>
 </div>
 </div>
 <div id="overview-post-weighting" class="section level3 hasAnchor" number="2.6.2">
 <h3><span class="header-section-number">2.6.2</span> Weighting<a href="c02-overview-surveys.html#overview-post-weighting" class="anchor-section" aria-label="Anchor link to header"></a></h3>
-<p>We can address some of the error sources identified in the previous sections using <em>weighting</em>. During the weighting process, weights are created for each respondent record. These weights allow the survey responses to generalize to the population. A weight, generally, reflects how many units in the population each respondent represents, and, often the weight is constructed such that the sum of the weights is the size of the population.</p>
-<p>Weights can address coverage, sampling, and nonresponse errors. Many published surveys include an “analysis weight” variable that combines these adjustments. However, weighting itself can also introduce <em>adjustment error</em>, so researchers need to balance which types of errors should be corrected with weighting. The construction of weights is outside the scope of this book, and researchers should reference other materials if interested in constructing their own <span class="citation">(<a href="#ref-Valliant2018weights">Valliant and Dever 2018</a>)</span>. Instead, this book assumes the survey has been completed, weights are constructed, and data is available to users.</p>
+<p>We can address some error sources identified in the previous sections using <em>weighting</em>. During the weighting process, weights are created for each respondent record. These weights allow the survey responses to generalize to the population. A weight, generally, reflects how many units in the population each respondent represents. Often, the weight is constructed such that the sum of the weights is the size of the population.</p>
+<p>Weights can address coverage, sampling, and nonresponse errors. Many published surveys include an “analysis weight” variable that combines these adjustments. However, weighting itself can also introduce <em>adjustment error</em>, so we need to balance which types of errors should be corrected with weighting. The construction of weights is outside the scope of this book, so we recommend referencing other materials if interested in weight construction <span class="citation">(<a href="#ref-Valliant2018weights">Valliant and Dever 2018</a>)</span>. Instead, this book assumes the survey has been completed, weights are constructed, and data are available to users.</p>
 <div id="overview-post-weighting-ex" class="section level4 unnumbered hasAnchor">
 <h4>Example: Number of pets in a household<a href="c02-overview-surveys.html#overview-post-weighting-ex" class="anchor-section" aria-label="Anchor link to header"></a></h4>
-<p>In the simple example of our survey, we decided to obtain a random sample from each state to select our sample members. Knowing this sampling design, our researcher can include selection weights for analysis that account for how the sample members were selected for the survey. Additionally, the sampling frame may have the type of building associated with each address, so we could include the building type as a potential nonresponse weighting variable, along with some interviewer observations that may be related to our research topic of the average number of pets in a household. Combining these weights, we can create an analytic weight that researchers need to use when analyzing the data.</p>
+<p>In the simple example of our survey, we decided to obtain a random sample from each state to select our sample members. Knowing this sampling design, we can include selection weights for analysis that account for how the sample members were selected for the survey. Additionally, the sampling frame may have the type of building associated with each address, so we could include the building type as a potential nonresponse weighting variable, along with some interviewer observations that may be related to our research topic of the average number of pets in a household. Combining these weights, we can create an analytic weight that analysts need to use when analyzing the data.</p>
 </div>
 </div>
 <div id="overview-post-disclosure" class="section level3 hasAnchor" number="2.6.3">
 <h3><span class="header-section-number">2.6.3</span> Disclosure<a href="c02-overview-surveys.html#overview-post-disclosure" class="anchor-section" aria-label="Anchor link to header"></a></h3>
-<p>Before data is released publicly, researchers need to ensure that individual respondents can not be identified by the data when confidentiality is required. There are a variety of different methods that can be used. Here we describe a few of the most commonly used:</p>
+<p>Before data is released publicly, we need to ensure that individual respondents can not be identified by the data when confidentiality is required. There are a variety of different methods that can be used. Here we describe a few of the most commonly used:</p>
 <ul>
-<li><strong>Data swapping</strong>: Researchers may swap specific data values across different respondents so that it does not impact insights from the data but ensures that specific individuals cannot be identified.</li>
-<li><strong>Top/bottom coding</strong>: Researchers may choose top or bottom coding to mask extreme values. For example, researchers may top-code income values such that households with income greater than $500,000 are coded as “$500,000 or more” with other incomes being presented as integers between $0 and $499,999. This can impact analyses at the tails of the distribution.</li>
-<li><strong>Coarsening</strong>: Researchers may use coarsening to mask unique values. For example, a survey question may ask for a precise income but the public data may include data as a categorical variable. Another example commonly used in survey practice is to coarsen geographic variables. Data collectors likely know the precise address of sample members but the public data may only include the state or even region of respondents.</li>
-<li><strong>Perturbation</strong>: Researchers may add random noise to outcomes. As with swapping, this is done so that it does not impact insights from the data but ensures that specific individuals cannot be identified.</li>
+<li><strong>Data swapping</strong>: We may swap specific data values across different respondents so that it does not impact insights from the data but ensures that specific individuals cannot be identified.</li>
+<li><strong>Top/bottom coding</strong>: We may choose top or bottom coding to mask extreme values. For example, we may top-code income values such that households with income greater than $500,000 are coded as “$500,000 or more,” with other incomes being presented as integers between $0 and $499,999. This can impact analyses at the tails of the distribution.</li>
+<li><strong>Coarsening</strong>: We may use coarsening to mask unique values. For example, a survey question may ask for a precise income, but the public data may include income as a categorical variable. Another example commonly used in survey practice is to coarsen geographic variables. Data collectors likely know the precise address of sample members, but the public data may only include the state or even region of respondents.</li>
+<li><strong>Perturbation</strong>: We may add random noise to outcomes. As with swapping, this is done so that it does not impact insights from the data but ensures that specific individuals cannot be identified.</li>
 </ul>
-<p>There is as much art as there is science to the methods used for disclosure. In the survey documentation, researchers will only provide high-level comments about the disclosure and not specific details. This ensures nobody can reverse the disclosure and thus identify individuals. For more information on different disclosure methods, please see <span class="citation">Skinner (<a href="#ref-Skinner2009">2009</a>)</span> and the <a href="https://aapor.org/standards-and-ethics/disclosure-standards/">AAPOR Standards</a>.</p>
+<p>There is as much art as there is science to the methods used for disclosure. Only high-level comments about the disclosure are provided in the survey documentation, not specific details. This ensures nobody can reverse the disclosure and thus identify individuals. For more information on different disclosure methods, please see <span class="citation">Skinner (<a href="#ref-Skinner2009">2009</a>)</span> and the <a href="https://aapor.org/standards-and-ethics/disclosure-standards/">AAPOR Standards</a>.</p>
 </div>
 <div id="overview-post-documentation" class="section level3 hasAnchor" number="2.6.4">
 <h3><span class="header-section-number">2.6.4</span> Documentation<a href="c02-overview-surveys.html#overview-post-documentation" class="anchor-section" aria-label="Anchor link to header"></a></h3>
-<p>Documentation is a critical step of the survey life cycle. Researchers systematically record all the details, decisions, procedures, and methodologies to ensure transparency, reproducibility, and the overall quality of survey research.</p>
+<p>Documentation is a critical step of the survey life cycle. We should systematically record all the details, decisions, procedures, and methodologies to ensure transparency, reproducibility, and the overall quality of survey research.</p>
 <p>Proper documentation allows analysts to understand, reproduce, and evaluate the study’s methods and findings. Chapter <a href="c03-survey-data-documentation.html#c03-survey-data-documentation">3</a> dives into how analysts should use survey data documentation.</p>
 </div>
 </div>
 <div id="post-survey-data-analysis-and-reporting" class="section level2 hasAnchor" number="2.7">
 <h2><span class="header-section-number">2.7</span> Post-survey data analysis and reporting<a href="c02-overview-surveys.html#post-survey-data-analysis-and-reporting" class="anchor-section" aria-label="Anchor link to header"></a></h2>
-<p>After completing the survey life cycle, the data is ready for analysts to use. The rest of this book continues from this point. For more information on the survey life cycle, please explore the references cited throughout this chapter.</p>
+<p>After completing the survey life cycle, the data are ready for analysts to use. Chapter <a href="c04-getting-started.html#c04-getting-started">4</a> continues from this point. For more information on the survey life cycle, please explore the references cited throughout this chapter.</p>
 
 </div>
 </div>
@@ -821,7 +821,7 @@ <h3>References<a href="references.html#references" class="anchor-section" aria-l
 <div class="footnotes">
 <hr />
 <ol start="1">
-<li id="fn1"><p>Other modes such as using mobile apps or text messaging can also be considered, but at the time of publication, have smaller reach or are better for longitudinal studies (i.e., surveying the same individuals over many time periods of a single study).<a href="c02-overview-surveys.html#fnref1" class="footnote-back">↩︎</a></p></li>
+<li id="fn1"><p>Other modes such as using mobile apps or text messaging can also be considered, but at the time of publication, have smaller reach or are better for longitudinal studies (i.e., surveying the same individuals over many time periods of a single study.)<a href="c02-overview-surveys.html#fnref1" class="footnote-back">↩︎</a></p></li>
 </ol>
 </div>
             </section>
diff --git a/c03-survey-data-documentation.html b/c03-survey-data-documentation.html
index 3ca5a5ea..1a249abb 100644
--- a/c03-survey-data-documentation.html
+++ b/c03-survey-data-documentation.html
@@ -23,7 +23,7 @@
 <meta name="author" content="Stephanie Zimmer, Rebecca J. Powell, and Isabella Velásquez" />
 
 
-<meta name="date" content="2024-04-16" />
+<meta name="date" content="2024-04-24" />
 
   <meta name="viewport" content="width=device-width, initial-scale=1" />
   <meta name="apple-mobile-web-app-capable" content="yes" />
@@ -229,7 +229,7 @@
 <li class="chapter" data-level="4.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#introduction-3"><i class="fa fa-check"></i><b>4.1</b> Introduction</a></li>
 <li class="chapter" data-level="4.2" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup"><i class="fa fa-check"></i><b>4.2</b> Setup</a>
 <ul>
-<li class="chapter" data-level="4.2.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#packages"><i class="fa fa-check"></i><b>4.2.1</b> Packages</a></li>
+<li class="chapter" data-level="4.2.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup-load-pkgs"><i class="fa fa-check"></i><b>4.2.1</b> Packages</a></li>
 <li class="chapter" data-level="4.2.2" data-path="c04-getting-started.html"><a href="c04-getting-started.html#data"><i class="fa fa-check"></i><b>4.2.2</b> Data</a></li>
 <li class="chapter" data-level="4.2.3" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup-des-obj"><i class="fa fa-check"></i><b>4.2.3</b> Design objects</a></li>
 </ul></li>
@@ -358,7 +358,7 @@
 <li class="chapter" data-level="9.9.1" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#random-number-seeds"><i class="fa fa-check"></i><b>9.9.1</b> Random number seeds</a></li>
 <li class="chapter" data-level="9.9.2" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#descriptive-names-and-labels"><i class="fa fa-check"></i><b>9.9.2</b> Descriptive names and labels</a></li>
 </ul></li>
-<li class="chapter" data-level="9.10" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#summary"><i class="fa fa-check"></i><b>9.10</b> Summary</a></li>
+<li class="chapter" data-level="9.10" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#additional-resources-1"><i class="fa fa-check"></i><b>9.10</b> Additional resources</a></li>
 </ul></li>
 <li class="part"><span><b>IV Real life data</b></span></li>
 <li class="chapter" data-level="10" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html"><i class="fa fa-check"></i><b>10</b> Sample designs and replicate weights</a>
@@ -380,7 +380,7 @@
 <ul>
 <li class="chapter" data-level="10.4.1" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#balanced-repeated-replication-brr-method"><i class="fa fa-check"></i><b>10.4.1</b> Balanced Repeated Replication (BRR) method</a></li>
 <li class="chapter" data-level="10.4.2" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#fays-brr-method"><i class="fa fa-check"></i><b>10.4.2</b> Fay’s BRR method</a></li>
-<li class="chapter" data-level="10.4.3" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#jackknife-method"><i class="fa fa-check"></i><b>10.4.3</b> Jackknife method</a></li>
+<li class="chapter" data-level="10.4.3" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#samp-jackknife"><i class="fa fa-check"></i><b>10.4.3</b> Jackknife method</a></li>
 <li class="chapter" data-level="10.4.4" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#bootstrap-method"><i class="fa fa-check"></i><b>10.4.4</b> Bootstrap method</a></li>
 </ul></li>
 <li class="chapter" data-level="10.5" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#exercises-2"><i class="fa fa-check"></i><b>10.5</b> Exercises</a></li>
@@ -398,14 +398,14 @@
 <li class="chapter" data-level="11.4" data-path="c11-missing-data.html"><a href="c11-missing-data.html#analysis-with-missing-data"><i class="fa fa-check"></i><b>11.4</b> Analysis with missing data</a>
 <ul>
 <li class="chapter" data-level="11.4.1" data-path="c11-missing-data.html"><a href="c11-missing-data.html#recoding-missing-data"><i class="fa fa-check"></i><b>11.4.1</b> Recoding missing data</a></li>
-<li class="chapter" data-level="11.4.2" data-path="c11-missing-data.html"><a href="c11-missing-data.html#accounting-for-skip-patterns"><i class="fa fa-check"></i><b>11.4.2</b> Accounting for skip patterns</a></li>
+<li class="chapter" data-level="11.4.2" data-path="c11-missing-data.html"><a href="c11-missing-data.html#missing-skip-patt"><i class="fa fa-check"></i><b>11.4.2</b> Accounting for skip patterns</a></li>
 </ul></li>
 </ul></li>
 <li class="chapter" data-level="12" data-path="c12-recommendations.html"><a href="c12-recommendations.html"><i class="fa fa-check"></i><b>12</b> Successful survey analysis recommendations</a>
 <ul>
 <li class="chapter" data-level="" data-path="c12-recommendations.html"><a href="c12-recommendations.html#prereq12"><i class="fa fa-check"></i>Prerequisites</a></li>
 <li class="chapter" data-level="12.1" data-path="c12-recommendations.html"><a href="c12-recommendations.html#introduction-10"><i class="fa fa-check"></i><b>12.1</b> Introduction</a></li>
-<li class="chapter" data-level="12.2" data-path="c12-recommendations.html"><a href="c12-recommendations.html#recs-survey-process"><i class="fa fa-check"></i><b>12.2</b> Follow survey analysis process</a></li>
+<li class="chapter" data-level="12.2" data-path="c12-recommendations.html"><a href="c12-recommendations.html#recs-survey-process"><i class="fa fa-check"></i><b>12.2</b> Follow the survey analysis process</a></li>
 <li class="chapter" data-level="12.3" data-path="c12-recommendations.html"><a href="c12-recommendations.html#begin-with-descriptive-analysis"><i class="fa fa-check"></i><b>12.3</b> Begin with descriptive analysis</a>
 <ul>
 <li class="chapter" data-level="12.3.1" data-path="c12-recommendations.html"><a href="c12-recommendations.html#table-review"><i class="fa fa-check"></i><b>12.3.1</b> Table review</a></li>
@@ -532,16 +532,16 @@ <h3><span class="header-section-number">3.2.1</span> Technical documentation<a h
 <ul>
 <li><strong>Introduction:</strong> The introduction orients us to the survey. This section provides the project’s background, the study’s purpose, and the main research questions.</li>
 <li><strong>Study design:</strong> The study design section describes how researchers prepared and administered the survey.</li>
-<li><strong>Sample:</strong> The sample section describes the sample frame, any known sampling errors, and the limitations of the sample. This section can contain recommendations on how to use sampling weights. Look for weight information, whether the survey design contains strata, clusters/PSUs, or replicate weights. Also look for population sizes, finite population correction, or replicate weight scaling information. Additional detail on sample designs is available in Chapter <a href="c10-sample-designs-replicate-weights.html#c10-sample-designs-replicate-weights">10</a>.</li>
+<li><strong>Sample:</strong> The sample section describes the sample frame, any known sampling errors, and the sample’s limitations. This section can contain recommendations on how to use sampling weights. Look for weight information, whether the survey design contains strata, clusters/PSUs, or replicate weights. Also, look for population sizes, finite population correction, or replicate weight scaling information. Additional detail on sample designs is available in Chapter <a href="c10-sample-designs-replicate-weights.html#c10-sample-designs-replicate-weights">10</a>.</li>
 <li><strong>Notes on fielding:</strong> Any additional notes on fielding, such as response rates, may be found in the technical documentation.</li>
 </ul>
-<p>The technical documentation may include other helpful resources. Some technical documentation includes syntax for SAS, SUDAAN, Stata, and/or R, so we do not have to create this code from scratch.</p>
+<p>The technical documentation may include other helpful resources. For example, some technical documentation includes syntax for SAS, SUDAAN, Stata, and/or R, so we do not have to create this code from scratch.</p>
 </div>
 <div id="questionnaires" class="section level3 hasAnchor" number="3.2.2">
 <h3><span class="header-section-number">3.2.2</span> Questionnaires<a href="c03-survey-data-documentation.html#questionnaires" class="anchor-section" aria-label="Anchor link to header"></a></h3>
-<p>A questionnaire is a series of questions used to collect information from people in a survey. It can ask about opinions, behaviors, demographics, or even just numbers like the count of lightbulbs, square footage, or farm size. Questionnaires can employ different types of questions, such as closed-ended (e.g., select one or check all that apply), open-ended (e.g., numeric or text), Likert scales (e.g., a 5- or 7-point scale specifying a respondent’s level of agreement to a statement), or ranking questions (e.g., a list of options that a respondent ranks by preference). It may randomize the display order of responses or include instructions that help respondents understand the questions. A survey may have one questionnaire or multiple, depending on its scale and scope.</p>
-<p>The questionnaire is another important resource for understanding and interpreting the survey data (see Section <a href="c02-overview-surveys.html#overview-design-questionnaire">2.4.3</a>), and we should use it alongside any analysis. It provides details about each of the questions asked in the survey, such as question name, question wording, response options, skip logic, randomizations, display specification, mode differences, and the universe (the subset of respondents that were asked a question).</p>
-<p>Below, in Figure <a href="c03-survey-data-documentation.html#fig:understand-que-examp">3.1</a>, we show an example from the ANES 2020 questionnaire <span class="citation">(<a href="#ref-anes-svy">American National Election Studies 2021</a>)</span>. The figure shows a question’s question name (<code>POSTVOTE_RVOTE</code>), description (Did R Vote?), full wording of the question and responses, response order, universe, question logic (this question was only asked if <code>vote_pre</code> = 0), and other specifications. The section also includes the variable name, which we can link to the codebook.</p>
+<p>A questionnaire is a series of questions used to collect information from people in a survey. It can ask about opinions, behaviors, demographics, or even just numbers like the count of lightbulbs, square footage, or farm size. Questionnaires can employ different types of questions, such as closed-ended (e.g., select one or check all that apply), open-ended (e.g., numeric or text), Likert scales (e.g., a 5- or 7-point scale specifying a respondent’s level of agreement to a statement), or ranking questions (e.g., a list of options that a respondent ranks by preference.) It may randomize the display order of responses or include instructions that help respondents understand the questions. A survey may have one questionnaire or multiple, depending on its scale and scope.</p>
+<p>The questionnaire is another important resource for understanding and interpreting the survey data (see Section <a href="c02-overview-surveys.html#overview-design-questionnaire">2.4.3</a>), and we should use it alongside any analysis. It provides details about each of the questions asked in the survey, such as question name, question wording, response options, skip logic, randomizations, display specifications, mode differences, and the universe (the subset of respondents who were asked a question.)</p>
+<p>In Figure <a href="c03-survey-data-documentation.html#fig:understand-que-examp">3.1</a>, we show an example from the ANES 2020 questionnaire <span class="citation">(<a href="#ref-anes-svy">American National Election Studies 2021</a>)</span>. The figure shows the question name (<code>POSTVOTE_RVOTE</code>), description (Did R Vote?), full wording of the question and responses, response order, universe, question logic (this question was only asked if <code>vote_pre</code> = 0), and other specifications. The section also includes the variable name, which we can link to the codebook.</p>
 <div class="figure"><span style="display:block;" id="fig:understand-que-examp"></span>
 <img src="images/questionnaire-example.jpg" alt="Question information about the variable postvote_rvote from ANES 2020 questionnaire Survey question, Universe, Logic, Web Spec, Response Order, and Released Variable are included."  />
 <p class="caption">
@@ -555,12 +555,12 @@ <h3><span class="header-section-number">3.2.2</span> Questionnaires<a href="c03-
 FIGURE 3.2: BRFSS 2021 Questionnaire Example
 </p>
 </div>
-<p>We should factor in the details of a survey when conducting our analyses. For example, surveys that use various modes (e.g., web and mail) may have differences in question wording or skip logic, as web surveys can include fills or automate skip logic. These variations could warrant separate analyses for each mode.</p>
+<p>We should factor in the details of a survey when conducting our analyses. For example, surveys that use various modes (e.g., web and mail) may have differences in question wording or skip logic, as web surveys can include fills or automate skip logic. If large enough, these variations could warrant separate analyses for each mode.</p>
 </div>
 <div id="codebooks" class="section level3 hasAnchor" number="3.2.3">
 <h3><span class="header-section-number">3.2.3</span> Codebooks<a href="c03-survey-data-documentation.html#codebooks" class="anchor-section" aria-label="Anchor link to header"></a></h3>
-<p>While a questionnaire provides information about the questions posed to respondents, the codebook explains how the survey data was coded and recorded. It lists details such as variable names, variable labels, variable meanings, codes for missing data, value labels, and value types (whether categorical or continuous, etc.). The codebook helps us understand and use the variables appropriately in our analysis. In particular, the codebook (as opposed to the questionnaire) often includes information on missing data. Note that the term <em>data dictionary</em> is sometimes used interchangeably with codebook, but a data dictionary may include more details on the structure and elements of the data.</p>
-<p>Figure <a href="c03-survey-data-documentation.html#fig:understand-codebook-examp">3.3</a> is a question from the ANES 2020 codebook <span class="citation">(<a href="#ref-anes-cb">American National Election Studies 2022</a>)</span>. This section indicates a particular variable’s name (<code>V202066</code>), question wording, value labels, universe, and associated survey question (<code>POSTVOTE_RVOTE</code>).</p>
+<p>While a questionnaire provides information about the questions posed to respondents, the codebook explains how the survey data were coded and recorded. It lists details such as variable names, variable labels, variable meanings, codes for missing data, value labels, and value types (whether categorical, continuous, etc.) The codebook helps us understand and use the variables appropriately in our analysis. In particular, the codebook (as opposed to the questionnaire) often includes information on missing data. Note that the term <em>data dictionary</em> is sometimes used interchangeably with codebook, but a data dictionary may include more details on the structure and elements of the data.</p>
+<p>Figure <a href="c03-survey-data-documentation.html#fig:understand-codebook-examp">3.3</a> is a question from the ANES 2020 codebook <span class="citation">(<a href="#ref-anes-cb">American National Election Studies 2022</a>)</span>. This section indicates a variable’s name (<code>V202066</code>), question wording, value labels, universe, and associated survey question (<code>POSTVOTE_RVOTE</code>.)</p>
 <div class="figure"><span style="display:block;" id="fig:understand-codebook-examp"></span>
 <img src="images/codebook-example.jpg" alt="Variable information about the variable V202066 from ANES 2020 questionnaire Variable meaning, Value labels, Universe, and Survey Question(s) are included."  />
 <p class="caption">
@@ -576,7 +576,7 @@ <h3><span class="header-section-number">3.2.4</span> Errata<a href="c03-survey-d
 <li>Issuing a corrected data table after realizing a typo or mistake in a table cell</li>
 <li>Reporting incorrectly programmed skips in an electronic survey where questions are skipped by the respondent when they should not have been</li>
 </ul>
-<p>The 2004 ANES dataset released an erratum, notifying analysts to remove a specific row from the data file due to the inclusion of a respondent who should not have been part of the sample. Adhering to an issued erratum helps us increase the accuracy and reliability of analysis.</p>
+<p>For example, the 2004 ANES dataset released an erratum, notifying analysts to remove a specific row from the data file due to the inclusion of a respondent who should not have been part of the sample. Adhering to an issued erratum helps us increase the accuracy and reliability of analysis.</p>
 </div>
 <div id="additional-resources" class="section level3 hasAnchor" number="3.2.5">
 <h3><span class="header-section-number">3.2.5</span> Additional resources<a href="c03-survey-data-documentation.html#additional-resources" class="anchor-section" aria-label="Anchor link to header"></a></h3>
@@ -585,16 +585,16 @@ <h3><span class="header-section-number">3.2.5</span> Additional resources<a href
 </div>
 <div id="missing-data-coding" class="section level2 hasAnchor" number="3.3">
 <h2><span class="header-section-number">3.3</span> Missing data coding<a href="c03-survey-data-documentation.html#missing-data-coding" class="anchor-section" aria-label="Anchor link to header"></a></h2>
-<p>For some observations in a dataset, there may be missing data. This can be by design or from nonresponse, and these concepts are detailed in Chapter <a href="c11-missing-data.html#c11-missing-data">11</a>. In that chapter, we also discuss how to analyze data with missing data. In this section, we discuss how to understand documentation related to missing data.</p>
-<p>The survey documentation, often the codebook, represents the missing data with a code. The codebook may list different codes depending on why certain data is missing. In the example of variable <code>V202066</code> from the ANES (Figure <a href="c03-survey-data-documentation.html#fig:understand-codebook-examp">3.3</a>), <code>-9</code> represents “Refused,” <code>-7</code> means that the response was deleted due to an incomplete interview, <code>-6</code> means that there is no response because there was no follow-up interview, and <code>-1</code> means “Inapplicable” (due to the designed skip pattern).</p>
-<p>As another example, there may be a summary variable that describes the missingness of a set of variables - particularly with “select all that apply” or “multiple response” questions. In the National Crime Victimization Survey (NCVS), respondents who are victims of a crime and saw the offender are asked if the offender have a weapon and then asked what the type of weapon was. This part of the questionnaire from 2021 is shown in Figure <a href="c03-survey-data-documentation.html#fig:understand-ncvs-weapon-q">3.4</a>.</p>
+<p>Some observations in a dataset may have missing data. This can be due to design or nonresponse, and these concepts are detailed in Chapter <a href="c11-missing-data.html#c11-missing-data">11</a>. In that chapter, we also discuss how to analyze data with missing values. This chapter walks through how to understand documentation related to missing data.</p>
+<p>The survey documentation, often the codebook, represents the missing data with a code. The codebook may list different codes depending on why certain data points are missing. In the example of variable <code>V202066</code> from the ANES (Figure <a href="c03-survey-data-documentation.html#fig:understand-codebook-examp">3.3</a>), <code>-9</code> represents “Refused,” <code>-7</code> means that the response was deleted due to an incomplete interview, <code>-6</code> means that there is no response because there was no follow-up interview, and <code>-1</code> means “Inapplicable” (due to a designed skip pattern.)</p>
+<p>As another example, there may be a summary variable that describes the missingness of a set of variables - particularly with “select all that apply” or “multiple response” questions. In the National Crime Victimization Survey (NCVS), respondents who are victims of a crime and saw the offender are asked if the offender had a weapon and then asked what the type of weapon was. This part of the questionnaire from 2021 is shown in Figure <a href="c03-survey-data-documentation.html#fig:understand-ncvs-weapon-q">3.4</a>.</p>
 <div class="figure"><span style="display:block;" id="fig:understand-ncvs-weapon-q"></span>
 <img src="images/questionnaire-ncvs-weapon.jpg" alt="Questions 22 and 23a from the NCVS 2020-2021 Crime Incident Report, see https://bjs.ojp.gov/content/pub/pdf/ncvs20_cir.pdf"  />
 <p class="caption">
 FIGURE 3.4: Excerpt from the NCVS 2020-2021 Crime Incident Report - Weapon Type
 </p>
 </div>
-<p>The NCVS codebook includes coding for all multiple response variables of a “lead in” variable that summarizes the individual options. For question 23a on the weapon type, the lead in variable is V4050 which is shown in <a href="c03-survey-data-documentation.html#fig:understand-ncvs-weapon-cb">3.5</a>. This variable is then followed by a set of variables for each weapon type. An example of one of the individual variables from the codebook, the handgun, is shown in <a href="c03-survey-data-documentation.html#fig:understand-ncvs-weapon-cb-hg">3.6</a>. We will dive in more to this example in Chapter <a href="c11-missing-data.html#c11-missing-data">11</a> of how to analyze this variable.</p>
+<p>The NCVS codebook includes coding for all multiple response variables of a “lead in” variable that summarizes the individual options. For question 23a on the weapon type, the lead-in variable is V4050, which is shown in <a href="c03-survey-data-documentation.html#fig:understand-ncvs-weapon-cb">3.5</a>. This variable is then followed by a set of variables for each weapon type. An example of one of the individual variables from the codebook, the handgun, is shown in <a href="c03-survey-data-documentation.html#fig:understand-ncvs-weapon-cb-hg">3.6</a>. We will dive into how to analyze this variable in Chapter <a href="c11-missing-data.html#c11-missing-data">11</a>.</p>
 <div class="figure"><span style="display:block;" id="fig:understand-ncvs-weapon-cb"></span>
 <img src="images/codebook-ncvs-weapon-li.jpg" alt="Codebook includes location of variable (files and columns), variable type (numeric), question (What was the weapon? Anything else?), and the coding of this lead in variable"  />
 <p class="caption">
@@ -607,19 +607,18 @@ <h2><span class="header-section-number">3.3</span> Missing data coding<a href="c
 FIGURE 3.6: Excerpt from the NCVS 2021 Codebook for V4051 - C WEAPON: HAND GUN
 </p>
 </div>
-<p>When data is read into R, some values may be system missing, that is they are coded as <code>NA</code> even if that is not evident in a codebook. We will discuss in Chapter <a href="c11-missing-data.html#c11-missing-data">11</a> how to analyze data with <code>NA</code> values and review how R handles missing data in calculations.</p>
+<p>When data are read into R, some values may be system missing, that is they are coded as <code>NA</code> even if that is not evident in a codebook. We discuss in Chapter <a href="c11-missing-data.html#c11-missing-data">11</a> how to analyze data with <code>NA</code> values and review how R handles missing data in calculations.</p>
 </div>
 <div id="example-american-national-election-studies-anes-2020-survey-documentation" class="section level2 hasAnchor" number="3.4">
 <h2><span class="header-section-number">3.4</span> Example: American National Election Studies (ANES) 2020 survey documentation<a href="c03-survey-data-documentation.html#example-american-national-election-studies-anes-2020-survey-documentation" class="anchor-section" aria-label="Anchor link to header"></a></h2>
-<p>Let’s look at the survey documentation for the American National Election Studies (ANES) 2020. The survey website is located at <a href="https://electionstudies.org/data-center/2020-time-series-study/">https://electionstudies.org/data-center/2020-time-series-study/</a>.</p>
-<p>Navigating to “User Guide and Codebook” <span class="citation">(<a href="#ref-anes-cb">American National Election Studies 2022</a>)</span>, we can download the PDF that contains the survey documentation, titled “ANES 2020 Time Series Study Full Release: User Guide and Codebook”. Do not be daunted by the 796-page PDF. We will focus on the most critical information.</p>
+<p>Let’s look at the survey documentation for the American National Election Studies (ANES) 2020 and the documentation from their <a href="https://electionstudies.org/data-center/2020-time-series-study/">website</a>. Navigating to “User Guide and Codebook” <span class="citation">(<a href="#ref-anes-cb">American National Election Studies 2022</a>)</span>, we can download the PDF that contains the survey documentation, titled “ANES 2020 Time Series Study Full Release: User Guide and Codebook”. Do not be daunted by the 796-page PDF. Below, we focus on the most critical information.</p>
 <div id="introduction-2" class="section level4 unnumbered hasAnchor">
 <h4>Introduction<a href="c03-survey-data-documentation.html#introduction-2" class="anchor-section" aria-label="Anchor link to header"></a></h4>
-<p>The first section in the User Guide explains that the ANES 2020 Times Series Study continues a series of election surveys conducted since 1948. These surveys contain data on public opinion and voting behavior in the U.S. presidential elections. The introduction also includes information about the modes used for data collection (web, live video interviewing, or CATI). Additionally, there is a summary of the number of pre-election interviews (8,280) and post-election re-interviews (7,449).</p>
+<p>The first section in the User Guide explains that the ANES 2020 Times Series Study continues a series of election surveys conducted since 1948. These surveys contain data on public opinion and voting behavior in the U.S. presidential elections. The introduction also includes information about the modes used for data collection (web, live video interviewing, or CATI.) Additionally, there is a summary of the number of pre-election interviews (8,280) and post-election re-interviews (7,449.)</p>
 </div>
 <div id="sample-design-and-respondent-recruitment" class="section level4 unnumbered hasAnchor">
 <h4>Sample design and respondent recruitment<a href="c03-survey-data-documentation.html#sample-design-and-respondent-recruitment" class="anchor-section" aria-label="Anchor link to header"></a></h4>
-<p>The section “Sample Design and Respondent Recruitment” provides more detail about the survey’s sequential mixed-mode design. All three modes were conducted one after another and not at the same time. Additionally, it indicates that for the 2020 survey, they resampled all respondents who participated in 2016 ANES, along with a newly-drawn cross-section:</p>
+<p>The section “Sample Design and Respondent Recruitment” provides more detail about the survey’s sequential mixed-mode design. All three modes were conducted one after another and not at the same time. Additionally, it indicates that for the 2020 survey, they resampled all respondents who participated in the 2016 ANES, along with a newly drawn cross-section:</p>
 <blockquote>
 <p>The target population for the fresh cross-section was the 231 million non-institutional U.S. citizens aged 18 or older living in the 50 U.S. states or the District of Columbia.</p>
 </blockquote>
@@ -631,7 +630,7 @@ <h4>Data analysis, weights, and variance estimation<a href="c03-survey-data-docu
 <blockquote>
 <p>For analysis of the complete set of cases using pre-election data only, including all cases and representative of the 2020 electorate, use the full sample pre-election weight, <strong>V200010a</strong>. For analysis including post-election data for the complete set of participants (i.e., analysis of post-election data only or a combination of pre- and post-election data), use the full sample post-election weight, <strong>V200010b</strong>. Additional weights are provided for analysis of subsets of the data…</p>
 </blockquote>
-<p>The document provides more information about the variables, summarized in Table <a href="c03-survey-data-documentation.html#tab:aneswgts">3.1</a>.</p>
+<p>The document provides more information about the design variables, summarized in Table <a href="c03-survey-data-documentation.html#tab:aneswgts">3.1</a>.</p>
 <table>
 <caption><span id="tab:aneswgts">TABLE 3.1: </span> Weight and variance information for ANES</caption>
 <thead>
@@ -661,7 +660,7 @@ <h3>Methodology<a href="c03-survey-data-documentation.html#methodology" class="a
 <blockquote>
 <p>The target population for the fresh cross-section was the 231 million non-institutional U.S. citizens aged 18 or older living in the 50 U.S. states or the District of Columbia.</p>
 </blockquote>
-<p>The documentation suggests that the population should equal around 231 million, but this is a very imprecise count. Upon further investigation in the available resources, we can find the methodology file titled “Methodology Report for the ANES 2020 Time Series Study” <span class="citation">(<a href="#ref-anes-2020-tech">DeBell, Matthew and Amsbary, Michelle and Brader, Ted and Brock, Shelley and Good, Cindy and Kamens, Justin and Maisel, Natalya and Pinto, Sarah 2022</a>)</span>. This file states that we can use the population total from the Current Population Survey (CPS), a monthly survey sponsored by the U.S. Census Bureau and the U.S. Bureau of Labor Statistics. The CPS provides a more accurate population estimate for a specific month. Therefore, we can use the CPS to get the total population number for March 2020, the time in which the ANES was conducted. Chapter <a href="c04-getting-started.html#c04-getting-started">4</a> goes into detailed instructions on how to calculate and adjust this value in the data.</p>
+<p>The documentation suggests that the population should equal around 231 million, but this is a very imprecise count. Upon further investigation of the available resources, we can find the methodology file titled “Methodology Report for the ANES 2020 Time Series Study” <span class="citation">(<a href="#ref-anes-2020-tech">DeBell et al. 2022</a>)</span>. This file states that we can use the population total from the Current Population Survey (CPS), a monthly survey sponsored by the U.S. Census Bureau and the U.S. Bureau of Labor Statistics. The CPS provides a more accurate population estimate for a specific month. Therefore, we can use the CPS to get the total population number for March 2020, when the ANES was conducted. Chapter <a href="c04-getting-started.html#c04-getting-started">4</a> goes into detailed instructions on how to calculate and adjust this value in the data.</p>
 
 </div>
 </div>
@@ -684,7 +683,7 @@ <h3>References<a href="references.html#references" class="anchor-section" aria-l
 DeBell, Matthew. 2010. <span>“How to Analyze ANES Survey Data.”</span> ANES Technical Report Series nes012492. Palo Alto, CA: Stanford University; Ann Arbor, MI: the University of Michigan; <a href="https://electionstudies.org/wp-content/uploads/2018/05/HowToAnalyzeANESData.pdf" class="uri">https://electionstudies.org/wp-content/uploads/2018/05/HowToAnalyzeANESData.pdf</a>.
 </div>
 <div id="ref-anes-2020-tech" class="csl-entry">
-DeBell, Matthew and Amsbary, Michelle and Brader, Ted and Brock, Shelley and Good, Cindy and Kamens, Justin and Maisel, Natalya and Pinto, Sarah. 2022. <span>“<span class="nocase">Methodology Report for the ANES 2020 Time Series Study</span>.”</span> <a href="https://electionstudies.org/wp-content/uploads/2022/08/anes_timeseries_2020_methodology_report.pdf" class="uri">https://electionstudies.org/wp-content/uploads/2022/08/anes_timeseries_2020_methodology_report.pdf</a>.
+DeBell, Matthew, Michelle Amsbary, Ted Brader, Shelley Brock, Cindy Good, Justin Kamens, Natalya Maisel, and Sarah Pinto. 2022. <span>“<span class="nocase">Methodology Report for the ANES 2020 Time Series Study</span>.”</span> <a href="https://electionstudies.org/wp-content/uploads/2022/08/anes_timeseries_2020_methodology_report.pdf" class="uri">https://electionstudies.org/wp-content/uploads/2022/08/anes_timeseries_2020_methodology_report.pdf</a>.
 </div>
 </div>
             </section>
diff --git a/c04-getting-started.html b/c04-getting-started.html
index 0d2d7e78..d0938e50 100644
--- a/c04-getting-started.html
+++ b/c04-getting-started.html
@@ -23,7 +23,7 @@
 <meta name="author" content="Stephanie Zimmer, Rebecca J. Powell, and Isabella Velásquez" />
 
 
-<meta name="date" content="2024-04-16" />
+<meta name="date" content="2024-04-24" />
 
   <meta name="viewport" content="width=device-width, initial-scale=1" />
   <meta name="apple-mobile-web-app-capable" content="yes" />
@@ -229,7 +229,7 @@
 <li class="chapter" data-level="4.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#introduction-3"><i class="fa fa-check"></i><b>4.1</b> Introduction</a></li>
 <li class="chapter" data-level="4.2" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup"><i class="fa fa-check"></i><b>4.2</b> Setup</a>
 <ul>
-<li class="chapter" data-level="4.2.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#packages"><i class="fa fa-check"></i><b>4.2.1</b> Packages</a></li>
+<li class="chapter" data-level="4.2.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup-load-pkgs"><i class="fa fa-check"></i><b>4.2.1</b> Packages</a></li>
 <li class="chapter" data-level="4.2.2" data-path="c04-getting-started.html"><a href="c04-getting-started.html#data"><i class="fa fa-check"></i><b>4.2.2</b> Data</a></li>
 <li class="chapter" data-level="4.2.3" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup-des-obj"><i class="fa fa-check"></i><b>4.2.3</b> Design objects</a></li>
 </ul></li>
@@ -358,7 +358,7 @@
 <li class="chapter" data-level="9.9.1" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#random-number-seeds"><i class="fa fa-check"></i><b>9.9.1</b> Random number seeds</a></li>
 <li class="chapter" data-level="9.9.2" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#descriptive-names-and-labels"><i class="fa fa-check"></i><b>9.9.2</b> Descriptive names and labels</a></li>
 </ul></li>
-<li class="chapter" data-level="9.10" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#summary"><i class="fa fa-check"></i><b>9.10</b> Summary</a></li>
+<li class="chapter" data-level="9.10" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#additional-resources-1"><i class="fa fa-check"></i><b>9.10</b> Additional resources</a></li>
 </ul></li>
 <li class="part"><span><b>IV Real life data</b></span></li>
 <li class="chapter" data-level="10" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html"><i class="fa fa-check"></i><b>10</b> Sample designs and replicate weights</a>
@@ -380,7 +380,7 @@
 <ul>
 <li class="chapter" data-level="10.4.1" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#balanced-repeated-replication-brr-method"><i class="fa fa-check"></i><b>10.4.1</b> Balanced Repeated Replication (BRR) method</a></li>
 <li class="chapter" data-level="10.4.2" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#fays-brr-method"><i class="fa fa-check"></i><b>10.4.2</b> Fay’s BRR method</a></li>
-<li class="chapter" data-level="10.4.3" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#jackknife-method"><i class="fa fa-check"></i><b>10.4.3</b> Jackknife method</a></li>
+<li class="chapter" data-level="10.4.3" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#samp-jackknife"><i class="fa fa-check"></i><b>10.4.3</b> Jackknife method</a></li>
 <li class="chapter" data-level="10.4.4" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#bootstrap-method"><i class="fa fa-check"></i><b>10.4.4</b> Bootstrap method</a></li>
 </ul></li>
 <li class="chapter" data-level="10.5" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#exercises-2"><i class="fa fa-check"></i><b>10.5</b> Exercises</a></li>
@@ -398,14 +398,14 @@
 <li class="chapter" data-level="11.4" data-path="c11-missing-data.html"><a href="c11-missing-data.html#analysis-with-missing-data"><i class="fa fa-check"></i><b>11.4</b> Analysis with missing data</a>
 <ul>
 <li class="chapter" data-level="11.4.1" data-path="c11-missing-data.html"><a href="c11-missing-data.html#recoding-missing-data"><i class="fa fa-check"></i><b>11.4.1</b> Recoding missing data</a></li>
-<li class="chapter" data-level="11.4.2" data-path="c11-missing-data.html"><a href="c11-missing-data.html#accounting-for-skip-patterns"><i class="fa fa-check"></i><b>11.4.2</b> Accounting for skip patterns</a></li>
+<li class="chapter" data-level="11.4.2" data-path="c11-missing-data.html"><a href="c11-missing-data.html#missing-skip-patt"><i class="fa fa-check"></i><b>11.4.2</b> Accounting for skip patterns</a></li>
 </ul></li>
 </ul></li>
 <li class="chapter" data-level="12" data-path="c12-recommendations.html"><a href="c12-recommendations.html"><i class="fa fa-check"></i><b>12</b> Successful survey analysis recommendations</a>
 <ul>
 <li class="chapter" data-level="" data-path="c12-recommendations.html"><a href="c12-recommendations.html#prereq12"><i class="fa fa-check"></i>Prerequisites</a></li>
 <li class="chapter" data-level="12.1" data-path="c12-recommendations.html"><a href="c12-recommendations.html#introduction-10"><i class="fa fa-check"></i><b>12.1</b> Introduction</a></li>
-<li class="chapter" data-level="12.2" data-path="c12-recommendations.html"><a href="c12-recommendations.html#recs-survey-process"><i class="fa fa-check"></i><b>12.2</b> Follow survey analysis process</a></li>
+<li class="chapter" data-level="12.2" data-path="c12-recommendations.html"><a href="c12-recommendations.html#recs-survey-process"><i class="fa fa-check"></i><b>12.2</b> Follow the survey analysis process</a></li>
 <li class="chapter" data-level="12.3" data-path="c12-recommendations.html"><a href="c12-recommendations.html#begin-with-descriptive-analysis"><i class="fa fa-check"></i><b>12.3</b> Begin with descriptive analysis</a>
 <ul>
 <li class="chapter" data-level="12.3.1" data-path="c12-recommendations.html"><a href="c12-recommendations.html#table-review"><i class="fa fa-check"></i><b>12.3.1</b> Table review</a></li>
@@ -521,13 +521,13 @@ <h1>
 <h1><span class="header-section-number">Chapter 4</span> Getting started<a href="c04-getting-started.html#c04-getting-started" class="anchor-section" aria-label="Anchor link to header"></a></h1>
 <div id="introduction-3" class="section level2 hasAnchor" number="4.1">
 <h2><span class="header-section-number">4.1</span> Introduction<a href="c04-getting-started.html#introduction-3" class="anchor-section" aria-label="Anchor link to header"></a></h2>
-<p>This chapter provides an overview of the packages, data, and design objects we use frequently throughout this book. As mentioned in Chapter <a href="c02-overview-surveys.html#c02-overview-surveys">2</a>, understanding how a survey was conducted helps us make sense of the results and interpret findings. Therefore, we provide background on the datasets used in examples and exercises. Next, we walk through how to create the survey design objects necessary to begin analysis. Finally, we provide an overview of the {srvyr} package and the steps needed for analysis. If you have questions or face issues while going through the book, please report them in the book’s <a href="https://github.com/tidy-survey-r/tidy-survey-book">GitHub repository</a>.</p>
+<p>This chapter provides an overview of the packages, data, and design objects we use frequently throughout this book. As mentioned in Chapter <a href="c02-overview-surveys.html#c02-overview-surveys">2</a>, understanding how a survey was conducted helps us make sense of the results and interpret findings. Therefore, we provide background on the datasets used in examples and exercises. Next, we walk through how to create the survey design objects necessary to begin an analysis. Finally, we provide an overview of the {srvyr} package and the steps needed for analysis. Please report any bugs and issues while going through the book to the book’s <a href="https://github.com/tidy-survey-r/tidy-survey-book">GitHub repository</a>.</p>
 </div>
 <div id="setup" class="section level2 hasAnchor" number="4.2">
 <h2><span class="header-section-number">4.2</span> Setup<a href="c04-getting-started.html#setup" class="anchor-section" aria-label="Anchor link to header"></a></h2>
-<p>The Setup section provides details on the required packages and data, as well as the steps for preparing survey design objects. For a streamlined learning experience, we recommend taking the time to walk through the code provided and making sure everything is properly set up.</p>
-<div id="packages" class="section level3 hasAnchor" number="4.2.1">
-<h3><span class="header-section-number">4.2.1</span> Packages<a href="c04-getting-started.html#packages" class="anchor-section" aria-label="Anchor link to header"></a></h3>
+<p>This section provides details on the required packages and data, as well as the steps for preparing survey design objects. For a streamlined learning experience, we recommend taking the time to walk through the code provided here and making sure everything is properly set up.</p>
+<div id="setup-load-pkgs" class="section level3 hasAnchor" number="4.2.1">
+<h3><span class="header-section-number">4.2.1</span> Packages<a href="c04-getting-started.html#setup-load-pkgs" class="anchor-section" aria-label="Anchor link to header"></a></h3>
 <p>We use several packages throughout the book, but let’s install and load specific ones for this chapter. Many functions in the examples and exercises are from three packages: {tidyverse}, {survey}, and {srvyr}. If they are not already installed, use the code below. The {tidyverse} and {survey} packages can both be installed from the Comprehensive R Archive Network (CRAN) <span class="citation">(<a href="#ref-lumley2010complex">Lumley 2010</a>; <a href="#ref-tidyverse2019">Wickham et al. 2019</a>)</span>. We use the GitHub development version of {srvyr} because of its additional functionality compared to the one on CRAN <span class="citation">(<a href="#ref-R-srvyr">Freedman Ellis and Schneider 2023</a>)</span>. Install the package directly from GitHub using the {remotes} package:</p>
 <div class="sourceCode" id="cb1"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb1-1"><a href="c04-getting-started.html#cb1-1" tabindex="-1"></a><span class="fu">install.packages</span>(<span class="fu">c</span>(<span class="st">&quot;tidyverse&quot;</span>, <span class="st">&quot;survey&quot;</span>, <span class="st">&quot;remotes&quot;</span>))</span>
 <span id="cb1-2"><a href="c04-getting-started.html#cb1-2" tabindex="-1"></a>remotes<span class="sc">::</span><span class="fu">install_github</span>(<span class="st">&quot;gergness/srvyr&quot;</span>)</span></code></pre></div>
@@ -548,38 +548,38 @@ <h3><span class="header-section-number">4.2.1</span> Packages<a href="c04-gettin
 <div class="sourceCode" id="cb6"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb6-1"><a href="c04-getting-started.html#cb6-1" tabindex="-1"></a><span class="fu">install.packages</span>(<span class="st">&quot;censusapi&quot;</span>)</span></code></pre></div>
 <p>After installing this package, load it using the <code>library()</code> function:</p>
 <div class="sourceCode" id="cb7"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb7-1"><a href="c04-getting-started.html#cb7-1" tabindex="-1"></a><span class="fu">library</span>(censusapi)</span></code></pre></div>
-<p>Note that the {censusapi} package requires a Census API key, available for free from the <a href="https://api.census.gov/data/key_signup.html">U.S. Census Bureau website</a> (refer to the package documentation for more information). We recommend storing the Census API key in our R environment instead of directly in the code. After obtaining the API key, save it in your R environment by running <code>Sys.setenv()</code>:</p>
+<p>Note that the {censusapi} package requires a Census API key, available for free from the <a href="https://api.census.gov/data/key_signup.html">U.S. Census Bureau website</a> (refer to the package documentation for more information). We recommend storing the Census API key in the R environment instead of directly in the code. To do this, run <code>Sys.setenv()</code> after obtaining the API key.</p>
 <div class="sourceCode" id="cb8"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb8-1"><a href="c04-getting-started.html#cb8-1" tabindex="-1"></a><span class="fu">Sys.setenv</span>(<span class="at">CENSUS_KEY=</span><span class="st">&quot;YOUR_API_KEY_HERE&quot;</span>)</span></code></pre></div>
 <p>Then, restart the R session. Once the Census API key is stored, we can retrieve it in our R code with <code>Sys.getenv("CENSUS_KEY")</code>.</p>
 <p>There are a few other packages used in the book in limited frequency. We list them in the Prerequisite boxes at the beginning of each chapter. As we work through the book, make sure to check the Prerequisite box and install any missing packages before proceeding.</p>
 </div>
 <div id="data" class="section level3 hasAnchor" number="4.2.2">
 <h3><span class="header-section-number">4.2.2</span> Data<a href="c04-getting-started.html#data" class="anchor-section" aria-label="Anchor link to header"></a></h3>
-<p>As mentioned above, the {srvyrexploR} package contains the datasets used in the book. Once installed and loaded, explore the documentation using the <code>help()</code> function. Read the descriptions of the datasets to understand what they contain:</p>
+<p>The {srvyrexploR} package contains the datasets used in the book. Once installed and loaded, explore the documentation using the <code>help()</code> function. Read the descriptions of the datasets to understand what they contain:</p>
 <div class="sourceCode" id="cb9"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb9-1"><a href="c04-getting-started.html#cb9-1" tabindex="-1"></a><span class="fu">help</span>(<span class="at">package =</span> <span class="st">&quot;srvyrexploR&quot;</span>)</span></code></pre></div>
-<p>This book uses two main datasets: the American National Election Studies <span class="citation">(ANES – <a href="#ref-debell">DeBell 2010</a>)</span> and the Residential Energy Consumption Survey <span class="citation">(RECS – <a href="#ref-recs-2020-tech">U.S. Energy Information Administration 2023b</a>)</span> which are included as <code>anes_2020</code> and <code>recs_2020</code>, respectively, in the {srvyrexploR} package.</p>
+<p>This book uses two main datasets: the American National Election Studies <span class="citation">(ANES – <a href="#ref-debell">DeBell 2010</a>)</span> and the Residential Energy Consumption Survey <span class="citation">(RECS – <a href="#ref-recs-2020-tech">U.S. Energy Information Administration 2023b</a>)</span>, which are included as <code>anes_2020</code> and <code>recs_2020</code> in the {srvyrexploR} package, respectively.</p>
 <div id="american-national-election-studies-anes-data" class="section level4 unnumbered hasAnchor">
 <h4>American National Election Studies (ANES) Data<a href="c04-getting-started.html#american-national-election-studies-anes-data" class="anchor-section" aria-label="Anchor link to header"></a></h4>
-<p>The ANES is a study that collects data from election surveys dating back to 1948. These surveys contain information on public opinion and voting behavior in U.S. presidential elections and some midterm elections<a href="#fn3" class="footnote-ref" id="fnref3"><sup>3</sup></a>. They cover topics such as party affiliation, voting choice, and level of trust in the government. The 2020 survey, the data we use in the book, was fielded online, through live video interviews, or via computer-assisted telephone interviews (CATI).</p>
-<p>When working with new survey data, analysts should review the survey documentation (see Chapter <a href="c03-survey-data-documentation.html#c03-survey-data-documentation">3</a>) to understand the data collection methods. The original ANES data contains variables starting with <code>V20</code> <span class="citation">(<a href="#ref-debell">DeBell 2010</a>)</span>, so to assist with our analysis throughout the book, we created descriptive variable names. For example, the respondent’s age is now in a variable called <code>Age</code>, and gender is in a variable called <code>Gender</code>. These descriptive variables are included in the {srvyrexploR} package, and Table <a href="c04-getting-started.html#tab:anes-view-tab">4.1</a> displays the list of these renamed variables. A complete overview of all variables can be found in Appendix <a href="anes-cb.html#anes-cb">B</a>.</p>
+<p>ANES is a study that collects data from election surveys dating back to 1948. These surveys contain information on public opinion and voting behavior in U.S. presidential elections and some midterm elections<a href="#fn3" class="footnote-ref" id="fnref3"><sup>3</sup></a>. They cover topics such as party affiliation, voting choice, and level of trust in the government. The 2020 survey (data used in this book) was fielded online, through live video interviews, or via computer-assisted telephone interviews (CATI).</p>
+<p>When working with new survey data, we should review the survey documentation (see Chapter <a href="c03-survey-data-documentation.html#c03-survey-data-documentation">3</a>) to understand the data collection methods. The original ANES data contains variables starting with <code>V20</code> <span class="citation">(<a href="#ref-debell">DeBell 2010</a>)</span>, so to assist with our analysis throughout the book, we created descriptive variable names. For example, the respondent’s age is now in a variable called <code>Age</code>, and gender is in a variable called <code>Gender</code>. These descriptive variables are included in the {srvyrexploR} package, and Table <a href="c04-getting-started.html#tab:anes-view-tab">4.1</a> displays the list of these renamed variables. A complete overview of all variables can be found in Appendix <a href="anes-cb.html#anes-cb">B</a>.</p>
 
-<div id="usgbmusvau" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
-<style>#usgbmusvau table {
+<div id="dfizchpgkm" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
+<style>#dfizchpgkm table {
   font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
   -webkit-font-smoothing: antialiased;
   -moz-osx-font-smoothing: grayscale;
 }
 
-#usgbmusvau thead, #usgbmusvau tbody, #usgbmusvau tfoot, #usgbmusvau tr, #usgbmusvau td, #usgbmusvau th {
+#dfizchpgkm thead, #dfizchpgkm tbody, #dfizchpgkm tfoot, #dfizchpgkm tr, #dfizchpgkm td, #dfizchpgkm th {
   border-style: none;
 }
 
-#usgbmusvau p {
+#dfizchpgkm p {
   margin: 0;
   padding: 0;
 }
 
-#usgbmusvau .gt_table {
+#dfizchpgkm .gt_table {
   display: table;
   border-collapse: collapse;
   line-height: normal;
@@ -605,12 +605,12 @@ <h4>American National Election Studies (ANES) Data<a href="c04-getting-started.h
   border-left-color: #D3D3D3;
 }
 
-#usgbmusvau .gt_caption {
+#dfizchpgkm .gt_caption {
   padding-top: 4px;
   padding-bottom: 4px;
 }
 
-#usgbmusvau .gt_title {
+#dfizchpgkm .gt_title {
   color: #333333;
   font-size: 125%;
   font-weight: initial;
@@ -622,7 +622,7 @@ <h4>American National Election Studies (ANES) Data<a href="c04-getting-started.h
   border-bottom-width: 0;
 }
 
-#usgbmusvau .gt_subtitle {
+#dfizchpgkm .gt_subtitle {
   color: #333333;
   font-size: 85%;
   font-weight: initial;
@@ -634,7 +634,7 @@ <h4>American National Election Studies (ANES) Data<a href="c04-getting-started.h
   border-top-width: 0;
 }
 
-#usgbmusvau .gt_heading {
+#dfizchpgkm .gt_heading {
   background-color: #FFFFFF;
   text-align: center;
   border-bottom-color: #FFFFFF;
@@ -646,13 +646,13 @@ <h4>American National Election Studies (ANES) Data<a href="c04-getting-started.h
   border-right-color: #D3D3D3;
 }
 
-#usgbmusvau .gt_bottom_border {
+#dfizchpgkm .gt_bottom_border {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
 }
 
-#usgbmusvau .gt_col_headings {
+#dfizchpgkm .gt_col_headings {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -667,7 +667,7 @@ <h4>American National Election Studies (ANES) Data<a href="c04-getting-started.h
   border-right-color: #D3D3D3;
 }
 
-#usgbmusvau .gt_col_heading {
+#dfizchpgkm .gt_col_heading {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -687,7 +687,7 @@ <h4>American National Election Studies (ANES) Data<a href="c04-getting-started.h
   overflow-x: hidden;
 }
 
-#usgbmusvau .gt_column_spanner_outer {
+#dfizchpgkm .gt_column_spanner_outer {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -699,15 +699,15 @@ <h4>American National Election Studies (ANES) Data<a href="c04-getting-started.h
   padding-right: 4px;
 }
 
-#usgbmusvau .gt_column_spanner_outer:first-child {
+#dfizchpgkm .gt_column_spanner_outer:first-child {
   padding-left: 0;
 }
 
-#usgbmusvau .gt_column_spanner_outer:last-child {
+#dfizchpgkm .gt_column_spanner_outer:last-child {
   padding-right: 0;
 }
 
-#usgbmusvau .gt_column_spanner {
+#dfizchpgkm .gt_column_spanner {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
@@ -719,11 +719,11 @@ <h4>American National Election Studies (ANES) Data<a href="c04-getting-started.h
   width: 100%;
 }
 
-#usgbmusvau .gt_spanner_row {
+#dfizchpgkm .gt_spanner_row {
   border-bottom-style: hidden;
 }
 
-#usgbmusvau .gt_group_heading {
+#dfizchpgkm .gt_group_heading {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -749,7 +749,7 @@ <h4>American National Election Studies (ANES) Data<a href="c04-getting-started.h
   text-align: left;
 }
 
-#usgbmusvau .gt_empty_group_heading {
+#dfizchpgkm .gt_empty_group_heading {
   padding: 0.5px;
   color: #333333;
   background-color: #FFFFFF;
@@ -764,15 +764,15 @@ <h4>American National Election Studies (ANES) Data<a href="c04-getting-started.h
   vertical-align: middle;
 }
 
-#usgbmusvau .gt_from_md > :first-child {
+#dfizchpgkm .gt_from_md > :first-child {
   margin-top: 0;
 }
 
-#usgbmusvau .gt_from_md > :last-child {
+#dfizchpgkm .gt_from_md > :last-child {
   margin-bottom: 0;
 }
 
-#usgbmusvau .gt_row {
+#dfizchpgkm .gt_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -791,7 +791,7 @@ <h4>American National Election Studies (ANES) Data<a href="c04-getting-started.h
   overflow-x: hidden;
 }
 
-#usgbmusvau .gt_stub {
+#dfizchpgkm .gt_stub {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -804,7 +804,7 @@ <h4>American National Election Studies (ANES) Data<a href="c04-getting-started.h
   padding-right: 5px;
 }
 
-#usgbmusvau .gt_stub_row_group {
+#dfizchpgkm .gt_stub_row_group {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -818,15 +818,15 @@ <h4>American National Election Studies (ANES) Data<a href="c04-getting-started.h
   vertical-align: top;
 }
 
-#usgbmusvau .gt_row_group_first td {
+#dfizchpgkm .gt_row_group_first td {
   border-top-width: 2px;
 }
 
-#usgbmusvau .gt_row_group_first th {
+#dfizchpgkm .gt_row_group_first th {
   border-top-width: 2px;
 }
 
-#usgbmusvau .gt_summary_row {
+#dfizchpgkm .gt_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -836,16 +836,16 @@ <h4>American National Election Studies (ANES) Data<a href="c04-getting-started.h
   padding-right: 5px;
 }
 
-#usgbmusvau .gt_first_summary_row {
+#dfizchpgkm .gt_first_summary_row {
   border-top-style: solid;
   border-top-color: #D3D3D3;
 }
 
-#usgbmusvau .gt_first_summary_row.thick {
+#dfizchpgkm .gt_first_summary_row.thick {
   border-top-width: 2px;
 }
 
-#usgbmusvau .gt_last_summary_row {
+#dfizchpgkm .gt_last_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -855,7 +855,7 @@ <h4>American National Election Studies (ANES) Data<a href="c04-getting-started.h
   border-bottom-color: #D3D3D3;
 }
 
-#usgbmusvau .gt_grand_summary_row {
+#dfizchpgkm .gt_grand_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -865,7 +865,7 @@ <h4>American National Election Studies (ANES) Data<a href="c04-getting-started.h
   padding-right: 5px;
 }
 
-#usgbmusvau .gt_first_grand_summary_row {
+#dfizchpgkm .gt_first_grand_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -875,7 +875,7 @@ <h4>American National Election Studies (ANES) Data<a href="c04-getting-started.h
   border-top-color: #D3D3D3;
 }
 
-#usgbmusvau .gt_last_grand_summary_row_top {
+#dfizchpgkm .gt_last_grand_summary_row_top {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -885,11 +885,11 @@ <h4>American National Election Studies (ANES) Data<a href="c04-getting-started.h
   border-bottom-color: #D3D3D3;
 }
 
-#usgbmusvau .gt_striped {
+#dfizchpgkm .gt_striped {
   background-color: rgba(128, 128, 128, 0.05);
 }
 
-#usgbmusvau .gt_table_body {
+#dfizchpgkm .gt_table_body {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -898,7 +898,7 @@ <h4>American National Election Studies (ANES) Data<a href="c04-getting-started.h
   border-bottom-color: #D3D3D3;
 }
 
-#usgbmusvau .gt_footnotes {
+#dfizchpgkm .gt_footnotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -912,7 +912,7 @@ <h4>American National Election Studies (ANES) Data<a href="c04-getting-started.h
   border-right-color: #D3D3D3;
 }
 
-#usgbmusvau .gt_footnote {
+#dfizchpgkm .gt_footnote {
   margin: 0px;
   font-size: 90%;
   padding-top: 4px;
@@ -921,7 +921,7 @@ <h4>American National Election Studies (ANES) Data<a href="c04-getting-started.h
   padding-right: 5px;
 }
 
-#usgbmusvau .gt_sourcenotes {
+#dfizchpgkm .gt_sourcenotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -935,7 +935,7 @@ <h4>American National Election Studies (ANES) Data<a href="c04-getting-started.h
   border-right-color: #D3D3D3;
 }
 
-#usgbmusvau .gt_sourcenote {
+#dfizchpgkm .gt_sourcenote {
   font-size: 90%;
   padding-top: 4px;
   padding-bottom: 4px;
@@ -943,63 +943,63 @@ <h4>American National Election Studies (ANES) Data<a href="c04-getting-started.h
   padding-right: 5px;
 }
 
-#usgbmusvau .gt_left {
+#dfizchpgkm .gt_left {
   text-align: left;
 }
 
-#usgbmusvau .gt_center {
+#dfizchpgkm .gt_center {
   text-align: center;
 }
 
-#usgbmusvau .gt_right {
+#dfizchpgkm .gt_right {
   text-align: right;
   font-variant-numeric: tabular-nums;
 }
 
-#usgbmusvau .gt_font_normal {
+#dfizchpgkm .gt_font_normal {
   font-weight: normal;
 }
 
-#usgbmusvau .gt_font_bold {
+#dfizchpgkm .gt_font_bold {
   font-weight: bold;
 }
 
-#usgbmusvau .gt_font_italic {
+#dfizchpgkm .gt_font_italic {
   font-style: italic;
 }
 
-#usgbmusvau .gt_super {
+#dfizchpgkm .gt_super {
   font-size: 65%;
 }
 
-#usgbmusvau .gt_footnote_marks {
+#dfizchpgkm .gt_footnote_marks {
   font-size: 75%;
   vertical-align: 0.4em;
   position: initial;
 }
 
-#usgbmusvau .gt_asterisk {
+#dfizchpgkm .gt_asterisk {
   font-size: 100%;
   vertical-align: 0;
 }
 
-#usgbmusvau .gt_indent_1 {
+#dfizchpgkm .gt_indent_1 {
   text-indent: 5px;
 }
 
-#usgbmusvau .gt_indent_2 {
+#dfizchpgkm .gt_indent_2 {
   text-indent: 10px;
 }
 
-#usgbmusvau .gt_indent_3 {
+#dfizchpgkm .gt_indent_3 {
   text-indent: 15px;
 }
 
-#usgbmusvau .gt_indent_4 {
+#dfizchpgkm .gt_indent_4 {
   text-indent: 20px;
 }
 
-#usgbmusvau .gt_indent_5 {
+#dfizchpgkm .gt_indent_5 {
   text-indent: 25px;
 }
 </style>
@@ -1072,26 +1072,26 @@ <h4>American National Election Studies (ANES) Data<a href="c04-getting-started.h
 </div>
 <div id="residential-energy-consumption-survey-recs-data" class="section level4 unnumbered hasAnchor">
 <h4>Residential Energy Consumption Survey (RECS) Data<a href="c04-getting-started.html#residential-energy-consumption-survey-recs-data" class="anchor-section" aria-label="Anchor link to header"></a></h4>
-<p>RECS is a study that measures energy consumption and expenditure in American households. Funded by the Energy Information Administration, the RECS data are collected through interviews with household members and energy suppliers. These interviews take place in person, over the phone, via mail, and on the web with modes changing over time. The survey has been fielded 14 times between 1950 and 2020. It includes questions about appliances, electronics, heating, air conditioning (A/C), temperatures, water heating, lighting, energy bills, respondent demographics, and energy assistance.</p>
-<p>As mentioned above, analysts should read the survey documentation (see Chapter <a href="c03-survey-data-documentation.html#c03-survey-data-documentation">3</a>) to understand how the data was collected and implemented. Table <a href="c04-getting-started.html#tab:recs-view-tab">4.2</a> displays the list of variables in the RECS data (not including the weights, which start with <code>NWEIGHT</code> and will be described in more detail in Chapter <a href="c10-sample-designs-replicate-weights.html#c10-sample-designs-replicate-weights">10</a>). An overview of all variables can be found in Appendix <a href="recs-cb.html#recs-cb">C</a>.</p>
+<p>RECS is a study that measures energy consumption and expenditure in American households. Funded by the Energy Information Administration, RECS data are collected through interviews with household members and energy suppliers. These interviews take place in person, over the phone, via mail, and on the web, with modes changing over time. The survey has been fielded 14 times between 1950 and 2020. It includes questions about appliances, electronics, heating, air conditioning (A/C), temperatures, water heating, lighting, energy bills, respondent demographics, and energy assistance.</p>
+<p>We should read the survey documentation (see Chapter <a href="c03-survey-data-documentation.html#c03-survey-data-documentation">3</a>) to understand how the data were collected and implemented. Table <a href="c04-getting-started.html#tab:recs-view-tab">4.2</a> displays the list of variables in the RECS data (not including the weights, which start with <code>NWEIGHT</code> and are described in more detail in Chapter <a href="c10-sample-designs-replicate-weights.html#c10-sample-designs-replicate-weights">10</a>). An overview of all variables can be found in Appendix <a href="recs-cb.html#recs-cb">C</a>.</p>
 
-<div id="qqdbxxsqdu" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
-<style>#qqdbxxsqdu table {
+<div id="zgtugjzmca" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
+<style>#zgtugjzmca table {
   font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
   -webkit-font-smoothing: antialiased;
   -moz-osx-font-smoothing: grayscale;
 }
 
-#qqdbxxsqdu thead, #qqdbxxsqdu tbody, #qqdbxxsqdu tfoot, #qqdbxxsqdu tr, #qqdbxxsqdu td, #qqdbxxsqdu th {
+#zgtugjzmca thead, #zgtugjzmca tbody, #zgtugjzmca tfoot, #zgtugjzmca tr, #zgtugjzmca td, #zgtugjzmca th {
   border-style: none;
 }
 
-#qqdbxxsqdu p {
+#zgtugjzmca p {
   margin: 0;
   padding: 0;
 }
 
-#qqdbxxsqdu .gt_table {
+#zgtugjzmca .gt_table {
   display: table;
   border-collapse: collapse;
   line-height: normal;
@@ -1117,12 +1117,12 @@ <h4>Residential Energy Consumption Survey (RECS) Data<a href="c04-getting-starte
   border-left-color: #D3D3D3;
 }
 
-#qqdbxxsqdu .gt_caption {
+#zgtugjzmca .gt_caption {
   padding-top: 4px;
   padding-bottom: 4px;
 }
 
-#qqdbxxsqdu .gt_title {
+#zgtugjzmca .gt_title {
   color: #333333;
   font-size: 125%;
   font-weight: initial;
@@ -1134,7 +1134,7 @@ <h4>Residential Energy Consumption Survey (RECS) Data<a href="c04-getting-starte
   border-bottom-width: 0;
 }
 
-#qqdbxxsqdu .gt_subtitle {
+#zgtugjzmca .gt_subtitle {
   color: #333333;
   font-size: 85%;
   font-weight: initial;
@@ -1146,7 +1146,7 @@ <h4>Residential Energy Consumption Survey (RECS) Data<a href="c04-getting-starte
   border-top-width: 0;
 }
 
-#qqdbxxsqdu .gt_heading {
+#zgtugjzmca .gt_heading {
   background-color: #FFFFFF;
   text-align: center;
   border-bottom-color: #FFFFFF;
@@ -1158,13 +1158,13 @@ <h4>Residential Energy Consumption Survey (RECS) Data<a href="c04-getting-starte
   border-right-color: #D3D3D3;
 }
 
-#qqdbxxsqdu .gt_bottom_border {
+#zgtugjzmca .gt_bottom_border {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
 }
 
-#qqdbxxsqdu .gt_col_headings {
+#zgtugjzmca .gt_col_headings {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -1179,7 +1179,7 @@ <h4>Residential Energy Consumption Survey (RECS) Data<a href="c04-getting-starte
   border-right-color: #D3D3D3;
 }
 
-#qqdbxxsqdu .gt_col_heading {
+#zgtugjzmca .gt_col_heading {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -1199,7 +1199,7 @@ <h4>Residential Energy Consumption Survey (RECS) Data<a href="c04-getting-starte
   overflow-x: hidden;
 }
 
-#qqdbxxsqdu .gt_column_spanner_outer {
+#zgtugjzmca .gt_column_spanner_outer {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -1211,15 +1211,15 @@ <h4>Residential Energy Consumption Survey (RECS) Data<a href="c04-getting-starte
   padding-right: 4px;
 }
 
-#qqdbxxsqdu .gt_column_spanner_outer:first-child {
+#zgtugjzmca .gt_column_spanner_outer:first-child {
   padding-left: 0;
 }
 
-#qqdbxxsqdu .gt_column_spanner_outer:last-child {
+#zgtugjzmca .gt_column_spanner_outer:last-child {
   padding-right: 0;
 }
 
-#qqdbxxsqdu .gt_column_spanner {
+#zgtugjzmca .gt_column_spanner {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
@@ -1231,11 +1231,11 @@ <h4>Residential Energy Consumption Survey (RECS) Data<a href="c04-getting-starte
   width: 100%;
 }
 
-#qqdbxxsqdu .gt_spanner_row {
+#zgtugjzmca .gt_spanner_row {
   border-bottom-style: hidden;
 }
 
-#qqdbxxsqdu .gt_group_heading {
+#zgtugjzmca .gt_group_heading {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -1261,7 +1261,7 @@ <h4>Residential Energy Consumption Survey (RECS) Data<a href="c04-getting-starte
   text-align: left;
 }
 
-#qqdbxxsqdu .gt_empty_group_heading {
+#zgtugjzmca .gt_empty_group_heading {
   padding: 0.5px;
   color: #333333;
   background-color: #FFFFFF;
@@ -1276,15 +1276,15 @@ <h4>Residential Energy Consumption Survey (RECS) Data<a href="c04-getting-starte
   vertical-align: middle;
 }
 
-#qqdbxxsqdu .gt_from_md > :first-child {
+#zgtugjzmca .gt_from_md > :first-child {
   margin-top: 0;
 }
 
-#qqdbxxsqdu .gt_from_md > :last-child {
+#zgtugjzmca .gt_from_md > :last-child {
   margin-bottom: 0;
 }
 
-#qqdbxxsqdu .gt_row {
+#zgtugjzmca .gt_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -1303,7 +1303,7 @@ <h4>Residential Energy Consumption Survey (RECS) Data<a href="c04-getting-starte
   overflow-x: hidden;
 }
 
-#qqdbxxsqdu .gt_stub {
+#zgtugjzmca .gt_stub {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -1316,7 +1316,7 @@ <h4>Residential Energy Consumption Survey (RECS) Data<a href="c04-getting-starte
   padding-right: 5px;
 }
 
-#qqdbxxsqdu .gt_stub_row_group {
+#zgtugjzmca .gt_stub_row_group {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -1330,15 +1330,15 @@ <h4>Residential Energy Consumption Survey (RECS) Data<a href="c04-getting-starte
   vertical-align: top;
 }
 
-#qqdbxxsqdu .gt_row_group_first td {
+#zgtugjzmca .gt_row_group_first td {
   border-top-width: 2px;
 }
 
-#qqdbxxsqdu .gt_row_group_first th {
+#zgtugjzmca .gt_row_group_first th {
   border-top-width: 2px;
 }
 
-#qqdbxxsqdu .gt_summary_row {
+#zgtugjzmca .gt_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -1348,16 +1348,16 @@ <h4>Residential Energy Consumption Survey (RECS) Data<a href="c04-getting-starte
   padding-right: 5px;
 }
 
-#qqdbxxsqdu .gt_first_summary_row {
+#zgtugjzmca .gt_first_summary_row {
   border-top-style: solid;
   border-top-color: #D3D3D3;
 }
 
-#qqdbxxsqdu .gt_first_summary_row.thick {
+#zgtugjzmca .gt_first_summary_row.thick {
   border-top-width: 2px;
 }
 
-#qqdbxxsqdu .gt_last_summary_row {
+#zgtugjzmca .gt_last_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -1367,7 +1367,7 @@ <h4>Residential Energy Consumption Survey (RECS) Data<a href="c04-getting-starte
   border-bottom-color: #D3D3D3;
 }
 
-#qqdbxxsqdu .gt_grand_summary_row {
+#zgtugjzmca .gt_grand_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -1377,7 +1377,7 @@ <h4>Residential Energy Consumption Survey (RECS) Data<a href="c04-getting-starte
   padding-right: 5px;
 }
 
-#qqdbxxsqdu .gt_first_grand_summary_row {
+#zgtugjzmca .gt_first_grand_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -1387,7 +1387,7 @@ <h4>Residential Energy Consumption Survey (RECS) Data<a href="c04-getting-starte
   border-top-color: #D3D3D3;
 }
 
-#qqdbxxsqdu .gt_last_grand_summary_row_top {
+#zgtugjzmca .gt_last_grand_summary_row_top {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -1397,11 +1397,11 @@ <h4>Residential Energy Consumption Survey (RECS) Data<a href="c04-getting-starte
   border-bottom-color: #D3D3D3;
 }
 
-#qqdbxxsqdu .gt_striped {
+#zgtugjzmca .gt_striped {
   background-color: rgba(128, 128, 128, 0.05);
 }
 
-#qqdbxxsqdu .gt_table_body {
+#zgtugjzmca .gt_table_body {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -1410,7 +1410,7 @@ <h4>Residential Energy Consumption Survey (RECS) Data<a href="c04-getting-starte
   border-bottom-color: #D3D3D3;
 }
 
-#qqdbxxsqdu .gt_footnotes {
+#zgtugjzmca .gt_footnotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -1424,7 +1424,7 @@ <h4>Residential Energy Consumption Survey (RECS) Data<a href="c04-getting-starte
   border-right-color: #D3D3D3;
 }
 
-#qqdbxxsqdu .gt_footnote {
+#zgtugjzmca .gt_footnote {
   margin: 0px;
   font-size: 90%;
   padding-top: 4px;
@@ -1433,7 +1433,7 @@ <h4>Residential Energy Consumption Survey (RECS) Data<a href="c04-getting-starte
   padding-right: 5px;
 }
 
-#qqdbxxsqdu .gt_sourcenotes {
+#zgtugjzmca .gt_sourcenotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -1447,7 +1447,7 @@ <h4>Residential Energy Consumption Survey (RECS) Data<a href="c04-getting-starte
   border-right-color: #D3D3D3;
 }
 
-#qqdbxxsqdu .gt_sourcenote {
+#zgtugjzmca .gt_sourcenote {
   font-size: 90%;
   padding-top: 4px;
   padding-bottom: 4px;
@@ -1455,63 +1455,63 @@ <h4>Residential Energy Consumption Survey (RECS) Data<a href="c04-getting-starte
   padding-right: 5px;
 }
 
-#qqdbxxsqdu .gt_left {
+#zgtugjzmca .gt_left {
   text-align: left;
 }
 
-#qqdbxxsqdu .gt_center {
+#zgtugjzmca .gt_center {
   text-align: center;
 }
 
-#qqdbxxsqdu .gt_right {
+#zgtugjzmca .gt_right {
   text-align: right;
   font-variant-numeric: tabular-nums;
 }
 
-#qqdbxxsqdu .gt_font_normal {
+#zgtugjzmca .gt_font_normal {
   font-weight: normal;
 }
 
-#qqdbxxsqdu .gt_font_bold {
+#zgtugjzmca .gt_font_bold {
   font-weight: bold;
 }
 
-#qqdbxxsqdu .gt_font_italic {
+#zgtugjzmca .gt_font_italic {
   font-style: italic;
 }
 
-#qqdbxxsqdu .gt_super {
+#zgtugjzmca .gt_super {
   font-size: 65%;
 }
 
-#qqdbxxsqdu .gt_footnote_marks {
+#zgtugjzmca .gt_footnote_marks {
   font-size: 75%;
   vertical-align: 0.4em;
   position: initial;
 }
 
-#qqdbxxsqdu .gt_asterisk {
+#zgtugjzmca .gt_asterisk {
   font-size: 100%;
   vertical-align: 0;
 }
 
-#qqdbxxsqdu .gt_indent_1 {
+#zgtugjzmca .gt_indent_1 {
   text-indent: 5px;
 }
 
-#qqdbxxsqdu .gt_indent_2 {
+#zgtugjzmca .gt_indent_2 {
   text-indent: 10px;
 }
 
-#qqdbxxsqdu .gt_indent_3 {
+#zgtugjzmca .gt_indent_3 {
   text-indent: 15px;
 }
 
-#qqdbxxsqdu .gt_indent_4 {
+#zgtugjzmca .gt_indent_4 {
   text-indent: 20px;
 }
 
-#qqdbxxsqdu .gt_indent_5 {
+#zgtugjzmca .gt_indent_5 {
   text-indent: 25px;
 }
 </style>
@@ -1652,17 +1652,17 @@ <h4>Residential Energy Consumption Survey (RECS) Data<a href="c04-getting-starte
 ## $ ZBTUWOOD          &lt;fct&gt; Not applicable, Not applicable, Not applicab…
 ## $ TOTALBTU          &lt;dbl&gt; 144648, 28035, 30750, 86765, 59127, 85401, 1…
 ## $ TOTALDOL          &lt;dbl&gt; 2656.9, 975.0, 522.6, 2061.8, 1463.0, 2335.1…</code></pre>
-<p>From the output, we can see that there are 18,496 rows and 57 non-weight variables in the RECS data. This output also indicates that most of the variables are in double (numeric) format (e.g., <code>TOTSQFT_EN</code>), with some factor (e.g., <code>Region</code>), Boolean (e.g., <code>ACUsed</code>), character (e.g., <code>REGIONC</code>), and ordinal (e.g., <code>YearMade</code>) variables.</p>
+<p>From the output, we can see that the RECS data has 18,496 rows and 57 non-weight variables. This output also indicates that most of the variables are in double (numeric) format (e.g., <code>TOTSQFT_EN</code>), with some factor (e.g., <code>Region</code>), Boolean (e.g., <code>ACUsed</code>), character (e.g., <code>REGIONC</code>), and ordinal (e.g., <code>YearMade</code>) variables.</p>
 </div>
 </div>
 <div id="setup-des-obj" class="section level3 hasAnchor" number="4.2.3">
 <h3><span class="header-section-number">4.2.3</span> Design objects<a href="c04-getting-started.html#setup-des-obj" class="anchor-section" aria-label="Anchor link to header"></a></h3>
-<p>The design object is the backbone for survey analysis. It is where we specify the sampling design, weights, and other necessary information to ensure we account for errors in the data. Before creating the design object, analysts should carefully review the survey documentation to understand how to create the design object for accurate analysis.</p>
-<p>In this chapter, we provide details on how to code the design object for the ANES and RECS data used in the book. However, we only provide a high-level overview to get readers started. For a deeper understanding of creating these design objects for a variety of sampling designs, see Chapter <a href="c10-sample-designs-replicate-weights.html#c10-sample-designs-replicate-weights">10</a>.</p>
-<p>While we recommend conducting exploratory data analysis on the original data before diving into complex survey analysis (see Chapter <a href="c12-recommendations.html#c12-recommendations">12</a>), the actual analysis and inference should be performed with the survey design objects instead of the original survey data. For example, the ANES data is called <code>anes_2020</code>. If we create a survey design object called <code>anes_des</code>, our analyses should begin with <code>anes_des</code> and not <code>anes_2020</code>. Using the survey design object ensures that our calculations are appropriately accounting for the details of the survey design.</p>
+<p>The design object is the backbone for survey analysis. It is where we specify the sampling design, weights, and other necessary information to ensure we account for errors in the data. Before creating the design object, we should carefully review the survey documentation to understand how to create the design object for accurate analysis.</p>
+<p>In this section, we provide details on how to code the design object for the ANES and RECS data used in the book. However, we only provide a high-level overview to get readers started. For a deeper understanding of creating design objects for a variety of sampling designs, see Chapter <a href="c10-sample-designs-replicate-weights.html#c10-sample-designs-replicate-weights">10</a>.</p>
+<p>While we recommend conducting exploratory data analysis on the original data before diving into complex survey analysis (see Chapter <a href="c12-recommendations.html#c12-recommendations">12</a>), the actual survey analysis and inference should be performed with the survey design objects instead of the original survey data. For example, the ANES data is called <code>anes_2020</code>. If we create a survey design object called <code>anes_des</code>, our survey analyses should begin with <code>anes_des</code> and not <code>anes_2020</code>. Using the survey design object ensures that our calculations appropriately account for the details of the survey design.</p>
 <div id="american-national-election-studies-anes-design-object" class="section level4 unnumbered hasAnchor">
 <h4>American National Election Studies (ANES) Design Object<a href="c04-getting-started.html#american-national-election-studies-anes-design-object" class="anchor-section" aria-label="Anchor link to header"></a></h4>
-<p>The ANES documentation <span class="citation">(<a href="#ref-debell">DeBell 2010</a>)</span> details the sampling and weighting implications for analyzing the survey data. From this documentation and as noted in Chapter <a href="c03-survey-data-documentation.html#c03-survey-data-documentation">3</a>, the 2020 ANES data is weighted to the sample, not the population. To make generalizations about the population, we need to weigh the data against the full population count. The ANES methodology recommends using the Current Population Survey (CPS) to determine the number of non-institutional U.S. citizens aged 18 or older living in the 50 U.S. states or D.C. in March of 2020.</p>
+<p>The ANES documentation <span class="citation">(<a href="#ref-debell">DeBell 2010</a>)</span> details the sampling and weighting implications for analyzing the survey data. From this documentation and as noted in Chapter <a href="c03-survey-data-documentation.html#c03-survey-data-documentation">3</a>, the 2020 ANES data are weighted to the sample, not the population. To make generalizations about the population, we need to weigh the data against the full population count. The ANES methodology recommends using the Current Population Survey (CPS) to determine the number of non-institutional U.S. citizens aged 18 or older living in the 50 U.S. states or D.C. in March 2020.</p>
 <p>We can use the {censusapi} package to obtain the information needed for the survey design object. The <code>getCensus()</code> function allows us to retrieve the CPS data for March (<code>cps/basic/mar</code>) in 2020 (<code>vintage = 2020</code>). Additionally, we extract several variables from the CPS:</p>
 <ul>
 <li>month (<code>HRMONTH</code>) and year (<code>HRYEAR4</code>) of the interview: to confirm the correct time period</li>
@@ -1683,7 +1683,7 @@ <h4>American National Election Studies (ANES) Design Object<a href="c04-getting-
 <span id="cb14-10"><a href="c04-getting-started.html#cb14-10" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="fu">across</span>(<span class="at">.cols =</span> <span class="fu">everything</span>(),</span>
 <span id="cb14-11"><a href="c04-getting-started.html#cb14-11" tabindex="-1"></a>                <span class="at">.fns =</span> as.numeric))</span></code></pre></div>
 <p>In the code above, we include <code>region = "state"</code>. The default region type for the CPS data is at the state level. While not required, including the region can be helpful for understanding the geographical context of the data.</p>
-<p>In <code>getCensus()</code>, we filtered the dataset by specifying the month (<code>HRMONTH == 3</code>) and year (<code>HRYEAR4 == 2020</code>) of our request. Therefore, we expect that all interviews within our output were conducted during that particular month and year. We can confirm that the data is from March 2020 by running the code below:</p>
+<p>In <code>getCensus()</code>, we filtered the dataset by specifying the month (<code>HRMONTH == 3</code>) and year (<code>HRYEAR4 == 2020</code>) of our request. Therefore, we expect that all interviews within our output were conducted during that particular month and year. We can confirm that the data are from March 2020 by running the code below:</p>
 <div class="sourceCode" id="cb15"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb15-1"><a href="c04-getting-started.html#cb15-1" tabindex="-1"></a>cps_state <span class="sc">%&gt;%</span></span>
 <span id="cb15-2"><a href="c04-getting-started.html#cb15-2" tabindex="-1"></a>  <span class="fu">distinct</span>(HRMONTH, HRYEAR4)</span></code></pre></div>
 <pre><code>## # A tibble: 1 × 2
@@ -1697,20 +1697,21 @@ <h4>American National Election Studies (ANES) Design Object<a href="c04-getting-
 <p>To calculate the U.S. population from the filtered data, we sum the person weights (<code>PWSSWGT</code>):</p>
 <div class="sourceCode" id="cb18"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb18-1"><a href="c04-getting-started.html#cb18-1" tabindex="-1"></a>targetpop <span class="ot">&lt;-</span> cps_narrow_resp <span class="sc">%&gt;%</span></span>
 <span id="cb18-2"><a href="c04-getting-started.html#cb18-2" tabindex="-1"></a>  <span class="fu">pull</span>(PWSSWGT) <span class="sc">%&gt;%</span></span>
-<span id="cb18-3"><a href="c04-getting-started.html#cb18-3" tabindex="-1"></a>  <span class="fu">sum</span>()</span></code></pre></div>
-<div class="sourceCode" id="cb19"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb19-1"><a href="c04-getting-started.html#cb19-1" tabindex="-1"></a>scales<span class="sc">::</span><span class="fu">comma</span>(targetpop)</span></code></pre></div>
+<span id="cb18-3"><a href="c04-getting-started.html#cb18-3" tabindex="-1"></a>  <span class="fu">sum</span>()</span>
+<span id="cb18-4"><a href="c04-getting-started.html#cb18-4" tabindex="-1"></a></span>
+<span id="cb18-5"><a href="c04-getting-started.html#cb18-5" tabindex="-1"></a>scales<span class="sc">::</span><span class="fu">comma</span>(targetpop)</span></code></pre></div>
 <pre><code>## [1] &quot;231,034,125&quot;</code></pre>
-<p>The target population in 2020 is 231,034,125. This result gives us what we need to create the survey design object for estimating population statistics. Using the <code>anes_2020</code> data, we adjust the weighting variable (<code>V200010b</code>) using the target population we just calculated (<code>targetpop</code>). We determine the proportion of the total weight for each individual weight (<code>V200010b / sum(V200010b)</code>) and then multiply that proportion by the calculated target population.</p>
-<div class="sourceCode" id="cb21"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb21-1"><a href="c04-getting-started.html#cb21-1" tabindex="-1"></a>anes_adjwgt <span class="ot">&lt;-</span> anes_2020 <span class="sc">%&gt;%</span></span>
-<span id="cb21-2"><a href="c04-getting-started.html#cb21-2" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">Weight =</span> V200010b <span class="sc">/</span> <span class="fu">sum</span>(V200010b) <span class="sc">*</span> targetpop) </span></code></pre></div>
+<p>The population of interest in 2020 is 231,034,125. This result gives us what we need to create the survey design object for estimating population statistics. Using the <code>anes_2020</code> data, we adjust the weighting variable (<code>V200010b</code>) using the population of interest we just calculated (<code>targetpop</code>). We determine the proportion of the total weight for each individual weight (<code>V200010b / sum(V200010b)</code>) and then multiply that proportion by the calculated population of interest.</p>
+<div class="sourceCode" id="cb20"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb20-1"><a href="c04-getting-started.html#cb20-1" tabindex="-1"></a>anes_adjwgt <span class="ot">&lt;-</span> anes_2020 <span class="sc">%&gt;%</span></span>
+<span id="cb20-2"><a href="c04-getting-started.html#cb20-2" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">Weight =</span> V200010b <span class="sc">/</span> <span class="fu">sum</span>(V200010b) <span class="sc">*</span> targetpop) </span></code></pre></div>
 <p>Once we have the adjusted weights, we can refer to the rest of the documentation to create the survey design. The documentation indicates that the study uses a stratified cluster sampling design. Therefore, we need to specify variables for <code>strata</code> and <code>ids</code> (cluster) and fill in the <code>nest</code> argument. The documentation provides guidance on which strata and cluster variables to use depending on whether we are analyzing pre- or post-election data. In this book, we analyze post-election data, so we need to use the post-election weight <code>V200010b</code>, strata variable <code>V200010d</code>, and PSU/cluster variable <code>V200010c</code>. Additionally, we set <code>nest=TRUE</code> to ensure the clusters are nested within the strata.</p>
-<div class="sourceCode" id="cb22"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb22-1"><a href="c04-getting-started.html#cb22-1" tabindex="-1"></a>anes_des <span class="ot">&lt;-</span> anes_adjwgt <span class="sc">%&gt;%</span></span>
-<span id="cb22-2"><a href="c04-getting-started.html#cb22-2" tabindex="-1"></a>  <span class="fu">as_survey_design</span>(<span class="at">weights =</span> Weight,</span>
-<span id="cb22-3"><a href="c04-getting-started.html#cb22-3" tabindex="-1"></a>                   <span class="at">strata =</span> V200010d,</span>
-<span id="cb22-4"><a href="c04-getting-started.html#cb22-4" tabindex="-1"></a>                   <span class="at">ids =</span> V200010c,</span>
-<span id="cb22-5"><a href="c04-getting-started.html#cb22-5" tabindex="-1"></a>                   <span class="at">nest =</span> <span class="cn">TRUE</span>)</span>
-<span id="cb22-6"><a href="c04-getting-started.html#cb22-6" tabindex="-1"></a></span>
-<span id="cb22-7"><a href="c04-getting-started.html#cb22-7" tabindex="-1"></a>anes_des</span></code></pre></div>
+<div class="sourceCode" id="cb21"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb21-1"><a href="c04-getting-started.html#cb21-1" tabindex="-1"></a>anes_des <span class="ot">&lt;-</span> anes_adjwgt <span class="sc">%&gt;%</span></span>
+<span id="cb21-2"><a href="c04-getting-started.html#cb21-2" tabindex="-1"></a>  <span class="fu">as_survey_design</span>(<span class="at">weights =</span> Weight,</span>
+<span id="cb21-3"><a href="c04-getting-started.html#cb21-3" tabindex="-1"></a>                   <span class="at">strata =</span> V200010d,</span>
+<span id="cb21-4"><a href="c04-getting-started.html#cb21-4" tabindex="-1"></a>                   <span class="at">ids =</span> V200010c,</span>
+<span id="cb21-5"><a href="c04-getting-started.html#cb21-5" tabindex="-1"></a>                   <span class="at">nest =</span> <span class="cn">TRUE</span>)</span>
+<span id="cb21-6"><a href="c04-getting-started.html#cb21-6" tabindex="-1"></a></span>
+<span id="cb21-7"><a href="c04-getting-started.html#cb21-7" tabindex="-1"></a>anes_des</span></code></pre></div>
 <pre><code>## Stratified 1 - level Cluster Sampling design (with replacement)
 ## With (101) clusters.
 ## Called via srvyr
@@ -1738,22 +1739,22 @@ <h4>American National Election Studies (ANES) Design Object<a href="c04-getting-
 ##     (fct), Income7 (fct), V202051 (dbl+lbl), V202066 (dbl+lbl), V202072
 ##     (dbl+lbl), VotedPres2020 (fct), V202073 (dbl+lbl), V202109x
 ##     (dbl+lbl), V202110x (dbl+lbl), VotedPres2020_selection (fct)</code></pre>
-<p>We can examine this new object to learn more about the survey design, such that the ANES is a “Stratified 1 - level Cluster Sampling design (with replacement) With (101) clusters”. Additionally, the output displays the sampling variables and then lists the remaining variables in the dataset. This design object will be used throughout this book to conduct survey analysis.</p>
+<p>We can examine this new object to learn more about the survey design, such that the ANES is a “Stratified 1 - level Cluster Sampling design (with replacement) With (101) clusters”. Additionally, the output displays the sampling variables and then lists the remaining variables in the dataset. This design object is used throughout this book to conduct survey analysis.</p>
 </div>
 <div id="residential-energy-consumption-survey-recs-design-object" class="section level4 unnumbered hasAnchor">
 <h4>Residential Energy Consumption Survey (RECS) Design Object<a href="c04-getting-started.html#residential-energy-consumption-survey-recs-design-object" class="anchor-section" aria-label="Anchor link to header"></a></h4>
-<p>The RECS documentation <span class="citation">(<a href="#ref-recs-2020-tech">U.S. Energy Information Administration 2023b</a>)</span> provides information on the survey’s sampling and weighting implications for analysis. The documentation shows the 2020 RECS uses Jackknife weights, where the main analytic weight is <code>NWEIGHT</code>, and the Jackknife weights are <code>NWEIGHT1</code>-<code>NWEIGHT60</code>. We can specify these in the weights and repweights arguments in the survey design object code, respectively.</p>
-<p>With Jackknife weights, additional information is required: <code>type</code>, <code>scale</code>, and <code>mse</code>. Chapter <a href="c10-sample-designs-replicate-weights.html#c10-sample-designs-replicate-weights">10</a> goes into depth about each of these arguments, but to quickly get started, the documentation lets us know that <code>type=JK1</code>, <code>scale=59/60</code>, and <code>mse = TRUE</code>. We can use the following code to create the survey design object:</p>
-<div class="sourceCode" id="cb24"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb24-1"><a href="c04-getting-started.html#cb24-1" tabindex="-1"></a>recs_des <span class="ot">&lt;-</span> recs_2020 <span class="sc">%&gt;%</span></span>
-<span id="cb24-2"><a href="c04-getting-started.html#cb24-2" tabindex="-1"></a>  <span class="fu">as_survey_rep</span>(</span>
-<span id="cb24-3"><a href="c04-getting-started.html#cb24-3" tabindex="-1"></a>    <span class="at">weights =</span> NWEIGHT,</span>
-<span id="cb24-4"><a href="c04-getting-started.html#cb24-4" tabindex="-1"></a>    <span class="at">repweights =</span> NWEIGHT1<span class="sc">:</span>NWEIGHT60,</span>
-<span id="cb24-5"><a href="c04-getting-started.html#cb24-5" tabindex="-1"></a>    <span class="at">type =</span> <span class="st">&quot;JK1&quot;</span>,</span>
-<span id="cb24-6"><a href="c04-getting-started.html#cb24-6" tabindex="-1"></a>    <span class="at">scale =</span> <span class="dv">59</span> <span class="sc">/</span> <span class="dv">60</span>,</span>
-<span id="cb24-7"><a href="c04-getting-started.html#cb24-7" tabindex="-1"></a>    <span class="at">mse =</span> <span class="cn">TRUE</span></span>
-<span id="cb24-8"><a href="c04-getting-started.html#cb24-8" tabindex="-1"></a>  )</span>
-<span id="cb24-9"><a href="c04-getting-started.html#cb24-9" tabindex="-1"></a></span>
-<span id="cb24-10"><a href="c04-getting-started.html#cb24-10" tabindex="-1"></a>recs_des</span></code></pre></div>
+<p>The RECS documentation <span class="citation">(<a href="#ref-recs-2020-tech">U.S. Energy Information Administration 2023b</a>)</span> provides information on the survey’s sampling and weighting implications for analysis. The documentation shows the 2020 RECS uses Jackknife weights, where the main analytic weight is <code>NWEIGHT</code>, and the Jackknife weights are <code>NWEIGHT1</code>-<code>NWEIGHT60</code>. We can specify these in the <code>weights</code> and <code>repweights</code> arguments in the survey design object code, respectively.</p>
+<p>With Jackknife weights, additional information is required: <code>type</code>, <code>scale</code>, and <code>mse</code>. Chapter <a href="c10-sample-designs-replicate-weights.html#c10-sample-designs-replicate-weights">10</a> goes into depth about each of these arguments, but to quickly get started, the RECS documentation lets us know that <code>type=JK1</code>, <code>scale=59/60</code>, and <code>mse = TRUE</code>. We can use the following code to create the survey design object:</p>
+<div class="sourceCode" id="cb23"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb23-1"><a href="c04-getting-started.html#cb23-1" tabindex="-1"></a>recs_des <span class="ot">&lt;-</span> recs_2020 <span class="sc">%&gt;%</span></span>
+<span id="cb23-2"><a href="c04-getting-started.html#cb23-2" tabindex="-1"></a>  <span class="fu">as_survey_rep</span>(</span>
+<span id="cb23-3"><a href="c04-getting-started.html#cb23-3" tabindex="-1"></a>    <span class="at">weights =</span> NWEIGHT,</span>
+<span id="cb23-4"><a href="c04-getting-started.html#cb23-4" tabindex="-1"></a>    <span class="at">repweights =</span> NWEIGHT1<span class="sc">:</span>NWEIGHT60,</span>
+<span id="cb23-5"><a href="c04-getting-started.html#cb23-5" tabindex="-1"></a>    <span class="at">type =</span> <span class="st">&quot;JK1&quot;</span>,</span>
+<span id="cb23-6"><a href="c04-getting-started.html#cb23-6" tabindex="-1"></a>    <span class="at">scale =</span> <span class="dv">59</span> <span class="sc">/</span> <span class="dv">60</span>,</span>
+<span id="cb23-7"><a href="c04-getting-started.html#cb23-7" tabindex="-1"></a>    <span class="at">mse =</span> <span class="cn">TRUE</span></span>
+<span id="cb23-8"><a href="c04-getting-started.html#cb23-8" tabindex="-1"></a>  )</span>
+<span id="cb23-9"><a href="c04-getting-started.html#cb23-9" tabindex="-1"></a></span>
+<span id="cb23-10"><a href="c04-getting-started.html#cb23-10" tabindex="-1"></a>recs_des</span></code></pre></div>
 <pre><code>## Call: Called via srvyr
 ## Unstratified cluster jacknife (JK1) with 60 replicates and MSE variances.
 ## Sampling variables:
@@ -1803,27 +1804,27 @@ <h4>Residential Energy Consumption Survey (RECS) Design Object<a href="c04-getti
 ##     DOLLARNG (dbl), ZBTUNG (fct), BTULP (dbl), DOLLARLP (dbl), ZBTULP
 ##     (fct), BTUFO (dbl), DOLLARFO (dbl), ZBTUFO (fct), BTUWOOD (dbl),
 ##     ZBTUWOOD (fct), TOTALBTU (dbl), TOTALDOL (dbl)</code></pre>
-<p>Viewing this new object provides information about the survey design, such that the RECS is an “unstratified cluster jacknife (JK1) with 60 replicates and MSE variances”. Additionally, the output shows the sampling variables (<code>NWEIGHT1</code>-<code>NWEIGHT60</code>) and then lists the remaining variables in the dataset. This design object will be used throughout this book to conduct survey analysis.</p>
+<p>Viewing this new object provides information about the survey design, such that RECS is an “Unstratified cluster jacknife (JK1) with 60 replicates and MSE variances”. Additionally, the output shows the sampling variables (<code>NWEIGHT1</code>-<code>NWEIGHT60</code>) and then lists the remaining variables in the dataset. This design object is used throughout this book to conduct survey analysis.</p>
 </div>
 </div>
 </div>
 <div id="survey-analysis-process" class="section level2 hasAnchor" number="4.3">
 <h2><span class="header-section-number">4.3</span> Survey analysis process<a href="c04-getting-started.html#survey-analysis-process" class="anchor-section" aria-label="Anchor link to header"></a></h2>
-<p>The section above walked through the installation and loading of several packages, introduced the survey data available in the {srvyrexploR} package, and provided context on preparing survey design objects for the ANES and RECS data. Once the survey design objects are created, there is a general process for analyzing data to create estimates with {srvyr} package:</p>
+<p>There is a general process for analyzing data to create estimates with {srvyr} package:</p>
 <ol style="list-style-type: decimal">
 <li><p>Create a <code>tbl_svy</code> object (a survey object) using: <code>as_survey_design()</code> or <code>as_survey_rep()</code></p></li>
 <li><p>Subset data (if needed) using <code>filter()</code> (to create subpopulations)</p></li>
 <li><p>Specify domains of analysis using <code>group_by()</code></p></li>
 <li><p>Within <code>summarize()</code>, specify variables to calculate, including means, totals, proportions, quantiles, and more</p></li>
 </ol>
-<p>In Section <a href="c04-getting-started.html#setup-des-obj">4.2.3</a>, we follow Step #1 to create the survey design objects for the ANES and RECS data featured in this book. Additional details on how to create design objects can be found in <a href="c10-sample-designs-replicate-weights.html#c10-sample-designs-replicate-weights">10</a>. Then, once we have the design object, we can then filter the data to any subpopulation of interest (if needed). It is important to filter the data <strong>after</strong> creating the design object. This ensures that we are accurately accounting for the survey design in our calculations. Finally, we can use <code>group_by()</code>, <code>summarize()</code>, and other functions from the {survey} and {srvyr} packages to analyze the survey data by estimating means, totals, and so on.</p>
+<p>In Section <a href="c04-getting-started.html#setup-des-obj">4.2.3</a>, we follow Step #1 to create the survey design objects for the ANES and RECS data featured in this book. Additional details on how to create design objects can be found in Chapter <a href="c10-sample-designs-replicate-weights.html#c10-sample-designs-replicate-weights">10</a>. Then, once we have the design object, we can filter the data to any subpopulation of interest (if needed). It is important to filter the data <strong>after</strong> creating the design object. This ensures that we are accurately accounting for the survey design in our calculations. Finally, we can use <code>group_by()</code>, <code>summarize()</code>, and other functions from the {survey} and {srvyr} packages to analyze the survey data by estimating means, totals, and so on.</p>
 </div>
 <div id="similarities-dplyr-srvyr" class="section level2 hasAnchor" number="4.4">
 <h2><span class="header-section-number">4.4</span> Similarities between {dplyr} and {srvyr} functions<a href="c04-getting-started.html#similarities-dplyr-srvyr" class="anchor-section" aria-label="Anchor link to header"></a></h2>
 <p>The {dplyr} package from the tidyverse offers flexible and intuitive functions for data wrangling <span class="citation">(<a href="#ref-R-dplyr">Wickham et al. 2023</a>)</span>. One of the major advantages of using {srvyr} is that it applies {dplyr}-like syntax to the {survey} package <span class="citation">(<a href="#ref-R-srvyr">Freedman Ellis and Schneider 2023</a>)</span>. We can use pipes, such as <code>%&gt;%</code> from the {magrittr} package, to specify a survey design object, apply a function, and then feed that output into the next function’s first argument <span class="citation">(<a href="#ref-R-magrittr">Bache and Wickham 2022</a>)</span>. Functions follow the ‘tidy’ convention of snake_case function names.</p>
 <p>To help explain the similarities between {dplyr} functions and {srvyr} functions, we use the <code>towny</code> dataset from the {gt} package and <code>apistrat</code> data that comes in the {survey} package. The <code>towny</code> dataset provides population data for municipalities in Ontario, Canada on Census years between 1996 and 2021. Taking a look at <code>towny</code> with <code>dplyr::glimpse()</code>, we can see the dataset has 25 columns with a mix of character and numeric data.</p>
-<div class="sourceCode" id="cb26"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb26-1"><a href="c04-getting-started.html#cb26-1" tabindex="-1"></a>towny <span class="sc">%&gt;%</span> </span>
-<span id="cb26-2"><a href="c04-getting-started.html#cb26-2" tabindex="-1"></a>  <span class="fu">glimpse</span>()</span></code></pre></div>
+<div class="sourceCode" id="cb25"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb25-1"><a href="c04-getting-started.html#cb25-1" tabindex="-1"></a>towny <span class="sc">%&gt;%</span> </span>
+<span id="cb25-2"><a href="c04-getting-started.html#cb25-2" tabindex="-1"></a>  <span class="fu">glimpse</span>()</span></code></pre></div>
 <pre><code>## Rows: 414
 ## Columns: 25
 ## $ name                     &lt;chr&gt; &quot;Addington Highlands&quot;, &quot;Adelaide Metc…
@@ -1852,55 +1853,58 @@ <h2><span class="header-section-number">4.4</span> Similarities between {dplyr}
 ## $ pop_change_2011_2016_pct &lt;dbl&gt; -0.0791, -0.0125, 0.0351, 0.0320, 0.0…
 ## $ pop_change_2016_2021_pct &lt;dbl&gt; 0.0932, 0.0070, 0.0013, 0.0204, 0.058…</code></pre>
 <p>Let’s examine the <code>towny</code> object’s class. We verify that it is a tibble, as indicated by <code>"tbl_df"</code>, by running the code below:</p>
-<div class="sourceCode" id="cb28"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb28-1"><a href="c04-getting-started.html#cb28-1" tabindex="-1"></a><span class="fu">class</span>(towny)</span></code></pre></div>
+<div class="sourceCode" id="cb27"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb27-1"><a href="c04-getting-started.html#cb27-1" tabindex="-1"></a><span class="fu">class</span>(towny)</span></code></pre></div>
 <pre><code>## [1] &quot;tbl_df&quot;     &quot;tbl&quot;        &quot;data.frame&quot;</code></pre>
-<p>All tibbles are data.frames but not all data.frames are tibbles. Compared to data.frames, tibbles have some advantages with the printing behavior being a noticeable advantage.</p>
+<p>All tibbles are data.frames, but not all data.frames are tibbles. Compared to data.frames, tibbles have some advantages, with the printing behavior being a noticeable advantage. When working with tidyverse style code, we recommend making all your datasets tibbles for ease of analysis.</p>
 <p>The {survey} package contains datasets related to the California Academic Performance Index, which measures student performance in schools with at least 100 students in California. We can access these datasets by loading the {survey} package and running <code>data(api)</code>.</p>
-<p>Let’s work with the <code>apistrat</code> dataset, a stratified simple random sample of three school types (elementary, middle, high) in each stratum. We can follow the process outlined in Section <a href="c04-getting-started.html#setup-des-obj">4.2.3</a> to create the survey design object. The sample is stratified by the <code>stype</code> variable and the sampling weights are found in the <code>pw</code> variable. We can use this information to construct the design object, <code>dstrata</code>.</p>
-<div class="sourceCode" id="cb30"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb30-1"><a href="c04-getting-started.html#cb30-1" tabindex="-1"></a><span class="fu">data</span>(api)</span>
-<span id="cb30-2"><a href="c04-getting-started.html#cb30-2" tabindex="-1"></a></span>
-<span id="cb30-3"><a href="c04-getting-started.html#cb30-3" tabindex="-1"></a>dstrata <span class="ot">&lt;-</span> apistrat <span class="sc">%&gt;%</span></span>
-<span id="cb30-4"><a href="c04-getting-started.html#cb30-4" tabindex="-1"></a>  <span class="fu">as_survey_design</span>(<span class="at">strata =</span> stype, <span class="at">weights =</span> pw)</span></code></pre></div>
-<p>When we check the class of <code>dstrata</code>, it is not a typical <code>data.frame</code>. Applying the <code>as_survey_design()</code> function transforms the data into a <code>tbl_svy</code>, a special class specifically for survey design objects. The {srvyr} package is designed to work with the <code>tbl_svy</code> class of objects.</p>
-<div class="sourceCode" id="cb31"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb31-1"><a href="c04-getting-started.html#cb31-1" tabindex="-1"></a><span class="fu">class</span>(dstrata)</span></code></pre></div>
+<p>Let’s work with the <code>apistrat</code> dataset, which is a stratified random sample, stratified by school type (<code>stype</code>) with three levels: <code>E</code> for elementary school, <code>M</code> for middle school, and <code>H</code> for high school. We first create the survey design object (see Chapter <a href="c10-sample-designs-replicate-weights.html#c10-sample-designs-replicate-weights">10</a> for more information). The sample is stratified by the <code>stype</code> variable, and the sampling weights are found in the <code>pw</code> variable. We can use this information to construct the design object, <code>apistrat_des</code>.</p>
+<div class="sourceCode" id="cb29"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb29-1"><a href="c04-getting-started.html#cb29-1" tabindex="-1"></a><span class="fu">data</span>(api)</span>
+<span id="cb29-2"><a href="c04-getting-started.html#cb29-2" tabindex="-1"></a></span>
+<span id="cb29-3"><a href="c04-getting-started.html#cb29-3" tabindex="-1"></a>apistrat_des <span class="ot">&lt;-</span> apistrat <span class="sc">%&gt;%</span></span>
+<span id="cb29-4"><a href="c04-getting-started.html#cb29-4" tabindex="-1"></a>  <span class="fu">as_survey_design</span>(<span class="at">strata =</span> stype, </span>
+<span id="cb29-5"><a href="c04-getting-started.html#cb29-5" tabindex="-1"></a>                   <span class="at">weights =</span> pw)</span></code></pre></div>
+<p>When we check the class of <code>apistrat_des</code>, it is not a typical <code>data.frame</code>. Applying the <code>as_survey_design()</code> function transforms the data into a <code>tbl_svy</code>, a special class specifically for survey design objects. The {srvyr} package is designed to work with the <code>tbl_svy</code> class of objects.</p>
+<div class="sourceCode" id="cb30"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb30-1"><a href="c04-getting-started.html#cb30-1" tabindex="-1"></a><span class="fu">class</span>(apistrat_des)</span></code></pre></div>
 <pre><code>## [1] &quot;tbl_svy&quot;        &quot;survey.design2&quot; &quot;survey.design&quot;</code></pre>
 <p>Let’s look at how {dplyr} works with regular data frames. The example below calculates the mean and median for the <code>land_area_km2</code> variable in the <code>towny</code> dataset.</p>
-<div class="sourceCode" id="cb33"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb33-1"><a href="c04-getting-started.html#cb33-1" tabindex="-1"></a>towny <span class="sc">%&gt;%</span></span>
-<span id="cb33-2"><a href="c04-getting-started.html#cb33-2" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">area_mean =</span> <span class="fu">mean</span>(land_area_km2),</span>
-<span id="cb33-3"><a href="c04-getting-started.html#cb33-3" tabindex="-1"></a>            <span class="at">area_median =</span> <span class="fu">median</span>(land_area_km2))</span></code></pre></div>
+<div class="sourceCode" id="cb32"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb32-1"><a href="c04-getting-started.html#cb32-1" tabindex="-1"></a>towny <span class="sc">%&gt;%</span></span>
+<span id="cb32-2"><a href="c04-getting-started.html#cb32-2" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">area_mean =</span> <span class="fu">mean</span>(land_area_km2),</span>
+<span id="cb32-3"><a href="c04-getting-started.html#cb32-3" tabindex="-1"></a>            <span class="at">area_median =</span> <span class="fu">median</span>(land_area_km2))</span></code></pre></div>
 <pre><code>## # A tibble: 1 × 2
 ##   area_mean area_median
 ##       &lt;dbl&gt;       &lt;dbl&gt;
 ## 1      373.        273.</code></pre>
-<p>In the code below, we calculate the mean and median of the variable <code>api00</code> using <code>dstrata</code>. Note the similarity in the syntax. When we dig into the {srvyr} functions later, we will show that the outputs share a similar structure. Each group (if present) generates one row of output, but with additional columns. By default, the standard error of the statistic is also calculated in addition to the statistic itself.</p>
-<div class="sourceCode" id="cb35"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb35-1"><a href="c04-getting-started.html#cb35-1" tabindex="-1"></a>dstrata <span class="sc">%&gt;%</span></span>
-<span id="cb35-2"><a href="c04-getting-started.html#cb35-2" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">api00_mean =</span> <span class="fu">survey_mean</span>(api00),</span>
-<span id="cb35-3"><a href="c04-getting-started.html#cb35-3" tabindex="-1"></a>            <span class="at">api00_med =</span> <span class="fu">survey_median</span>(api00))</span></code></pre></div>
+<p>In the code below, we calculate the mean and median of the variable <code>api00</code> using <code>apistrat_des</code>. Note the similarity in the syntax. However, the standard error of the statistic is also calculated in addition to the statistic itself.</p>
+<div class="sourceCode" id="cb34"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb34-1"><a href="c04-getting-started.html#cb34-1" tabindex="-1"></a>apistrat_des <span class="sc">%&gt;%</span></span>
+<span id="cb34-2"><a href="c04-getting-started.html#cb34-2" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">api00_mean =</span> <span class="fu">survey_mean</span>(api00),</span>
+<span id="cb34-3"><a href="c04-getting-started.html#cb34-3" tabindex="-1"></a>            <span class="at">api00_med =</span> <span class="fu">survey_median</span>(api00))</span></code></pre></div>
 <pre><code>## # A tibble: 1 × 4
 ##   api00_mean api00_mean_se api00_med api00_med_se
 ##        &lt;dbl&gt;         &lt;dbl&gt;     &lt;dbl&gt;        &lt;dbl&gt;
 ## 1       662.          9.54       668         13.7</code></pre>
-<p>The functions in {srvyr} also play nicely with other tidyverse functions. For example, if we wanted to select columns with shared characteristics, we can use {tidyselect} functions such as <code>starts_with()</code>, <code>num_range()</code>, etc <span class="citation">(<a href="#ref-R-tidyselect">Henry and Wickham 2022</a>)</span>. In the examples below, we use a combination of <code>across()</code> and <code>starts_with()</code> to calculate the mean of variables starting with “population” in the <code>towny</code> data frame and those beginning with <code>api</code> in the <code>dstrata</code> survey object.</p>
-<div class="sourceCode" id="cb37"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb37-1"><a href="c04-getting-started.html#cb37-1" tabindex="-1"></a>towny <span class="sc">%&gt;%</span></span>
-<span id="cb37-2"><a href="c04-getting-started.html#cb37-2" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="fu">across</span>(<span class="fu">starts_with</span>(<span class="st">&quot;population&quot;</span>), <span class="sc">~</span><span class="fu">mean</span>(.x, <span class="at">na.rm=</span><span class="cn">TRUE</span>)))</span></code></pre></div>
+<p>The functions in {srvyr} also play nicely with other tidyverse functions. For example, if we wanted to select columns with shared characteristics, we can use {tidyselect} functions such as <code>starts_with()</code>, <code>num_range()</code>, etc. <span class="citation">(<a href="#ref-R-tidyselect">Henry and Wickham 2022</a>)</span>. In the examples below, we use a combination of <code>across()</code> and <code>starts_with()</code> to calculate the mean of variables starting with “population” in the <code>towny</code> data frame and those beginning with <code>api</code> in the <code>apistrat_des</code> survey object.</p>
+<div class="sourceCode" id="cb36"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb36-1"><a href="c04-getting-started.html#cb36-1" tabindex="-1"></a>towny <span class="sc">%&gt;%</span></span>
+<span id="cb36-2"><a href="c04-getting-started.html#cb36-2" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="fu">across</span>(<span class="fu">starts_with</span>(<span class="st">&quot;population&quot;</span>), </span>
+<span id="cb36-3"><a href="c04-getting-started.html#cb36-3" tabindex="-1"></a>                   <span class="sc">~</span><span class="fu">mean</span>(.x, <span class="at">na.rm=</span><span class="cn">TRUE</span>)))</span></code></pre></div>
 <pre><code>## # A tibble: 1 × 6
 ##   population_1996 population_2001 population_2006 population_2011
 ##             &lt;dbl&gt;           &lt;dbl&gt;           &lt;dbl&gt;           &lt;dbl&gt;
 ## 1          25866.          27538.          29173.          30838.
 ## # ℹ 2 more variables: population_2016 &lt;dbl&gt;, population_2021 &lt;dbl&gt;</code></pre>
-<div class="sourceCode" id="cb39"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb39-1"><a href="c04-getting-started.html#cb39-1" tabindex="-1"></a>dstrata <span class="sc">%&gt;%</span></span>
-<span id="cb39-2"><a href="c04-getting-started.html#cb39-2" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="fu">across</span>(<span class="fu">starts_with</span>(<span class="st">&quot;api&quot;</span>), survey_mean))</span></code></pre></div>
+<div class="sourceCode" id="cb38"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb38-1"><a href="c04-getting-started.html#cb38-1" tabindex="-1"></a>apistrat_des <span class="sc">%&gt;%</span></span>
+<span id="cb38-2"><a href="c04-getting-started.html#cb38-2" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="fu">across</span>(<span class="fu">starts_with</span>(<span class="st">&quot;api&quot;</span>), </span>
+<span id="cb38-3"><a href="c04-getting-started.html#cb38-3" tabindex="-1"></a>                   survey_mean))</span></code></pre></div>
 <pre><code>## # A tibble: 1 × 6
 ##   api00 api00_se api99 api99_se api.stu api.stu_se
 ##   &lt;dbl&gt;    &lt;dbl&gt; &lt;dbl&gt;    &lt;dbl&gt;   &lt;dbl&gt;      &lt;dbl&gt;
 ## 1  662.     9.54  629.     10.1    498.       16.4</code></pre>
 <p>We have the flexibility to use {dplyr} verbs such as <code>mutate()</code>, <code>filter()</code>, and <code>select()</code> on our survey design object. As mentioned in Section <a href="c04-getting-started.html#survey-analysis-process">4.3</a>, these steps should be performed on the survey design object. This ensures our survey design is properly considered in all our calculations.</p>
-<div class="sourceCode" id="cb41"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb41-1"><a href="c04-getting-started.html#cb41-1" tabindex="-1"></a>dstrata_mod <span class="ot">&lt;-</span> dstrata <span class="sc">%&gt;%</span></span>
-<span id="cb41-2"><a href="c04-getting-started.html#cb41-2" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">api_diff =</span> api00 <span class="sc">-</span> api99) <span class="sc">%&gt;%</span></span>
-<span id="cb41-3"><a href="c04-getting-started.html#cb41-3" tabindex="-1"></a>  <span class="fu">filter</span>(stype <span class="sc">==</span> <span class="st">&quot;E&quot;</span>) <span class="sc">%&gt;%</span></span>
-<span id="cb41-4"><a href="c04-getting-started.html#cb41-4" tabindex="-1"></a>  <span class="fu">select</span>(stype, api99, api00, api_diff, <span class="at">api_students =</span> api.stu)</span>
-<span id="cb41-5"><a href="c04-getting-started.html#cb41-5" tabindex="-1"></a></span>
-<span id="cb41-6"><a href="c04-getting-started.html#cb41-6" tabindex="-1"></a>dstrata_mod</span></code></pre></div>
+<div class="sourceCode" id="cb40"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb40-1"><a href="c04-getting-started.html#cb40-1" tabindex="-1"></a>apistrat_des_mod <span class="ot">&lt;-</span> apistrat_des <span class="sc">%&gt;%</span></span>
+<span id="cb40-2"><a href="c04-getting-started.html#cb40-2" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">api_diff =</span> api00 <span class="sc">-</span> api99) <span class="sc">%&gt;%</span></span>
+<span id="cb40-3"><a href="c04-getting-started.html#cb40-3" tabindex="-1"></a>  <span class="fu">filter</span>(stype <span class="sc">==</span> <span class="st">&quot;E&quot;</span>) <span class="sc">%&gt;%</span></span>
+<span id="cb40-4"><a href="c04-getting-started.html#cb40-4" tabindex="-1"></a>  <span class="fu">select</span>(stype, api99, api00, api_diff, <span class="at">api_students =</span> api.stu)</span>
+<span id="cb40-5"><a href="c04-getting-started.html#cb40-5" tabindex="-1"></a></span>
+<span id="cb40-6"><a href="c04-getting-started.html#cb40-6" tabindex="-1"></a>apistrat_des_mod</span></code></pre></div>
 <pre><code>## Stratified Independent Sampling design (with replacement)
 ## Called via srvyr
 ## Sampling variables:
@@ -1910,7 +1914,7 @@ <h2><span class="header-section-number">4.4</span> Similarities between {dplyr}
 ## Data variables: 
 ##   - stype (fct), api99 (int), api00 (int), api_diff (int), api_students
 ##     (int)</code></pre>
-<div class="sourceCode" id="cb43"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb43-1"><a href="c04-getting-started.html#cb43-1" tabindex="-1"></a>dstrata</span></code></pre></div>
+<div class="sourceCode" id="cb42"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb42-1"><a href="c04-getting-started.html#cb42-1" tabindex="-1"></a>apistrat_des</span></code></pre></div>
 <pre><code>## Stratified Independent Sampling design (with replacement)
 ## Called via srvyr
 ## Sampling variables:
@@ -1928,10 +1932,10 @@ <h2><span class="header-section-number">4.4</span> Similarities between {dplyr}
 ##     (dbl), full (int), emer (int), enroll (int), api.stu (int), pw
 ##     (dbl), fpc (dbl)</code></pre>
 <p>Several functions in {srvyr} must be called within <code>srvyr::summarize()</code>, with the exception of <code>srvyr::survey_count()</code> and <code>srvyr::survey_tally()</code>. This is similar to how <code>dplyr::count()</code> and <code>dplyr::tally()</code> are not called within <code>dplyr::summarize()</code>. The <code>summarize()</code> function can be used in conjunction with the <code>group_by()</code> function or <code>by/.by</code> arguments, which applies the functions on a group-by-group basis to create grouped summaries.</p>
-<div class="sourceCode" id="cb45"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb45-1"><a href="c04-getting-started.html#cb45-1" tabindex="-1"></a>towny <span class="sc">%&gt;%</span></span>
-<span id="cb45-2"><a href="c04-getting-started.html#cb45-2" tabindex="-1"></a>  <span class="fu">group_by</span>(csd_type) <span class="sc">%&gt;%</span></span>
-<span id="cb45-3"><a href="c04-getting-started.html#cb45-3" tabindex="-1"></a>  dplyr<span class="sc">::</span><span class="fu">summarize</span>(<span class="at">area_mean =</span> <span class="fu">mean</span>(land_area_km2),</span>
-<span id="cb45-4"><a href="c04-getting-started.html#cb45-4" tabindex="-1"></a>                   <span class="at">area_median =</span> <span class="fu">median</span>(land_area_km2))</span></code></pre></div>
+<div class="sourceCode" id="cb44"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb44-1"><a href="c04-getting-started.html#cb44-1" tabindex="-1"></a>towny <span class="sc">%&gt;%</span></span>
+<span id="cb44-2"><a href="c04-getting-started.html#cb44-2" tabindex="-1"></a>  <span class="fu">group_by</span>(csd_type) <span class="sc">%&gt;%</span></span>
+<span id="cb44-3"><a href="c04-getting-started.html#cb44-3" tabindex="-1"></a>  dplyr<span class="sc">::</span><span class="fu">summarize</span>(<span class="at">area_mean =</span> <span class="fu">mean</span>(land_area_km2),</span>
+<span id="cb44-4"><a href="c04-getting-started.html#cb44-4" tabindex="-1"></a>                   <span class="at">area_median =</span> <span class="fu">median</span>(land_area_km2))</span></code></pre></div>
 <pre><code>## # A tibble: 5 × 3
 ##   csd_type     area_mean area_median
 ##   &lt;chr&gt;            &lt;dbl&gt;       &lt;dbl&gt;
@@ -1941,10 +1945,10 @@ <h2><span class="header-section-number">4.4</span> Similarities between {dplyr}
 ## 4 township         363.        301. 
 ## 5 village           23.0         3.3</code></pre>
 <p>We use a similar setup to summarize data in {srvyr}:</p>
-<div class="sourceCode" id="cb47"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb47-1"><a href="c04-getting-started.html#cb47-1" tabindex="-1"></a>dstrata <span class="sc">%&gt;%</span></span>
-<span id="cb47-2"><a href="c04-getting-started.html#cb47-2" tabindex="-1"></a>  <span class="fu">group_by</span>(stype) <span class="sc">%&gt;%</span></span>
-<span id="cb47-3"><a href="c04-getting-started.html#cb47-3" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">api00_mean =</span> <span class="fu">survey_mean</span>(api00),</span>
-<span id="cb47-4"><a href="c04-getting-started.html#cb47-4" tabindex="-1"></a>            <span class="at">api00_median =</span> <span class="fu">survey_median</span>(api00))</span></code></pre></div>
+<div class="sourceCode" id="cb46"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb46-1"><a href="c04-getting-started.html#cb46-1" tabindex="-1"></a>apistrat_des <span class="sc">%&gt;%</span></span>
+<span id="cb46-2"><a href="c04-getting-started.html#cb46-2" tabindex="-1"></a>  <span class="fu">group_by</span>(stype) <span class="sc">%&gt;%</span></span>
+<span id="cb46-3"><a href="c04-getting-started.html#cb46-3" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">api00_mean =</span> <span class="fu">survey_mean</span>(api00),</span>
+<span id="cb46-4"><a href="c04-getting-started.html#cb46-4" tabindex="-1"></a>            <span class="at">api00_median =</span> <span class="fu">survey_median</span>(api00))</span></code></pre></div>
 <pre><code>## # A tibble: 3 × 5
 ##   stype api00_mean api00_mean_se api00_median api00_median_se
 ##   &lt;fct&gt;      &lt;dbl&gt;         &lt;dbl&gt;        &lt;dbl&gt;           &lt;dbl&gt;
@@ -1952,10 +1956,10 @@ <h2><span class="header-section-number">4.4</span> Similarities between {dplyr}
 ## 2 H           626.          15.5          635            21.6
 ## 3 M           637.          16.6          648            24.1</code></pre>
 <p>At this time, the <code>.by</code> argument is <code>srvyr::summarize()</code> does not exist as it does in {dplyr}. An alternative way to do the grouped analysis on the <code>towny</code> data would be:</p>
-<div class="sourceCode" id="cb49"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb49-1"><a href="c04-getting-started.html#cb49-1" tabindex="-1"></a>towny <span class="sc">%&gt;%</span></span>
-<span id="cb49-2"><a href="c04-getting-started.html#cb49-2" tabindex="-1"></a>  dplyr<span class="sc">::</span><span class="fu">summarize</span>(<span class="at">area_mean =</span> <span class="fu">mean</span>(land_area_km2),</span>
-<span id="cb49-3"><a href="c04-getting-started.html#cb49-3" tabindex="-1"></a>                   <span class="at">area_median =</span> <span class="fu">median</span>(land_area_km2), </span>
-<span id="cb49-4"><a href="c04-getting-started.html#cb49-4" tabindex="-1"></a>                   <span class="at">.by=</span>csd_type)</span></code></pre></div>
+<div class="sourceCode" id="cb48"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb48-1"><a href="c04-getting-started.html#cb48-1" tabindex="-1"></a>towny <span class="sc">%&gt;%</span></span>
+<span id="cb48-2"><a href="c04-getting-started.html#cb48-2" tabindex="-1"></a>  dplyr<span class="sc">::</span><span class="fu">summarize</span>(<span class="at">area_mean =</span> <span class="fu">mean</span>(land_area_km2),</span>
+<span id="cb48-3"><a href="c04-getting-started.html#cb48-3" tabindex="-1"></a>                   <span class="at">area_median =</span> <span class="fu">median</span>(land_area_km2), </span>
+<span id="cb48-4"><a href="c04-getting-started.html#cb48-4" tabindex="-1"></a>                   <span class="at">.by=</span>csd_type)</span></code></pre></div>
 <pre><code>## # A tibble: 5 × 3
 ##   csd_type     area_mean area_median
 ##   &lt;chr&gt;            &lt;dbl&gt;       &lt;dbl&gt;
@@ -1965,10 +1969,10 @@ <h2><span class="header-section-number">4.4</span> Similarities between {dplyr}
 ## 4 city             498.        198. 
 ## 5 village           23.0         3.3</code></pre>
 <p>However, the <code>.by</code> syntax is not yet available in {srvyr}:</p>
-<div class="sourceCode" id="cb51"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb51-1"><a href="c04-getting-started.html#cb51-1" tabindex="-1"></a>dstrata <span class="sc">%&gt;%</span></span>
-<span id="cb51-2"><a href="c04-getting-started.html#cb51-2" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">api00_mean =</span> <span class="fu">survey_mean</span>(api00),</span>
-<span id="cb51-3"><a href="c04-getting-started.html#cb51-3" tabindex="-1"></a>            <span class="at">api00_median =</span> <span class="fu">survey_median</span>(api00),</span>
-<span id="cb51-4"><a href="c04-getting-started.html#cb51-4" tabindex="-1"></a>            <span class="at">.by=</span>stype)</span></code></pre></div>
+<div class="sourceCode" id="cb50"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb50-1"><a href="c04-getting-started.html#cb50-1" tabindex="-1"></a>apistrat_des <span class="sc">%&gt;%</span></span>
+<span id="cb50-2"><a href="c04-getting-started.html#cb50-2" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">api00_mean =</span> <span class="fu">survey_mean</span>(api00),</span>
+<span id="cb50-3"><a href="c04-getting-started.html#cb50-3" tabindex="-1"></a>            <span class="at">api00_median =</span> <span class="fu">survey_median</span>(api00),</span>
+<span id="cb50-4"><a href="c04-getting-started.html#cb50-4" tabindex="-1"></a>            <span class="at">.by=</span>stype)</span></code></pre></div>
 <pre><code>## Error in `dplyr::summarise()` at gergness-srvyr-1917f75/R/summarise.r:10:3:
 ## ℹ In argument: `api00_mean = survey_mean(api00)`.
 ## ℹ In group 1: `stype = E`.
@@ -1979,21 +1983,21 @@ <h2><span class="header-section-number">4.4</span> Similarities between {dplyr}
 ## ℹ Only vectors of size 1 are recycled.
 ## Caused by error in `vectbl_recycle_rhs_rows()`:
 ## ! Can&#39;t recycle input of size 100 to size 200.</code></pre>
-<p>As mentioned above, {srvyr} functions are meant for <code>tbl_svy</code> objects. Attempting to perform data manipulation on non-<code>tbl_svy</code> objects, like the <code>towny</code> example shown below, will result in an error. Running the code will let you know what the issue is: <code>Survey context not set</code>.</p>
-<div class="sourceCode" id="cb53"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb53-1"><a href="c04-getting-started.html#cb53-1" tabindex="-1"></a>towny <span class="sc">%&gt;%</span></span>
-<span id="cb53-2"><a href="c04-getting-started.html#cb53-2" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">area_mean =</span> <span class="fu">survey_mean</span>(land_area_km2))</span></code></pre></div>
+<p>As mentioned above, {srvyr} functions are meant for <code>tbl_svy</code> objects. Attempting to manipulate data on non-<code>tbl_svy</code> objects, like the <code>towny</code> example shown below, results in an error. Running the code lets us know what the issue is: <code>Survey context not set</code>.</p>
+<div class="sourceCode" id="cb52"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb52-1"><a href="c04-getting-started.html#cb52-1" tabindex="-1"></a>towny <span class="sc">%&gt;%</span></span>
+<span id="cb52-2"><a href="c04-getting-started.html#cb52-2" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">area_mean =</span> <span class="fu">survey_mean</span>(land_area_km2))</span></code></pre></div>
 <pre><code>## Error in `summarize()`:
 ## ℹ In argument: `area_mean = survey_mean(land_area_km2)`.
 ## Caused by error in `cur_svy()` at gergness-srvyr-1917f75/R/survey_statistics.r:114:3:
 ## ! Survey context not set</code></pre>
-<p>A few functions in {srvyr} have counterparts in {dplyr}, such as <code>srvyr::summarize()</code> and <code>srvyr::group_by()</code>. Unlike {srvyr}-specific verbs, {srvyr} recognizes these parallel functions if applied to a non-survey object. Instead of causing an error, the package will provide the equivalent output from {dplyr}:</p>
-<div class="sourceCode" id="cb55"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb55-1"><a href="c04-getting-started.html#cb55-1" tabindex="-1"></a>towny <span class="sc">%&gt;%</span></span>
-<span id="cb55-2"><a href="c04-getting-started.html#cb55-2" tabindex="-1"></a>  srvyr<span class="sc">::</span><span class="fu">summarize</span>(<span class="at">area_mean =</span> <span class="fu">mean</span>(land_area_km2))</span></code></pre></div>
+<p>A few functions in {srvyr} have counterparts in {dplyr}, such as <code>srvyr::summarize()</code> and <code>srvyr::group_by()</code>. Unlike {srvyr}-specific verbs, {srvyr} recognizes these parallel functions if applied to a non-survey object. Instead of causing an error, the package provides the equivalent output from {dplyr}:</p>
+<div class="sourceCode" id="cb54"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb54-1"><a href="c04-getting-started.html#cb54-1" tabindex="-1"></a>towny <span class="sc">%&gt;%</span></span>
+<span id="cb54-2"><a href="c04-getting-started.html#cb54-2" tabindex="-1"></a>  srvyr<span class="sc">::</span><span class="fu">summarize</span>(<span class="at">area_mean =</span> <span class="fu">mean</span>(land_area_km2))</span></code></pre></div>
 <pre><code>## # A tibble: 1 × 1
 ##   area_mean
 ##       &lt;dbl&gt;
 ## 1      373.</code></pre>
-<p>Because this book focuses on survey analysis, most of our pipes will stem from a survey object. When we load the {dplyr} and {srvyr} packages, the functions will automatically figure out the class of data and use the appropriate one from {dplyr} or {srvyr}. Therefore, we do not need to include the namespace for each function (e.g., <code>srvyr::summarize()</code>).</p>
+<p>Because this book focuses on survey analysis, most of our pipes stem from a survey object. When we load the {dplyr} and {srvyr} packages, the functions automatically figure out the class of data and use the appropriate one from {dplyr} or {srvyr}. Therefore, we do not need to include the namespace for each function (e.g., <code>srvyr::summarize()</code>).</p>
 
 </div>
 </div>
@@ -2006,7 +2010,7 @@ <h3>References<a href="references.html#references" class="anchor-section" aria-l
 DeBell, Matthew. 2010. <span>“How to Analyze ANES Survey Data.”</span> ANES Technical Report Series nes012492. Palo Alto, CA: Stanford University; Ann Arbor, MI: the University of Michigan; <a href="https://electionstudies.org/wp-content/uploads/2018/05/HowToAnalyzeANESData.pdf" class="uri">https://electionstudies.org/wp-content/uploads/2018/05/HowToAnalyzeANESData.pdf</a>.
 </div>
 <div id="ref-R-srvyr" class="csl-entry">
-Freedman Ellis, Greg, and Ben Schneider. 2023. <em><span class="nocase">srvyr</span>: Dplyr-Like Syntax for Summary Statistics of Survey Data</em>.
+Freedman Ellis, Greg, and Ben Schneider. 2023. <em><span class="nocase">srvyr</span>: ’<span class="nocase">dplyr</span>’-Like Syntax for Summary Statistics of Survey Data</em>.
 </div>
 <div id="ref-R-tidyselect" class="csl-entry">
 Henry, Lionel, and Hadley Wickham. 2022. <em><span class="nocase">tidyselect</span>: Select from a Set of Strings</em>.
@@ -2033,7 +2037,7 @@ <h3>References<a href="references.html#references" class="anchor-section" aria-l
 <div class="footnotes">
 <hr />
 <ol start="2">
-<li id="fn2"><p>Note: {broom} is already included in the tidyverse, so no separate installation is required<a href="c04-getting-started.html#fnref2" class="footnote-back">↩︎</a></p></li>
+<li id="fn2"><p>Note: {broom} is already included in the tidyverse, so no separate installation is required.<a href="c04-getting-started.html#fnref2" class="footnote-back">↩︎</a></p></li>
 <li id="fn3"><p>In the United States, presidential elections are held in years divisible by four. In other even years, there are elections at the federal level for congress which are referred to as midterm elections as they occur at the middle of the term of a president.<a href="c04-getting-started.html#fnref3" class="footnote-back">↩︎</a></p></li>
 </ol>
 </div>
diff --git a/c05-descriptive-analysis.html b/c05-descriptive-analysis.html
index 26401db3..7604c0bc 100644
--- a/c05-descriptive-analysis.html
+++ b/c05-descriptive-analysis.html
@@ -23,7 +23,7 @@
 <meta name="author" content="Stephanie Zimmer, Rebecca J. Powell, and Isabella Velásquez" />
 
 
-<meta name="date" content="2024-04-16" />
+<meta name="date" content="2024-04-24" />
 
   <meta name="viewport" content="width=device-width, initial-scale=1" />
   <meta name="apple-mobile-web-app-capable" content="yes" />
@@ -229,7 +229,7 @@
 <li class="chapter" data-level="4.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#introduction-3"><i class="fa fa-check"></i><b>4.1</b> Introduction</a></li>
 <li class="chapter" data-level="4.2" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup"><i class="fa fa-check"></i><b>4.2</b> Setup</a>
 <ul>
-<li class="chapter" data-level="4.2.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#packages"><i class="fa fa-check"></i><b>4.2.1</b> Packages</a></li>
+<li class="chapter" data-level="4.2.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup-load-pkgs"><i class="fa fa-check"></i><b>4.2.1</b> Packages</a></li>
 <li class="chapter" data-level="4.2.2" data-path="c04-getting-started.html"><a href="c04-getting-started.html#data"><i class="fa fa-check"></i><b>4.2.2</b> Data</a></li>
 <li class="chapter" data-level="4.2.3" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup-des-obj"><i class="fa fa-check"></i><b>4.2.3</b> Design objects</a></li>
 </ul></li>
@@ -358,7 +358,7 @@
 <li class="chapter" data-level="9.9.1" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#random-number-seeds"><i class="fa fa-check"></i><b>9.9.1</b> Random number seeds</a></li>
 <li class="chapter" data-level="9.9.2" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#descriptive-names-and-labels"><i class="fa fa-check"></i><b>9.9.2</b> Descriptive names and labels</a></li>
 </ul></li>
-<li class="chapter" data-level="9.10" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#summary"><i class="fa fa-check"></i><b>9.10</b> Summary</a></li>
+<li class="chapter" data-level="9.10" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#additional-resources-1"><i class="fa fa-check"></i><b>9.10</b> Additional resources</a></li>
 </ul></li>
 <li class="part"><span><b>IV Real life data</b></span></li>
 <li class="chapter" data-level="10" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html"><i class="fa fa-check"></i><b>10</b> Sample designs and replicate weights</a>
@@ -380,7 +380,7 @@
 <ul>
 <li class="chapter" data-level="10.4.1" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#balanced-repeated-replication-brr-method"><i class="fa fa-check"></i><b>10.4.1</b> Balanced Repeated Replication (BRR) method</a></li>
 <li class="chapter" data-level="10.4.2" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#fays-brr-method"><i class="fa fa-check"></i><b>10.4.2</b> Fay’s BRR method</a></li>
-<li class="chapter" data-level="10.4.3" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#jackknife-method"><i class="fa fa-check"></i><b>10.4.3</b> Jackknife method</a></li>
+<li class="chapter" data-level="10.4.3" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#samp-jackknife"><i class="fa fa-check"></i><b>10.4.3</b> Jackknife method</a></li>
 <li class="chapter" data-level="10.4.4" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#bootstrap-method"><i class="fa fa-check"></i><b>10.4.4</b> Bootstrap method</a></li>
 </ul></li>
 <li class="chapter" data-level="10.5" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#exercises-2"><i class="fa fa-check"></i><b>10.5</b> Exercises</a></li>
@@ -398,14 +398,14 @@
 <li class="chapter" data-level="11.4" data-path="c11-missing-data.html"><a href="c11-missing-data.html#analysis-with-missing-data"><i class="fa fa-check"></i><b>11.4</b> Analysis with missing data</a>
 <ul>
 <li class="chapter" data-level="11.4.1" data-path="c11-missing-data.html"><a href="c11-missing-data.html#recoding-missing-data"><i class="fa fa-check"></i><b>11.4.1</b> Recoding missing data</a></li>
-<li class="chapter" data-level="11.4.2" data-path="c11-missing-data.html"><a href="c11-missing-data.html#accounting-for-skip-patterns"><i class="fa fa-check"></i><b>11.4.2</b> Accounting for skip patterns</a></li>
+<li class="chapter" data-level="11.4.2" data-path="c11-missing-data.html"><a href="c11-missing-data.html#missing-skip-patt"><i class="fa fa-check"></i><b>11.4.2</b> Accounting for skip patterns</a></li>
 </ul></li>
 </ul></li>
 <li class="chapter" data-level="12" data-path="c12-recommendations.html"><a href="c12-recommendations.html"><i class="fa fa-check"></i><b>12</b> Successful survey analysis recommendations</a>
 <ul>
 <li class="chapter" data-level="" data-path="c12-recommendations.html"><a href="c12-recommendations.html#prereq12"><i class="fa fa-check"></i>Prerequisites</a></li>
 <li class="chapter" data-level="12.1" data-path="c12-recommendations.html"><a href="c12-recommendations.html#introduction-10"><i class="fa fa-check"></i><b>12.1</b> Introduction</a></li>
-<li class="chapter" data-level="12.2" data-path="c12-recommendations.html"><a href="c12-recommendations.html#recs-survey-process"><i class="fa fa-check"></i><b>12.2</b> Follow survey analysis process</a></li>
+<li class="chapter" data-level="12.2" data-path="c12-recommendations.html"><a href="c12-recommendations.html#recs-survey-process"><i class="fa fa-check"></i><b>12.2</b> Follow the survey analysis process</a></li>
 <li class="chapter" data-level="12.3" data-path="c12-recommendations.html"><a href="c12-recommendations.html#begin-with-descriptive-analysis"><i class="fa fa-check"></i><b>12.3</b> Begin with descriptive analysis</a>
 <ul>
 <li class="chapter" data-level="12.3.1" data-path="c12-recommendations.html"><a href="c12-recommendations.html#table-review"><i class="fa fa-check"></i><b>12.3.1</b> Table review</a></li>
@@ -524,60 +524,60 @@ <h3>Prerequisites<a href="c05-descriptive-analysis.html#prereq5" class="anchor-s
 </div>
 <div class="prereqbox">
 <p>For this chapter, load the following packages:</p>
-<div class="sourceCode" id="cb57"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb57-1"><a href="c05-descriptive-analysis.html#cb57-1" tabindex="-1"></a><span class="fu">library</span>(tidyverse)</span>
-<span id="cb57-2"><a href="c05-descriptive-analysis.html#cb57-2" tabindex="-1"></a><span class="fu">library</span>(srvyr)</span>
-<span id="cb57-3"><a href="c05-descriptive-analysis.html#cb57-3" tabindex="-1"></a><span class="fu">library</span>(srvyrexploR)</span>
-<span id="cb57-4"><a href="c05-descriptive-analysis.html#cb57-4" tabindex="-1"></a><span class="fu">library</span>(broom)</span></code></pre></div>
-<p>We will be using data from ANES and RECS described in Chapter <a href="c04-getting-started.html#c04-getting-started">4</a>. As a reminder, here is the code to create the design objects for each to use throughout this chapter. For ANES, we need to adjust the weight so it sums to the population instead of the sample (see the ANES documentation and Chapter <a href="c04-getting-started.html#c04-getting-started">4</a> for more information).</p>
-<div class="sourceCode" id="cb58"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb58-1"><a href="c05-descriptive-analysis.html#cb58-1" tabindex="-1"></a>targetpop <span class="ot">&lt;-</span> <span class="dv">231592693</span></span>
-<span id="cb58-2"><a href="c05-descriptive-analysis.html#cb58-2" tabindex="-1"></a></span>
-<span id="cb58-3"><a href="c05-descriptive-analysis.html#cb58-3" tabindex="-1"></a>anes_adjwgt <span class="ot">&lt;-</span> anes_2020 <span class="sc">%&gt;%</span></span>
-<span id="cb58-4"><a href="c05-descriptive-analysis.html#cb58-4" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">Weight =</span> Weight <span class="sc">/</span> <span class="fu">sum</span>(Weight) <span class="sc">*</span> targetpop)</span>
-<span id="cb58-5"><a href="c05-descriptive-analysis.html#cb58-5" tabindex="-1"></a></span>
-<span id="cb58-6"><a href="c05-descriptive-analysis.html#cb58-6" tabindex="-1"></a>anes_des <span class="ot">&lt;-</span> anes_adjwgt <span class="sc">%&gt;%</span></span>
-<span id="cb58-7"><a href="c05-descriptive-analysis.html#cb58-7" tabindex="-1"></a>  <span class="fu">as_survey_design</span>(</span>
-<span id="cb58-8"><a href="c05-descriptive-analysis.html#cb58-8" tabindex="-1"></a>    <span class="at">weights =</span> Weight,</span>
-<span id="cb58-9"><a href="c05-descriptive-analysis.html#cb58-9" tabindex="-1"></a>    <span class="at">strata =</span> Stratum,</span>
-<span id="cb58-10"><a href="c05-descriptive-analysis.html#cb58-10" tabindex="-1"></a>    <span class="at">ids =</span> VarUnit,</span>
-<span id="cb58-11"><a href="c05-descriptive-analysis.html#cb58-11" tabindex="-1"></a>    <span class="at">nest =</span> <span class="cn">TRUE</span></span>
-<span id="cb58-12"><a href="c05-descriptive-analysis.html#cb58-12" tabindex="-1"></a>  )</span></code></pre></div>
+<div class="sourceCode" id="cb56"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb56-1"><a href="c05-descriptive-analysis.html#cb56-1" tabindex="-1"></a><span class="fu">library</span>(tidyverse)</span>
+<span id="cb56-2"><a href="c05-descriptive-analysis.html#cb56-2" tabindex="-1"></a><span class="fu">library</span>(srvyr)</span>
+<span id="cb56-3"><a href="c05-descriptive-analysis.html#cb56-3" tabindex="-1"></a><span class="fu">library</span>(srvyrexploR)</span>
+<span id="cb56-4"><a href="c05-descriptive-analysis.html#cb56-4" tabindex="-1"></a><span class="fu">library</span>(broom)</span></code></pre></div>
+<p>We are using data from ANES and RECS described in Chapter <a href="c04-getting-started.html#c04-getting-started">4</a>. As a reminder, here is the code to create the design objects for each to use throughout this chapter. For ANES, we need to adjust the weight so it sums to the population instead of the sample (see the ANES documentation and Chapter <a href="c04-getting-started.html#c04-getting-started">4</a> for more information.)</p>
+<div class="sourceCode" id="cb57"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb57-1"><a href="c05-descriptive-analysis.html#cb57-1" tabindex="-1"></a>targetpop <span class="ot">&lt;-</span> <span class="dv">231592693</span></span>
+<span id="cb57-2"><a href="c05-descriptive-analysis.html#cb57-2" tabindex="-1"></a></span>
+<span id="cb57-3"><a href="c05-descriptive-analysis.html#cb57-3" tabindex="-1"></a>anes_adjwgt <span class="ot">&lt;-</span> anes_2020 <span class="sc">%&gt;%</span></span>
+<span id="cb57-4"><a href="c05-descriptive-analysis.html#cb57-4" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">Weight =</span> Weight <span class="sc">/</span> <span class="fu">sum</span>(Weight) <span class="sc">*</span> targetpop)</span>
+<span id="cb57-5"><a href="c05-descriptive-analysis.html#cb57-5" tabindex="-1"></a></span>
+<span id="cb57-6"><a href="c05-descriptive-analysis.html#cb57-6" tabindex="-1"></a>anes_des <span class="ot">&lt;-</span> anes_adjwgt <span class="sc">%&gt;%</span></span>
+<span id="cb57-7"><a href="c05-descriptive-analysis.html#cb57-7" tabindex="-1"></a>  <span class="fu">as_survey_design</span>(</span>
+<span id="cb57-8"><a href="c05-descriptive-analysis.html#cb57-8" tabindex="-1"></a>    <span class="at">weights =</span> Weight,</span>
+<span id="cb57-9"><a href="c05-descriptive-analysis.html#cb57-9" tabindex="-1"></a>    <span class="at">strata =</span> Stratum,</span>
+<span id="cb57-10"><a href="c05-descriptive-analysis.html#cb57-10" tabindex="-1"></a>    <span class="at">ids =</span> VarUnit,</span>
+<span id="cb57-11"><a href="c05-descriptive-analysis.html#cb57-11" tabindex="-1"></a>    <span class="at">nest =</span> <span class="cn">TRUE</span></span>
+<span id="cb57-12"><a href="c05-descriptive-analysis.html#cb57-12" tabindex="-1"></a>  )</span></code></pre></div>
 <p>For RECS, details are included in the RECS documentation and Chapters <a href="c04-getting-started.html#c04-getting-started">4</a> and <a href="c10-sample-designs-replicate-weights.html#c10-sample-designs-replicate-weights">10</a>.</p>
-<div class="sourceCode" id="cb59"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb59-1"><a href="c05-descriptive-analysis.html#cb59-1" tabindex="-1"></a>recs_des <span class="ot">&lt;-</span> recs_2020 <span class="sc">%&gt;%</span></span>
-<span id="cb59-2"><a href="c05-descriptive-analysis.html#cb59-2" tabindex="-1"></a>  <span class="fu">as_survey_rep</span>(</span>
-<span id="cb59-3"><a href="c05-descriptive-analysis.html#cb59-3" tabindex="-1"></a>    <span class="at">weights =</span> NWEIGHT,</span>
-<span id="cb59-4"><a href="c05-descriptive-analysis.html#cb59-4" tabindex="-1"></a>    <span class="at">repweights =</span> NWEIGHT1<span class="sc">:</span>NWEIGHT60,</span>
-<span id="cb59-5"><a href="c05-descriptive-analysis.html#cb59-5" tabindex="-1"></a>    <span class="at">type =</span> <span class="st">&quot;JK1&quot;</span>,</span>
-<span id="cb59-6"><a href="c05-descriptive-analysis.html#cb59-6" tabindex="-1"></a>    <span class="at">scale =</span> <span class="dv">59</span><span class="sc">/</span><span class="dv">60</span>,</span>
-<span id="cb59-7"><a href="c05-descriptive-analysis.html#cb59-7" tabindex="-1"></a>    <span class="at">mse =</span> <span class="cn">TRUE</span></span>
-<span id="cb59-8"><a href="c05-descriptive-analysis.html#cb59-8" tabindex="-1"></a>  )</span></code></pre></div>
+<div class="sourceCode" id="cb58"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb58-1"><a href="c05-descriptive-analysis.html#cb58-1" tabindex="-1"></a>recs_des <span class="ot">&lt;-</span> recs_2020 <span class="sc">%&gt;%</span></span>
+<span id="cb58-2"><a href="c05-descriptive-analysis.html#cb58-2" tabindex="-1"></a>  <span class="fu">as_survey_rep</span>(</span>
+<span id="cb58-3"><a href="c05-descriptive-analysis.html#cb58-3" tabindex="-1"></a>    <span class="at">weights =</span> NWEIGHT,</span>
+<span id="cb58-4"><a href="c05-descriptive-analysis.html#cb58-4" tabindex="-1"></a>    <span class="at">repweights =</span> NWEIGHT1<span class="sc">:</span>NWEIGHT60,</span>
+<span id="cb58-5"><a href="c05-descriptive-analysis.html#cb58-5" tabindex="-1"></a>    <span class="at">type =</span> <span class="st">&quot;JK1&quot;</span>,</span>
+<span id="cb58-6"><a href="c05-descriptive-analysis.html#cb58-6" tabindex="-1"></a>    <span class="at">scale =</span> <span class="dv">59</span><span class="sc">/</span><span class="dv">60</span>,</span>
+<span id="cb58-7"><a href="c05-descriptive-analysis.html#cb58-7" tabindex="-1"></a>    <span class="at">mse =</span> <span class="cn">TRUE</span></span>
+<span id="cb58-8"><a href="c05-descriptive-analysis.html#cb58-8" tabindex="-1"></a>  )</span></code></pre></div>
 </div>
 <div id="introduction-4" class="section level2 hasAnchor" number="5.1">
 <h2><span class="header-section-number">5.1</span> Introduction<a href="c05-descriptive-analysis.html#introduction-4" class="anchor-section" aria-label="Anchor link to header"></a></h2>
-<p>Descriptive analyses, such as basic counts, cross-tabulations, or means, are one of the first steps in making sense of our survey results. By reviewing the findings, analysts can glean insight into the data, the underlying population, and any unique aspects of the data or population. For example, if only 10% of the survey respondents are male, it could indicate a unique population, a potential error or bias, an intentional survey sampling method, or other factors. Additionally, descriptive analyses allow analysts to provide summaries like means, proportions, or other measures to make estimates about the population. These analyses lay the groundwork for the next steps of running statistical tests or developing models.</p>
-<p>We will discuss many different types of descriptive analyses in this chapter. However, it is important to know what type of data we are working with and which statistics are appropriate. In survey data, we typically consider data as one of four main types:</p>
+<p>Descriptive analyses, such as basic counts, cross-tabulations, or means, are among the first steps in making sense of our survey results. By reviewing the findings, we can glean insight into the data, the underlying population, and any unique aspects of the data or population. For example, if only 10% of the survey respondents are male, it could indicate a unique population, a potential error or bias, an intentional survey sampling method, or other factors. Additionally, descriptive analyses allow us to provide summaries like means, proportions, or other measures to make estimates about the population. These analyses lay the groundwork for the next steps of running statistical tests or developing models.</p>
+<p>We discuss many different types of descriptive analyses in this chapter. However, it is important to know what type of data we are working with and which statistics are appropriate. In survey data, we typically consider data as one of four main types:</p>
 <ul>
 <li>Categorical/nominal data: variables with levels or descriptions that cannot be ordered, such as the region of the country (North, South, East, and West)</li>
 <li>Ordinal data: variables that can be ordered, such as those from a Likert scale (strongly disagree, disagree, agree, and strongly agree)</li>
 <li>Discrete data: variables that are counted or measured, such as number of children</li>
 <li>Continuous data, variables that are measured and whose values can lie anywhere on an interval, such as income</li>
 </ul>
-<p>This chapter will discuss how to analyze <em>measures of distribution</em> (e.g., cross-tabulations), <em>central tendency</em> (e.g., means), <em>relationship</em> (e.g., ratios), and <em>dispersion</em> (e.g., standard deviation) using functions from the {srvyr} package <span class="citation">(<a href="#ref-R-srvyr">Freedman Ellis and Schneider 2023</a>)</span>.</p>
-<p><strong>Measures of distribution</strong> describe how often an event or response occurs. These measures include counts and totals. We will cover the following functions:</p>
+<p>This chapter discusses how to analyze <em>measures of distribution</em> (e.g., cross-tabulations), <em>central tendency</em> (e.g., means), <em>relationship</em> (e.g., ratios), and <em>dispersion</em> (e.g., standard deviation) using functions from the {srvyr} package <span class="citation">(<a href="#ref-R-srvyr">Freedman Ellis and Schneider 2023</a>)</span>.</p>
+<p><strong>Measures of distribution</strong> describe how often an event or response occurs. These measures include counts and totals. We cover the following functions:</p>
 <ul>
 <li>Count of observations (<code>survey_count()</code> and <code>survey_tally()</code>)</li>
 <li>Summation of variables (<code>survey_total()</code>)</li>
 </ul>
-<p><strong>Measures of central tendency</strong> find the central (or average) responses. These measures include means and medians. We will cover the following functions:</p>
+<p><strong>Measures of central tendency</strong> find the central (or average) responses. These measures include means and medians. We cover the following functions:</p>
 <ul>
 <li>Means and proportions (<code>survey_mean()</code> and <code>survey_prop()</code>)</li>
 <li>Quantiles and medians (<code>survey_quantile()</code> and <code>survey_median()</code>)</li>
 </ul>
-<p><strong>Measures of relationship</strong> describe how variables relate to each other. These measures include correlations and ratios. We will cover the following functions:</p>
+<p><strong>Measures of relationship</strong> describe how variables relate to each other. These measures include correlations and ratios. We cover the following functions:</p>
 <ul>
 <li>Correlations (<code>survey_corr()</code>)</li>
 <li>Ratios (<code>survey_ratio()</code>)</li>
 </ul>
-<p><strong>Measures of dispersion</strong> describe how data spreads around the central tendency for continuous variables. These measures include standard deviations and variances. We will cover the following functions:</p>
+<p><strong>Measures of dispersion</strong> describe how data spread around the central tendency for continuous variables. These measures include standard deviations and variances. We cover the following functions:</p>
 <ul>
 <li>Variances and standard deviations (<code>survey_var()</code> and <code>survey_sd()</code>)</li>
 </ul>
@@ -588,25 +588,25 @@ <h2><span class="header-section-number">5.1</span> Introduction<a href="c05-desc
 <li>Specify domains of analysis using <code>srvyr::group_by()</code>, if needed.</li>
 <li>Analyze the data with survey-specific functions.</li>
 </ol>
-<p>This chapter will walk through how to apply the survey functions in Step 4. Note that unless otherwise specified, our estimates will be weighted as a result of setting up the survey design object.</p>
-<p>To look at the data by different subgroups, we can choose to filter and/or group the data. It is very important that we filter and group the data only <em>after</em> creating the design object. This ensures that the results accurately reflect the survey design. If we filter or group data before creating the survey design object, the data for those cases is not included in the survey design information and estimations of the variance, leading to inaccurate results.</p>
-<p>For the sake of simplicity, we’ve removed cases with missing values in the examples below. If you want a more detailed explanation on how to handle missing data, please refer to Chapter <a href="c11-missing-data.html#c11-missing-data">11</a>.</p>
+<p>This chapter walks through how to apply the survey functions in Step 4. Note that unless otherwise specified, our estimates are weighted as a result of setting up the survey design object.</p>
+<p>To look at the data by different subgroups, we can choose to filter and/or group the data. It is very important that we filter and group the data only <em>after</em> creating the design object. This ensures that the results accurately reflect the survey design. If we filter or group data before creating the survey design object, the data for those cases are not included in the survey design information and estimations of the variance, leading to inaccurate results.</p>
+<p>For the sake of simplicity, we’ve removed cases with missing values in the examples below. For a more detailed explanation of how to handle missing data, please refer to Chapter <a href="c11-missing-data.html#c11-missing-data">11</a>.</p>
 </div>
 <div id="counts-and-cross-tabulations" class="section level2 hasAnchor" number="5.2">
 <h2><span class="header-section-number">5.2</span> Counts and cross-tabulations<a href="c05-descriptive-analysis.html#counts-and-cross-tabulations" class="anchor-section" aria-label="Anchor link to header"></a></h2>
-<p>Using <code>survey_count()</code> and <code>survey_tally()</code>, we can calculate the estimated population counts for a given variable or combination of variables. These summaries, often referred to as cross-tabulations or crosstabs, are applied to categorical data. They help in estimating counts of the population size for different groups based on the survey data.</p>
+<p>Using <code>survey_count()</code> and <code>survey_tally()</code>, we can calculate the estimated population counts for a given variable or combination of variables. These summaries, often referred to as cross-tabulations or cross-tabs, are applied to categorical data. They help in estimating counts of the population size for different groups based on the survey data.</p>
 <div id="desc-count-syntax" class="section level3 hasAnchor" number="5.2.1">
 <h3><span class="header-section-number">5.2.1</span> Syntax<a href="c05-descriptive-analysis.html#desc-count-syntax" class="anchor-section" aria-label="Anchor link to header"></a></h3>
 <p>The syntax for <code>survey_count()</code> is similar to the <code>dplyr::count()</code> syntax, as mentioned in Chapter <a href="c04-getting-started.html#c04-getting-started">4</a>. However, as noted above, this function can only be called on <code>tbl_svy</code> objects. Let’s explore the syntax:</p>
-<div class="sourceCode" id="cb60"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb60-1"><a href="c05-descriptive-analysis.html#cb60-1" tabindex="-1"></a><span class="fu">survey_count</span>(</span>
-<span id="cb60-2"><a href="c05-descriptive-analysis.html#cb60-2" tabindex="-1"></a>  x,</span>
-<span id="cb60-3"><a href="c05-descriptive-analysis.html#cb60-3" tabindex="-1"></a>  ...,</span>
-<span id="cb60-4"><a href="c05-descriptive-analysis.html#cb60-4" tabindex="-1"></a>  <span class="at">wt =</span> <span class="cn">NULL</span>,</span>
-<span id="cb60-5"><a href="c05-descriptive-analysis.html#cb60-5" tabindex="-1"></a>  <span class="at">sort =</span> <span class="cn">FALSE</span>,</span>
-<span id="cb60-6"><a href="c05-descriptive-analysis.html#cb60-6" tabindex="-1"></a>  <span class="at">name =</span> <span class="st">&quot;n&quot;</span>,</span>
-<span id="cb60-7"><a href="c05-descriptive-analysis.html#cb60-7" tabindex="-1"></a>  <span class="at">.drop =</span> dplyr<span class="sc">::</span><span class="fu">group_by_drop_default</span>(x),</span>
-<span id="cb60-8"><a href="c05-descriptive-analysis.html#cb60-8" tabindex="-1"></a>  <span class="at">vartype =</span> <span class="fu">c</span>(<span class="st">&quot;se&quot;</span>, <span class="st">&quot;ci&quot;</span>, <span class="st">&quot;var&quot;</span>, <span class="st">&quot;cv&quot;</span>)</span>
-<span id="cb60-9"><a href="c05-descriptive-analysis.html#cb60-9" tabindex="-1"></a>  )</span></code></pre></div>
+<div class="sourceCode" id="cb59"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb59-1"><a href="c05-descriptive-analysis.html#cb59-1" tabindex="-1"></a><span class="fu">survey_count</span>(</span>
+<span id="cb59-2"><a href="c05-descriptive-analysis.html#cb59-2" tabindex="-1"></a>  x,</span>
+<span id="cb59-3"><a href="c05-descriptive-analysis.html#cb59-3" tabindex="-1"></a>  ...,</span>
+<span id="cb59-4"><a href="c05-descriptive-analysis.html#cb59-4" tabindex="-1"></a>  <span class="at">wt =</span> <span class="cn">NULL</span>,</span>
+<span id="cb59-5"><a href="c05-descriptive-analysis.html#cb59-5" tabindex="-1"></a>  <span class="at">sort =</span> <span class="cn">FALSE</span>,</span>
+<span id="cb59-6"><a href="c05-descriptive-analysis.html#cb59-6" tabindex="-1"></a>  <span class="at">name =</span> <span class="st">&quot;n&quot;</span>,</span>
+<span id="cb59-7"><a href="c05-descriptive-analysis.html#cb59-7" tabindex="-1"></a>  <span class="at">.drop =</span> dplyr<span class="sc">::</span><span class="fu">group_by_drop_default</span>(x),</span>
+<span id="cb59-8"><a href="c05-descriptive-analysis.html#cb59-8" tabindex="-1"></a>  <span class="at">vartype =</span> <span class="fu">c</span>(<span class="st">&quot;se&quot;</span>, <span class="st">&quot;ci&quot;</span>, <span class="st">&quot;var&quot;</span>, <span class="st">&quot;cv&quot;</span>)</span>
+<span id="cb59-9"><a href="c05-descriptive-analysis.html#cb59-9" tabindex="-1"></a>  )</span></code></pre></div>
 <p>The arguments are:</p>
 <ul>
 <li><code>x</code>: a <code>tbl_svy</code> object created by <code>as_survey</code></li>
@@ -617,14 +617,14 @@ <h3><span class="header-section-number">5.2.1</span> Syntax<a href="c05-descript
 <li><code>.drop</code>: whether to drop empty groups</li>
 <li><code>vartype</code>: type(s) of variation estimate to calculate including any of <code>c("se", "ci", "var", "cv")</code>, defaults to <code>se</code> (standard error) (see <a href="c05-descriptive-analysis.html#desc-count-syntax">5.2.1</a> for more information)</li>
 </ul>
-<p>To generate a count or crosstabs by different variables, we include them in the (<code>...</code>) argument. This argument can take any number of variables and will break down the counts by all combinations of the provided variables. This is similar to <code>dplyr::count()</code>. To obtain an estimate of the overall population, we can exclude any variables from the (<code>...</code>) argument or use the <code>survey_tally()</code> function. While the <code>survey_tally()</code> function has a similar syntax to the <code>survey_count()</code> function, it does not include the (<code>...</code>) or the <code>.drop</code> arguments:</p>
-<div class="sourceCode" id="cb61"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb61-1"><a href="c05-descriptive-analysis.html#cb61-1" tabindex="-1"></a><span class="fu">survey_tally</span>(</span>
-<span id="cb61-2"><a href="c05-descriptive-analysis.html#cb61-2" tabindex="-1"></a>  x,</span>
-<span id="cb61-3"><a href="c05-descriptive-analysis.html#cb61-3" tabindex="-1"></a>  wt,</span>
-<span id="cb61-4"><a href="c05-descriptive-analysis.html#cb61-4" tabindex="-1"></a>  <span class="at">sort =</span> <span class="cn">FALSE</span>,</span>
-<span id="cb61-5"><a href="c05-descriptive-analysis.html#cb61-5" tabindex="-1"></a>  <span class="at">name =</span> <span class="st">&quot;n&quot;</span>,</span>
-<span id="cb61-6"><a href="c05-descriptive-analysis.html#cb61-6" tabindex="-1"></a>  <span class="at">vartype =</span> <span class="fu">c</span>(<span class="st">&quot;se&quot;</span>, <span class="st">&quot;ci&quot;</span>, <span class="st">&quot;var&quot;</span>, <span class="st">&quot;cv&quot;</span>)</span>
-<span id="cb61-7"><a href="c05-descriptive-analysis.html#cb61-7" tabindex="-1"></a>)</span></code></pre></div>
+<p>To generate a count or cross-tabs by different variables, we include them in the (<code>...</code>) argument. This argument can take any number of variables and breaks down the counts by all combinations of the provided variables. This is similar to <code>dplyr::count()</code>. To obtain an estimate of the overall population, we can exclude any variables from the (<code>...</code>) argument or use the <code>survey_tally()</code> function. While the <code>survey_tally()</code> function has a similar syntax to the <code>survey_count()</code> function, it does not include the (<code>...</code>) or the <code>.drop</code> arguments:</p>
+<div class="sourceCode" id="cb60"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb60-1"><a href="c05-descriptive-analysis.html#cb60-1" tabindex="-1"></a><span class="fu">survey_tally</span>(</span>
+<span id="cb60-2"><a href="c05-descriptive-analysis.html#cb60-2" tabindex="-1"></a>  x,</span>
+<span id="cb60-3"><a href="c05-descriptive-analysis.html#cb60-3" tabindex="-1"></a>  wt,</span>
+<span id="cb60-4"><a href="c05-descriptive-analysis.html#cb60-4" tabindex="-1"></a>  <span class="at">sort =</span> <span class="cn">FALSE</span>,</span>
+<span id="cb60-5"><a href="c05-descriptive-analysis.html#cb60-5" tabindex="-1"></a>  <span class="at">name =</span> <span class="st">&quot;n&quot;</span>,</span>
+<span id="cb60-6"><a href="c05-descriptive-analysis.html#cb60-6" tabindex="-1"></a>  <span class="at">vartype =</span> <span class="fu">c</span>(<span class="st">&quot;se&quot;</span>, <span class="st">&quot;ci&quot;</span>, <span class="st">&quot;var&quot;</span>, <span class="st">&quot;cv&quot;</span>)</span>
+<span id="cb60-7"><a href="c05-descriptive-analysis.html#cb60-7" tabindex="-1"></a>)</span></code></pre></div>
 <p>Both functions include the <code>vartype</code> argument with four different values:</p>
 <ul>
 <li><code>se</code>: standard error
@@ -635,7 +635,7 @@ <h3><span class="header-section-number">5.2.1</span> Syntax<a href="c05-descript
 <li><code>ci</code>: confidence interval
 <ul>
 <li>The lower and upper limits of a confidence interval</li>
-<li>Output has a column with the variable name specified in the <code>name</code> argument with a suffix of “_low” and “_upp”</li>
+<li>Output has two columns with the variable name specified in the <code>name</code> argument with a suffix of “_low” and “_upp”</li>
 <li>By default, this is a 95% confidence interval but can be changed by using the argument level and specifying a number between 0 and 1. For example, <code>level=0.8</code> would produce a 80% confidence interval.</li>
 </ul></li>
 <li><code>var</code>: variance
@@ -651,33 +651,33 @@ <h3><span class="header-section-number">5.2.1</span> Syntax<a href="c05-descript
 </ul>
 <p>The confidence intervals are always calculated using a symmetric t-distribution based method, given by the formula:</p>
 <p><span class="math display">\[ \text{estimate} \pm t^*_{df}\times SE\]</span>
-where <span class="math inline">\(t^*_{df}\)</span> is the critical value from a t-distribution based on the confidence level and the degrees of freedom. By default, the degrees of freedom are based on the design or number of replicates, but they can be specified using the <code>df</code> argument. For survey design objects, the degrees of freedom are calculated as the number of PSUs minus the number of strata. For replicate-based objects, the degrees of freedom are calculated as one less than the rank of the matrix of replicate weight, where the number of replicates is typically the rank. Note that specifying <code>df = Inf</code> is equivalent to using a normal (z-based) confidence interval – this is the default in {survey}. These variability types are the same for most of the survey functions, and we will provide examples using different variability types throughout this chapter.</p>
+where <span class="math inline">\(t^*_{df}\)</span> is the critical value from a t-distribution based on the confidence level and the degrees of freedom. By default, the degrees of freedom are based on the design or number of replicates, but they can be specified using the <code>df</code> argument. For survey design objects, the degrees of freedom are calculated as the number of primary sampling units (PSUs or clusters) minus the number of strata (see Chapter <a href="c10-sample-designs-replicate-weights.html#c10-sample-designs-replicate-weights">10</a> for more information on PSUs, strata, and sample designs.) For replicate-based objects, the degrees of freedom are calculated as one less than the rank of the matrix of replicate weight, where the number of replicates is typically the rank. Note that specifying <code>df = Inf</code> is equivalent to using a normal (z-based) confidence interval – this is the default in {survey}. These variability types are the same for most of the survey functions, and we provide examples using different variability types throughout this chapter.</p>
 </div>
 <div id="examples" class="section level3 hasAnchor" number="5.2.2">
 <h3><span class="header-section-number">5.2.2</span> Examples<a href="c05-descriptive-analysis.html#examples" class="anchor-section" aria-label="Anchor link to header"></a></h3>
 <div id="example-1-estimated-population-count" class="section level4 unnumbered hasAnchor">
 <h4>Example 1: Estimated population count<a href="c05-descriptive-analysis.html#example-1-estimated-population-count" class="anchor-section" aria-label="Anchor link to header"></a></h4>
-<p>If we want to obtain the estimated number of households in the U.S. (the target population) using the Residential Energy Consumption Survey (RECS) data, we can use <code>survey_count()</code>. If we do not specify any variables in the <code>survey_count()</code> function, it will output the estimated population count (<code>n</code>) and its corresponding standard error (<code>n_se</code>).</p>
-<div class="sourceCode" id="cb62"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb62-1"><a href="c05-descriptive-analysis.html#cb62-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
-<span id="cb62-2"><a href="c05-descriptive-analysis.html#cb62-2" tabindex="-1"></a>  <span class="fu">survey_count</span>() </span></code></pre></div>
+<p>If we want to obtain the estimated number of households in the U.S. (the population of interest) using the Residential Energy Consumption Survey (RECS) data, we can use <code>survey_count()</code>. If we do not specify any variables in the <code>survey_count()</code> function, it outputs the estimated population count (<code>n</code>) and its corresponding standard error (<code>n_se</code>.)</p>
+<div class="sourceCode" id="cb61"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb61-1"><a href="c05-descriptive-analysis.html#cb61-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
+<span id="cb61-2"><a href="c05-descriptive-analysis.html#cb61-2" tabindex="-1"></a>  <span class="fu">survey_count</span>() </span></code></pre></div>
 <pre><code>## # A tibble: 1 × 2
 ##            n  n_se
 ##        &lt;dbl&gt; &lt;dbl&gt;
 ## 1 123529025. 0.148</code></pre>
 <p>Based on this calculation, the estimated number of households in the U.S. is 123,529,025.</p>
 <p>Alternatively, we could also use the <code>survey_tally()</code> function. The example below yields the same results as <code>survey_count()</code>.</p>
-<div class="sourceCode" id="cb64"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb64-1"><a href="c05-descriptive-analysis.html#cb64-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
-<span id="cb64-2"><a href="c05-descriptive-analysis.html#cb64-2" tabindex="-1"></a>  <span class="fu">survey_tally</span>() </span></code></pre></div>
+<div class="sourceCode" id="cb63"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb63-1"><a href="c05-descriptive-analysis.html#cb63-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
+<span id="cb63-2"><a href="c05-descriptive-analysis.html#cb63-2" tabindex="-1"></a>  <span class="fu">survey_tally</span>() </span></code></pre></div>
 <pre><code>## # A tibble: 1 × 2
 ##            n  n_se
 ##        &lt;dbl&gt; &lt;dbl&gt;
 ## 1 123529025. 0.148</code></pre>
 </div>
-<div id="example-2-estimated-counts-by-subgroups-crosstabs" class="section level4 unnumbered hasAnchor">
-<h4>Example 2: Estimated counts by subgroups (crosstabs)<a href="c05-descriptive-analysis.html#example-2-estimated-counts-by-subgroups-crosstabs" class="anchor-section" aria-label="Anchor link to header"></a></h4>
+<div id="example-2-estimated-counts-by-subgroups-cross-tabs" class="section level4 unnumbered hasAnchor">
+<h4>Example 2: Estimated counts by subgroups (cross-tabs)<a href="c05-descriptive-analysis.html#example-2-estimated-counts-by-subgroups-cross-tabs" class="anchor-section" aria-label="Anchor link to header"></a></h4>
 <p>To calculate the estimated number of observations for specific subgroups, such as Region and Division, we can include the variables of interest in the <code>survey_count()</code> function. In the example below, we calculate the estimated number of housing units by region and division. The argument <code>name =</code> in <code>survey_count()</code> allows us to change the name of the count variable in the output from the default <code>n</code> to <code>N</code>.</p>
-<div class="sourceCode" id="cb66"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb66-1"><a href="c05-descriptive-analysis.html#cb66-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
-<span id="cb66-2"><a href="c05-descriptive-analysis.html#cb66-2" tabindex="-1"></a>  <span class="fu">survey_count</span>(Region, Division, <span class="at">name =</span> <span class="st">&quot;N&quot;</span>) </span></code></pre></div>
+<div class="sourceCode" id="cb65"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb65-1"><a href="c05-descriptive-analysis.html#cb65-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
+<span id="cb65-2"><a href="c05-descriptive-analysis.html#cb65-2" tabindex="-1"></a>  <span class="fu">survey_count</span>(Region, Division, <span class="at">name =</span> <span class="st">&quot;N&quot;</span>) </span></code></pre></div>
 <pre><code>## # A tibble: 10 × 4
 ##    Region    Division                   N         N_se
 ##    &lt;fct&gt;     &lt;fct&gt;                  &lt;dbl&gt;        &lt;dbl&gt;
@@ -691,19 +691,19 @@ <h4>Example 2: Estimated counts by subgroups (crosstabs)<a href="c05-descriptive
 ##  8 West      Mountain North      4615844  0.119       
 ##  9 West      Mountain South      4602070  0.0000000492
 ## 10 West      Pacific            18505643. 0.00000295</code></pre>
-<p>When we run the crosstab, we see there are an estimated 5,876,166 housing units in the New England Division.</p>
-<p>The code will result in an error if we try to use the <code>survey_count()</code> syntax with <code>survey_tally()</code>:</p>
-<div class="sourceCode" id="cb68"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb68-1"><a href="c05-descriptive-analysis.html#cb68-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
-<span id="cb68-2"><a href="c05-descriptive-analysis.html#cb68-2" tabindex="-1"></a>  <span class="fu">survey_tally</span>(Region, Division, <span class="at">name =</span> <span class="st">&quot;N&quot;</span>) </span></code></pre></div>
+<p>When we run the cross-tab, we see there are an estimated 5,876,166 housing units in the New England Division.</p>
+<p>The code results in an error if we try to use the <code>survey_count()</code> syntax with <code>survey_tally()</code>:</p>
+<div class="sourceCode" id="cb67"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb67-1"><a href="c05-descriptive-analysis.html#cb67-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
+<span id="cb67-2"><a href="c05-descriptive-analysis.html#cb67-2" tabindex="-1"></a>  <span class="fu">survey_tally</span>(Region, Division, <span class="at">name =</span> <span class="st">&quot;N&quot;</span>) </span></code></pre></div>
 <pre><code>## Error in `dplyr::summarise()` at gergness-srvyr-1917f75/R/summarise.r:10:3:
 ## ℹ In argument: `N = survey_total(Region, vartype = vartype,
 ##   na.rm = TRUE)`.
 ## Caused by error:
 ## ! Factor not allowed in survey functions, should be used as a grouping variable.</code></pre>
-<p>Use a <code>group_by()</code> function prior to using <code>survey_tally()</code> to successfully run the crosstab:</p>
-<div class="sourceCode" id="cb70"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb70-1"><a href="c05-descriptive-analysis.html#cb70-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
-<span id="cb70-2"><a href="c05-descriptive-analysis.html#cb70-2" tabindex="-1"></a>  <span class="fu">group_by</span>(Region, Division) <span class="sc">%&gt;%</span></span>
-<span id="cb70-3"><a href="c05-descriptive-analysis.html#cb70-3" tabindex="-1"></a>  <span class="fu">survey_tally</span>(<span class="at">name =</span> <span class="st">&quot;N&quot;</span>) </span></code></pre></div>
+<p>Use a <code>group_by()</code> function prior to using <code>survey_tally()</code> to successfully run the cross-tab:</p>
+<div class="sourceCode" id="cb69"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb69-1"><a href="c05-descriptive-analysis.html#cb69-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
+<span id="cb69-2"><a href="c05-descriptive-analysis.html#cb69-2" tabindex="-1"></a>  <span class="fu">group_by</span>(Region, Division) <span class="sc">%&gt;%</span></span>
+<span id="cb69-3"><a href="c05-descriptive-analysis.html#cb69-3" tabindex="-1"></a>  <span class="fu">survey_tally</span>(<span class="at">name =</span> <span class="st">&quot;N&quot;</span>) </span></code></pre></div>
 <pre><code>## # A tibble: 10 × 4
 ## # Groups:   Region [4]
 ##    Region    Division                   N         N_se
@@ -727,14 +727,14 @@ <h2><span class="header-section-number">5.3</span> Totals and sums<a href="c05-d
 <div id="syntax" class="section level3 hasAnchor" number="5.3.1">
 <h3><span class="header-section-number">5.3.1</span> Syntax<a href="c05-descriptive-analysis.html#syntax" class="anchor-section" aria-label="Anchor link to header"></a></h3>
 <p>Here is the syntax:</p>
-<div class="sourceCode" id="cb72"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb72-1"><a href="c05-descriptive-analysis.html#cb72-1" tabindex="-1"></a><span class="fu">survey_total</span>(</span>
-<span id="cb72-2"><a href="c05-descriptive-analysis.html#cb72-2" tabindex="-1"></a>  x,</span>
-<span id="cb72-3"><a href="c05-descriptive-analysis.html#cb72-3" tabindex="-1"></a>  <span class="at">na.rm =</span> <span class="cn">FALSE</span>,</span>
-<span id="cb72-4"><a href="c05-descriptive-analysis.html#cb72-4" tabindex="-1"></a>  <span class="at">vartype =</span> <span class="fu">c</span>(<span class="st">&quot;se&quot;</span>, <span class="st">&quot;ci&quot;</span>, <span class="st">&quot;var&quot;</span>, <span class="st">&quot;cv&quot;</span>),</span>
-<span id="cb72-5"><a href="c05-descriptive-analysis.html#cb72-5" tabindex="-1"></a>  <span class="at">level =</span> <span class="fl">0.95</span>,</span>
-<span id="cb72-6"><a href="c05-descriptive-analysis.html#cb72-6" tabindex="-1"></a>  <span class="at">deff =</span> <span class="cn">FALSE</span>,</span>
-<span id="cb72-7"><a href="c05-descriptive-analysis.html#cb72-7" tabindex="-1"></a>  <span class="at">df =</span> <span class="cn">NULL</span></span>
-<span id="cb72-8"><a href="c05-descriptive-analysis.html#cb72-8" tabindex="-1"></a>)</span></code></pre></div>
+<div class="sourceCode" id="cb71"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb71-1"><a href="c05-descriptive-analysis.html#cb71-1" tabindex="-1"></a><span class="fu">survey_total</span>(</span>
+<span id="cb71-2"><a href="c05-descriptive-analysis.html#cb71-2" tabindex="-1"></a>  x,</span>
+<span id="cb71-3"><a href="c05-descriptive-analysis.html#cb71-3" tabindex="-1"></a>  <span class="at">na.rm =</span> <span class="cn">FALSE</span>,</span>
+<span id="cb71-4"><a href="c05-descriptive-analysis.html#cb71-4" tabindex="-1"></a>  <span class="at">vartype =</span> <span class="fu">c</span>(<span class="st">&quot;se&quot;</span>, <span class="st">&quot;ci&quot;</span>, <span class="st">&quot;var&quot;</span>, <span class="st">&quot;cv&quot;</span>),</span>
+<span id="cb71-5"><a href="c05-descriptive-analysis.html#cb71-5" tabindex="-1"></a>  <span class="at">level =</span> <span class="fl">0.95</span>,</span>
+<span id="cb71-6"><a href="c05-descriptive-analysis.html#cb71-6" tabindex="-1"></a>  <span class="at">deff =</span> <span class="cn">FALSE</span>,</span>
+<span id="cb71-7"><a href="c05-descriptive-analysis.html#cb71-7" tabindex="-1"></a>  <span class="at">df =</span> <span class="cn">NULL</span></span>
+<span id="cb71-8"><a href="c05-descriptive-analysis.html#cb71-8" tabindex="-1"></a>)</span></code></pre></div>
 <p>The arguments are:</p>
 <ul>
 <li><code>x</code>: a variable, expression, or empty</li>
@@ -749,20 +749,20 @@ <h3><span class="header-section-number">5.3.1</span> Syntax<a href="c05-descript
 <h3><span class="header-section-number">5.3.2</span> Examples<a href="c05-descriptive-analysis.html#examples-1" class="anchor-section" aria-label="Anchor link to header"></a></h3>
 <div id="example-1-estimated-population-count-1" class="section level4 unnumbered hasAnchor">
 <h4>Example 1: Estimated population count<a href="c05-descriptive-analysis.html#example-1-estimated-population-count-1" class="anchor-section" aria-label="Anchor link to header"></a></h4>
-<p>To calculate a population count estimate with <code>survey_total()</code>, we leave the argument <code>x</code> empty as shown in the example below:</p>
-<div class="sourceCode" id="cb73"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb73-1"><a href="c05-descriptive-analysis.html#cb73-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
-<span id="cb73-2"><a href="c05-descriptive-analysis.html#cb73-2" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">Tot =</span> <span class="fu">survey_total</span>())  </span></code></pre></div>
+<p>To calculate a population count estimate with <code>survey_total()</code>, we leave the argument <code>x</code> empty, as shown in the example below:</p>
+<div class="sourceCode" id="cb72"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb72-1"><a href="c05-descriptive-analysis.html#cb72-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
+<span id="cb72-2"><a href="c05-descriptive-analysis.html#cb72-2" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">Tot =</span> <span class="fu">survey_total</span>())  </span></code></pre></div>
 <pre><code>## # A tibble: 1 × 2
 ##          Tot Tot_se
 ##        &lt;dbl&gt;  &lt;dbl&gt;
 ## 1 123529025.  0.148</code></pre>
-<p>The estimated number of households in the U.S. is 123,529,025. Note that this result obtained from <code>recs_des %&gt;% summarize(survey_total())</code> is equivalent to the ones from the <code>survey_count()</code> and <code>survey_tally()</code> functions. However, the <code>survey_total()</code> function is called within <code>summarize</code>, whereas <code>survey_count()</code> and <code>survey_tally()</code> are not.</p>
+<p>The estimated number of households in the U.S. is 123,529,025. Note that this result obtained from <code>survey_total()</code> is equivalent to the ones from the <code>survey_count()</code> and <code>survey_tally()</code> functions. However, the <code>survey_total()</code> function is called within <code>summarize()</code>, whereas <code>survey_count()</code> and <code>survey_tally()</code> are not.</p>
 </div>
 <div id="example-2-overall-summation-of-continuous-variables" class="section level4 unnumbered hasAnchor">
 <h4>Example 2: Overall summation of continuous variables<a href="c05-descriptive-analysis.html#example-2-overall-summation-of-continuous-variables" class="anchor-section" aria-label="Anchor link to header"></a></h4>
 <p>The distinction between <code>survey_total()</code> and <code>survey_count()</code> becomes more evident when working with continuous variables. Let’s compute the total cost of electricity in whole dollars from variable <code>DOLLAREL</code><a href="#fn4" class="footnote-ref" id="fnref4"><sup>4</sup></a>.</p>
-<div class="sourceCode" id="cb75"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb75-1"><a href="c05-descriptive-analysis.html#cb75-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
-<span id="cb75-2"><a href="c05-descriptive-analysis.html#cb75-2" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">elec_bill =</span> <span class="fu">survey_total</span>(DOLLAREL))</span></code></pre></div>
+<div class="sourceCode" id="cb74"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb74-1"><a href="c05-descriptive-analysis.html#cb74-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
+<span id="cb74-2"><a href="c05-descriptive-analysis.html#cb74-2" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">elec_bill =</span> <span class="fu">survey_total</span>(DOLLAREL))</span></code></pre></div>
 <pre><code>## # A tibble: 1 × 2
 ##       elec_bill elec_bill_se
 ##           &lt;dbl&gt;        &lt;dbl&gt;
@@ -772,10 +772,10 @@ <h4>Example 2: Overall summation of continuous variables<a href="c05-descriptive
 <div id="example-3-summation-by-groups" class="section level4 unnumbered hasAnchor">
 <h4>Example 3: Summation by groups<a href="c05-descriptive-analysis.html#example-3-summation-by-groups" class="anchor-section" aria-label="Anchor link to header"></a></h4>
 <p>Since we are using the {srvyr} package, we can use <code>group_by()</code> to calculate the cost of electricity for different groups. Let’s examine the variations in the cost of electricity in whole dollars across regions and display the confidence interval instead of the default standard error.</p>
-<div class="sourceCode" id="cb77"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb77-1"><a href="c05-descriptive-analysis.html#cb77-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
-<span id="cb77-2"><a href="c05-descriptive-analysis.html#cb77-2" tabindex="-1"></a>  <span class="fu">group_by</span>(Region) <span class="sc">%&gt;%</span></span>
-<span id="cb77-3"><a href="c05-descriptive-analysis.html#cb77-3" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">elec_bill =</span> <span class="fu">survey_total</span>(DOLLAREL,</span>
-<span id="cb77-4"><a href="c05-descriptive-analysis.html#cb77-4" tabindex="-1"></a>                                     <span class="at">vartype =</span> <span class="st">&quot;ci&quot;</span>))</span></code></pre></div>
+<div class="sourceCode" id="cb76"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb76-1"><a href="c05-descriptive-analysis.html#cb76-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
+<span id="cb76-2"><a href="c05-descriptive-analysis.html#cb76-2" tabindex="-1"></a>  <span class="fu">group_by</span>(Region) <span class="sc">%&gt;%</span></span>
+<span id="cb76-3"><a href="c05-descriptive-analysis.html#cb76-3" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">elec_bill =</span> <span class="fu">survey_total</span>(DOLLAREL,</span>
+<span id="cb76-4"><a href="c05-descriptive-analysis.html#cb76-4" tabindex="-1"></a>                                     <span class="at">vartype =</span> <span class="st">&quot;ci&quot;</span>))</span></code></pre></div>
 <pre><code>## # A tibble: 4 × 4
 ##   Region       elec_bill elec_bill_low elec_bill_upp
 ##   &lt;fct&gt;            &lt;dbl&gt;         &lt;dbl&gt;         &lt;dbl&gt;
@@ -783,37 +783,37 @@ <h4>Example 3: Summation by groups<a href="c05-descriptive-analysis.html#example
 ## 2 Midwest   34972544751.  34339576041.  35605513460.
 ## 3 South     72496840204.  71534780902.  73458899506.
 ## 4 West      33573773008.  32909111702.  34238434313.</code></pre>
-<p>The survey results estimate that households in the Northeast spent $29,430,369,947 with a confidence interval of ($28,788,987,554, $30,071,752,341) on electricity in 2020 while households in the South spent an estimated $72,496,840,204 with a confidence interval of ($28,788,987,554, $73,458,899,506).</p>
-<p>As we calculate these numbers, we may notice that the confidence interval of the South is larger than those of other regions. This implies that we have less certainty about the true value of electricity spending in the South. A larger confidence interval could be due to a variety of factors, such as a wider range of electricity spending in the South. We could try to analyze smaller regions within the South to identify areas that are contributing to more variability. Descriptive analyses serve as a valuable starting point for more in-depth exploration and analysis.</p>
+<p>The survey results estimate that households in the Northeast spent $29,430,369,947 with a confidence interval of ($28,788,987,554, $30,071,752,341) on electricity in 2020, while households in the South spent an estimated $72,496,840,204 with a confidence interval of ($71,534,780,902, $73,458,899,506.)</p>
 </div>
 </div>
 </div>
 <div id="desc-meanprop" class="section level2 hasAnchor" number="5.4">
 <h2><span class="header-section-number">5.4</span> Means and proportions<a href="c05-descriptive-analysis.html#desc-meanprop" class="anchor-section" aria-label="Anchor link to header"></a></h2>
-<p>Means and proportions form the backbone of many research studies. These estimates are often the first things we look for when reviewing research on a given topic. The <code>survey_mean()</code> and <code>survey_prop()</code> functions calculate means and proportions while taking into account the survey design elements. The <code>survey_mean()</code> function should be used on continuous variables of survey data, while the <code>survey_prop()</code> function should be used on categorical variables. These topics are grouped together because a proportion is a mean of a logical (Boolean) variable.</p>
+<p>Means and proportions form the foundation of many research studies. These estimates are often the first things we look for when reviewing research on a given topic. The <code>survey_mean()</code> and <code>survey_prop()</code> functions calculate means and proportions while taking into account the survey design elements. The <code>survey_mean()</code> function should be used on continuous variables of survey data, while the <code>survey_prop()</code> function should be used on categorical variables.</p>
 <div id="desc-meanprop-syntax" class="section level3 hasAnchor" number="5.4.1">
 <h3><span class="header-section-number">5.4.1</span> Syntax<a href="c05-descriptive-analysis.html#desc-meanprop-syntax" class="anchor-section" aria-label="Anchor link to header"></a></h3>
 <p>The syntax for both means and proportions are very similar:</p>
-<div class="sourceCode" id="cb79"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb79-1"><a href="c05-descriptive-analysis.html#cb79-1" tabindex="-1"></a><span class="fu">survey_mean</span>(</span>
-<span id="cb79-2"><a href="c05-descriptive-analysis.html#cb79-2" tabindex="-1"></a>  x,</span>
-<span id="cb79-3"><a href="c05-descriptive-analysis.html#cb79-3" tabindex="-1"></a>  <span class="at">na.rm =</span> <span class="cn">FALSE</span>,</span>
-<span id="cb79-4"><a href="c05-descriptive-analysis.html#cb79-4" tabindex="-1"></a>  <span class="at">vartype =</span> <span class="fu">c</span>(<span class="st">&quot;se&quot;</span>, <span class="st">&quot;ci&quot;</span>, <span class="st">&quot;var&quot;</span>, <span class="st">&quot;cv&quot;</span>),</span>
-<span id="cb79-5"><a href="c05-descriptive-analysis.html#cb79-5" tabindex="-1"></a>  <span class="at">level =</span> <span class="fl">0.95</span>,</span>
-<span id="cb79-6"><a href="c05-descriptive-analysis.html#cb79-6" tabindex="-1"></a>  <span class="at">proportion =</span> <span class="cn">FALSE</span>,</span>
-<span id="cb79-7"><a href="c05-descriptive-analysis.html#cb79-7" tabindex="-1"></a>  <span class="at">prop_method =</span> <span class="fu">c</span>(<span class="st">&quot;logit&quot;</span>, <span class="st">&quot;likelihood&quot;</span>, <span class="st">&quot;asin&quot;</span>, <span class="st">&quot;beta&quot;</span>, <span class="st">&quot;mean&quot;</span>),</span>
-<span id="cb79-8"><a href="c05-descriptive-analysis.html#cb79-8" tabindex="-1"></a>  <span class="at">deff =</span> <span class="cn">FALSE</span>,</span>
-<span id="cb79-9"><a href="c05-descriptive-analysis.html#cb79-9" tabindex="-1"></a>  <span class="at">df =</span> <span class="cn">NULL</span></span>
-<span id="cb79-10"><a href="c05-descriptive-analysis.html#cb79-10" tabindex="-1"></a>)</span>
-<span id="cb79-11"><a href="c05-descriptive-analysis.html#cb79-11" tabindex="-1"></a></span>
-<span id="cb79-12"><a href="c05-descriptive-analysis.html#cb79-12" tabindex="-1"></a><span class="fu">survey_prop</span>(</span>
-<span id="cb79-13"><a href="c05-descriptive-analysis.html#cb79-13" tabindex="-1"></a>  <span class="at">na.rm =</span> <span class="cn">FALSE</span>,</span>
-<span id="cb79-14"><a href="c05-descriptive-analysis.html#cb79-14" tabindex="-1"></a>  <span class="at">vartype =</span> <span class="fu">c</span>(<span class="st">&quot;se&quot;</span>, <span class="st">&quot;ci&quot;</span>, <span class="st">&quot;var&quot;</span>, <span class="st">&quot;cv&quot;</span>),</span>
-<span id="cb79-15"><a href="c05-descriptive-analysis.html#cb79-15" tabindex="-1"></a>  <span class="at">level =</span> <span class="fl">0.95</span>,</span>
-<span id="cb79-16"><a href="c05-descriptive-analysis.html#cb79-16" tabindex="-1"></a>  <span class="at">proportion =</span> <span class="cn">TRUE</span>,</span>
-<span id="cb79-17"><a href="c05-descriptive-analysis.html#cb79-17" tabindex="-1"></a>  <span class="at">prop_method =</span> <span class="fu">c</span>(<span class="st">&quot;logit&quot;</span>, <span class="st">&quot;likelihood&quot;</span>, <span class="st">&quot;asin&quot;</span>, <span class="st">&quot;beta&quot;</span>, <span class="st">&quot;mean&quot;</span>, <span class="st">&quot;xlogit&quot;</span>),</span>
-<span id="cb79-18"><a href="c05-descriptive-analysis.html#cb79-18" tabindex="-1"></a>  <span class="at">deff =</span> <span class="cn">FALSE</span>,</span>
-<span id="cb79-19"><a href="c05-descriptive-analysis.html#cb79-19" tabindex="-1"></a>  <span class="at">df =</span> <span class="cn">NULL</span></span>
-<span id="cb79-20"><a href="c05-descriptive-analysis.html#cb79-20" tabindex="-1"></a>)</span></code></pre></div>
+<div class="sourceCode" id="cb78"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb78-1"><a href="c05-descriptive-analysis.html#cb78-1" tabindex="-1"></a><span class="fu">survey_mean</span>(</span>
+<span id="cb78-2"><a href="c05-descriptive-analysis.html#cb78-2" tabindex="-1"></a>  x,</span>
+<span id="cb78-3"><a href="c05-descriptive-analysis.html#cb78-3" tabindex="-1"></a>  <span class="at">na.rm =</span> <span class="cn">FALSE</span>,</span>
+<span id="cb78-4"><a href="c05-descriptive-analysis.html#cb78-4" tabindex="-1"></a>  <span class="at">vartype =</span> <span class="fu">c</span>(<span class="st">&quot;se&quot;</span>, <span class="st">&quot;ci&quot;</span>, <span class="st">&quot;var&quot;</span>, <span class="st">&quot;cv&quot;</span>),</span>
+<span id="cb78-5"><a href="c05-descriptive-analysis.html#cb78-5" tabindex="-1"></a>  <span class="at">level =</span> <span class="fl">0.95</span>,</span>
+<span id="cb78-6"><a href="c05-descriptive-analysis.html#cb78-6" tabindex="-1"></a>  <span class="at">proportion =</span> <span class="cn">FALSE</span>,</span>
+<span id="cb78-7"><a href="c05-descriptive-analysis.html#cb78-7" tabindex="-1"></a>  <span class="at">prop_method =</span> <span class="fu">c</span>(<span class="st">&quot;logit&quot;</span>, <span class="st">&quot;likelihood&quot;</span>, <span class="st">&quot;asin&quot;</span>, <span class="st">&quot;beta&quot;</span>, <span class="st">&quot;mean&quot;</span>),</span>
+<span id="cb78-8"><a href="c05-descriptive-analysis.html#cb78-8" tabindex="-1"></a>  <span class="at">deff =</span> <span class="cn">FALSE</span>,</span>
+<span id="cb78-9"><a href="c05-descriptive-analysis.html#cb78-9" tabindex="-1"></a>  <span class="at">df =</span> <span class="cn">NULL</span></span>
+<span id="cb78-10"><a href="c05-descriptive-analysis.html#cb78-10" tabindex="-1"></a>)</span>
+<span id="cb78-11"><a href="c05-descriptive-analysis.html#cb78-11" tabindex="-1"></a></span>
+<span id="cb78-12"><a href="c05-descriptive-analysis.html#cb78-12" tabindex="-1"></a><span class="fu">survey_prop</span>(</span>
+<span id="cb78-13"><a href="c05-descriptive-analysis.html#cb78-13" tabindex="-1"></a>  <span class="at">na.rm =</span> <span class="cn">FALSE</span>,</span>
+<span id="cb78-14"><a href="c05-descriptive-analysis.html#cb78-14" tabindex="-1"></a>  <span class="at">vartype =</span> <span class="fu">c</span>(<span class="st">&quot;se&quot;</span>, <span class="st">&quot;ci&quot;</span>, <span class="st">&quot;var&quot;</span>, <span class="st">&quot;cv&quot;</span>),</span>
+<span id="cb78-15"><a href="c05-descriptive-analysis.html#cb78-15" tabindex="-1"></a>  <span class="at">level =</span> <span class="fl">0.95</span>,</span>
+<span id="cb78-16"><a href="c05-descriptive-analysis.html#cb78-16" tabindex="-1"></a>  <span class="at">proportion =</span> <span class="cn">TRUE</span>,</span>
+<span id="cb78-17"><a href="c05-descriptive-analysis.html#cb78-17" tabindex="-1"></a>  <span class="at">prop_method =</span> </span>
+<span id="cb78-18"><a href="c05-descriptive-analysis.html#cb78-18" tabindex="-1"></a>    <span class="fu">c</span>(<span class="st">&quot;logit&quot;</span>, <span class="st">&quot;likelihood&quot;</span>, <span class="st">&quot;asin&quot;</span>, <span class="st">&quot;beta&quot;</span>, <span class="st">&quot;mean&quot;</span>, <span class="st">&quot;xlogit&quot;</span>),</span>
+<span id="cb78-19"><a href="c05-descriptive-analysis.html#cb78-19" tabindex="-1"></a>  <span class="at">deff =</span> <span class="cn">FALSE</span>,</span>
+<span id="cb78-20"><a href="c05-descriptive-analysis.html#cb78-20" tabindex="-1"></a>  <span class="at">df =</span> <span class="cn">NULL</span></span>
+<span id="cb78-21"><a href="c05-descriptive-analysis.html#cb78-21" tabindex="-1"></a>)</span></code></pre></div>
 <p>Both functions have the following arguments and defaults:</p>
 <ul>
 <li><code>na.rm</code>: an indicator of whether missing values should be dropped, defaults to <code>FALSE</code></li>
@@ -824,29 +824,26 @@ <h3><span class="header-section-number">5.4.1</span> Syntax<a href="c05-descript
 <li><code>df</code>: (for <code>vartype = 'ci'</code>), a numeric value indicating degrees of freedom for the t-distribution</li>
 </ul>
 <p>There are two main differences in the syntax. The <code>survey_mean()</code> function includes the first argument <code>x</code>, representing the variable or expression on which the mean should be calculated. The <code>survey_prop()</code> does not have an argument to include the variables directly. Instead, prior to <code>summarize()</code>, we must use the <code>group_by()</code> function to specify the variables of interest for <code>survey_prop()</code>. For <code>survey_mean()</code>, including a <code>group_by()</code> function allows us to obtain the means by different groups.</p>
-<p>The other main difference is with the <code>proportion</code> argument. The <code>survey_mean()</code> function can be used to calculate both means and proportions. Its <code>proportion</code> argument defaults to <code>FALSE</code>, indicating it is used for calculating means. If we wish to calculate a proportion using <code>survey_mean()</code>, we will need to set the <code>proportion</code> argument to <code>TRUE</code>. In the <code>survey_prop()</code> function, the <code>proportion</code> argument defaults to <code>TRUE</code> because the function is specifically designed for calculating proportions.</p>
-<p>In section <a href="c05-descriptive-analysis.html#desc-count-syntax">5.2.1</a>, we provide an overview of different variability types. The confidence interval used for most measures, such as means and counts, is referred to as a Wald-type interval. However, for proportions, a Wald-type interval with a symmetric t-based confidence interval may not provide accurate coverage, especially when dealing with small sample sizes or proportions “near” 0 or 1. We can use other methods to calculate confidence intervals, which we specify using the <code>prop_method</code> option in <code>survey_prop()</code>. The options include:</p>
+<p>The other main difference is with the <code>proportion</code> argument. The <code>survey_mean()</code> function can be used to calculate both means and proportions. Its <code>proportion</code> argument defaults to <code>FALSE</code>, indicating it is used for calculating means. If we wish to calculate a proportion using <code>survey_mean()</code>, we need to set the <code>proportion</code> argument to <code>TRUE</code>. In the <code>survey_prop()</code> function, the <code>proportion</code> argument defaults to <code>TRUE</code> because the function is specifically designed for calculating proportions.</p>
+<p>In Section <a href="c05-descriptive-analysis.html#desc-count-syntax">5.2.1</a>, we provide an overview of different variability types. The confidence interval used for most measures, such as means and counts, is referred to as a Wald-type interval. However, for proportions, a Wald-type interval with a symmetric t-based confidence interval may not provide accurate coverage, especially when dealing with small sample sizes or proportions “near” 0 or 1. We can use other methods to calculate confidence intervals, which we specify using the <code>prop_method</code> option in <code>survey_prop()</code>. The options include:</p>
 <ul>
 <li><code>logit</code>: fits a logistic regression model and computes a Wald-type interval on the log-odds scale, which is then transformed to the probability scale. This is the default method.</li>
 <li><code>likelihood</code>: uses the (Rao-Scott) scaled chi-squared distribution for the log-likelihood from a binomial distribution.</li>
 <li><code>asin</code>: uses the variance-stabilizing transformation for the binomial distribution, the arcsine square root, and then back-transforms the interval to the probability scale</li>
 <li><code>beta</code>: uses the incomplete beta function with an effective sample size based on the estimated variance of the proportion.</li>
 <li><code>mean</code>: the Wald-type interval (<span class="math inline">\(\pm t_{df}^*\times SE\)</span>)</li>
-<li><code>xlogit</code>: uses a logit transformation of the proportion, calculates a Wald-type interval, and then back-transforms to the probability scale. This method is the same as those used in SUDAAN and SPSS.</li>
+<li><code>xlogit</code>: uses a logit transformation of the proportion, calculates a Wald-type interval, and then back-transforms to the probability scale. This method is the same as those used by default in SUDAAN and SPSS.</li>
 </ul>
-<p>Each option will yield slightly different confidence interval bounds when dealing with proportions. Please note that when working with <code>survey_mean()</code>, we do not need to specify a method unless the <code>proportion</code> argument is <code>TRUE</code>. If <code>proportion</code> is <code>FALSE</code>, it calculates a symmetric <code>mean</code> type of confidence interval.</p>
+<p>Each option yields slightly different confidence interval bounds when dealing with proportions. Please note that when working with <code>survey_mean()</code>, we do not need to specify a method unless the <code>proportion</code> argument is <code>TRUE</code>. If <code>proportion</code> is <code>FALSE</code>, it calculates a symmetric <code>mean</code> type of confidence interval.</p>
 </div>
 <div id="examples-2" class="section level3 hasAnchor" number="5.4.2">
 <h3><span class="header-section-number">5.4.2</span> Examples<a href="c05-descriptive-analysis.html#examples-2" class="anchor-section" aria-label="Anchor link to header"></a></h3>
 <div id="example-1-one-variable-proportion" class="section level4 unnumbered hasAnchor">
 <h4>Example 1: One variable proportion<a href="c05-descriptive-analysis.html#example-1-one-variable-proportion" class="anchor-section" aria-label="Anchor link to header"></a></h4>
 <p>If we are interested in obtaining the proportion of people in each region in the RECS data, we can use <code>group_by()</code> and <code>survey_prop()</code> as shown below:</p>
-<div class="sourceCode" id="cb80"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb80-1"><a href="c05-descriptive-analysis.html#cb80-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
-<span id="cb80-2"><a href="c05-descriptive-analysis.html#cb80-2" tabindex="-1"></a>  <span class="fu">group_by</span>(Region) <span class="sc">%&gt;%</span></span>
-<span id="cb80-3"><a href="c05-descriptive-analysis.html#cb80-3" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">p =</span> <span class="fu">survey_prop</span>()) </span></code></pre></div>
-<pre><code>## When `proportion` is unspecified, `survey_prop()` now defaults to `proportion = TRUE`.
-## ℹ This should improve confidence interval coverage.
-## This message is displayed once per session.</code></pre>
+<div class="sourceCode" id="cb79"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb79-1"><a href="c05-descriptive-analysis.html#cb79-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
+<span id="cb79-2"><a href="c05-descriptive-analysis.html#cb79-2" tabindex="-1"></a>  <span class="fu">group_by</span>(Region) <span class="sc">%&gt;%</span></span>
+<span id="cb79-3"><a href="c05-descriptive-analysis.html#cb79-3" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">p =</span> <span class="fu">survey_prop</span>()) </span></code></pre></div>
 <pre><code>## # A tibble: 4 × 3
 ##   Region        p           p_se
 ##   &lt;fct&gt;     &lt;dbl&gt;          &lt;dbl&gt;
@@ -855,10 +852,10 @@ <h4>Example 1: One variable proportion<a href="c05-descriptive-analysis.html#exa
 ## 3 South     0.379 0.000000000740
 ## 4 West      0.224 0.000000000816</code></pre>
 <p>17.7% of the households are in the Northeast, 21.9% in the Midwest, and so on. Note that the proportions in column <code>p</code> add up to one.</p>
-<p>The <code>survey_prop()</code> function is essentially the same as using <code>survey_mean()</code> with a categorical variable and without specifying a numeric variable in the <code>x</code> argument. The following code will give us the same results as above:</p>
-<div class="sourceCode" id="cb83"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb83-1"><a href="c05-descriptive-analysis.html#cb83-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
-<span id="cb83-2"><a href="c05-descriptive-analysis.html#cb83-2" tabindex="-1"></a>  <span class="fu">group_by</span>(Region) <span class="sc">%&gt;%</span></span>
-<span id="cb83-3"><a href="c05-descriptive-analysis.html#cb83-3" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">p =</span> <span class="fu">survey_mean</span>())</span></code></pre></div>
+<p>The <code>survey_prop()</code> function is essentially the same as using <code>survey_mean()</code> with a categorical variable and without specifying a numeric variable in the <code>x</code> argument. The following code gives us the same results as above:</p>
+<div class="sourceCode" id="cb81"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb81-1"><a href="c05-descriptive-analysis.html#cb81-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
+<span id="cb81-2"><a href="c05-descriptive-analysis.html#cb81-2" tabindex="-1"></a>  <span class="fu">group_by</span>(Region) <span class="sc">%&gt;%</span></span>
+<span id="cb81-3"><a href="c05-descriptive-analysis.html#cb81-3" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">p =</span> <span class="fu">survey_mean</span>())</span></code></pre></div>
 <pre><code>## # A tibble: 4 × 3
 ##   Region        p           p_se
 ##   &lt;fct&gt;     &lt;dbl&gt;          &lt;dbl&gt;
@@ -869,10 +866,10 @@ <h4>Example 1: One variable proportion<a href="c05-descriptive-analysis.html#exa
 </div>
 <div id="example-2-conditional-proportions" class="section level4 unnumbered hasAnchor">
 <h4>Example 2: Conditional proportions<a href="c05-descriptive-analysis.html#example-2-conditional-proportions" class="anchor-section" aria-label="Anchor link to header"></a></h4>
-<p>We can also obtain proportions by more than one variable. In the following example, we look at the proportion of housing units by Region and whether air conditioning is used (<code>ACUsed</code>).<a href="#fn5" class="footnote-ref" id="fnref5"><sup>5</sup></a></p>
-<div class="sourceCode" id="cb85"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb85-1"><a href="c05-descriptive-analysis.html#cb85-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
-<span id="cb85-2"><a href="c05-descriptive-analysis.html#cb85-2" tabindex="-1"></a>  <span class="fu">group_by</span>(Region, ACUsed) <span class="sc">%&gt;%</span></span>
-<span id="cb85-3"><a href="c05-descriptive-analysis.html#cb85-3" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">p =</span> <span class="fu">survey_prop</span>())</span></code></pre></div>
+<p>We can also obtain proportions by more than one variable. In the following example, we look at the proportion of housing units by Region and whether air conditioning (A/C) is used (<code>ACUsed</code>.)<a href="#fn5" class="footnote-ref" id="fnref5"><sup>5</sup></a></p>
+<div class="sourceCode" id="cb83"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb83-1"><a href="c05-descriptive-analysis.html#cb83-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
+<span id="cb83-2"><a href="c05-descriptive-analysis.html#cb83-2" tabindex="-1"></a>  <span class="fu">group_by</span>(Region, ACUsed) <span class="sc">%&gt;%</span></span>
+<span id="cb83-3"><a href="c05-descriptive-analysis.html#cb83-3" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">p =</span> <span class="fu">survey_prop</span>())</span></code></pre></div>
 <pre><code>## # A tibble: 8 × 4
 ## # Groups:   Region [4]
 ##   Region    ACUsed      p    p_se
@@ -885,14 +882,14 @@ <h4>Example 2: Conditional proportions<a href="c05-descriptive-analysis.html#exa
 ## 6 South     TRUE   0.942  0.00278
 ## 7 West      FALSE  0.255  0.00759
 ## 8 West      TRUE   0.745  0.00759</code></pre>
-<p>When specifying multiple variables, the proportions are conditional. In the results above, notice that the proportions sum to 1 within each region. This can be interpreted as the proportion of housing units with air conditioning <em>within</em> each region. For example, in the Northeast region, approximately 11.0% of housing units don’t have air conditioning, while around 89.0% have air conditioning.</p>
+<p>When specifying multiple variables, the proportions are conditional. In the results above, notice that the proportions sum to 1 within each region. This can be interpreted as the proportion of housing units with A/C <em>within</em> each region. For example, in the Northeast region, approximately 11.0% of housing units don’t have A/C, while around 89.0% have A/C.</p>
 </div>
 <div id="example-3-joint-proportions" class="section level4 unnumbered hasAnchor">
 <h4>Example 3: Joint proportions<a href="c05-descriptive-analysis.html#example-3-joint-proportions" class="anchor-section" aria-label="Anchor link to header"></a></h4>
 <p>If we’re interested in a joint proportion, we use the <code>interact()</code> function. In the example below, we apply the <code>interact()</code> function to <code>Region</code> and <code>ACUsed</code>:</p>
-<div class="sourceCode" id="cb87"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb87-1"><a href="c05-descriptive-analysis.html#cb87-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
-<span id="cb87-2"><a href="c05-descriptive-analysis.html#cb87-2" tabindex="-1"></a>  <span class="fu">group_by</span>(<span class="fu">interact</span>(Region, ACUsed)) <span class="sc">%&gt;%</span></span>
-<span id="cb87-3"><a href="c05-descriptive-analysis.html#cb87-3" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">p =</span> <span class="fu">survey_prop</span>())</span></code></pre></div>
+<div class="sourceCode" id="cb85"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb85-1"><a href="c05-descriptive-analysis.html#cb85-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
+<span id="cb85-2"><a href="c05-descriptive-analysis.html#cb85-2" tabindex="-1"></a>  <span class="fu">group_by</span>(<span class="fu">interact</span>(Region, ACUsed)) <span class="sc">%&gt;%</span></span>
+<span id="cb85-3"><a href="c05-descriptive-analysis.html#cb85-3" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">p =</span> <span class="fu">survey_prop</span>())</span></code></pre></div>
 <pre><code>## # A tibble: 8 × 4
 ##   Region    ACUsed      p    p_se
 ##   &lt;fct&gt;     &lt;lgl&gt;   &lt;dbl&gt;   &lt;dbl&gt;
@@ -904,14 +901,14 @@ <h4>Example 3: Joint proportions<a href="c05-descriptive-analysis.html#example-3
 ## 6 South     TRUE   0.357  0.00106
 ## 7 West      FALSE  0.0573 0.00170
 ## 8 West      TRUE   0.167  0.00170</code></pre>
-<p>As noted earlier, we can use both the <code>survey_prop()</code> and <code>survey_mean()</code> functions, and they will produce the same results.</p>
+<p>In this case, all proportions sum to 1, not just within regions. This means that 15.8% of the population lives in the Northeast and has A/C. As noted earlier, we can use both the <code>survey_prop()</code> and <code>survey_mean()</code> functions, and they produce the same results.</p>
 </div>
 <div id="example-4-overall-mean" class="section level4 unnumbered hasAnchor">
 <h4>Example 4: Overall mean<a href="c05-descriptive-analysis.html#example-4-overall-mean" class="anchor-section" aria-label="Anchor link to header"></a></h4>
 <p>Below, we calculate the estimated average cost of electricity in the U.S. using <code>survey_mean()</code>. To include both the standard error and the confidence interval, we can include them in the <code>vartype</code> argument:</p>
-<div class="sourceCode" id="cb89"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb89-1"><a href="c05-descriptive-analysis.html#cb89-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
-<span id="cb89-2"><a href="c05-descriptive-analysis.html#cb89-2" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">elec_bill =</span> <span class="fu">survey_mean</span>(DOLLAREL,</span>
-<span id="cb89-3"><a href="c05-descriptive-analysis.html#cb89-3" tabindex="-1"></a>                                    <span class="at">vartype =</span> <span class="fu">c</span>(<span class="st">&quot;se&quot;</span>, <span class="st">&quot;ci&quot;</span>)))</span></code></pre></div>
+<div class="sourceCode" id="cb87"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb87-1"><a href="c05-descriptive-analysis.html#cb87-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
+<span id="cb87-2"><a href="c05-descriptive-analysis.html#cb87-2" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">elec_bill =</span> <span class="fu">survey_mean</span>(DOLLAREL,</span>
+<span id="cb87-3"><a href="c05-descriptive-analysis.html#cb87-3" tabindex="-1"></a>                                    <span class="at">vartype =</span> <span class="fu">c</span>(<span class="st">&quot;se&quot;</span>, <span class="st">&quot;ci&quot;</span>)))</span></code></pre></div>
 <pre><code>## # A tibble: 1 × 4
 ##   elec_bill elec_bill_se elec_bill_low elec_bill_upp
 ##       &lt;dbl&gt;        &lt;dbl&gt;         &lt;dbl&gt;         &lt;dbl&gt;
@@ -921,9 +918,9 @@ <h4>Example 4: Overall mean<a href="c05-descriptive-analysis.html#example-4-over
 <div id="example-5-means-by-subgroup" class="section level4 unnumbered hasAnchor">
 <h4>Example 5: Means by subgroup<a href="c05-descriptive-analysis.html#example-5-means-by-subgroup" class="anchor-section" aria-label="Anchor link to header"></a></h4>
 <p>We can also calculate the estimated average cost of electricity in the U.S. by each region. To do this, we include a <code>group_by()</code> function with the variable of interest before the <code>summarize()</code> function:</p>
-<div class="sourceCode" id="cb91"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb91-1"><a href="c05-descriptive-analysis.html#cb91-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
-<span id="cb91-2"><a href="c05-descriptive-analysis.html#cb91-2" tabindex="-1"></a>  <span class="fu">group_by</span>(Region) <span class="sc">%&gt;%</span></span>
-<span id="cb91-3"><a href="c05-descriptive-analysis.html#cb91-3" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">elec_bill =</span> <span class="fu">survey_mean</span>(DOLLAREL))</span></code></pre></div>
+<div class="sourceCode" id="cb89"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb89-1"><a href="c05-descriptive-analysis.html#cb89-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
+<span id="cb89-2"><a href="c05-descriptive-analysis.html#cb89-2" tabindex="-1"></a>  <span class="fu">group_by</span>(Region) <span class="sc">%&gt;%</span></span>
+<span id="cb89-3"><a href="c05-descriptive-analysis.html#cb89-3" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">elec_bill =</span> <span class="fu">survey_mean</span>(DOLLAREL))</span></code></pre></div>
 <pre><code>## # A tibble: 4 × 3
 ##   Region    elec_bill elec_bill_se
 ##   &lt;fct&gt;         &lt;dbl&gt;        &lt;dbl&gt;
@@ -938,32 +935,34 @@ <h4>Example 5: Means by subgroup<a href="c05-descriptive-analysis.html#example-5
 <div id="quantiles-and-medians" class="section level2 hasAnchor" number="5.5">
 <h2><span class="header-section-number">5.5</span> Quantiles and medians<a href="c05-descriptive-analysis.html#quantiles-and-medians" class="anchor-section" aria-label="Anchor link to header"></a></h2>
 <p>To better understand the distribution of a continuous variable like income, we can calculate quantiles at specific points. For example, computing estimates of the quartiles (25%, 50%, 75%) helps us understand how income is spread across the population. We use the <code>survey_quantile()</code> function to calculate quantiles in survey data.</p>
-<p>Medians are useful for finding the midpoint of a continuous distribution when the data is skewed, as medians are less affected by outliers than means. The median is the same as the 50th percentile, meaning the value where 50% of the data is higher and 50% is lower. Because medians are a special, common case of quantiles, we have a dedicated function called <code>survey_median()</code> for calculating the median in survey data. Alternatively, we can use the <code>survey_quantile()</code> function with the <code>quantiles</code> argument set to <code>0.5</code> to achieve the same result.</p>
+<p>Medians are useful for finding the midpoint of a continuous distribution when the data are skewed, as medians are less affected by outliers compared to means. The median is the same as the 50th percentile, meaning the value where 50% of the data are higher and 50% are lower. Because medians are a special, common case of quantiles, we have a dedicated function called <code>survey_median()</code> for calculating the median in survey data. Alternatively, we can use the <code>survey_quantile()</code> function with the <code>quantiles</code> argument set to <code>0.5</code> to achieve the same result.</p>
 <div id="syntax-1" class="section level3 hasAnchor" number="5.5.1">
 <h3><span class="header-section-number">5.5.1</span> Syntax<a href="c05-descriptive-analysis.html#syntax-1" class="anchor-section" aria-label="Anchor link to header"></a></h3>
 <p>The syntax for <code>survey_quantile()</code> and <code>survey_median()</code> are nearly identical:</p>
-<div class="sourceCode" id="cb93"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb93-1"><a href="c05-descriptive-analysis.html#cb93-1" tabindex="-1"></a><span class="fu">survey_quantile</span>(</span>
-<span id="cb93-2"><a href="c05-descriptive-analysis.html#cb93-2" tabindex="-1"></a>  x,</span>
-<span id="cb93-3"><a href="c05-descriptive-analysis.html#cb93-3" tabindex="-1"></a>  quantiles,</span>
-<span id="cb93-4"><a href="c05-descriptive-analysis.html#cb93-4" tabindex="-1"></a>  <span class="at">na.rm =</span> <span class="cn">FALSE</span>,</span>
-<span id="cb93-5"><a href="c05-descriptive-analysis.html#cb93-5" tabindex="-1"></a>  <span class="at">vartype =</span> <span class="fu">c</span>(<span class="st">&quot;se&quot;</span>, <span class="st">&quot;ci&quot;</span>, <span class="st">&quot;var&quot;</span>, <span class="st">&quot;cv&quot;</span>),</span>
-<span id="cb93-6"><a href="c05-descriptive-analysis.html#cb93-6" tabindex="-1"></a>  <span class="at">level =</span> <span class="fl">0.95</span>,</span>
-<span id="cb93-7"><a href="c05-descriptive-analysis.html#cb93-7" tabindex="-1"></a>  <span class="at">interval_type =</span> <span class="fu">c</span>(<span class="st">&quot;mean&quot;</span>, <span class="st">&quot;beta&quot;</span>, <span class="st">&quot;xlogit&quot;</span>, <span class="st">&quot;asin&quot;</span>, <span class="st">&quot;score&quot;</span>, <span class="st">&quot;quantile&quot;</span>),</span>
-<span id="cb93-8"><a href="c05-descriptive-analysis.html#cb93-8" tabindex="-1"></a>  <span class="at">qrule =</span> <span class="fu">c</span>(<span class="st">&quot;math&quot;</span>, <span class="st">&quot;school&quot;</span>, <span class="st">&quot;shahvaish&quot;</span>, <span class="st">&quot;hf1&quot;</span>, <span class="st">&quot;hf2&quot;</span>, <span class="st">&quot;hf3&quot;</span>, </span>
-<span id="cb93-9"><a href="c05-descriptive-analysis.html#cb93-9" tabindex="-1"></a>            <span class="st">&quot;hf4&quot;</span>, <span class="st">&quot;hf5&quot;</span>, <span class="st">&quot;hf6&quot;</span>, <span class="st">&quot;hf7&quot;</span>, <span class="st">&quot;hf8&quot;</span>, <span class="st">&quot;hf9&quot;</span>),</span>
-<span id="cb93-10"><a href="c05-descriptive-analysis.html#cb93-10" tabindex="-1"></a>  <span class="at">df =</span> <span class="cn">NULL</span></span>
-<span id="cb93-11"><a href="c05-descriptive-analysis.html#cb93-11" tabindex="-1"></a>)</span>
-<span id="cb93-12"><a href="c05-descriptive-analysis.html#cb93-12" tabindex="-1"></a></span>
-<span id="cb93-13"><a href="c05-descriptive-analysis.html#cb93-13" tabindex="-1"></a><span class="fu">survey_median</span>(</span>
-<span id="cb93-14"><a href="c05-descriptive-analysis.html#cb93-14" tabindex="-1"></a>  x,</span>
-<span id="cb93-15"><a href="c05-descriptive-analysis.html#cb93-15" tabindex="-1"></a>  <span class="at">na.rm =</span> <span class="cn">FALSE</span>,</span>
-<span id="cb93-16"><a href="c05-descriptive-analysis.html#cb93-16" tabindex="-1"></a>  <span class="at">vartype =</span> <span class="fu">c</span>(<span class="st">&quot;se&quot;</span>, <span class="st">&quot;ci&quot;</span>, <span class="st">&quot;var&quot;</span>, <span class="st">&quot;cv&quot;</span>),</span>
-<span id="cb93-17"><a href="c05-descriptive-analysis.html#cb93-17" tabindex="-1"></a>  <span class="at">level =</span> <span class="fl">0.95</span>,</span>
-<span id="cb93-18"><a href="c05-descriptive-analysis.html#cb93-18" tabindex="-1"></a>  <span class="at">interval_type =</span> <span class="fu">c</span>(<span class="st">&quot;mean&quot;</span>, <span class="st">&quot;beta&quot;</span>, <span class="st">&quot;xlogit&quot;</span>, <span class="st">&quot;asin&quot;</span>, <span class="st">&quot;score&quot;</span>, <span class="st">&quot;quantile&quot;</span>),</span>
-<span id="cb93-19"><a href="c05-descriptive-analysis.html#cb93-19" tabindex="-1"></a>  <span class="at">qrule =</span> <span class="fu">c</span>(<span class="st">&quot;math&quot;</span>, <span class="st">&quot;school&quot;</span>, <span class="st">&quot;shahvaish&quot;</span>, <span class="st">&quot;hf1&quot;</span>, <span class="st">&quot;hf2&quot;</span>, <span class="st">&quot;hf3&quot;</span>, </span>
-<span id="cb93-20"><a href="c05-descriptive-analysis.html#cb93-20" tabindex="-1"></a>            <span class="st">&quot;hf4&quot;</span>, <span class="st">&quot;hf5&quot;</span>, <span class="st">&quot;hf6&quot;</span>, <span class="st">&quot;hf7&quot;</span>, <span class="st">&quot;hf8&quot;</span>, <span class="st">&quot;hf9&quot;</span>),</span>
-<span id="cb93-21"><a href="c05-descriptive-analysis.html#cb93-21" tabindex="-1"></a>  <span class="at">df =</span> <span class="cn">NULL</span></span>
-<span id="cb93-22"><a href="c05-descriptive-analysis.html#cb93-22" tabindex="-1"></a>)</span></code></pre></div>
+<div class="sourceCode" id="cb91"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb91-1"><a href="c05-descriptive-analysis.html#cb91-1" tabindex="-1"></a><span class="fu">survey_quantile</span>(</span>
+<span id="cb91-2"><a href="c05-descriptive-analysis.html#cb91-2" tabindex="-1"></a>  x,</span>
+<span id="cb91-3"><a href="c05-descriptive-analysis.html#cb91-3" tabindex="-1"></a>  quantiles,</span>
+<span id="cb91-4"><a href="c05-descriptive-analysis.html#cb91-4" tabindex="-1"></a>  <span class="at">na.rm =</span> <span class="cn">FALSE</span>,</span>
+<span id="cb91-5"><a href="c05-descriptive-analysis.html#cb91-5" tabindex="-1"></a>  <span class="at">vartype =</span> <span class="fu">c</span>(<span class="st">&quot;se&quot;</span>, <span class="st">&quot;ci&quot;</span>, <span class="st">&quot;var&quot;</span>, <span class="st">&quot;cv&quot;</span>),</span>
+<span id="cb91-6"><a href="c05-descriptive-analysis.html#cb91-6" tabindex="-1"></a>  <span class="at">level =</span> <span class="fl">0.95</span>,</span>
+<span id="cb91-7"><a href="c05-descriptive-analysis.html#cb91-7" tabindex="-1"></a>  <span class="at">interval_type =</span> </span>
+<span id="cb91-8"><a href="c05-descriptive-analysis.html#cb91-8" tabindex="-1"></a>    <span class="fu">c</span>(<span class="st">&quot;mean&quot;</span>, <span class="st">&quot;beta&quot;</span>, <span class="st">&quot;xlogit&quot;</span>, <span class="st">&quot;asin&quot;</span>, <span class="st">&quot;score&quot;</span>, <span class="st">&quot;quantile&quot;</span>),</span>
+<span id="cb91-9"><a href="c05-descriptive-analysis.html#cb91-9" tabindex="-1"></a>  <span class="at">qrule =</span> <span class="fu">c</span>(<span class="st">&quot;math&quot;</span>, <span class="st">&quot;school&quot;</span>, <span class="st">&quot;shahvaish&quot;</span>, <span class="st">&quot;hf1&quot;</span>, <span class="st">&quot;hf2&quot;</span>, <span class="st">&quot;hf3&quot;</span>, </span>
+<span id="cb91-10"><a href="c05-descriptive-analysis.html#cb91-10" tabindex="-1"></a>            <span class="st">&quot;hf4&quot;</span>, <span class="st">&quot;hf5&quot;</span>, <span class="st">&quot;hf6&quot;</span>, <span class="st">&quot;hf7&quot;</span>, <span class="st">&quot;hf8&quot;</span>, <span class="st">&quot;hf9&quot;</span>),</span>
+<span id="cb91-11"><a href="c05-descriptive-analysis.html#cb91-11" tabindex="-1"></a>  <span class="at">df =</span> <span class="cn">NULL</span></span>
+<span id="cb91-12"><a href="c05-descriptive-analysis.html#cb91-12" tabindex="-1"></a>)</span>
+<span id="cb91-13"><a href="c05-descriptive-analysis.html#cb91-13" tabindex="-1"></a></span>
+<span id="cb91-14"><a href="c05-descriptive-analysis.html#cb91-14" tabindex="-1"></a><span class="fu">survey_median</span>(</span>
+<span id="cb91-15"><a href="c05-descriptive-analysis.html#cb91-15" tabindex="-1"></a>  x,</span>
+<span id="cb91-16"><a href="c05-descriptive-analysis.html#cb91-16" tabindex="-1"></a>  <span class="at">na.rm =</span> <span class="cn">FALSE</span>,</span>
+<span id="cb91-17"><a href="c05-descriptive-analysis.html#cb91-17" tabindex="-1"></a>  <span class="at">vartype =</span> <span class="fu">c</span>(<span class="st">&quot;se&quot;</span>, <span class="st">&quot;ci&quot;</span>, <span class="st">&quot;var&quot;</span>, <span class="st">&quot;cv&quot;</span>),</span>
+<span id="cb91-18"><a href="c05-descriptive-analysis.html#cb91-18" tabindex="-1"></a>  <span class="at">level =</span> <span class="fl">0.95</span>,</span>
+<span id="cb91-19"><a href="c05-descriptive-analysis.html#cb91-19" tabindex="-1"></a>  <span class="at">interval_type =</span> </span>
+<span id="cb91-20"><a href="c05-descriptive-analysis.html#cb91-20" tabindex="-1"></a>    <span class="fu">c</span>(<span class="st">&quot;mean&quot;</span>, <span class="st">&quot;beta&quot;</span>, <span class="st">&quot;xlogit&quot;</span>, <span class="st">&quot;asin&quot;</span>, <span class="st">&quot;score&quot;</span>, <span class="st">&quot;quantile&quot;</span>),</span>
+<span id="cb91-21"><a href="c05-descriptive-analysis.html#cb91-21" tabindex="-1"></a>  <span class="at">qrule =</span> <span class="fu">c</span>(<span class="st">&quot;math&quot;</span>, <span class="st">&quot;school&quot;</span>, <span class="st">&quot;shahvaish&quot;</span>, <span class="st">&quot;hf1&quot;</span>, <span class="st">&quot;hf2&quot;</span>, <span class="st">&quot;hf3&quot;</span>, </span>
+<span id="cb91-22"><a href="c05-descriptive-analysis.html#cb91-22" tabindex="-1"></a>            <span class="st">&quot;hf4&quot;</span>, <span class="st">&quot;hf5&quot;</span>, <span class="st">&quot;hf6&quot;</span>, <span class="st">&quot;hf7&quot;</span>, <span class="st">&quot;hf8&quot;</span>, <span class="st">&quot;hf9&quot;</span>),</span>
+<span id="cb91-23"><a href="c05-descriptive-analysis.html#cb91-23" tabindex="-1"></a>  <span class="at">df =</span> <span class="cn">NULL</span></span>
+<span id="cb91-24"><a href="c05-descriptive-analysis.html#cb91-24" tabindex="-1"></a>)</span></code></pre></div>
 <p>The arguments available in both functions are:</p>
 <ul>
 <li><code>x</code>: a variable, expression, or empty</li>
@@ -971,26 +970,25 @@ <h3><span class="header-section-number">5.5.1</span> Syntax<a href="c05-descript
 <li><code>vartype</code>: type(s) of variation estimate to calculate, defaults to <code>se</code> (standard error)</li>
 <li><code>level</code>: a number or a vector indicating the confidence level, defaults to 0.95</li>
 <li><code>interval_type</code>: method for calculating a confidence interval</li>
-<li><code>qrule</code>: rule for defining quantiles. The default is the lower end of the quantile interval (“math”). The midpoint of the quantile interval is the “school” rule. “hf1” to “hf9” are weighted analogs to type=1 to 9 in <code>quantile()</code>. “shahvaish” corresponds to a rule proposed by <span class="citation">Shah and Vaish (<a href="#ref-shahvaish">2006</a>)</span>. See <code>vignette("qrule", package="survey")</code> for more information.</li>
+<li><code>qrule</code>: rule for defining quantiles. The default is the lower end of the quantile interval (“math”.) The midpoint of the quantile interval is the “school” rule. “hf1” to “hf9” are weighted analogs to type=1 to 9 in <code>quantile()</code>. “shahvaish” corresponds to a rule proposed by <span class="citation">Shah and Vaish (<a href="#ref-shahvaish">2006</a>)</span>. See <code>vignette("qrule", package="survey")</code> for more information.</li>
 <li><code>df</code>: (for <code>vartype = 'ci'</code>), a numeric value indicating degrees of freedom for the t-distribution</li>
 </ul>
 <p>The only difference between <code>survey_quantile()</code> and <code>survey_median()</code> is the inclusion of the <code>quantiles</code> argument in the <code>survey_quantile()</code> function. This argument takes a vector with values between 0 and 1 to indicate which quantiles to calculate. For example, if we wanted the quartiles of a variable, we would provide <code>quantiles = c(0.25, 0.5, 0.75)</code>. While we can specify quantiles of 0 and 1, which represent the minimum and maximum, this is not recommended. It only returns the minimum and maximum of the respondents and cannot be extrapolated to the population as there is no valid definition of standard error.</p>
-<p>In Section <a href="c05-descriptive-analysis.html#desc-count-syntax">5.2.1</a>, we provide an overview of the different variability types. The interval used in confidence intervals for most measures, such as means and counts, is referred to as a Wald-type interval. However, this is not always the most accurate interval for quantiles. Similar to confidence intervals for proportions, quantiles have various interval types including asin, beta, mean, and xlogit (see Section <a href="c05-descriptive-analysis.html#desc-meanprop-syntax">5.4.1</a>). Quantiles also have two more methods available:</p>
+<p>In Section <a href="c05-descriptive-analysis.html#desc-count-syntax">5.2.1</a>, we provide an overview of the different variability types. The interval used in confidence intervals for most measures, such as means and counts, is referred to as a Wald-type interval. However, this is not always the most accurate interval for quantiles. Similar to confidence intervals for proportions, quantiles have various interval types, including asin, beta, mean, and xlogit (see Section <a href="c05-descriptive-analysis.html#desc-meanprop-syntax">5.4.1</a>.) Quantiles also have two more methods available:</p>
 <ul>
 <li><code>score</code>: the Francisco and Fuller confidence interval based on inverting a score test (only available for design-based survey objects and not replicate-based objects)</li>
 <li><code>quantile</code>: based on the replicates of the quantile. This is not valid for jackknife-type replicates but is available for bootstrap and BRR replicates.</li>
 </ul>
-<p>One note with the <code>score</code> method is that when there are numerous ties in the data, this method may produce confidence intervals that do not contain the estimate. When dealing with a high propensity for ties (e.g., many respondents have the same age), it is recommended to use another method. SUDAAN, for example, uses the <code>score</code> method but adds noise to the values to prevent issues. The documentation in the {survey} package indicates in general, the <code>score</code> method may have poorer performance compared to the beta and logit intervals <span class="citation">(<a href="#ref-lumley2010complex">Lumley 2010</a>)</span>.</p>
+<p>One note with the <code>score</code> method is that when there are numerous ties in the data, this method may produce confidence intervals that do not contain the estimate. When dealing with a high propensity for ties (e.g., many respondents are the same age), it is recommended to use another method. SUDAAN, for example, uses the <code>score</code> method but adds noise to the values to prevent issues. The documentation in the {survey} package indicates, in general, that the <code>score</code> method may have poorer performance compared to the beta and logit intervals <span class="citation">(<a href="#ref-lumley2010complex">Lumley 2010</a>)</span>.</p>
 </div>
 <div id="examples-3" class="section level3 hasAnchor" number="5.5.2">
 <h3><span class="header-section-number">5.5.2</span> Examples<a href="c05-descriptive-analysis.html#examples-3" class="anchor-section" aria-label="Anchor link to header"></a></h3>
 <div id="example-1-overall-quartiles" class="section level4 unnumbered hasAnchor">
 <h4>Example 1: Overall quartiles<a href="c05-descriptive-analysis.html#example-1-overall-quartiles" class="anchor-section" aria-label="Anchor link to header"></a></h4>
 <p>Quantiles provide insights into the distribution of a variable. Let’s look into the quartiles, specifically, the first quartile (p=0.25), the median (p=0.5), and the third quartile (p=0.75) of electric bills.</p>
-<div class="sourceCode" id="cb94"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb94-1"><a href="c05-descriptive-analysis.html#cb94-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
-<span id="cb94-2"><a href="c05-descriptive-analysis.html#cb94-2" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">elec_bill =</span> <span class="fu">survey_quantile</span>(DOLLAREL,</span>
-<span id="cb94-3"><a href="c05-descriptive-analysis.html#cb94-3" tabindex="-1"></a>                                        <span class="at">quantiles =</span> <span class="fu">c</span>(<span class="fl">0.25</span>, .<span class="dv">5</span>, <span class="fl">0.75</span>))) <span class="sc">%&gt;%</span></span>
-<span id="cb94-4"><a href="c05-descriptive-analysis.html#cb94-4" tabindex="-1"></a>  <span class="fu">print</span>(<span class="at">width=</span><span class="cn">Inf</span>)</span></code></pre></div>
+<div class="sourceCode" id="cb92"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb92-1"><a href="c05-descriptive-analysis.html#cb92-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
+<span id="cb92-2"><a href="c05-descriptive-analysis.html#cb92-2" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">elec_bill =</span> <span class="fu">survey_quantile</span>(DOLLAREL,</span>
+<span id="cb92-3"><a href="c05-descriptive-analysis.html#cb92-3" tabindex="-1"></a>                                        <span class="at">quantiles =</span> <span class="fu">c</span>(<span class="fl">0.25</span>, .<span class="dv">5</span>, <span class="fl">0.75</span>)))</span></code></pre></div>
 <pre><code>## # A tibble: 1 × 6
 ##   elec_bill_q25 elec_bill_q50 elec_bill_q75 elec_bill_q25_se
 ##           &lt;dbl&gt;         &lt;dbl&gt;         &lt;dbl&gt;            &lt;dbl&gt;
@@ -998,16 +996,15 @@ <h4>Example 1: Overall quartiles<a href="c05-descriptive-analysis.html#example-1
 ##   elec_bill_q50_se elec_bill_q75_se
 ##              &lt;dbl&gt;            &lt;dbl&gt;
 ## 1             6.33             9.99</code></pre>
-<p>The output above shows the values for the three quartiles and their respective standard errors: the 25th percentile is $795 with a standard error of $5.69, the 50th percentile (median) is $1,215 with a standard error of $6.33, and the 75th percentile is $1,770 with a standard error of $9.99.</p>
+<p>The output above shows the values for the three quartiles of electric bill costs and their respective standard errors: the 25th percentile is $795 with a standard error of $5.69, the 50th percentile (median) is $1,215 with a standard error of $6.33, and the 75th percentile is $1,770 with a standard error of $9.99.</p>
 </div>
 <div id="example-2-quartiles-by-subgroup" class="section level4 unnumbered hasAnchor">
 <h4>Example 2: Quartiles by subgroup<a href="c05-descriptive-analysis.html#example-2-quartiles-by-subgroup" class="anchor-section" aria-label="Anchor link to header"></a></h4>
 <p>We can estimate the quantiles of electric bills by region by using the <code>group_by()</code> function:</p>
-<div class="sourceCode" id="cb96"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb96-1"><a href="c05-descriptive-analysis.html#cb96-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
-<span id="cb96-2"><a href="c05-descriptive-analysis.html#cb96-2" tabindex="-1"></a>  <span class="fu">group_by</span>(Region) <span class="sc">%&gt;%</span></span>
-<span id="cb96-3"><a href="c05-descriptive-analysis.html#cb96-3" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">elec_bill =</span> <span class="fu">survey_quantile</span>(DOLLAREL,</span>
-<span id="cb96-4"><a href="c05-descriptive-analysis.html#cb96-4" tabindex="-1"></a>                                        <span class="at">quantiles =</span> <span class="fu">c</span>(<span class="fl">0.25</span>, .<span class="dv">5</span>, <span class="fl">0.75</span>))) <span class="sc">%&gt;%</span></span>
-<span id="cb96-5"><a href="c05-descriptive-analysis.html#cb96-5" tabindex="-1"></a>  <span class="fu">print</span>(<span class="at">width =</span> <span class="cn">Inf</span>)</span></code></pre></div>
+<div class="sourceCode" id="cb94"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb94-1"><a href="c05-descriptive-analysis.html#cb94-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
+<span id="cb94-2"><a href="c05-descriptive-analysis.html#cb94-2" tabindex="-1"></a>  <span class="fu">group_by</span>(Region) <span class="sc">%&gt;%</span></span>
+<span id="cb94-3"><a href="c05-descriptive-analysis.html#cb94-3" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">elec_bill =</span> <span class="fu">survey_quantile</span>(DOLLAREL,</span>
+<span id="cb94-4"><a href="c05-descriptive-analysis.html#cb94-4" tabindex="-1"></a>                                        <span class="at">quantiles =</span> <span class="fu">c</span>(<span class="fl">0.25</span>, .<span class="dv">5</span>, <span class="fl">0.75</span>))) </span></code></pre></div>
 <pre><code>## # A tibble: 4 × 7
 ##   Region    elec_bill_q25 elec_bill_q50 elec_bill_q75 elec_bill_q25_se
 ##   &lt;fct&gt;             &lt;dbl&gt;         &lt;dbl&gt;         &lt;dbl&gt;            &lt;dbl&gt;
@@ -1021,37 +1018,37 @@ <h4>Example 2: Quartiles by subgroup<a href="c05-descriptive-analysis.html#examp
 ## 2            11.6              18.6
 ## 3             9.17             13.9
 ## 4            14.3              20.5</code></pre>
-<p>The 25th percentile for the Northeast region is $740 while it is $968 for the South.</p>
+<p>The 25th percentile for the Northeast region is $740, while it is $968 for the South.</p>
 </div>
 <div id="example-3-minimum-and-maximum" class="section level4 unnumbered hasAnchor">
 <h4>Example 3: Minimum and maximum<a href="c05-descriptive-analysis.html#example-3-minimum-and-maximum" class="anchor-section" aria-label="Anchor link to header"></a></h4>
-<p>As mentioned in the syntax section, we can specify quantiles of <code>0</code> (minimum) and <code>1</code> (maximum) and R will calculate these values. However, these are only the minimum and maximum values in the data, and there is not enough information to determine their standard errors:</p>
-<div class="sourceCode" id="cb98"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb98-1"><a href="c05-descriptive-analysis.html#cb98-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
-<span id="cb98-2"><a href="c05-descriptive-analysis.html#cb98-2" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">elec_bill =</span> <span class="fu">survey_quantile</span>(DOLLAREL,</span>
-<span id="cb98-3"><a href="c05-descriptive-analysis.html#cb98-3" tabindex="-1"></a>                                        <span class="at">quantiles =</span> <span class="fu">c</span>(<span class="dv">0</span>, <span class="dv">1</span>))) </span></code></pre></div>
+<p>As mentioned in the syntax section, we can specify quantiles of <code>0</code> (minimum) and <code>1</code> (maximum), and R calculates these values. However, these are only the minimum and maximum values in the data, and there is not enough information to determine their standard errors:</p>
+<div class="sourceCode" id="cb96"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb96-1"><a href="c05-descriptive-analysis.html#cb96-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
+<span id="cb96-2"><a href="c05-descriptive-analysis.html#cb96-2" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">elec_bill =</span> <span class="fu">survey_quantile</span>(DOLLAREL,</span>
+<span id="cb96-3"><a href="c05-descriptive-analysis.html#cb96-3" tabindex="-1"></a>                                        <span class="at">quantiles =</span> <span class="fu">c</span>(<span class="dv">0</span>, <span class="dv">1</span>))) </span></code></pre></div>
 <pre><code>## # A tibble: 1 × 4
 ##   elec_bill_q00 elec_bill_q100 elec_bill_q00_se elec_bill_q100_se
 ##           &lt;dbl&gt;          &lt;dbl&gt;            &lt;dbl&gt;             &lt;dbl&gt;
 ## 1         -151.         15680.              NaN                 0</code></pre>
-<p>The minimum cost of electricity in the dataset is -$151 while the maximum is $15,680, but the standard error is shown as <code>NaN</code> and 0, respectively. Notice that the minimum cost is a negative number which may be surprising but some housing units with solar power sell their energy back to the grid and make money which is recorded as a negative expenditure.</p>
+<p>The minimum cost of electricity in the dataset is -$151 while the maximum is $15,680, but the standard error is shown as <code>NaN</code> and 0, respectively. Notice that the minimum cost is a negative number. This may be surprising, but some housing units with solar power sell their energy back to the grid and earn money, which is recorded as a negative expenditure.</p>
 </div>
 <div id="example-4-overall-median" class="section level4 unnumbered hasAnchor">
 <h4>Example 4: Overall median<a href="c05-descriptive-analysis.html#example-4-overall-median" class="anchor-section" aria-label="Anchor link to header"></a></h4>
 <p>We can calculate the estimated median cost of electricity in the U.S. using the <code>survey_median()</code> function:</p>
-<div class="sourceCode" id="cb100"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb100-1"><a href="c05-descriptive-analysis.html#cb100-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
-<span id="cb100-2"><a href="c05-descriptive-analysis.html#cb100-2" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">elec_bill =</span> <span class="fu">survey_median</span>(DOLLAREL))</span></code></pre></div>
+<div class="sourceCode" id="cb98"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb98-1"><a href="c05-descriptive-analysis.html#cb98-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
+<span id="cb98-2"><a href="c05-descriptive-analysis.html#cb98-2" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">elec_bill =</span> <span class="fu">survey_median</span>(DOLLAREL))</span></code></pre></div>
 <pre><code>## # A tibble: 1 × 2
 ##   elec_bill elec_bill_se
 ##       &lt;dbl&gt;        &lt;dbl&gt;
 ## 1     1215.         6.33</code></pre>
-<p>Nationally, the median household spent $1,215 in 2020. This is the same result as we obtained using the <code>survey_quantile()</code> function. Interestingly, the average electric bill for households that we calculated in section <a href="c05-descriptive-analysis.html#desc-meanprop">5.4</a> is $1,380, but the estimated median electric bill is $1,215 indicating the distribution is likely right-skewed.</p>
+<p>Nationally, the median household spent $1,215 in 2020. This is the same result as we obtained using the <code>survey_quantile()</code> function. Interestingly, the average electric bill for households that we calculated in Section <a href="c05-descriptive-analysis.html#desc-meanprop">5.4</a> is $1,380, but the estimated median electric bill is $1,215, indicating the distribution is likely right-skewed.</p>
 </div>
 <div id="example-5-medians-by-subgroup" class="section level4 unnumbered hasAnchor">
 <h4>Example 5: Medians by subgroup<a href="c05-descriptive-analysis.html#example-5-medians-by-subgroup" class="anchor-section" aria-label="Anchor link to header"></a></h4>
 <p>We can calculate the estimated median cost of electricity in the U.S. by region using the <code>group_by()</code> function with the variable(s) of interest before the <code>summarize()</code> function, similar to when we found the mean by region.</p>
-<div class="sourceCode" id="cb102"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb102-1"><a href="c05-descriptive-analysis.html#cb102-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
-<span id="cb102-2"><a href="c05-descriptive-analysis.html#cb102-2" tabindex="-1"></a>  <span class="fu">group_by</span>(Region) <span class="sc">%&gt;%</span></span>
-<span id="cb102-3"><a href="c05-descriptive-analysis.html#cb102-3" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">elec_bill =</span> <span class="fu">survey_median</span>(DOLLAREL))</span></code></pre></div>
+<div class="sourceCode" id="cb100"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb100-1"><a href="c05-descriptive-analysis.html#cb100-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
+<span id="cb100-2"><a href="c05-descriptive-analysis.html#cb100-2" tabindex="-1"></a>  <span class="fu">group_by</span>(Region) <span class="sc">%&gt;%</span></span>
+<span id="cb100-3"><a href="c05-descriptive-analysis.html#cb100-3" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">elec_bill =</span> <span class="fu">survey_median</span>(DOLLAREL))</span></code></pre></div>
 <pre><code>## # A tibble: 4 × 3
 ##   Region    elec_bill elec_bill_se
 ##   &lt;fct&gt;         &lt;dbl&gt;        &lt;dbl&gt;
@@ -1059,7 +1056,7 @@ <h4>Example 5: Medians by subgroup<a href="c05-descriptive-analysis.html#example
 ## 2 Midwest       1149.        11.6 
 ## 3 South         1402.         9.17
 ## 4 West          1028.        14.3</code></pre>
-<p>Households from the Northeast spent $1,148 on electricity, and in the South, they spent an average of $1,402.</p>
+<p>We estimate that households in the Northeast spent a median of $1,148 on electricity, and in the South, they spent a median of $1,402.</p>
 </div>
 </div>
 </div>
@@ -1074,15 +1071,15 @@ <h2><span class="header-section-number">5.6</span> Ratios<a href="c05-descriptiv
 <div id="syntax-2" class="section level3 hasAnchor" number="5.6.1">
 <h3><span class="header-section-number">5.6.1</span> Syntax<a href="c05-descriptive-analysis.html#syntax-2" class="anchor-section" aria-label="Anchor link to header"></a></h3>
 <p>The syntax for <code>survey_ratio()</code> is as follows:</p>
-<div class="sourceCode" id="cb104"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb104-1"><a href="c05-descriptive-analysis.html#cb104-1" tabindex="-1"></a><span class="fu">survey_ratio</span>(</span>
-<span id="cb104-2"><a href="c05-descriptive-analysis.html#cb104-2" tabindex="-1"></a>  numerator,</span>
-<span id="cb104-3"><a href="c05-descriptive-analysis.html#cb104-3" tabindex="-1"></a>  denominator,</span>
-<span id="cb104-4"><a href="c05-descriptive-analysis.html#cb104-4" tabindex="-1"></a>  <span class="at">na.rm =</span> <span class="cn">FALSE</span>,</span>
-<span id="cb104-5"><a href="c05-descriptive-analysis.html#cb104-5" tabindex="-1"></a>  <span class="at">vartype =</span> <span class="fu">c</span>(<span class="st">&quot;se&quot;</span>, <span class="st">&quot;ci&quot;</span>, <span class="st">&quot;var&quot;</span>, <span class="st">&quot;cv&quot;</span>),</span>
-<span id="cb104-6"><a href="c05-descriptive-analysis.html#cb104-6" tabindex="-1"></a>  <span class="at">level =</span> <span class="fl">0.95</span>,</span>
-<span id="cb104-7"><a href="c05-descriptive-analysis.html#cb104-7" tabindex="-1"></a>  <span class="at">deff =</span> <span class="cn">FALSE</span>,</span>
-<span id="cb104-8"><a href="c05-descriptive-analysis.html#cb104-8" tabindex="-1"></a>  <span class="at">df =</span> <span class="cn">NULL</span></span>
-<span id="cb104-9"><a href="c05-descriptive-analysis.html#cb104-9" tabindex="-1"></a>)</span></code></pre></div>
+<div class="sourceCode" id="cb102"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb102-1"><a href="c05-descriptive-analysis.html#cb102-1" tabindex="-1"></a><span class="fu">survey_ratio</span>(</span>
+<span id="cb102-2"><a href="c05-descriptive-analysis.html#cb102-2" tabindex="-1"></a>  numerator,</span>
+<span id="cb102-3"><a href="c05-descriptive-analysis.html#cb102-3" tabindex="-1"></a>  denominator,</span>
+<span id="cb102-4"><a href="c05-descriptive-analysis.html#cb102-4" tabindex="-1"></a>  <span class="at">na.rm =</span> <span class="cn">FALSE</span>,</span>
+<span id="cb102-5"><a href="c05-descriptive-analysis.html#cb102-5" tabindex="-1"></a>  <span class="at">vartype =</span> <span class="fu">c</span>(<span class="st">&quot;se&quot;</span>, <span class="st">&quot;ci&quot;</span>, <span class="st">&quot;var&quot;</span>, <span class="st">&quot;cv&quot;</span>),</span>
+<span id="cb102-6"><a href="c05-descriptive-analysis.html#cb102-6" tabindex="-1"></a>  <span class="at">level =</span> <span class="fl">0.95</span>,</span>
+<span id="cb102-7"><a href="c05-descriptive-analysis.html#cb102-7" tabindex="-1"></a>  <span class="at">deff =</span> <span class="cn">FALSE</span>,</span>
+<span id="cb102-8"><a href="c05-descriptive-analysis.html#cb102-8" tabindex="-1"></a>  <span class="at">df =</span> <span class="cn">NULL</span></span>
+<span id="cb102-9"><a href="c05-descriptive-analysis.html#cb102-9" tabindex="-1"></a>)</span></code></pre></div>
 <p>The arguments are:</p>
 <ul>
 <li><code>numerator</code>: The numerator of the ratio</li>
@@ -1099,14 +1096,13 @@ <h3><span class="header-section-number">5.6.2</span> Examples<a href="c05-descri
 <div id="example-1-overall-ratios" class="section level4 unnumbered hasAnchor">
 <h4>Example 1: Overall ratios<a href="c05-descriptive-analysis.html#example-1-overall-ratios" class="anchor-section" aria-label="Anchor link to header"></a></h4>
 <p>Suppose we wanted to find the ratio of dollars spent on liquid propane per unit (in British thermal unit [Btu]) nationally<a href="#fn6" class="footnote-ref" id="fnref6"><sup>6</sup></a>. To find the average cost to a household, we can use <code>survey_mean()</code>. However, to find the national unit rate, we can use <code>survey_ratio()</code>. In the following example, we show both methods and discuss the interpretation of each:</p>
-<div class="sourceCode" id="cb105"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb105-1"><a href="c05-descriptive-analysis.html#cb105-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
-<span id="cb105-2"><a href="c05-descriptive-analysis.html#cb105-2" tabindex="-1"></a>  <span class="fu">summarize</span>(</span>
-<span id="cb105-3"><a href="c05-descriptive-analysis.html#cb105-3" tabindex="-1"></a>    <span class="at">DOLLARLP_Tot =</span> <span class="fu">survey_total</span>(DOLLARLP, <span class="at">vartype =</span> <span class="cn">NULL</span>),</span>
-<span id="cb105-4"><a href="c05-descriptive-analysis.html#cb105-4" tabindex="-1"></a>    <span class="at">BTULP_Tot =</span> <span class="fu">survey_total</span>(BTULP, <span class="at">vartype =</span> <span class="cn">NULL</span>),</span>
-<span id="cb105-5"><a href="c05-descriptive-analysis.html#cb105-5" tabindex="-1"></a>    <span class="at">DOL_BTU_Rat =</span> <span class="fu">survey_ratio</span>(DOLLARLP, BTULP),</span>
-<span id="cb105-6"><a href="c05-descriptive-analysis.html#cb105-6" tabindex="-1"></a>    <span class="at">DOL_BTU_Avg =</span> <span class="fu">survey_mean</span>(DOLLARLP <span class="sc">/</span> BTULP, <span class="at">na.rm =</span> <span class="cn">TRUE</span>)</span>
-<span id="cb105-7"><a href="c05-descriptive-analysis.html#cb105-7" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
-<span id="cb105-8"><a href="c05-descriptive-analysis.html#cb105-8" tabindex="-1"></a>  <span class="fu">print</span>(<span class="at">width =</span> <span class="cn">Inf</span>)</span></code></pre></div>
+<div class="sourceCode" id="cb103"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb103-1"><a href="c05-descriptive-analysis.html#cb103-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
+<span id="cb103-2"><a href="c05-descriptive-analysis.html#cb103-2" tabindex="-1"></a>  <span class="fu">summarize</span>(</span>
+<span id="cb103-3"><a href="c05-descriptive-analysis.html#cb103-3" tabindex="-1"></a>    <span class="at">DOLLARLP_Tot =</span> <span class="fu">survey_total</span>(DOLLARLP, <span class="at">vartype =</span> <span class="cn">NULL</span>),</span>
+<span id="cb103-4"><a href="c05-descriptive-analysis.html#cb103-4" tabindex="-1"></a>    <span class="at">BTULP_Tot =</span> <span class="fu">survey_total</span>(BTULP, <span class="at">vartype =</span> <span class="cn">NULL</span>),</span>
+<span id="cb103-5"><a href="c05-descriptive-analysis.html#cb103-5" tabindex="-1"></a>    <span class="at">DOL_BTU_Rat =</span> <span class="fu">survey_ratio</span>(DOLLARLP, BTULP),</span>
+<span id="cb103-6"><a href="c05-descriptive-analysis.html#cb103-6" tabindex="-1"></a>    <span class="at">DOL_BTU_Avg =</span> <span class="fu">survey_mean</span>(DOLLARLP <span class="sc">/</span> BTULP, <span class="at">na.rm =</span> <span class="cn">TRUE</span>)</span>
+<span id="cb103-7"><a href="c05-descriptive-analysis.html#cb103-7" tabindex="-1"></a>  )</span></code></pre></div>
 <pre><code>## # A tibble: 1 × 6
 ##   DOLLARLP_Tot     BTULP_Tot DOL_BTU_Rat DOL_BTU_Rat_se DOL_BTU_Avg
 ##          &lt;dbl&gt;         &lt;dbl&gt;       &lt;dbl&gt;          &lt;dbl&gt;       &lt;dbl&gt;
@@ -1114,15 +1110,15 @@ <h4>Example 1: Overall ratios<a href="c05-descriptive-analysis.html#example-1-ov
 ##   DOL_BTU_Avg_se
 ##            &lt;dbl&gt;
 ## 1       0.000223</code></pre>
-<p>The ratio of the total spent on liquid propane to the total consumption was 0.0208, but the average rate was 0.024. With a bit of calculation, we can show that the ratio is the ratio of the totals <code>DOLLARLP_Tot</code>/<code>BTULP_Tot</code>=8,122,911,173/391,425,311,586=0.0208. Although the ratio can be calculated manually in this manner, the standard error requires the use of the <code>survey_ratio()</code> function. The average can be interpreted as the average rate paid by a household.</p>
+<p>The ratio of the total spent on liquid propane to the total consumption was 0.0208, but the average rate was 0.024. With a bit of calculation, we can show that the ratio is the ratio of the totals <code>DOLLARLP_Tot</code>/<code>BTULP_Tot</code>=8,122,911,173/391,425,311,586=0.0208. Although the estimated ratio can be calculated manually in this manner, the standard error requires the use of the <code>survey_ratio()</code> function. The average can be interpreted as the average rate paid by a household.</p>
 </div>
 <div id="example-2-ratios-by-subgroup" class="section level4 unnumbered hasAnchor">
 <h4>Example 2: Ratios by subgroup<a href="c05-descriptive-analysis.html#example-2-ratios-by-subgroup" class="anchor-section" aria-label="Anchor link to header"></a></h4>
 <p>As previously done with other estimates, we can use <code>group_by()</code> to examine whether this ratio varies by region.</p>
-<div class="sourceCode" id="cb107"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb107-1"><a href="c05-descriptive-analysis.html#cb107-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
-<span id="cb107-2"><a href="c05-descriptive-analysis.html#cb107-2" tabindex="-1"></a>  <span class="fu">group_by</span>(Region) <span class="sc">%&gt;%</span></span>
-<span id="cb107-3"><a href="c05-descriptive-analysis.html#cb107-3" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">DOL_BTU_Rat =</span> <span class="fu">survey_ratio</span>(DOLLARLP, BTULP)) <span class="sc">%&gt;%</span></span>
-<span id="cb107-4"><a href="c05-descriptive-analysis.html#cb107-4" tabindex="-1"></a>  <span class="fu">arrange</span>(DOL_BTU_Rat)</span></code></pre></div>
+<div class="sourceCode" id="cb105"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb105-1"><a href="c05-descriptive-analysis.html#cb105-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
+<span id="cb105-2"><a href="c05-descriptive-analysis.html#cb105-2" tabindex="-1"></a>  <span class="fu">group_by</span>(Region) <span class="sc">%&gt;%</span></span>
+<span id="cb105-3"><a href="c05-descriptive-analysis.html#cb105-3" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">DOL_BTU_Rat =</span> <span class="fu">survey_ratio</span>(DOLLARLP, BTULP)) <span class="sc">%&gt;%</span></span>
+<span id="cb105-4"><a href="c05-descriptive-analysis.html#cb105-4" tabindex="-1"></a>  <span class="fu">arrange</span>(DOL_BTU_Rat)</span></code></pre></div>
 <pre><code>## # A tibble: 4 × 3
 ##   Region    DOL_BTU_Rat DOL_BTU_Rat_se
 ##   &lt;fct&gt;           &lt;dbl&gt;          &lt;dbl&gt;
@@ -1130,26 +1126,26 @@ <h4>Example 2: Ratios by subgroup<a href="c05-descriptive-analysis.html#example-
 ## 2 South          0.0245       0.000388
 ## 3 West           0.0246       0.000875
 ## 4 Northeast      0.0247       0.000488</code></pre>
-<p>Although not a formal statistical test, it appears that the cost ratios for liquid propane are the lowest in the Midwest (0.0158).</p>
+<p>Although not a formal statistical test, it appears that the cost ratios for liquid propane are the lowest in the Midwest (0.0158.)</p>
 </div>
 </div>
 </div>
 <div id="correlations" class="section level2 hasAnchor" number="5.7">
 <h2><span class="header-section-number">5.7</span> Correlations<a href="c05-descriptive-analysis.html#correlations" class="anchor-section" aria-label="Anchor link to header"></a></h2>
-<p>The correlation is a measure of the linear relationship between two continuous variables, which ranges between -1 and 1. The most commonly used method is Pearson’s correlation (referred to as correlation henceforth). A sample correlation for a simple random sample is calculated as follows:</p>
+<p>The correlation is a measure of the linear relationship between two continuous variables, which ranges between -1 and 1. The most commonly used method is Pearson’s correlation (referred to as correlation henceforth.) A sample correlation for a simple random sample is calculated as follows:</p>
 <p><span class="math display">\[\frac{\sum (x_i-\bar{x})(y_i-\bar{y})}{\sqrt{\sum (x_i-\bar{x})^2} \sqrt{\sum(y_i-\bar{y})^2}} \]</span></p>
 <p>When using <code>survey_corr()</code> for designs other than a simple random sample, the weights are applied when estimating the correlation.</p>
 <div id="syntax-3" class="section level3 hasAnchor" number="5.7.1">
 <h3><span class="header-section-number">5.7.1</span> Syntax<a href="c05-descriptive-analysis.html#syntax-3" class="anchor-section" aria-label="Anchor link to header"></a></h3>
 <p>The syntax for <code>survey_corr()</code> is as follows:</p>
-<div class="sourceCode" id="cb109"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb109-1"><a href="c05-descriptive-analysis.html#cb109-1" tabindex="-1"></a><span class="fu">survey_corr</span>(</span>
-<span id="cb109-2"><a href="c05-descriptive-analysis.html#cb109-2" tabindex="-1"></a>  x,</span>
-<span id="cb109-3"><a href="c05-descriptive-analysis.html#cb109-3" tabindex="-1"></a>  y,</span>
-<span id="cb109-4"><a href="c05-descriptive-analysis.html#cb109-4" tabindex="-1"></a>  <span class="at">na.rm =</span> <span class="cn">FALSE</span>,</span>
-<span id="cb109-5"><a href="c05-descriptive-analysis.html#cb109-5" tabindex="-1"></a>  <span class="at">vartype =</span> <span class="fu">c</span>(<span class="st">&quot;se&quot;</span>, <span class="st">&quot;ci&quot;</span>, <span class="st">&quot;var&quot;</span>, <span class="st">&quot;cv&quot;</span>),</span>
-<span id="cb109-6"><a href="c05-descriptive-analysis.html#cb109-6" tabindex="-1"></a>  <span class="at">level =</span> <span class="fl">0.95</span>,</span>
-<span id="cb109-7"><a href="c05-descriptive-analysis.html#cb109-7" tabindex="-1"></a>  <span class="at">df =</span> <span class="cn">NULL</span></span>
-<span id="cb109-8"><a href="c05-descriptive-analysis.html#cb109-8" tabindex="-1"></a>)</span></code></pre></div>
+<div class="sourceCode" id="cb107"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb107-1"><a href="c05-descriptive-analysis.html#cb107-1" tabindex="-1"></a><span class="fu">survey_corr</span>(</span>
+<span id="cb107-2"><a href="c05-descriptive-analysis.html#cb107-2" tabindex="-1"></a>  x,</span>
+<span id="cb107-3"><a href="c05-descriptive-analysis.html#cb107-3" tabindex="-1"></a>  y,</span>
+<span id="cb107-4"><a href="c05-descriptive-analysis.html#cb107-4" tabindex="-1"></a>  <span class="at">na.rm =</span> <span class="cn">FALSE</span>,</span>
+<span id="cb107-5"><a href="c05-descriptive-analysis.html#cb107-5" tabindex="-1"></a>  <span class="at">vartype =</span> <span class="fu">c</span>(<span class="st">&quot;se&quot;</span>, <span class="st">&quot;ci&quot;</span>, <span class="st">&quot;var&quot;</span>, <span class="st">&quot;cv&quot;</span>),</span>
+<span id="cb107-6"><a href="c05-descriptive-analysis.html#cb107-6" tabindex="-1"></a>  <span class="at">level =</span> <span class="fl">0.95</span>,</span>
+<span id="cb107-7"><a href="c05-descriptive-analysis.html#cb107-7" tabindex="-1"></a>  <span class="at">df =</span> <span class="cn">NULL</span></span>
+<span id="cb107-8"><a href="c05-descriptive-analysis.html#cb107-8" tabindex="-1"></a>)</span></code></pre></div>
 <p>The arguments are:</p>
 <ul>
 <li><code>x</code>: A variable or expression</li>
@@ -1164,27 +1160,27 @@ <h3><span class="header-section-number">5.7.1</span> Syntax<a href="c05-descript
 <h3><span class="header-section-number">5.7.2</span> Examples<a href="c05-descriptive-analysis.html#examples-5" class="anchor-section" aria-label="Anchor link to header"></a></h3>
 <div id="example-1-overall-correlation" class="section level4 unnumbered hasAnchor">
 <h4>Example 1: Overall correlation<a href="c05-descriptive-analysis.html#example-1-overall-correlation" class="anchor-section" aria-label="Anchor link to header"></a></h4>
-<p>We can calculate the correlation between total square footage of homes (<code>TOTSQFT_EN</code>)<a href="#fn7" class="footnote-ref" id="fnref7"><sup>7</sup></a> and electricity consumption (<code>BTUEL</code>)<a href="#fn8" class="footnote-ref" id="fnref8"><sup>8</sup></a>.</p>
-<div class="sourceCode" id="cb110"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb110-1"><a href="c05-descriptive-analysis.html#cb110-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
-<span id="cb110-2"><a href="c05-descriptive-analysis.html#cb110-2" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">SQFT_Elec_Corr =</span> <span class="fu">survey_corr</span>(TOTSQFT_EN, BTUEL))</span></code></pre></div>
+<p>We can calculate the correlation between the total square footage of homes (<code>TOTSQFT_EN</code>)<a href="#fn7" class="footnote-ref" id="fnref7"><sup>7</sup></a> and electricity consumption (<code>BTUEL</code>.)<a href="#fn8" class="footnote-ref" id="fnref8"><sup>8</sup></a></p>
+<div class="sourceCode" id="cb108"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb108-1"><a href="c05-descriptive-analysis.html#cb108-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
+<span id="cb108-2"><a href="c05-descriptive-analysis.html#cb108-2" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">SQFT_Elec_Corr =</span> <span class="fu">survey_corr</span>(TOTSQFT_EN, BTUEL))</span></code></pre></div>
 <pre><code>## # A tibble: 1 × 2
 ##   SQFT_Elec_Corr SQFT_Elec_Corr_se
 ##            &lt;dbl&gt;             &lt;dbl&gt;
 ## 1          0.417           0.00689</code></pre>
-<p>The correlation between total square footage of homes and electricity consumption is 0.417, indicating a moderate positive relationship.</p>
+<p>The correlation between the total square footage of homes and electricity consumption is 0.417, indicating a moderate positive relationship.</p>
 </div>
 <div id="example-2-correlations-by-subgroup" class="section level4 unnumbered hasAnchor">
 <h4>Example 2: Correlations by subgroup<a href="c05-descriptive-analysis.html#example-2-correlations-by-subgroup" class="anchor-section" aria-label="Anchor link to header"></a></h4>
-<p>Like with other statistics, we can explore the correlation between total square footage and electricity consumption based on subgroups, such as whether air conditioning is used (<code>ACUsed</code>).</p>
-<div class="sourceCode" id="cb112"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb112-1"><a href="c05-descriptive-analysis.html#cb112-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
-<span id="cb112-2"><a href="c05-descriptive-analysis.html#cb112-2" tabindex="-1"></a>  <span class="fu">group_by</span>(ACUsed) <span class="sc">%&gt;%</span></span>
-<span id="cb112-3"><a href="c05-descriptive-analysis.html#cb112-3" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">SQFT_Elec_Corr =</span> <span class="fu">survey_corr</span>(TOTSQFT_EN, DOLLAREL))</span></code></pre></div>
+<p>We can explore the correlation between total square footage and electricity consumption based on subgroups, such as whether air conditioning (A/C) is used (<code>ACUsed</code>.)</p>
+<div class="sourceCode" id="cb110"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb110-1"><a href="c05-descriptive-analysis.html#cb110-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
+<span id="cb110-2"><a href="c05-descriptive-analysis.html#cb110-2" tabindex="-1"></a>  <span class="fu">group_by</span>(ACUsed) <span class="sc">%&gt;%</span></span>
+<span id="cb110-3"><a href="c05-descriptive-analysis.html#cb110-3" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">SQFT_Elec_Corr =</span> <span class="fu">survey_corr</span>(TOTSQFT_EN, DOLLAREL))</span></code></pre></div>
 <pre><code>## # A tibble: 2 × 3
 ##   ACUsed SQFT_Elec_Corr SQFT_Elec_Corr_se
 ##   &lt;lgl&gt;           &lt;dbl&gt;             &lt;dbl&gt;
 ## 1 FALSE           0.290           0.0240 
 ## 2 TRUE            0.401           0.00808</code></pre>
-<p>For homes without air conditioning, there is a moderate positive correlation between total square footage with electricity consumption (0.29). For homes with air conditioning, the correlation of 0.401 indicates a stronger positive correlation between total square footage and electricity consumption.</p>
+<p>For homes without A/C, there is a small positive correlation between total square footage with electricity consumption (0.29.) For homes with A/C, the correlation of 0.401 indicates a stronger positive correlation between total square footage and electricity consumption.</p>
 </div>
 </div>
 </div>
@@ -1194,18 +1190,18 @@ <h2><span class="header-section-number">5.8</span> Standard deviation and varian
 <div id="syntax-4" class="section level3 hasAnchor" number="5.8.1">
 <h3><span class="header-section-number">5.8.1</span> Syntax<a href="c05-descriptive-analysis.html#syntax-4" class="anchor-section" aria-label="Anchor link to header"></a></h3>
 <p>As with non-survey data, the standard deviation estimate is the square root of the variance estimate. Therefore, the <code>survey_var()</code> and <code>survey_sd()</code> functions share the same arguments, except the standard deviation does not allow the usage of <code>vartype</code>.</p>
-<div class="sourceCode" id="cb114"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb114-1"><a href="c05-descriptive-analysis.html#cb114-1" tabindex="-1"></a><span class="fu">survey_var</span>(</span>
-<span id="cb114-2"><a href="c05-descriptive-analysis.html#cb114-2" tabindex="-1"></a>  x,</span>
-<span id="cb114-3"><a href="c05-descriptive-analysis.html#cb114-3" tabindex="-1"></a>  <span class="at">na.rm =</span> <span class="cn">FALSE</span>,</span>
-<span id="cb114-4"><a href="c05-descriptive-analysis.html#cb114-4" tabindex="-1"></a>  <span class="at">vartype =</span> <span class="fu">c</span>(<span class="st">&quot;se&quot;</span>, <span class="st">&quot;ci&quot;</span>, <span class="st">&quot;var&quot;</span>),</span>
-<span id="cb114-5"><a href="c05-descriptive-analysis.html#cb114-5" tabindex="-1"></a>  <span class="at">level =</span> <span class="fl">0.95</span>,</span>
-<span id="cb114-6"><a href="c05-descriptive-analysis.html#cb114-6" tabindex="-1"></a>  <span class="at">df =</span> <span class="cn">NULL</span></span>
-<span id="cb114-7"><a href="c05-descriptive-analysis.html#cb114-7" tabindex="-1"></a>)</span>
-<span id="cb114-8"><a href="c05-descriptive-analysis.html#cb114-8" tabindex="-1"></a></span>
-<span id="cb114-9"><a href="c05-descriptive-analysis.html#cb114-9" tabindex="-1"></a><span class="fu">survey_sd</span>(</span>
-<span id="cb114-10"><a href="c05-descriptive-analysis.html#cb114-10" tabindex="-1"></a>  x, </span>
-<span id="cb114-11"><a href="c05-descriptive-analysis.html#cb114-11" tabindex="-1"></a>  <span class="at">na.rm =</span> <span class="cn">FALSE</span></span>
-<span id="cb114-12"><a href="c05-descriptive-analysis.html#cb114-12" tabindex="-1"></a>)</span></code></pre></div>
+<div class="sourceCode" id="cb112"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb112-1"><a href="c05-descriptive-analysis.html#cb112-1" tabindex="-1"></a><span class="fu">survey_var</span>(</span>
+<span id="cb112-2"><a href="c05-descriptive-analysis.html#cb112-2" tabindex="-1"></a>  x,</span>
+<span id="cb112-3"><a href="c05-descriptive-analysis.html#cb112-3" tabindex="-1"></a>  <span class="at">na.rm =</span> <span class="cn">FALSE</span>,</span>
+<span id="cb112-4"><a href="c05-descriptive-analysis.html#cb112-4" tabindex="-1"></a>  <span class="at">vartype =</span> <span class="fu">c</span>(<span class="st">&quot;se&quot;</span>, <span class="st">&quot;ci&quot;</span>, <span class="st">&quot;var&quot;</span>),</span>
+<span id="cb112-5"><a href="c05-descriptive-analysis.html#cb112-5" tabindex="-1"></a>  <span class="at">level =</span> <span class="fl">0.95</span>,</span>
+<span id="cb112-6"><a href="c05-descriptive-analysis.html#cb112-6" tabindex="-1"></a>  <span class="at">df =</span> <span class="cn">NULL</span></span>
+<span id="cb112-7"><a href="c05-descriptive-analysis.html#cb112-7" tabindex="-1"></a>)</span>
+<span id="cb112-8"><a href="c05-descriptive-analysis.html#cb112-8" tabindex="-1"></a></span>
+<span id="cb112-9"><a href="c05-descriptive-analysis.html#cb112-9" tabindex="-1"></a><span class="fu">survey_sd</span>(</span>
+<span id="cb112-10"><a href="c05-descriptive-analysis.html#cb112-10" tabindex="-1"></a>  x, </span>
+<span id="cb112-11"><a href="c05-descriptive-analysis.html#cb112-11" tabindex="-1"></a>  <span class="at">na.rm =</span> <span class="cn">FALSE</span></span>
+<span id="cb112-12"><a href="c05-descriptive-analysis.html#cb112-12" tabindex="-1"></a>)</span></code></pre></div>
 <p>The arguments are:</p>
 <ul>
 <li><code>x</code>: A variable or expression, or empty</li>
@@ -1220,22 +1216,22 @@ <h3><span class="header-section-number">5.8.2</span> Examples<a href="c05-descri
 <div id="example-1-overall-variability" class="section level4 unnumbered hasAnchor">
 <h4>Example 1: Overall variability<a href="c05-descriptive-analysis.html#example-1-overall-variability" class="anchor-section" aria-label="Anchor link to header"></a></h4>
 <p>Let’s return to electricity bills and explore the variability in electricity expenditure.</p>
-<div class="sourceCode" id="cb115"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb115-1"><a href="c05-descriptive-analysis.html#cb115-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
-<span id="cb115-2"><a href="c05-descriptive-analysis.html#cb115-2" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">var_elbill =</span> <span class="fu">survey_var</span>(DOLLAREL),</span>
-<span id="cb115-3"><a href="c05-descriptive-analysis.html#cb115-3" tabindex="-1"></a>            <span class="at">sd_elbill =</span> <span class="fu">survey_sd</span>(DOLLAREL))</span></code></pre></div>
+<div class="sourceCode" id="cb113"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb113-1"><a href="c05-descriptive-analysis.html#cb113-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
+<span id="cb113-2"><a href="c05-descriptive-analysis.html#cb113-2" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">var_elbill =</span> <span class="fu">survey_var</span>(DOLLAREL),</span>
+<span id="cb113-3"><a href="c05-descriptive-analysis.html#cb113-3" tabindex="-1"></a>            <span class="at">sd_elbill =</span> <span class="fu">survey_sd</span>(DOLLAREL))</span></code></pre></div>
 <pre><code>## # A tibble: 1 × 3
 ##   var_elbill var_elbill_se sd_elbill
 ##        &lt;dbl&gt;         &lt;dbl&gt;     &lt;dbl&gt;
 ## 1    704906.        13926.      840.</code></pre>
-<p>We may encounter a warning related to a deprecation in the underlying calculations performed by the <code>survey_var()</code> function. This warning is a result of changes in the way R handles recycling in vectorized operations. The results are still valid. They give an estimate of the population variance of electricity bills (<code>var_elbill</code>), the standard error of that variance (<code>var_elbill_se</code>), and the estimated population standard deviation of electricity bills (<code>sd_elbill</code>). Note that no standard error is associated with the standard deviation - this is the only estimate that does not include a standard error.</p>
+<p>We may encounter a warning related to deprecated underlying calculations performed by the <code>survey_var()</code> function. This warning is a result of changes in the way R handles recycling in vectorized operations. The results are still valid. They give an estimate of the population variance of electricity bills (<code>var_elbill</code>), the standard error of that variance (<code>var_elbill_se</code>), and the estimated population standard deviation of electricity bills (<code>sd_elbill</code>.) Note that no standard error is associated with the standard deviation - this is the only estimate that does not include a standard error.</p>
 </div>
 <div id="example-2-variability-by-subgroup" class="section level4 unnumbered hasAnchor">
 <h4>Example 2: Variability by subgroup<a href="c05-descriptive-analysis.html#example-2-variability-by-subgroup" class="anchor-section" aria-label="Anchor link to header"></a></h4>
 <p>To find out if the variability in electricity expenditure is similar across regions, we can calculate the variance by region using <code>group_by()</code>:</p>
-<div class="sourceCode" id="cb117"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb117-1"><a href="c05-descriptive-analysis.html#cb117-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
-<span id="cb117-2"><a href="c05-descriptive-analysis.html#cb117-2" tabindex="-1"></a>  <span class="fu">group_by</span>(Region) <span class="sc">%&gt;%</span></span>
-<span id="cb117-3"><a href="c05-descriptive-analysis.html#cb117-3" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">var_elbill =</span> <span class="fu">survey_var</span>(DOLLAREL),</span>
-<span id="cb117-4"><a href="c05-descriptive-analysis.html#cb117-4" tabindex="-1"></a>            <span class="at">sd_elbill =</span> <span class="fu">survey_sd</span>(DOLLAREL))</span></code></pre></div>
+<div class="sourceCode" id="cb115"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb115-1"><a href="c05-descriptive-analysis.html#cb115-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
+<span id="cb115-2"><a href="c05-descriptive-analysis.html#cb115-2" tabindex="-1"></a>  <span class="fu">group_by</span>(Region) <span class="sc">%&gt;%</span></span>
+<span id="cb115-3"><a href="c05-descriptive-analysis.html#cb115-3" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">var_elbill =</span> <span class="fu">survey_var</span>(DOLLAREL),</span>
+<span id="cb115-4"><a href="c05-descriptive-analysis.html#cb115-4" tabindex="-1"></a>            <span class="at">sd_elbill =</span> <span class="fu">survey_sd</span>(DOLLAREL))</span></code></pre></div>
 <pre><code>## # A tibble: 4 × 4
 ##   Region    var_elbill var_elbill_se sd_elbill
 ##   &lt;fct&gt;          &lt;dbl&gt;         &lt;dbl&gt;     &lt;dbl&gt;
@@ -1251,9 +1247,9 @@ <h2><span class="header-section-number">5.9</span> Additional topics<a href="c05
 <div id="unweighted-analysis" class="section level3 hasAnchor" number="5.9.1">
 <h3><span class="header-section-number">5.9.1</span> Unweighted analysis<a href="c05-descriptive-analysis.html#unweighted-analysis" class="anchor-section" aria-label="Anchor link to header"></a></h3>
 <p>Sometimes, it is helpful to calculate an unweighted estimate of a given variable. For this, we use the <code>unweighted()</code> function in the <code>summarize()</code> function. The <code>unweighted()</code> function calculates unweighted summaries from a <code>tbl_svy</code> object, providing the summary among the <em>respondents</em> without extrapolating to a population estimate. The <code>unweighted()</code> function can be used in conjunction with any {dplyr} functions. Here is an example looking at the average household electricity cost:</p>
-<div class="sourceCode" id="cb119"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb119-1"><a href="c05-descriptive-analysis.html#cb119-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
-<span id="cb119-2"><a href="c05-descriptive-analysis.html#cb119-2" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">elec_bill =</span> <span class="fu">survey_mean</span>(DOLLAREL),</span>
-<span id="cb119-3"><a href="c05-descriptive-analysis.html#cb119-3" tabindex="-1"></a>            <span class="at">elec_unweight =</span> <span class="fu">unweighted</span>(<span class="fu">mean</span>(DOLLAREL)))</span></code></pre></div>
+<div class="sourceCode" id="cb117"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb117-1"><a href="c05-descriptive-analysis.html#cb117-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
+<span id="cb117-2"><a href="c05-descriptive-analysis.html#cb117-2" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">elec_bill =</span> <span class="fu">survey_mean</span>(DOLLAREL),</span>
+<span id="cb117-3"><a href="c05-descriptive-analysis.html#cb117-3" tabindex="-1"></a>            <span class="at">elec_unweight =</span> <span class="fu">unweighted</span>(<span class="fu">mean</span>(DOLLAREL)))</span></code></pre></div>
 <pre><code>## # A tibble: 1 × 3
 ##   elec_bill elec_bill_se elec_unweight
 ##       &lt;dbl&gt;        &lt;dbl&gt;         &lt;dbl&gt;
@@ -1262,39 +1258,38 @@ <h3><span class="header-section-number">5.9.1</span> Unweighted analysis<a href=
 </div>
 <div id="subpopulation-analysis" class="section level3 hasAnchor" number="5.9.2">
 <h3><span class="header-section-number">5.9.2</span> Subpopulation analysis<a href="c05-descriptive-analysis.html#subpopulation-analysis" class="anchor-section" aria-label="Anchor link to header"></a></h3>
-<p>We mentioned using <code>filter()</code> to subset a survey object for analysis. This operation should be done after creating the survey design object. In rare circumstances, subsetting data before creating the object can lead to incorrect variability estimates. This may occur if subsetting removes an entire Primary Sampling Unit (PSU; see Chapter <a href="c10-sample-designs-replicate-weights.html#c10-sample-designs-replicate-weights">10</a> for more information on PSUs and sample designs).</p>
-<p>Suppose we want estimates of the average amount spent on natural gas among housing units using natural gas (based on the variable <code>BTUNG</code>)<a href="#fn9" class="footnote-ref" id="fnref9"><sup>9</sup></a>. We first filter records to only include records where <code>BTUNG &gt; 0</code> and then find the average amount of money spent.</p>
-<div class="sourceCode" id="cb121"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb121-1"><a href="c05-descriptive-analysis.html#cb121-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
-<span id="cb121-2"><a href="c05-descriptive-analysis.html#cb121-2" tabindex="-1"></a>  <span class="fu">filter</span>(BTUNG <span class="sc">&gt;</span> <span class="dv">0</span>) <span class="sc">%&gt;%</span></span>
-<span id="cb121-3"><a href="c05-descriptive-analysis.html#cb121-3" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">NG_mean =</span> <span class="fu">survey_mean</span>(DOLLARNG,</span>
-<span id="cb121-4"><a href="c05-descriptive-analysis.html#cb121-4" tabindex="-1"></a>                                  <span class="at">vartype =</span> <span class="fu">c</span>(<span class="st">&quot;se&quot;</span>, <span class="st">&quot;ci&quot;</span>)))</span></code></pre></div>
+<p>We mentioned using <code>filter()</code> to subset a survey object for analysis. This operation should be done after creating the survey design object. Subsetting data before creating the object can lead to incorrect variability estimates if subsetting removes an entire Primary Sampling Unit (PSU; see Chapter <a href="c10-sample-designs-replicate-weights.html#c10-sample-designs-replicate-weights">10</a> for more information on PSUs and sample designs.)</p>
+<p>Suppose we want estimates of the average amount spent on natural gas among housing units using natural gas (based on the variable <code>BTUNG</code>.)<a href="#fn9" class="footnote-ref" id="fnref9"><sup>9</sup></a> We first filter records to only include records where <code>BTUNG &gt; 0</code> and then find the average amount spent.</p>
+<div class="sourceCode" id="cb119"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb119-1"><a href="c05-descriptive-analysis.html#cb119-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
+<span id="cb119-2"><a href="c05-descriptive-analysis.html#cb119-2" tabindex="-1"></a>  <span class="fu">filter</span>(BTUNG <span class="sc">&gt;</span> <span class="dv">0</span>) <span class="sc">%&gt;%</span></span>
+<span id="cb119-3"><a href="c05-descriptive-analysis.html#cb119-3" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">NG_mean =</span> <span class="fu">survey_mean</span>(DOLLARNG,</span>
+<span id="cb119-4"><a href="c05-descriptive-analysis.html#cb119-4" tabindex="-1"></a>                                  <span class="at">vartype =</span> <span class="fu">c</span>(<span class="st">&quot;se&quot;</span>, <span class="st">&quot;ci&quot;</span>)))</span></code></pre></div>
 <pre><code>## # A tibble: 1 × 4
 ##   NG_mean NG_mean_se NG_mean_low NG_mean_upp
 ##     &lt;dbl&gt;      &lt;dbl&gt;       &lt;dbl&gt;       &lt;dbl&gt;
 ## 1    631.       4.64        621.        640.</code></pre>
-<p>The estimated average amount spent on natural gas is $631.</p>
-<p>Note that applying the filter to include only housing units that use natural gas yields a higher mean than when not applying the filter. This is because including housing units that do not use natural gas introduces many $0 amounts, impacting the mean calculation.</p>
-<div class="sourceCode" id="cb123"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb123-1"><a href="c05-descriptive-analysis.html#cb123-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
-<span id="cb123-2"><a href="c05-descriptive-analysis.html#cb123-2" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">NG_mean =</span> <span class="fu">survey_mean</span>(DOLLARNG,</span>
-<span id="cb123-3"><a href="c05-descriptive-analysis.html#cb123-3" tabindex="-1"></a>                                  <span class="at">vartype =</span> <span class="fu">c</span>(<span class="st">&quot;se&quot;</span>, <span class="st">&quot;ci&quot;</span>)))</span></code></pre></div>
+<p>The estimated average amount spent on natural gas among households that use natural gas is $631. Let’s compare this to the mean when we do not filter.</p>
+<div class="sourceCode" id="cb121"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb121-1"><a href="c05-descriptive-analysis.html#cb121-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
+<span id="cb121-2"><a href="c05-descriptive-analysis.html#cb121-2" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">NG_mean =</span> <span class="fu">survey_mean</span>(DOLLARNG,</span>
+<span id="cb121-3"><a href="c05-descriptive-analysis.html#cb121-3" tabindex="-1"></a>                                  <span class="at">vartype =</span> <span class="fu">c</span>(<span class="st">&quot;se&quot;</span>, <span class="st">&quot;ci&quot;</span>)))</span></code></pre></div>
 <pre><code>## # A tibble: 1 × 4
 ##   NG_mean NG_mean_se NG_mean_low NG_mean_upp
 ##     &lt;dbl&gt;      &lt;dbl&gt;       &lt;dbl&gt;       &lt;dbl&gt;
 ## 1    382.       3.41        375.        389.</code></pre>
-<p>Based on this calculation, the estimated average amount spent on natural gas is $382.</p>
+<p>Based on this calculation, the estimated average amount spent on natural gas is $382. Note that applying the filter to include only housing units that use natural gas yields a higher mean than when not applying the filter. This is because including housing units that do not use natural gas introduces many $0 amounts, impacting the mean calculation.</p>
 </div>
 <div id="desc-deff" class="section level3 hasAnchor" number="5.9.3">
 <h3><span class="header-section-number">5.9.3</span> Design effects<a href="c05-descriptive-analysis.html#desc-deff" class="anchor-section" aria-label="Anchor link to header"></a></h3>
-<p>The design effect measures how the precision of an estimate is influenced by the sampling design. In other words, it measures how much more or less statistically efficient the survey design is compared to a simple random sample (SRS). It is computed by taking the ratio of the estimate’s variance under the design at hand to the estimate’s variance under a simple random sample without replacement. A design effect less than 1 indicates that the design is <em>more</em> statistically efficient than an SRS design, which is rare but possible in a stratified sampling design where the outcome correlates with the stratification variable(s). A design effect greater than 1 indicates that the design is <em>less</em> statistically efficient than a SRS design. From a design effect, we can calculate the effective sample size as follows:</p>
+<p>The design effect measures how the precision of an estimate is influenced by the sampling design. In other words, it measures how much more or less statistically efficient the survey design is compared to a simple random sample (SRS.) It is computed by taking the ratio of the estimate’s variance under the design at hand to the estimate’s variance under a simple random sample without replacement. A design effect less than 1 indicates that the design is <em>more</em> statistically efficient than an SRS design, which is rare but possible in a stratified sampling design where the outcome correlates with the stratification variable(s). A design effect greater than 1 indicates that the design is <em>less</em> statistically efficient than a SRS design. From a design effect, we can calculate the effective sample size as follows:</p>
 <p><span class="math display">\[n_{eff}=\frac{n}{D_{eff}} \]</span></p>
 <p>where <span class="math inline">\(n\)</span> is the nominal sample size (the number of survey responses) and <span class="math inline">\(D_{eff}\)</span> is the estimated design effect. We can interpret the effective sample size <span class="math inline">\(n_{eff}\)</span> as the hypothetical sample size that a survey using an SRS design would need to achieve the same precision as the design at hand. Design effects specific to each outcome — outcomes that are less clustered in the population have smaller design effects than outcomes that are clustered.</p>
 <p>In the {srvyr} package, design effects can be calculated for totals, proportions, means, and ratio estimates by setting the <code>deff</code> argument to <code>TRUE</code> in the corresponding functions. In the example below, we calculate the design effects for the average consumption of electricity (<code>BTUEL</code>), natural gas (<code>BTUNG</code>), liquid propane (<code>BTULP</code>), fuel oil (<code>BTUFO</code>), and wood (<code>BTUWOOD</code>) by setting <code>deff = TRUE</code>:</p>
-<div class="sourceCode" id="cb125"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb125-1"><a href="c05-descriptive-analysis.html#cb125-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
-<span id="cb125-2"><a href="c05-descriptive-analysis.html#cb125-2" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="fu">across</span>(</span>
-<span id="cb125-3"><a href="c05-descriptive-analysis.html#cb125-3" tabindex="-1"></a>    <span class="fu">c</span>(BTUEL, BTUNG, BTULP, BTUFO, BTUWOOD),</span>
-<span id="cb125-4"><a href="c05-descriptive-analysis.html#cb125-4" tabindex="-1"></a>    <span class="sc">~</span> <span class="fu">survey_mean</span>(.x, <span class="at">deff =</span> <span class="cn">TRUE</span>, <span class="at">vartype =</span> <span class="cn">NULL</span>)</span>
-<span id="cb125-5"><a href="c05-descriptive-analysis.html#cb125-5" tabindex="-1"></a>  )) <span class="sc">%&gt;%</span></span>
-<span id="cb125-6"><a href="c05-descriptive-analysis.html#cb125-6" tabindex="-1"></a>  <span class="fu">select</span>(<span class="fu">ends_with</span>(<span class="st">&quot;deff&quot;</span>))</span></code></pre></div>
+<div class="sourceCode" id="cb123"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb123-1"><a href="c05-descriptive-analysis.html#cb123-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
+<span id="cb123-2"><a href="c05-descriptive-analysis.html#cb123-2" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="fu">across</span>(</span>
+<span id="cb123-3"><a href="c05-descriptive-analysis.html#cb123-3" tabindex="-1"></a>    <span class="fu">c</span>(BTUEL, BTUNG, BTULP, BTUFO, BTUWOOD),</span>
+<span id="cb123-4"><a href="c05-descriptive-analysis.html#cb123-4" tabindex="-1"></a>    <span class="sc">~</span> <span class="fu">survey_mean</span>(.x, <span class="at">deff =</span> <span class="cn">TRUE</span>, <span class="at">vartype =</span> <span class="cn">NULL</span>)</span>
+<span id="cb123-5"><a href="c05-descriptive-analysis.html#cb123-5" tabindex="-1"></a>  )) <span class="sc">%&gt;%</span></span>
+<span id="cb123-6"><a href="c05-descriptive-analysis.html#cb123-6" tabindex="-1"></a>  <span class="fu">select</span>(<span class="fu">ends_with</span>(<span class="st">&quot;deff&quot;</span>))</span></code></pre></div>
 <pre><code>## # A tibble: 1 × 5
 ##   BTUEL_deff BTUNG_deff BTULP_deff BTUFO_deff BTUWOOD_deff
 ##        &lt;dbl&gt;      &lt;dbl&gt;      &lt;dbl&gt;      &lt;dbl&gt;        &lt;dbl&gt;
@@ -1303,7 +1298,7 @@ <h3><span class="header-section-number">5.9.3</span> Design effects<a href="c05-
 </div>
 <div id="creating-summary-rows" class="section level3 hasAnchor" number="5.9.4">
 <h3><span class="header-section-number">5.9.4</span> Creating summary rows<a href="c05-descriptive-analysis.html#creating-summary-rows" class="anchor-section" aria-label="Anchor link to header"></a></h3>
-<p>When using <code>group_by()</code> in analysis, the results are returned with a row for each group or combination of groups. Often, we want both the breakdowns by group and a summary row for the estimate representing the entire population. For example, we may want the average electricity consumption by region <em>and</em> nationally. The {srvyr} package has the convenient <code>cascade()</code> function, which adds summary rows for the total of a group. It is used in place of <code>summarize()</code> and has similar functionalities along with some additional features.</p>
+<p>When using <code>group_by()</code> in analysis, the results are returned with a row for each group or combination of groups. Often, we want both the breakdowns by group and a summary row for the estimate representing the entire population. For example, we may want the average electricity consumption by region <em>and</em> nationally. The {srvyr} package has the convenient <code>cascade()</code> function, which adds summary rows for the total of a group. It is used instead of <code>summarize()</code> and has similar functionalities along with some additional features.</p>
 <div id="syntax-5" class="section level4 unnumbered hasAnchor">
 <h4>Syntax<a href="c05-descriptive-analysis.html#syntax-5" class="anchor-section" aria-label="Anchor link to header"></a></h4>
 <p>The syntax is as follows:</p>
@@ -1319,22 +1314,22 @@ <h4>Syntax<a href="c05-descriptive-analysis.html#syntax-5" class="anchor-section
 <li><code>.data</code>: A <code>tbl_svy</code> object</li>
 <li><code>...</code>: Name-value pairs of summary functions (same as the <code>summarize()</code> function)</li>
 <li><code>.fill</code>: Value to fill in for group summaries (defaults to <code>NA</code>)</li>
-<li><code>.fill_level_top</code>: When filling factor variables, whether to put the value ‘.fill’ in the first position (defaults to FALSE, placing it in the bottom).</li>
+<li><code>.fill_level_top</code>: When filling factor variables, whether to put the value ‘.fill’ in the first position (defaults to FALSE, placing it in the bottom.)</li>
 </ul>
 </div>
 <div id="example" class="section level4 unnumbered hasAnchor">
 <h4>Example<a href="c05-descriptive-analysis.html#example" class="anchor-section" aria-label="Anchor link to header"></a></h4>
-<p>First, let’s look at an example where we calculate the average household electricity cost and. Then, we build on it to examine the features of the <code>cascade()</code> function. In the first example below, we calculate the average household energy cost <code>DOLLAREL_mn</code> using <code>survey_mean()</code> without modifying any of the argument defaults in the function:</p>
-<div class="sourceCode" id="cb128"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb128-1"><a href="c05-descriptive-analysis.html#cb128-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
-<span id="cb128-2"><a href="c05-descriptive-analysis.html#cb128-2" tabindex="-1"></a>  <span class="fu">cascade</span>(<span class="at">DOLLAREL_mn =</span> <span class="fu">survey_mean</span>(DOLLAREL))</span></code></pre></div>
+<p>First, let’s look at an example where we calculate the average household electricity cost. Then, we build on it to examine the features of the <code>cascade()</code> function. In the first example below, we calculate the average household energy cost <code>DOLLAREL_mn</code> using <code>survey_mean()</code> without modifying any of the argument defaults in the function:</p>
+<div class="sourceCode" id="cb126"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb126-1"><a href="c05-descriptive-analysis.html#cb126-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
+<span id="cb126-2"><a href="c05-descriptive-analysis.html#cb126-2" tabindex="-1"></a>  <span class="fu">cascade</span>(<span class="at">DOLLAREL_mn =</span> <span class="fu">survey_mean</span>(DOLLAREL))</span></code></pre></div>
 <pre><code>## # A tibble: 1 × 2
 ##   DOLLAREL_mn DOLLAREL_mn_se
 ##         &lt;dbl&gt;          &lt;dbl&gt;
 ## 1       1380.           5.38</code></pre>
 <p>Next, let’s group the results by region by adding <code>group_by()</code> before the <code>cascade()</code> function:</p>
-<div class="sourceCode" id="cb130"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb130-1"><a href="c05-descriptive-analysis.html#cb130-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
-<span id="cb130-2"><a href="c05-descriptive-analysis.html#cb130-2" tabindex="-1"></a>  <span class="fu">group_by</span>(Region) <span class="sc">%&gt;%</span></span>
-<span id="cb130-3"><a href="c05-descriptive-analysis.html#cb130-3" tabindex="-1"></a>  <span class="fu">cascade</span>(<span class="at">DOLLAREL_mn =</span> <span class="fu">survey_mean</span>(DOLLAREL))</span></code></pre></div>
+<div class="sourceCode" id="cb128"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb128-1"><a href="c05-descriptive-analysis.html#cb128-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
+<span id="cb128-2"><a href="c05-descriptive-analysis.html#cb128-2" tabindex="-1"></a>  <span class="fu">group_by</span>(Region) <span class="sc">%&gt;%</span></span>
+<span id="cb128-3"><a href="c05-descriptive-analysis.html#cb128-3" tabindex="-1"></a>  <span class="fu">cascade</span>(<span class="at">DOLLAREL_mn =</span> <span class="fu">survey_mean</span>(DOLLAREL))</span></code></pre></div>
 <pre><code>## # A tibble: 5 × 3
 ##   Region    DOLLAREL_mn DOLLAREL_mn_se
 ##   &lt;fct&gt;           &lt;dbl&gt;          &lt;dbl&gt;
@@ -1343,11 +1338,11 @@ <h4>Example<a href="c05-descriptive-analysis.html#example" class="anchor-section
 ## 3 South           1548.          10.3 
 ## 4 West            1211.          12.0 
 ## 5 &lt;NA&gt;            1380.           5.38</code></pre>
-<p>We can see the estimated average electricity bills by regions: $1,343 for the Northeast, $1,548 for the South, and so on. The last row where <code>Region = NA</code> is the national average electricity bill, $1,380. However, naming the national “region” as <code>NA</code> is not very informative. We can give it a better name using the <code>.fill</code> argument.</p>
-<div class="sourceCode" id="cb132"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb132-1"><a href="c05-descriptive-analysis.html#cb132-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
-<span id="cb132-2"><a href="c05-descriptive-analysis.html#cb132-2" tabindex="-1"></a>  <span class="fu">group_by</span>(Region) <span class="sc">%&gt;%</span></span>
-<span id="cb132-3"><a href="c05-descriptive-analysis.html#cb132-3" tabindex="-1"></a>  <span class="fu">cascade</span>(<span class="at">DOLLAREL_mn =</span> <span class="fu">survey_mean</span>(DOLLAREL),</span>
-<span id="cb132-4"><a href="c05-descriptive-analysis.html#cb132-4" tabindex="-1"></a>          <span class="at">.fill =</span> <span class="st">&quot;National&quot;</span>)</span></code></pre></div>
+<p>We can see the estimated average electricity bills by region: $1,343 for the Northeast, $1,548 for the South, and so on. The last row, where <code>Region = NA</code>, is the national average electricity bill, $1,380. However, naming the national “region” as <code>NA</code> is not very informative. We can give it a better name using the <code>.fill</code> argument.</p>
+<div class="sourceCode" id="cb130"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb130-1"><a href="c05-descriptive-analysis.html#cb130-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
+<span id="cb130-2"><a href="c05-descriptive-analysis.html#cb130-2" tabindex="-1"></a>  <span class="fu">group_by</span>(Region) <span class="sc">%&gt;%</span></span>
+<span id="cb130-3"><a href="c05-descriptive-analysis.html#cb130-3" tabindex="-1"></a>  <span class="fu">cascade</span>(<span class="at">DOLLAREL_mn =</span> <span class="fu">survey_mean</span>(DOLLAREL),</span>
+<span id="cb130-4"><a href="c05-descriptive-analysis.html#cb130-4" tabindex="-1"></a>          <span class="at">.fill =</span> <span class="st">&quot;National&quot;</span>)</span></code></pre></div>
 <pre><code>## # A tibble: 5 × 3
 ##   Region    DOLLAREL_mn DOLLAREL_mn_se
 ##   &lt;fct&gt;           &lt;dbl&gt;          &lt;dbl&gt;
@@ -1357,13 +1352,13 @@ <h4>Example<a href="c05-descriptive-analysis.html#example" class="anchor-section
 ## 4 West            1211.          12.0 
 ## 5 National        1380.           5.38</code></pre>
 <p>We can move the summary row to the first row by adding <code>.fill_level_top = TRUE</code> to <code>cascade()</code>:</p>
-<div class="sourceCode" id="cb134"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb134-1"><a href="c05-descriptive-analysis.html#cb134-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
-<span id="cb134-2"><a href="c05-descriptive-analysis.html#cb134-2" tabindex="-1"></a>  <span class="fu">group_by</span>(Region) <span class="sc">%&gt;%</span></span>
-<span id="cb134-3"><a href="c05-descriptive-analysis.html#cb134-3" tabindex="-1"></a>  <span class="fu">cascade</span>(</span>
-<span id="cb134-4"><a href="c05-descriptive-analysis.html#cb134-4" tabindex="-1"></a>    <span class="at">DOLLAREL_mn =</span> <span class="fu">survey_mean</span>(DOLLAREL),</span>
-<span id="cb134-5"><a href="c05-descriptive-analysis.html#cb134-5" tabindex="-1"></a>    <span class="at">.fill =</span> <span class="st">&quot;National&quot;</span>,</span>
-<span id="cb134-6"><a href="c05-descriptive-analysis.html#cb134-6" tabindex="-1"></a>    <span class="at">.fill_level_top =</span> <span class="cn">TRUE</span></span>
-<span id="cb134-7"><a href="c05-descriptive-analysis.html#cb134-7" tabindex="-1"></a>  )</span></code></pre></div>
+<div class="sourceCode" id="cb132"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb132-1"><a href="c05-descriptive-analysis.html#cb132-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
+<span id="cb132-2"><a href="c05-descriptive-analysis.html#cb132-2" tabindex="-1"></a>  <span class="fu">group_by</span>(Region) <span class="sc">%&gt;%</span></span>
+<span id="cb132-3"><a href="c05-descriptive-analysis.html#cb132-3" tabindex="-1"></a>  <span class="fu">cascade</span>(</span>
+<span id="cb132-4"><a href="c05-descriptive-analysis.html#cb132-4" tabindex="-1"></a>    <span class="at">DOLLAREL_mn =</span> <span class="fu">survey_mean</span>(DOLLAREL),</span>
+<span id="cb132-5"><a href="c05-descriptive-analysis.html#cb132-5" tabindex="-1"></a>    <span class="at">.fill =</span> <span class="st">&quot;National&quot;</span>,</span>
+<span id="cb132-6"><a href="c05-descriptive-analysis.html#cb132-6" tabindex="-1"></a>    <span class="at">.fill_level_top =</span> <span class="cn">TRUE</span></span>
+<span id="cb132-7"><a href="c05-descriptive-analysis.html#cb132-7" tabindex="-1"></a>  )</span></code></pre></div>
 <pre><code>## # A tibble: 5 × 3
 ##   Region    DOLLAREL_mn DOLLAREL_mn_se
 ##   &lt;fct&gt;           &lt;dbl&gt;          &lt;dbl&gt;
@@ -1378,21 +1373,21 @@ <h4>Example<a href="c05-descriptive-analysis.html#example" class="anchor-section
 <div id="calculating-estimates-for-many-outcomes" class="section level3 hasAnchor" number="5.9.5">
 <h3><span class="header-section-number">5.9.5</span> Calculating estimates for many outcomes<a href="c05-descriptive-analysis.html#calculating-estimates-for-many-outcomes" class="anchor-section" aria-label="Anchor link to header"></a></h3>
 <p>Often, we are interested in a summary statistic across many variables. Useful tools include the <code>across()</code> function in {dplyr}, shown a few times above, and the <code>map()</code> function in {purrr}.</p>
-<p>The <code>across()</code> function allows you to apply the same function to multiple columns within <code>summarize()</code>. This works well with all functions shown above, except for <code>survey_prop()</code>. In a later example, we will tackle summarizing multiple proportions.</p>
+<p>The <code>across()</code> function applies the same function to multiple columns within <code>summarize()</code>. This works well with all functions shown above, except for <code>survey_prop()</code>. In a later example, we tackle summarizing multiple proportions.</p>
 <div id="example-1-across" class="section level4 unnumbered hasAnchor">
 <h4>Example 1: <code>across()</code><a href="c05-descriptive-analysis.html#example-1-across" class="anchor-section" aria-label="Anchor link to header"></a></h4>
-<p>Suppose we want to calculate the total and average consumption, along with coefficients of variation (CV), for each fuel type. These include the reported consumption of electricity (<code>BTUEL</code>), natural gas (<code>BTUNG</code>), liquid propane (<code>BTULP</code>), fuel oil (<code>BTUFO</code>), and wood (<code>BTUWOOD</code>), as mentioned in the section on design effects. We can take advantage of the fact that these are the only variables that start with “BTU” by selecting them with <code>starts_with("BTU")</code> in the <code>across()</code> function. For each selected column (<code>.x</code>), <code>across()</code> creates a list of two functions to be applied: <code>survey_total()</code> to calculate the total and <code>survey_mean()</code> to calculate the mean, along with their CV (<code>vartype = "cv"</code>). Finally, <code>.unpack = "{outer}.{inner}"</code> specifies that the resulting column names are a concatenation of the variable name, followed by Total or Mean, and then “coef” or “cv”.</p>
-<div class="sourceCode" id="cb136"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb136-1"><a href="c05-descriptive-analysis.html#cb136-1" tabindex="-1"></a>consumption_ests <span class="ot">&lt;-</span> recs_des <span class="sc">%&gt;%</span></span>
-<span id="cb136-2"><a href="c05-descriptive-analysis.html#cb136-2" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="fu">across</span>(</span>
-<span id="cb136-3"><a href="c05-descriptive-analysis.html#cb136-3" tabindex="-1"></a>    <span class="fu">starts_with</span>(<span class="st">&quot;BTU&quot;</span>),</span>
-<span id="cb136-4"><a href="c05-descriptive-analysis.html#cb136-4" tabindex="-1"></a>    <span class="fu">list</span>(</span>
-<span id="cb136-5"><a href="c05-descriptive-analysis.html#cb136-5" tabindex="-1"></a>      <span class="at">Total =</span>  <span class="sc">~</span> <span class="fu">survey_total</span>(.x, <span class="at">vartype =</span> <span class="st">&quot;cv&quot;</span>),</span>
-<span id="cb136-6"><a href="c05-descriptive-analysis.html#cb136-6" tabindex="-1"></a>      <span class="at">Mean =</span>  <span class="sc">~</span> <span class="fu">survey_mean</span>(.x, <span class="at">vartype =</span> <span class="st">&quot;cv&quot;</span>)</span>
-<span id="cb136-7"><a href="c05-descriptive-analysis.html#cb136-7" tabindex="-1"></a>    ),</span>
-<span id="cb136-8"><a href="c05-descriptive-analysis.html#cb136-8" tabindex="-1"></a>    <span class="at">.unpack =</span> <span class="st">&quot;{outer}.{inner}&quot;</span></span>
-<span id="cb136-9"><a href="c05-descriptive-analysis.html#cb136-9" tabindex="-1"></a>  ))</span>
-<span id="cb136-10"><a href="c05-descriptive-analysis.html#cb136-10" tabindex="-1"></a></span>
-<span id="cb136-11"><a href="c05-descriptive-analysis.html#cb136-11" tabindex="-1"></a>consumption_ests </span></code></pre></div>
+<p>Suppose we want to calculate the total and average consumption, along with coefficients of variation (CV), for each fuel type. These include the reported consumption of electricity (<code>BTUEL</code>), natural gas (<code>BTUNG</code>), liquid propane (<code>BTULP</code>), fuel oil (<code>BTUFO</code>), and wood (<code>BTUWOOD</code>), as mentioned in the section on design effects. We can take advantage of the fact that these are the only variables that start with “BTU” by selecting them with <code>starts_with("BTU")</code> in the <code>across()</code> function. For each selected column (<code>.x</code>), <code>across()</code> creates a list of two functions to be applied: <code>survey_total()</code> to calculate the total and <code>survey_mean()</code> to calculate the mean, along with their CV (<code>vartype = "cv"</code>.) Finally, <code>.unpack = "{outer}.{inner}"</code> specifies that the resulting column names are a concatenation of the variable name, followed by Total or Mean, and then “coef” or “cv”.</p>
+<div class="sourceCode" id="cb134"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb134-1"><a href="c05-descriptive-analysis.html#cb134-1" tabindex="-1"></a>consumption_ests <span class="ot">&lt;-</span> recs_des <span class="sc">%&gt;%</span></span>
+<span id="cb134-2"><a href="c05-descriptive-analysis.html#cb134-2" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="fu">across</span>(</span>
+<span id="cb134-3"><a href="c05-descriptive-analysis.html#cb134-3" tabindex="-1"></a>    <span class="fu">starts_with</span>(<span class="st">&quot;BTU&quot;</span>),</span>
+<span id="cb134-4"><a href="c05-descriptive-analysis.html#cb134-4" tabindex="-1"></a>    <span class="fu">list</span>(</span>
+<span id="cb134-5"><a href="c05-descriptive-analysis.html#cb134-5" tabindex="-1"></a>      <span class="at">Total =</span>  <span class="sc">~</span> <span class="fu">survey_total</span>(.x, <span class="at">vartype =</span> <span class="st">&quot;cv&quot;</span>),</span>
+<span id="cb134-6"><a href="c05-descriptive-analysis.html#cb134-6" tabindex="-1"></a>      <span class="at">Mean =</span>  <span class="sc">~</span> <span class="fu">survey_mean</span>(.x, <span class="at">vartype =</span> <span class="st">&quot;cv&quot;</span>)</span>
+<span id="cb134-7"><a href="c05-descriptive-analysis.html#cb134-7" tabindex="-1"></a>    ),</span>
+<span id="cb134-8"><a href="c05-descriptive-analysis.html#cb134-8" tabindex="-1"></a>    <span class="at">.unpack =</span> <span class="st">&quot;{outer}.{inner}&quot;</span></span>
+<span id="cb134-9"><a href="c05-descriptive-analysis.html#cb134-9" tabindex="-1"></a>  ))</span>
+<span id="cb134-10"><a href="c05-descriptive-analysis.html#cb134-10" tabindex="-1"></a></span>
+<span id="cb134-11"><a href="c05-descriptive-analysis.html#cb134-11" tabindex="-1"></a>consumption_ests </span></code></pre></div>
 <pre><code>## # A tibble: 1 × 20
 ##   BTUEL_Total.coef BTUEL_Total._cv BTUEL_Mean.coef BTUEL_Mean._cv
 ##              &lt;dbl&gt;           &lt;dbl&gt;           &lt;dbl&gt;          &lt;dbl&gt;
@@ -1406,14 +1401,14 @@ <h4>Example 1: <code>across()</code><a href="c05-descriptive-analysis.html#examp
 ## #   BTUWOOD_Total.coef &lt;dbl&gt;, BTUWOOD_Total._cv &lt;dbl&gt;, …</code></pre>
 <p>The estimated total consumption of electricity (<code>BTUEL</code>) is 4,453,284,510,065 (<code>BTUEL_Total.coef</code>), the estimated average consumption is 36,051 (<code>BTUEL_Mean.coef</code>), and the CV is 0.0038.</p>
 <p>In the example above, the table was quite wide. We may prefer a row for each fuel type. Using the <code>pivot_longer()</code> and <code>pivot_wider()</code> functions from {tidyr} can help us achieve this. First, we use <code>pivot_longer()</code> to make each variable a column, changing the data to a “long” format. We use the <code>names_to</code> argument to specify new column names: <code>FuelType</code>, <code>Stat</code>, and <code>Type</code>. Then, the <code>names_pattern</code> argument extracts the names in the original column names based on the regular expression pattern <code>BTU(.*)_(.*)\\.(.*)</code>. They are saved in the column names defined in <code>names_to</code>.</p>
-<div class="sourceCode" id="cb138"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb138-1"><a href="c05-descriptive-analysis.html#cb138-1" tabindex="-1"></a>consumption_ests_long <span class="ot">&lt;-</span> consumption_ests <span class="sc">%&gt;%</span></span>
-<span id="cb138-2"><a href="c05-descriptive-analysis.html#cb138-2" tabindex="-1"></a>  <span class="fu">pivot_longer</span>(</span>
-<span id="cb138-3"><a href="c05-descriptive-analysis.html#cb138-3" tabindex="-1"></a>    <span class="at">cols =</span> <span class="fu">everything</span>(),</span>
-<span id="cb138-4"><a href="c05-descriptive-analysis.html#cb138-4" tabindex="-1"></a>    <span class="at">names_to =</span> <span class="fu">c</span>(<span class="st">&quot;FuelType&quot;</span>, <span class="st">&quot;Stat&quot;</span>, <span class="st">&quot;Type&quot;</span>),</span>
-<span id="cb138-5"><a href="c05-descriptive-analysis.html#cb138-5" tabindex="-1"></a>    <span class="at">names_pattern =</span> <span class="st">&quot;BTU(.*)_(.*)</span><span class="sc">\\</span><span class="st">.(.*)&quot;</span></span>
-<span id="cb138-6"><a href="c05-descriptive-analysis.html#cb138-6" tabindex="-1"></a>  )</span>
-<span id="cb138-7"><a href="c05-descriptive-analysis.html#cb138-7" tabindex="-1"></a></span>
-<span id="cb138-8"><a href="c05-descriptive-analysis.html#cb138-8" tabindex="-1"></a>consumption_ests_long</span></code></pre></div>
+<div class="sourceCode" id="cb136"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb136-1"><a href="c05-descriptive-analysis.html#cb136-1" tabindex="-1"></a>consumption_ests_long <span class="ot">&lt;-</span> consumption_ests <span class="sc">%&gt;%</span></span>
+<span id="cb136-2"><a href="c05-descriptive-analysis.html#cb136-2" tabindex="-1"></a>  <span class="fu">pivot_longer</span>(</span>
+<span id="cb136-3"><a href="c05-descriptive-analysis.html#cb136-3" tabindex="-1"></a>    <span class="at">cols =</span> <span class="fu">everything</span>(),</span>
+<span id="cb136-4"><a href="c05-descriptive-analysis.html#cb136-4" tabindex="-1"></a>    <span class="at">names_to =</span> <span class="fu">c</span>(<span class="st">&quot;FuelType&quot;</span>, <span class="st">&quot;Stat&quot;</span>, <span class="st">&quot;Type&quot;</span>),</span>
+<span id="cb136-5"><a href="c05-descriptive-analysis.html#cb136-5" tabindex="-1"></a>    <span class="at">names_pattern =</span> <span class="st">&quot;BTU(.*)_(.*)</span><span class="sc">\\</span><span class="st">.(.*)&quot;</span></span>
+<span id="cb136-6"><a href="c05-descriptive-analysis.html#cb136-6" tabindex="-1"></a>  )</span>
+<span id="cb136-7"><a href="c05-descriptive-analysis.html#cb136-7" tabindex="-1"></a></span>
+<span id="cb136-8"><a href="c05-descriptive-analysis.html#cb136-8" tabindex="-1"></a>consumption_ests_long</span></code></pre></div>
 <pre><code>## # A tibble: 20 × 4
 ##    FuelType Stat  Type                value
 ##    &lt;chr&gt;    &lt;chr&gt; &lt;chr&gt;               &lt;dbl&gt;
@@ -1438,15 +1433,15 @@ <h4>Example 1: <code>across()</code><a href="c05-descriptive-analysis.html#examp
 ## 19 WOOD     Mean  coef           2794.     
 ## 20 WOOD     Mean  _cv               0.0454</code></pre>
 <p>Then, we use <code>pivot_wider()</code> to create a table that is nearly ready for publication. Within the function, we can make the names for each element more descriptive and informative by gluing the <code>Stat</code> and <code>Type</code> together with <code>names_glue</code>. Further details on creating publication-ready tables are covered in Chapter <a href="c08-communicating-results.html#c08-communicating-results">8</a>.</p>
-<div class="sourceCode" id="cb140"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb140-1"><a href="c05-descriptive-analysis.html#cb140-1" tabindex="-1"></a>consumption_ests_long <span class="sc">%&gt;%</span></span>
-<span id="cb140-2"><a href="c05-descriptive-analysis.html#cb140-2" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">Type =</span> <span class="fu">case_when</span>(Type <span class="sc">==</span> <span class="st">&quot;coef&quot;</span> <span class="sc">~</span> <span class="st">&quot;&quot;</span>,</span>
-<span id="cb140-3"><a href="c05-descriptive-analysis.html#cb140-3" tabindex="-1"></a>                          Type <span class="sc">==</span> <span class="st">&quot;_cv&quot;</span> <span class="sc">~</span> <span class="st">&quot; (CV)&quot;</span>)) <span class="sc">%&gt;%</span></span>
-<span id="cb140-4"><a href="c05-descriptive-analysis.html#cb140-4" tabindex="-1"></a>  <span class="fu">pivot_wider</span>(</span>
-<span id="cb140-5"><a href="c05-descriptive-analysis.html#cb140-5" tabindex="-1"></a>    <span class="at">id_cols =</span> FuelType,</span>
-<span id="cb140-6"><a href="c05-descriptive-analysis.html#cb140-6" tabindex="-1"></a>    <span class="at">names_from =</span> <span class="fu">c</span>(Stat, Type),</span>
-<span id="cb140-7"><a href="c05-descriptive-analysis.html#cb140-7" tabindex="-1"></a>    <span class="at">names_glue =</span> <span class="st">&quot;{Stat}{Type}&quot;</span>,</span>
-<span id="cb140-8"><a href="c05-descriptive-analysis.html#cb140-8" tabindex="-1"></a>    <span class="at">values_from =</span> value</span>
-<span id="cb140-9"><a href="c05-descriptive-analysis.html#cb140-9" tabindex="-1"></a>  )</span></code></pre></div>
+<div class="sourceCode" id="cb138"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb138-1"><a href="c05-descriptive-analysis.html#cb138-1" tabindex="-1"></a>consumption_ests_long <span class="sc">%&gt;%</span></span>
+<span id="cb138-2"><a href="c05-descriptive-analysis.html#cb138-2" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">Type =</span> <span class="fu">case_when</span>(Type <span class="sc">==</span> <span class="st">&quot;coef&quot;</span> <span class="sc">~</span> <span class="st">&quot;&quot;</span>,</span>
+<span id="cb138-3"><a href="c05-descriptive-analysis.html#cb138-3" tabindex="-1"></a>                          Type <span class="sc">==</span> <span class="st">&quot;_cv&quot;</span> <span class="sc">~</span> <span class="st">&quot; (CV)&quot;</span>)) <span class="sc">%&gt;%</span></span>
+<span id="cb138-4"><a href="c05-descriptive-analysis.html#cb138-4" tabindex="-1"></a>  <span class="fu">pivot_wider</span>(</span>
+<span id="cb138-5"><a href="c05-descriptive-analysis.html#cb138-5" tabindex="-1"></a>    <span class="at">id_cols =</span> FuelType,</span>
+<span id="cb138-6"><a href="c05-descriptive-analysis.html#cb138-6" tabindex="-1"></a>    <span class="at">names_from =</span> <span class="fu">c</span>(Stat, Type),</span>
+<span id="cb138-7"><a href="c05-descriptive-analysis.html#cb138-7" tabindex="-1"></a>    <span class="at">names_glue =</span> <span class="st">&quot;{Stat}{Type}&quot;</span>,</span>
+<span id="cb138-8"><a href="c05-descriptive-analysis.html#cb138-8" tabindex="-1"></a>    <span class="at">values_from =</span> value</span>
+<span id="cb138-9"><a href="c05-descriptive-analysis.html#cb138-9" tabindex="-1"></a>  )</span></code></pre></div>
 <pre><code>## # A tibble: 5 × 5
 ##   FuelType          Total `Total (CV)`   Mean `Mean (CV)`
 ##   &lt;chr&gt;             &lt;dbl&gt;        &lt;dbl&gt;  &lt;dbl&gt;       &lt;dbl&gt;
@@ -1458,41 +1453,41 @@ <h4>Example 1: <code>across()</code><a href="c05-descriptive-analysis.html#examp
 </div>
 <div id="example-2-proportions-with-across" class="section level4 unnumbered hasAnchor">
 <h4>Example 2: Proportions with <code>across()</code><a href="c05-descriptive-analysis.html#example-2-proportions-with-across" class="anchor-section" aria-label="Anchor link to header"></a></h4>
-<p>As mentioned earlier, proportions do not work as well directly with the <code>across()</code> method. If we want the proportion of houses with air conditioning and the proportion of houses with heating, we require two separate <code>group_by()</code> statements as shown below:</p>
-<div class="sourceCode" id="cb142"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb142-1"><a href="c05-descriptive-analysis.html#cb142-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
-<span id="cb142-2"><a href="c05-descriptive-analysis.html#cb142-2" tabindex="-1"></a>  <span class="fu">group_by</span>(ACUsed) <span class="sc">%&gt;%</span></span>
-<span id="cb142-3"><a href="c05-descriptive-analysis.html#cb142-3" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">p =</span> <span class="fu">survey_prop</span>())</span></code></pre></div>
+<p>As mentioned earlier, proportions do not work as well directly with the <code>across()</code> method. If we want the proportion of houses with air conditioning (A/C) and the proportion of houses with heating, we require two separate <code>group_by()</code> statements as shown below:</p>
+<div class="sourceCode" id="cb140"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb140-1"><a href="c05-descriptive-analysis.html#cb140-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
+<span id="cb140-2"><a href="c05-descriptive-analysis.html#cb140-2" tabindex="-1"></a>  <span class="fu">group_by</span>(ACUsed) <span class="sc">%&gt;%</span></span>
+<span id="cb140-3"><a href="c05-descriptive-analysis.html#cb140-3" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">p =</span> <span class="fu">survey_prop</span>())</span></code></pre></div>
 <pre><code>## # A tibble: 2 × 3
 ##   ACUsed     p    p_se
 ##   &lt;lgl&gt;  &lt;dbl&gt;   &lt;dbl&gt;
 ## 1 FALSE  0.113 0.00306
 ## 2 TRUE   0.887 0.00306</code></pre>
-<div class="sourceCode" id="cb144"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb144-1"><a href="c05-descriptive-analysis.html#cb144-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
-<span id="cb144-2"><a href="c05-descriptive-analysis.html#cb144-2" tabindex="-1"></a>  <span class="fu">group_by</span>(SpaceHeatingUsed) <span class="sc">%&gt;%</span></span>
-<span id="cb144-3"><a href="c05-descriptive-analysis.html#cb144-3" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">p =</span> <span class="fu">survey_prop</span>())</span></code></pre></div>
+<div class="sourceCode" id="cb142"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb142-1"><a href="c05-descriptive-analysis.html#cb142-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
+<span id="cb142-2"><a href="c05-descriptive-analysis.html#cb142-2" tabindex="-1"></a>  <span class="fu">group_by</span>(SpaceHeatingUsed) <span class="sc">%&gt;%</span></span>
+<span id="cb142-3"><a href="c05-descriptive-analysis.html#cb142-3" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">p =</span> <span class="fu">survey_prop</span>())</span></code></pre></div>
 <pre><code>## # A tibble: 2 × 3
 ##   SpaceHeatingUsed      p    p_se
 ##   &lt;lgl&gt;             &lt;dbl&gt;   &lt;dbl&gt;
 ## 1 FALSE            0.0469 0.00207
 ## 2 TRUE             0.953  0.00207</code></pre>
-<p>We estimate 88.7% of households have air conditioning and 95.3% have heating.</p>
-<p>If we are <em>only</em> interested in the <code>TRUE</code> outcomes, that is, the proportion of households that have air conditioning and the proportion that have heating, we can simplify the code. Applying <code>survey_mean()</code> to a logical variable is the same as using <code>survey_prop()</code>, as shown below:</p>
-<div class="sourceCode" id="cb146"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb146-1"><a href="c05-descriptive-analysis.html#cb146-1" tabindex="-1"></a>cool_heat_tab <span class="ot">&lt;-</span> recs_des <span class="sc">%&gt;%</span></span>
-<span id="cb146-2"><a href="c05-descriptive-analysis.html#cb146-2" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="fu">across</span>(<span class="fu">c</span>(ACUsed, SpaceHeatingUsed), <span class="sc">~</span> <span class="fu">survey_mean</span>(.x),</span>
-<span id="cb146-3"><a href="c05-descriptive-analysis.html#cb146-3" tabindex="-1"></a>                   <span class="at">.unpack =</span> <span class="st">&quot;{outer}.{inner}&quot;</span>))</span>
-<span id="cb146-4"><a href="c05-descriptive-analysis.html#cb146-4" tabindex="-1"></a></span>
-<span id="cb146-5"><a href="c05-descriptive-analysis.html#cb146-5" tabindex="-1"></a>cool_heat_tab</span></code></pre></div>
+<p>We estimate 88.7% of households have A/C and 95.3% have heating.</p>
+<p>If we are <em>only</em> interested in the <code>TRUE</code> outcomes, that is, the proportion of households that have A/C and the proportion that have heating, we can simplify the code. Applying <code>survey_mean()</code> to a logical variable is the same as using <code>survey_prop()</code>, as shown below:</p>
+<div class="sourceCode" id="cb144"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb144-1"><a href="c05-descriptive-analysis.html#cb144-1" tabindex="-1"></a>cool_heat_tab <span class="ot">&lt;-</span> recs_des <span class="sc">%&gt;%</span></span>
+<span id="cb144-2"><a href="c05-descriptive-analysis.html#cb144-2" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="fu">across</span>(<span class="fu">c</span>(ACUsed, SpaceHeatingUsed), <span class="sc">~</span> <span class="fu">survey_mean</span>(.x),</span>
+<span id="cb144-3"><a href="c05-descriptive-analysis.html#cb144-3" tabindex="-1"></a>                   <span class="at">.unpack =</span> <span class="st">&quot;{outer}.{inner}&quot;</span>))</span>
+<span id="cb144-4"><a href="c05-descriptive-analysis.html#cb144-4" tabindex="-1"></a></span>
+<span id="cb144-5"><a href="c05-descriptive-analysis.html#cb144-5" tabindex="-1"></a>cool_heat_tab</span></code></pre></div>
 <pre><code>## # A tibble: 1 × 4
 ##   ACUsed.coef ACUsed._se SpaceHeatingUsed.coef SpaceHeatingUsed._se
 ##         &lt;dbl&gt;      &lt;dbl&gt;                 &lt;dbl&gt;                &lt;dbl&gt;
 ## 1       0.887    0.00306                 0.953              0.00207</code></pre>
-<p>Note that the estimates are the same with those obtained using the separate <code>group_by()</code> statements. As before, we can use <code>pivot_longer()</code> to structure the table in a more suitable format for distribution.</p>
-<div class="sourceCode" id="cb148"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb148-1"><a href="c05-descriptive-analysis.html#cb148-1" tabindex="-1"></a>cool_heat_tab <span class="sc">%&gt;%</span></span>
-<span id="cb148-2"><a href="c05-descriptive-analysis.html#cb148-2" tabindex="-1"></a>  <span class="fu">pivot_longer</span>(<span class="fu">everything</span>(),</span>
-<span id="cb148-3"><a href="c05-descriptive-analysis.html#cb148-3" tabindex="-1"></a>               <span class="at">names_to =</span> <span class="fu">c</span>(<span class="st">&quot;Comfort&quot;</span>, <span class="st">&quot;.value&quot;</span>),</span>
-<span id="cb148-4"><a href="c05-descriptive-analysis.html#cb148-4" tabindex="-1"></a>               <span class="at">names_pattern =</span> <span class="st">&quot;(.*)</span><span class="sc">\\</span><span class="st">.(.*)&quot;</span>) <span class="sc">%&gt;%</span></span>
-<span id="cb148-5"><a href="c05-descriptive-analysis.html#cb148-5" tabindex="-1"></a>  <span class="fu">rename</span>(<span class="at">p =</span> coef,</span>
-<span id="cb148-6"><a href="c05-descriptive-analysis.html#cb148-6" tabindex="-1"></a>         <span class="at">se =</span> <span class="st">`</span><span class="at">_se</span><span class="st">`</span>)</span></code></pre></div>
+<p>Note that the estimates are the same as those obtained using the separate <code>group_by()</code> statements. As before, we can use <code>pivot_longer()</code> to structure the table in a more suitable format for distribution.</p>
+<div class="sourceCode" id="cb146"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb146-1"><a href="c05-descriptive-analysis.html#cb146-1" tabindex="-1"></a>cool_heat_tab <span class="sc">%&gt;%</span></span>
+<span id="cb146-2"><a href="c05-descriptive-analysis.html#cb146-2" tabindex="-1"></a>  <span class="fu">pivot_longer</span>(<span class="fu">everything</span>(),</span>
+<span id="cb146-3"><a href="c05-descriptive-analysis.html#cb146-3" tabindex="-1"></a>               <span class="at">names_to =</span> <span class="fu">c</span>(<span class="st">&quot;Comfort&quot;</span>, <span class="st">&quot;.value&quot;</span>),</span>
+<span id="cb146-4"><a href="c05-descriptive-analysis.html#cb146-4" tabindex="-1"></a>               <span class="at">names_pattern =</span> <span class="st">&quot;(.*)</span><span class="sc">\\</span><span class="st">.(.*)&quot;</span>) <span class="sc">%&gt;%</span></span>
+<span id="cb146-5"><a href="c05-descriptive-analysis.html#cb146-5" tabindex="-1"></a>  <span class="fu">rename</span>(<span class="at">p =</span> coef,</span>
+<span id="cb146-6"><a href="c05-descriptive-analysis.html#cb146-6" tabindex="-1"></a>         <span class="at">se =</span> <span class="st">`</span><span class="at">_se</span><span class="st">`</span>)</span></code></pre></div>
 <pre><code>## # A tibble: 2 × 3
 ##   Comfort              p      se
 ##   &lt;chr&gt;            &lt;dbl&gt;   &lt;dbl&gt;
@@ -1501,16 +1496,16 @@ <h4>Example 2: Proportions with <code>across()</code><a href="c05-descriptive-an
 </div>
 <div id="example-3-purrrmap" class="section level4 unnumbered hasAnchor">
 <h4>Example 3: <code>purrr::map()</code><a href="c05-descriptive-analysis.html#example-3-purrrmap" class="anchor-section" aria-label="Anchor link to header"></a></h4>
-<p>Loops are a common tool when dealing with repetitive calculations. The {purrr} package provides the <code>map()</code> functions which, like a loop, allow you to perform the same task across different elements <span class="citation">(<a href="#ref-R-purrr">Wickham and Henry 2023</a>)</span>. In our case, we may want to calculate proportions from the same design multiple times. A straightforward approach is to design the calculation for one variable, build a function based on that, and then apply it iteratively for the rest of the variables.</p>
+<p>Loops are a common tool when dealing with repetitive calculations. The {purrr} package provides the <code>map()</code> functions, which, like a loop, allow us to perform the same task across different elements <span class="citation">(<a href="#ref-R-purrr">Wickham and Henry 2023</a>)</span>. In our case, we may want to calculate proportions from the same design multiple times. A straightforward approach is to design the calculation for one variable, build a function based on that, and then apply it iteratively for the rest of the variables.</p>
 <p>Suppose we want to create a table that shows the proportion of people who express trust in their government (<code>TrustGovernment</code>)<a href="#fn10" class="footnote-ref" id="fnref10"><sup>10</sup></a> as well as those that trust in people (<code>TrustPeople</code>)<a href="#fn11" class="footnote-ref" id="fnref11"><sup>11</sup></a>.</p>
 <p>First, we create a table for a single variable. The table includes the variable name as a column, the response, and the corresponding percentage with its standard error.</p>
-<div class="sourceCode" id="cb150"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb150-1"><a href="c05-descriptive-analysis.html#cb150-1" tabindex="-1"></a>anes_des <span class="sc">%&gt;%</span></span>
-<span id="cb150-2"><a href="c05-descriptive-analysis.html#cb150-2" tabindex="-1"></a>  <span class="fu">drop_na</span>(TrustGovernment) <span class="sc">%&gt;%</span></span>
-<span id="cb150-3"><a href="c05-descriptive-analysis.html#cb150-3" tabindex="-1"></a>  <span class="fu">group_by</span>(TrustGovernment) <span class="sc">%&gt;%</span></span>
-<span id="cb150-4"><a href="c05-descriptive-analysis.html#cb150-4" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">p =</span> <span class="fu">survey_prop</span>() <span class="sc">*</span> <span class="dv">100</span>) <span class="sc">%&gt;%</span></span>
-<span id="cb150-5"><a href="c05-descriptive-analysis.html#cb150-5" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">Variable =</span> <span class="st">&quot;TrustGovernment&quot;</span>) <span class="sc">%&gt;%</span></span>
-<span id="cb150-6"><a href="c05-descriptive-analysis.html#cb150-6" tabindex="-1"></a>  <span class="fu">rename</span>(<span class="at">Answer =</span> TrustGovernment) <span class="sc">%&gt;%</span></span>
-<span id="cb150-7"><a href="c05-descriptive-analysis.html#cb150-7" tabindex="-1"></a>  <span class="fu">select</span>(Variable, <span class="fu">everything</span>())</span></code></pre></div>
+<div class="sourceCode" id="cb148"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb148-1"><a href="c05-descriptive-analysis.html#cb148-1" tabindex="-1"></a>anes_des <span class="sc">%&gt;%</span></span>
+<span id="cb148-2"><a href="c05-descriptive-analysis.html#cb148-2" tabindex="-1"></a>  <span class="fu">drop_na</span>(TrustGovernment) <span class="sc">%&gt;%</span></span>
+<span id="cb148-3"><a href="c05-descriptive-analysis.html#cb148-3" tabindex="-1"></a>  <span class="fu">group_by</span>(TrustGovernment) <span class="sc">%&gt;%</span></span>
+<span id="cb148-4"><a href="c05-descriptive-analysis.html#cb148-4" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">p =</span> <span class="fu">survey_prop</span>() <span class="sc">*</span> <span class="dv">100</span>) <span class="sc">%&gt;%</span></span>
+<span id="cb148-5"><a href="c05-descriptive-analysis.html#cb148-5" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">Variable =</span> <span class="st">&quot;TrustGovernment&quot;</span>) <span class="sc">%&gt;%</span></span>
+<span id="cb148-6"><a href="c05-descriptive-analysis.html#cb148-6" tabindex="-1"></a>  <span class="fu">rename</span>(<span class="at">Answer =</span> TrustGovernment) <span class="sc">%&gt;%</span></span>
+<span id="cb148-7"><a href="c05-descriptive-analysis.html#cb148-7" tabindex="-1"></a>  <span class="fu">select</span>(Variable, <span class="fu">everything</span>())</span></code></pre></div>
 <pre><code>## # A tibble: 5 × 4
 ##   Variable        Answer                  p  p_se
 ##   &lt;chr&gt;           &lt;fct&gt;               &lt;dbl&gt; &lt;dbl&gt;
@@ -1521,17 +1516,17 @@ <h4>Example 3: <code>purrr::map()</code><a href="c05-descriptive-analysis.html#e
 ## 5 TrustGovernment Never               11.0  0.566</code></pre>
 <p>We estimate that 1.55% of people always trust the government, 13.16% trust the government most of the time, and so on.</p>
 <p>Now, we want to use the original series of steps as a template to create a general function <code>calcps()</code> that can apply the same steps to other variables. We replace <code>TrustGovernment</code> with an argument for a generic variable, <code>var</code>. Referring to <code>var</code> involves a bit of tidy evaluation, an advanced skill. To learn more, we recommend <span class="citation">Wickham (<a href="#ref-wickham2019advanced">2019</a>)</span>.</p>
-<div class="sourceCode" id="cb152"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb152-1"><a href="c05-descriptive-analysis.html#cb152-1" tabindex="-1"></a>calcps <span class="ot">&lt;-</span> <span class="cf">function</span>(var) {</span>
-<span id="cb152-2"><a href="c05-descriptive-analysis.html#cb152-2" tabindex="-1"></a>  anes_des <span class="sc">%&gt;%</span></span>
-<span id="cb152-3"><a href="c05-descriptive-analysis.html#cb152-3" tabindex="-1"></a>    <span class="fu">drop_na</span>(<span class="sc">!!</span><span class="fu">sym</span>(var)) <span class="sc">%&gt;%</span></span>
-<span id="cb152-4"><a href="c05-descriptive-analysis.html#cb152-4" tabindex="-1"></a>    <span class="fu">group_by</span>(<span class="sc">!!</span><span class="fu">sym</span>(var)) <span class="sc">%&gt;%</span></span>
-<span id="cb152-5"><a href="c05-descriptive-analysis.html#cb152-5" tabindex="-1"></a>    <span class="fu">summarize</span>(<span class="at">p =</span> <span class="fu">survey_prop</span>() <span class="sc">*</span> <span class="dv">100</span>) <span class="sc">%&gt;%</span></span>
-<span id="cb152-6"><a href="c05-descriptive-analysis.html#cb152-6" tabindex="-1"></a>    <span class="fu">mutate</span>(<span class="at">Variable =</span> var) <span class="sc">%&gt;%</span></span>
-<span id="cb152-7"><a href="c05-descriptive-analysis.html#cb152-7" tabindex="-1"></a>    <span class="fu">rename</span>(<span class="at">Answer :=</span> <span class="sc">!!</span><span class="fu">sym</span>(var)) <span class="sc">%&gt;%</span></span>
-<span id="cb152-8"><a href="c05-descriptive-analysis.html#cb152-8" tabindex="-1"></a>    <span class="fu">select</span>(Variable, <span class="fu">everything</span>())</span>
-<span id="cb152-9"><a href="c05-descriptive-analysis.html#cb152-9" tabindex="-1"></a>}</span></code></pre></div>
+<div class="sourceCode" id="cb150"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb150-1"><a href="c05-descriptive-analysis.html#cb150-1" tabindex="-1"></a>calcps <span class="ot">&lt;-</span> <span class="cf">function</span>(var) {</span>
+<span id="cb150-2"><a href="c05-descriptive-analysis.html#cb150-2" tabindex="-1"></a>  anes_des <span class="sc">%&gt;%</span></span>
+<span id="cb150-3"><a href="c05-descriptive-analysis.html#cb150-3" tabindex="-1"></a>    <span class="fu">drop_na</span>(<span class="sc">!!</span><span class="fu">sym</span>(var)) <span class="sc">%&gt;%</span></span>
+<span id="cb150-4"><a href="c05-descriptive-analysis.html#cb150-4" tabindex="-1"></a>    <span class="fu">group_by</span>(<span class="sc">!!</span><span class="fu">sym</span>(var)) <span class="sc">%&gt;%</span></span>
+<span id="cb150-5"><a href="c05-descriptive-analysis.html#cb150-5" tabindex="-1"></a>    <span class="fu">summarize</span>(<span class="at">p =</span> <span class="fu">survey_prop</span>() <span class="sc">*</span> <span class="dv">100</span>) <span class="sc">%&gt;%</span></span>
+<span id="cb150-6"><a href="c05-descriptive-analysis.html#cb150-6" tabindex="-1"></a>    <span class="fu">mutate</span>(<span class="at">Variable =</span> var) <span class="sc">%&gt;%</span></span>
+<span id="cb150-7"><a href="c05-descriptive-analysis.html#cb150-7" tabindex="-1"></a>    <span class="fu">rename</span>(<span class="at">Answer :=</span> <span class="sc">!!</span><span class="fu">sym</span>(var)) <span class="sc">%&gt;%</span></span>
+<span id="cb150-8"><a href="c05-descriptive-analysis.html#cb150-8" tabindex="-1"></a>    <span class="fu">select</span>(Variable, <span class="fu">everything</span>())</span>
+<span id="cb150-9"><a href="c05-descriptive-analysis.html#cb150-9" tabindex="-1"></a>}</span></code></pre></div>
 <p>We then apply this function to the two variables of interest, <code>TrustGovernment</code> and <code>TrustPeople</code>:</p>
-<div class="sourceCode" id="cb153"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb153-1"><a href="c05-descriptive-analysis.html#cb153-1" tabindex="-1"></a><span class="fu">calcps</span>(<span class="st">&quot;TrustGovernment&quot;</span>)</span></code></pre></div>
+<div class="sourceCode" id="cb151"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb151-1"><a href="c05-descriptive-analysis.html#cb151-1" tabindex="-1"></a><span class="fu">calcps</span>(<span class="st">&quot;TrustGovernment&quot;</span>)</span></code></pre></div>
 <pre><code>## # A tibble: 5 × 4
 ##   Variable        Answer                  p  p_se
 ##   &lt;chr&gt;           &lt;fct&gt;               &lt;dbl&gt; &lt;dbl&gt;
@@ -1540,7 +1535,7 @@ <h4>Example 3: <code>purrr::map()</code><a href="c05-descriptive-analysis.html#e
 ## 3 TrustGovernment About half the time 30.9  0.829
 ## 4 TrustGovernment Some of the time    43.4  0.855
 ## 5 TrustGovernment Never               11.0  0.566</code></pre>
-<div class="sourceCode" id="cb155"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb155-1"><a href="c05-descriptive-analysis.html#cb155-1" tabindex="-1"></a><span class="fu">calcps</span>(<span class="st">&quot;TrustPeople&quot;</span>)</span></code></pre></div>
+<div class="sourceCode" id="cb153"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb153-1"><a href="c05-descriptive-analysis.html#cb153-1" tabindex="-1"></a><span class="fu">calcps</span>(<span class="st">&quot;TrustPeople&quot;</span>)</span></code></pre></div>
 <pre><code>## # A tibble: 5 × 4
 ##   Variable    Answer                   p  p_se
 ##   &lt;chr&gt;       &lt;fct&gt;                &lt;dbl&gt; &lt;dbl&gt;
@@ -1550,9 +1545,9 @@ <h4>Example 3: <code>purrr::map()</code><a href="c05-descriptive-analysis.html#e
 ## 4 TrustPeople Some of the time    24.5   0.670
 ## 5 TrustPeople Never                5.05  0.422</code></pre>
 <p>Finally, we use <code>map()</code> to iterate over as many variables as needed. We feed our desired variables into <code>map()</code> along with our custom function, <code>calcps</code>. The output is a tibble with the variable names in the “Variable” column, the responses in the “Answer” column, along with the percentage and standard error. The <code>list_rbind()</code> function combines the rows into a single tibble. This example extends nicely when dealing with numerous variables for which we want percentage estimates.</p>
-<div class="sourceCode" id="cb157"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb157-1"><a href="c05-descriptive-analysis.html#cb157-1" tabindex="-1"></a><span class="fu">c</span>(<span class="st">&quot;TrustGovernment&quot;</span>, <span class="st">&quot;TrustPeople&quot;</span>) <span class="sc">%&gt;%</span></span>
-<span id="cb157-2"><a href="c05-descriptive-analysis.html#cb157-2" tabindex="-1"></a>  <span class="fu">map</span>(calcps) <span class="sc">%&gt;%</span></span>
-<span id="cb157-3"><a href="c05-descriptive-analysis.html#cb157-3" tabindex="-1"></a>  <span class="fu">list_rbind</span>()</span></code></pre></div>
+<div class="sourceCode" id="cb155"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb155-1"><a href="c05-descriptive-analysis.html#cb155-1" tabindex="-1"></a><span class="fu">c</span>(<span class="st">&quot;TrustGovernment&quot;</span>, <span class="st">&quot;TrustPeople&quot;</span>) <span class="sc">%&gt;%</span></span>
+<span id="cb155-2"><a href="c05-descriptive-analysis.html#cb155-2" tabindex="-1"></a>  <span class="fu">map</span>(calcps) <span class="sc">%&gt;%</span></span>
+<span id="cb155-3"><a href="c05-descriptive-analysis.html#cb155-3" tabindex="-1"></a>  <span class="fu">list_rbind</span>()</span></code></pre></div>
 <pre><code>## # A tibble: 10 × 4
 ##    Variable        Answer                   p  p_se
 ##    &lt;chr&gt;           &lt;fct&gt;                &lt;dbl&gt; &lt;dbl&gt;
@@ -1566,13 +1561,13 @@ <h4>Example 3: <code>purrr::map()</code><a href="c05-descriptive-analysis.html#e
 ##  8 TrustPeople     About half the time 28.2   0.776
 ##  9 TrustPeople     Some of the time    24.5   0.670
 ## 10 TrustPeople     Never                5.05  0.422</code></pre>
-<p>In addition to our results above, we can also see the output for <code>TrustPeople</code>. While we estimate 1.55% of people always trust the government, 0.81% always trust people.</p>
+<p>In addition to our results above, we can also see the output for <code>TrustPeople</code>. While we estimate that 1.55% of people always trust the government, 0.81% always trust people.</p>
 </div>
 </div>
 </div>
 <div id="exercises" class="section level2 hasAnchor" number="5.10">
 <h2><span class="header-section-number">5.10</span> Exercises<a href="c05-descriptive-analysis.html#exercises" class="anchor-section" aria-label="Anchor link to header"></a></h2>
-<p>The exercises use the design objects <code>anes_des</code> and <code>recs_des</code> provided in the Prerequisites box in the beginning of the chapter.</p>
+<p>The exercises use the design objects <code>anes_des</code> and <code>recs_des</code> provided in the Prerequisites box at the beginning of the chapter.</p>
 <ol style="list-style-type: decimal">
 <li><p>How many females have a graduate degree? Hint: the variables <code>Gender</code> and <code>Education</code> will be useful.</p></li>
 <li><p>What percentage of people identify as “Strong Democrat”? Hint: The variable <code>PartyID</code> indicates someone’s party affiliation.</p></li>
@@ -1580,9 +1575,9 @@ <h2><span class="header-section-number">5.10</span> Exercises<a href="c05-descri
 <li><p>What percentage of people voted in both the 2016 election and the 2020 election? Include the logit confidence interval. Hint: The variable <code>VotedPres2016</code> indicates whether someone voted in 2016.</p></li>
 <li><p>What is the design effect for the proportion of people who voted early? Hint: The variable <code>EarlyVote2020</code> indicates whether someone voted early in 2020.</p></li>
 <li><p>What is the median temperature people set their thermostats to at night during the winter? Hint: The variable <code>WinterTempNight</code> indicates the temperature that people set their temperature in the winter at night.</p></li>
-<li><p>People sometimes set their temperature differently over different seasons and during the day. What median temperatures do people set their thermostat to in the summer and winter, both during the day and at night? Include confidence intervals. Hint: Use the variables <code>WinterTempDay</code>, <code>WinterTempNight</code>, <code>SummerTempDay</code>, and <code>SummerTempNight</code>.</p></li>
+<li><p>People sometimes set their temperature differently over different seasons and during the day. What median temperatures do people set their thermostats to in the summer and winter, both during the day and at night? Include confidence intervals. Hint: Use the variables <code>WinterTempDay</code>, <code>WinterTempNight</code>, <code>SummerTempDay</code>, and <code>SummerTempNight</code>.</p></li>
 <li><p>What is the correlation between the temperature that people set their temperature at during the night and during the day in the summer?</p></li>
-<li><p>What is the 1st, 2nd, and 3rd quartile of the amount of money spent on energy by Building America (BA) climate zone? Hint: <code>TOTALDOL</code> indicates the total amount spent on all fuel, and <code>ClimateRegion_BA</code> indicates the BA climate zones.</p></li>
+<li><p>What is the 1st, 2nd, and 3rd quartile of money spent on energy by Building America (BA) climate zone? Hint: <code>TOTALDOL</code> indicates the total amount spent on all fuel, and <code>ClimateRegion_BA</code> indicates the BA climate zones.</p></li>
 </ol>
 
 </div>
@@ -1590,7 +1585,7 @@ <h2><span class="header-section-number">5.10</span> Exercises<a href="c05-descri
 <h3>References<a href="references.html#references" class="anchor-section" aria-label="Anchor link to header"></a></h3>
 <div id="refs" class="references csl-bib-body hanging-indent" entry-spacing="0">
 <div id="ref-R-srvyr" class="csl-entry">
-Freedman Ellis, Greg, and Ben Schneider. 2023. <em><span class="nocase">srvyr</span>: Dplyr-Like Syntax for Summary Statistics of Survey Data</em>.
+Freedman Ellis, Greg, and Ben Schneider. 2023. <em><span class="nocase">srvyr</span>: ’<span class="nocase">dplyr</span>’-Like Syntax for Summary Statistics of Survey Data</em>.
 </div>
 <div id="ref-lumley2010complex" class="csl-entry">
 Lumley, Thomas. 2010. <em>Complex Surveys: A Guide to Analysis Using <span>R</span></em>. John Wiley; Sons.
@@ -1615,8 +1610,8 @@ <h3>References<a href="references.html#references" class="anchor-section" aria-l
 <li id="fn5"><p>Question text: Is any air conditioning equipment used in your home?<a href="c05-descriptive-analysis.html#fnref5" class="footnote-back">↩︎</a></p></li>
 <li id="fn6"><p>The value of <code>DOLLARLP</code> reflects the annualized amount spent on liquid propane and <code>BTULP</code> reflects the annualized consumption in Btu of liquid propane.<a href="c05-descriptive-analysis.html#fnref6" class="footnote-back">↩︎</a></p></li>
 <li id="fn7"><p>Question text: What is the square footage of your home?<a href="c05-descriptive-analysis.html#fnref7" class="footnote-back">↩︎</a></p></li>
-<li id="fn8"><p>BTUEL is derived from the supplier side component of the survey where <code>BTUEL</code> represents the electricity consumption in British thermal units (Btus) converted from kilowatt hours (kWh) in a year<a href="c05-descriptive-analysis.html#fnref8" class="footnote-back">↩︎</a></p></li>
-<li id="fn9"><p><code>BTUNG</code> is derived from the supplier side component of the survey where <code>BTUNG</code> represents the natural gas consumption in British thermal units (Btus) in a year<a href="c05-descriptive-analysis.html#fnref9" class="footnote-back">↩︎</a></p></li>
+<li id="fn8"><p>BTUEL is derived from the supplier side component of the survey where <code>BTUEL</code> represents the electricity consumption in British thermal units (Btus) converted from kilowatt hours (kWh) in a year.<a href="c05-descriptive-analysis.html#fnref8" class="footnote-back">↩︎</a></p></li>
+<li id="fn9"><p><code>BTUNG</code> is derived from the supplier side component of the survey where <code>BTUNG</code> represents the natural gas consumption in British thermal units (Btus) in a year.<a href="c05-descriptive-analysis.html#fnref9" class="footnote-back">↩︎</a></p></li>
 <li id="fn10"><p>Question: How often can you trust the federal government in Washington to do what is right? (Always, most of the time, about half the time, some of the time, or never / Never, some of the time, about half the time, most of the time, or always)?<a href="c05-descriptive-analysis.html#fnref10" class="footnote-back">↩︎</a></p></li>
 <li id="fn11"><p>Question: Generally speaking, how often can you trust other people? (Always, most of the time, about half the time, some of the time, or never / Never, some of the time, about half the time, most of the time, or always)? <a href="c05-descriptive-analysis.html#fnref11" class="footnote-back">↩︎</a></p></li>
 </ol>
diff --git a/c06-statistical-testing.html b/c06-statistical-testing.html
index 8f379aeb..844d62d9 100644
--- a/c06-statistical-testing.html
+++ b/c06-statistical-testing.html
@@ -23,7 +23,7 @@
 <meta name="author" content="Stephanie Zimmer, Rebecca J. Powell, and Isabella Velásquez" />
 
 
-<meta name="date" content="2024-04-16" />
+<meta name="date" content="2024-04-24" />
 
   <meta name="viewport" content="width=device-width, initial-scale=1" />
   <meta name="apple-mobile-web-app-capable" content="yes" />
@@ -229,7 +229,7 @@
 <li class="chapter" data-level="4.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#introduction-3"><i class="fa fa-check"></i><b>4.1</b> Introduction</a></li>
 <li class="chapter" data-level="4.2" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup"><i class="fa fa-check"></i><b>4.2</b> Setup</a>
 <ul>
-<li class="chapter" data-level="4.2.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#packages"><i class="fa fa-check"></i><b>4.2.1</b> Packages</a></li>
+<li class="chapter" data-level="4.2.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup-load-pkgs"><i class="fa fa-check"></i><b>4.2.1</b> Packages</a></li>
 <li class="chapter" data-level="4.2.2" data-path="c04-getting-started.html"><a href="c04-getting-started.html#data"><i class="fa fa-check"></i><b>4.2.2</b> Data</a></li>
 <li class="chapter" data-level="4.2.3" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup-des-obj"><i class="fa fa-check"></i><b>4.2.3</b> Design objects</a></li>
 </ul></li>
@@ -358,7 +358,7 @@
 <li class="chapter" data-level="9.9.1" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#random-number-seeds"><i class="fa fa-check"></i><b>9.9.1</b> Random number seeds</a></li>
 <li class="chapter" data-level="9.9.2" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#descriptive-names-and-labels"><i class="fa fa-check"></i><b>9.9.2</b> Descriptive names and labels</a></li>
 </ul></li>
-<li class="chapter" data-level="9.10" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#summary"><i class="fa fa-check"></i><b>9.10</b> Summary</a></li>
+<li class="chapter" data-level="9.10" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#additional-resources-1"><i class="fa fa-check"></i><b>9.10</b> Additional resources</a></li>
 </ul></li>
 <li class="part"><span><b>IV Real life data</b></span></li>
 <li class="chapter" data-level="10" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html"><i class="fa fa-check"></i><b>10</b> Sample designs and replicate weights</a>
@@ -380,7 +380,7 @@
 <ul>
 <li class="chapter" data-level="10.4.1" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#balanced-repeated-replication-brr-method"><i class="fa fa-check"></i><b>10.4.1</b> Balanced Repeated Replication (BRR) method</a></li>
 <li class="chapter" data-level="10.4.2" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#fays-brr-method"><i class="fa fa-check"></i><b>10.4.2</b> Fay’s BRR method</a></li>
-<li class="chapter" data-level="10.4.3" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#jackknife-method"><i class="fa fa-check"></i><b>10.4.3</b> Jackknife method</a></li>
+<li class="chapter" data-level="10.4.3" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#samp-jackknife"><i class="fa fa-check"></i><b>10.4.3</b> Jackknife method</a></li>
 <li class="chapter" data-level="10.4.4" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#bootstrap-method"><i class="fa fa-check"></i><b>10.4.4</b> Bootstrap method</a></li>
 </ul></li>
 <li class="chapter" data-level="10.5" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#exercises-2"><i class="fa fa-check"></i><b>10.5</b> Exercises</a></li>
@@ -398,14 +398,14 @@
 <li class="chapter" data-level="11.4" data-path="c11-missing-data.html"><a href="c11-missing-data.html#analysis-with-missing-data"><i class="fa fa-check"></i><b>11.4</b> Analysis with missing data</a>
 <ul>
 <li class="chapter" data-level="11.4.1" data-path="c11-missing-data.html"><a href="c11-missing-data.html#recoding-missing-data"><i class="fa fa-check"></i><b>11.4.1</b> Recoding missing data</a></li>
-<li class="chapter" data-level="11.4.2" data-path="c11-missing-data.html"><a href="c11-missing-data.html#accounting-for-skip-patterns"><i class="fa fa-check"></i><b>11.4.2</b> Accounting for skip patterns</a></li>
+<li class="chapter" data-level="11.4.2" data-path="c11-missing-data.html"><a href="c11-missing-data.html#missing-skip-patt"><i class="fa fa-check"></i><b>11.4.2</b> Accounting for skip patterns</a></li>
 </ul></li>
 </ul></li>
 <li class="chapter" data-level="12" data-path="c12-recommendations.html"><a href="c12-recommendations.html"><i class="fa fa-check"></i><b>12</b> Successful survey analysis recommendations</a>
 <ul>
 <li class="chapter" data-level="" data-path="c12-recommendations.html"><a href="c12-recommendations.html#prereq12"><i class="fa fa-check"></i>Prerequisites</a></li>
 <li class="chapter" data-level="12.1" data-path="c12-recommendations.html"><a href="c12-recommendations.html#introduction-10"><i class="fa fa-check"></i><b>12.1</b> Introduction</a></li>
-<li class="chapter" data-level="12.2" data-path="c12-recommendations.html"><a href="c12-recommendations.html#recs-survey-process"><i class="fa fa-check"></i><b>12.2</b> Follow survey analysis process</a></li>
+<li class="chapter" data-level="12.2" data-path="c12-recommendations.html"><a href="c12-recommendations.html#recs-survey-process"><i class="fa fa-check"></i><b>12.2</b> Follow the survey analysis process</a></li>
 <li class="chapter" data-level="12.3" data-path="c12-recommendations.html"><a href="c12-recommendations.html#begin-with-descriptive-analysis"><i class="fa fa-check"></i><b>12.3</b> Begin with descriptive analysis</a>
 <ul>
 <li class="chapter" data-level="12.3.1" data-path="c12-recommendations.html"><a href="c12-recommendations.html#table-review"><i class="fa fa-check"></i><b>12.3.1</b> Table review</a></li>
@@ -524,104 +524,104 @@ <h3>Prerequisites<a href="c06-statistical-testing.html#prereq6" class="anchor-se
 </div>
 <div class="prereqbox">
 <p>For this chapter, load the following packages:</p>
-<div class="sourceCode" id="cb159"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb159-1"><a href="c06-statistical-testing.html#cb159-1" tabindex="-1"></a><span class="fu">library</span>(tidyverse)</span>
-<span id="cb159-2"><a href="c06-statistical-testing.html#cb159-2" tabindex="-1"></a><span class="fu">library</span>(survey) </span>
-<span id="cb159-3"><a href="c06-statistical-testing.html#cb159-3" tabindex="-1"></a><span class="fu">library</span>(srvyr) </span>
-<span id="cb159-4"><a href="c06-statistical-testing.html#cb159-4" tabindex="-1"></a><span class="fu">library</span>(srvyrexploR)</span>
-<span id="cb159-5"><a href="c06-statistical-testing.html#cb159-5" tabindex="-1"></a><span class="fu">library</span>(broom)</span>
-<span id="cb159-6"><a href="c06-statistical-testing.html#cb159-6" tabindex="-1"></a><span class="fu">library</span>(gt)</span>
-<span id="cb159-7"><a href="c06-statistical-testing.html#cb159-7" tabindex="-1"></a><span class="fu">library</span>(prettyunits)</span></code></pre></div>
-<p>We will be using data from ANES and RECS described in Chapter <a href="c04-getting-started.html#c04-getting-started">4</a>. As a reminder, here is the code to create the design objects for each to use throughout this chapter. For ANES, we need to adjust the weight so it sums to the population instead of the sample (see the ANES documentation and Chapter <a href="c04-getting-started.html#c04-getting-started">4</a> for more information).</p>
-<div class="sourceCode" id="cb160"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb160-1"><a href="c06-statistical-testing.html#cb160-1" tabindex="-1"></a>targetpop <span class="ot">&lt;-</span> <span class="dv">231592693</span></span>
-<span id="cb160-2"><a href="c06-statistical-testing.html#cb160-2" tabindex="-1"></a></span>
-<span id="cb160-3"><a href="c06-statistical-testing.html#cb160-3" tabindex="-1"></a>anes_adjwgt <span class="ot">&lt;-</span> anes_2020 <span class="sc">%&gt;%</span></span>
-<span id="cb160-4"><a href="c06-statistical-testing.html#cb160-4" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">Weight =</span> Weight <span class="sc">/</span> <span class="fu">sum</span>(Weight) <span class="sc">*</span> targetpop)</span>
-<span id="cb160-5"><a href="c06-statistical-testing.html#cb160-5" tabindex="-1"></a></span>
-<span id="cb160-6"><a href="c06-statistical-testing.html#cb160-6" tabindex="-1"></a>anes_des <span class="ot">&lt;-</span> anes_adjwgt <span class="sc">%&gt;%</span></span>
-<span id="cb160-7"><a href="c06-statistical-testing.html#cb160-7" tabindex="-1"></a>  <span class="fu">as_survey_design</span>(</span>
-<span id="cb160-8"><a href="c06-statistical-testing.html#cb160-8" tabindex="-1"></a>    <span class="at">weights =</span> Weight,</span>
-<span id="cb160-9"><a href="c06-statistical-testing.html#cb160-9" tabindex="-1"></a>    <span class="at">strata =</span> Stratum,</span>
-<span id="cb160-10"><a href="c06-statistical-testing.html#cb160-10" tabindex="-1"></a>    <span class="at">ids =</span> VarUnit,</span>
-<span id="cb160-11"><a href="c06-statistical-testing.html#cb160-11" tabindex="-1"></a>    <span class="at">nest =</span> <span class="cn">TRUE</span></span>
-<span id="cb160-12"><a href="c06-statistical-testing.html#cb160-12" tabindex="-1"></a>  )</span></code></pre></div>
+<div class="sourceCode" id="cb157"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb157-1"><a href="c06-statistical-testing.html#cb157-1" tabindex="-1"></a><span class="fu">library</span>(tidyverse)</span>
+<span id="cb157-2"><a href="c06-statistical-testing.html#cb157-2" tabindex="-1"></a><span class="fu">library</span>(survey) </span>
+<span id="cb157-3"><a href="c06-statistical-testing.html#cb157-3" tabindex="-1"></a><span class="fu">library</span>(srvyr) </span>
+<span id="cb157-4"><a href="c06-statistical-testing.html#cb157-4" tabindex="-1"></a><span class="fu">library</span>(srvyrexploR)</span>
+<span id="cb157-5"><a href="c06-statistical-testing.html#cb157-5" tabindex="-1"></a><span class="fu">library</span>(broom)</span>
+<span id="cb157-6"><a href="c06-statistical-testing.html#cb157-6" tabindex="-1"></a><span class="fu">library</span>(gt)</span>
+<span id="cb157-7"><a href="c06-statistical-testing.html#cb157-7" tabindex="-1"></a><span class="fu">library</span>(prettyunits)</span></code></pre></div>
+<p>We are using data from ANES and RECS described in Chapter <a href="c04-getting-started.html#c04-getting-started">4</a>. As a reminder, here is the code to create the design objects for each to use throughout this chapter. For ANES, we need to adjust the weight so it sums to the population instead of the sample (see the ANES documentation and Chapter <a href="c04-getting-started.html#c04-getting-started">4</a> for more information.)</p>
+<div class="sourceCode" id="cb158"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb158-1"><a href="c06-statistical-testing.html#cb158-1" tabindex="-1"></a>targetpop <span class="ot">&lt;-</span> <span class="dv">231592693</span></span>
+<span id="cb158-2"><a href="c06-statistical-testing.html#cb158-2" tabindex="-1"></a></span>
+<span id="cb158-3"><a href="c06-statistical-testing.html#cb158-3" tabindex="-1"></a>anes_adjwgt <span class="ot">&lt;-</span> anes_2020 <span class="sc">%&gt;%</span></span>
+<span id="cb158-4"><a href="c06-statistical-testing.html#cb158-4" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">Weight =</span> Weight <span class="sc">/</span> <span class="fu">sum</span>(Weight) <span class="sc">*</span> targetpop)</span>
+<span id="cb158-5"><a href="c06-statistical-testing.html#cb158-5" tabindex="-1"></a></span>
+<span id="cb158-6"><a href="c06-statistical-testing.html#cb158-6" tabindex="-1"></a>anes_des <span class="ot">&lt;-</span> anes_adjwgt <span class="sc">%&gt;%</span></span>
+<span id="cb158-7"><a href="c06-statistical-testing.html#cb158-7" tabindex="-1"></a>  <span class="fu">as_survey_design</span>(</span>
+<span id="cb158-8"><a href="c06-statistical-testing.html#cb158-8" tabindex="-1"></a>    <span class="at">weights =</span> Weight,</span>
+<span id="cb158-9"><a href="c06-statistical-testing.html#cb158-9" tabindex="-1"></a>    <span class="at">strata =</span> Stratum,</span>
+<span id="cb158-10"><a href="c06-statistical-testing.html#cb158-10" tabindex="-1"></a>    <span class="at">ids =</span> VarUnit,</span>
+<span id="cb158-11"><a href="c06-statistical-testing.html#cb158-11" tabindex="-1"></a>    <span class="at">nest =</span> <span class="cn">TRUE</span></span>
+<span id="cb158-12"><a href="c06-statistical-testing.html#cb158-12" tabindex="-1"></a>  )</span></code></pre></div>
 <p>For RECS, details are included in the RECS documentation and Chapters <a href="c04-getting-started.html#c04-getting-started">4</a> and <a href="c10-sample-designs-replicate-weights.html#c10-sample-designs-replicate-weights">10</a>.</p>
-<div class="sourceCode" id="cb161"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb161-1"><a href="c06-statistical-testing.html#cb161-1" tabindex="-1"></a>recs_des <span class="ot">&lt;-</span> recs_2020 <span class="sc">%&gt;%</span></span>
-<span id="cb161-2"><a href="c06-statistical-testing.html#cb161-2" tabindex="-1"></a>  <span class="fu">as_survey_rep</span>(</span>
-<span id="cb161-3"><a href="c06-statistical-testing.html#cb161-3" tabindex="-1"></a>    <span class="at">weights =</span> NWEIGHT,</span>
-<span id="cb161-4"><a href="c06-statistical-testing.html#cb161-4" tabindex="-1"></a>    <span class="at">repweights =</span> NWEIGHT1<span class="sc">:</span>NWEIGHT60,</span>
-<span id="cb161-5"><a href="c06-statistical-testing.html#cb161-5" tabindex="-1"></a>    <span class="at">type =</span> <span class="st">&quot;JK1&quot;</span>,</span>
-<span id="cb161-6"><a href="c06-statistical-testing.html#cb161-6" tabindex="-1"></a>    <span class="at">scale =</span> <span class="dv">59</span><span class="sc">/</span><span class="dv">60</span>,</span>
-<span id="cb161-7"><a href="c06-statistical-testing.html#cb161-7" tabindex="-1"></a>    <span class="at">mse =</span> <span class="cn">TRUE</span></span>
-<span id="cb161-8"><a href="c06-statistical-testing.html#cb161-8" tabindex="-1"></a>  )</span></code></pre></div>
+<div class="sourceCode" id="cb159"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb159-1"><a href="c06-statistical-testing.html#cb159-1" tabindex="-1"></a>recs_des <span class="ot">&lt;-</span> recs_2020 <span class="sc">%&gt;%</span></span>
+<span id="cb159-2"><a href="c06-statistical-testing.html#cb159-2" tabindex="-1"></a>  <span class="fu">as_survey_rep</span>(</span>
+<span id="cb159-3"><a href="c06-statistical-testing.html#cb159-3" tabindex="-1"></a>    <span class="at">weights =</span> NWEIGHT,</span>
+<span id="cb159-4"><a href="c06-statistical-testing.html#cb159-4" tabindex="-1"></a>    <span class="at">repweights =</span> NWEIGHT1<span class="sc">:</span>NWEIGHT60,</span>
+<span id="cb159-5"><a href="c06-statistical-testing.html#cb159-5" tabindex="-1"></a>    <span class="at">type =</span> <span class="st">&quot;JK1&quot;</span>,</span>
+<span id="cb159-6"><a href="c06-statistical-testing.html#cb159-6" tabindex="-1"></a>    <span class="at">scale =</span> <span class="dv">59</span><span class="sc">/</span><span class="dv">60</span>,</span>
+<span id="cb159-7"><a href="c06-statistical-testing.html#cb159-7" tabindex="-1"></a>    <span class="at">mse =</span> <span class="cn">TRUE</span></span>
+<span id="cb159-8"><a href="c06-statistical-testing.html#cb159-8" tabindex="-1"></a>  )</span></code></pre></div>
 </div>
 <div id="introduction-5" class="section level2 hasAnchor" number="6.1">
 <h2><span class="header-section-number">6.1</span> Introduction<a href="c06-statistical-testing.html#introduction-5" class="anchor-section" aria-label="Anchor link to header"></a></h2>
-<p>When analyzing results from a survey, the point estimates described in Chapter <a href="c05-descriptive-analysis.html#c05-descriptive-analysis">5</a> help us understand the data at a high level. Still, researchers and the public often want to make comparisons between different groups. These comparisons are calculated through statistical testing.</p>
+<p>When analyzing survey results, the point estimates described in Chapter <a href="c05-descriptive-analysis.html#c05-descriptive-analysis">5</a> help us understand the data at a high level. Still, we often want to make comparisons between different groups. These comparisons are calculated through statistical testing.</p>
 <p>The general idea of statistical testing is the same for data obtained through surveys and data obtained through other methods, where we compare the point estimates and variance estimates of each statistic to see if statistically significant differences exist. However, statistical testing for complex surveys involves additional considerations due to the need to account for the sampling design in order to obtain accurate variance estimates.</p>
-<p>Statistical testing, also called hypothesis testing, involves declaring a null and alternative hypothesis. A null hypothesis is denoted as <span class="math inline">\(H_0\)</span> and the alternative hypothesis is denoted as <span class="math inline">\(H_A\)</span>. The null hypothesis is the default assumption in that there are no differences in the data, or that the data is operating under “standard” behaviors. On the other hand, the alternative hypothesis is the break from the “standard” and what we are trying to determine if the data supports.</p>
-<p>Let’s review an example outside of survey data. If we are flipping a coin, a null hypothesis would be that the coin is fair and that each side has an equal chance of being flipped. In other words, the probability of the coin landing on each side is 1/2. Whereas an alternative hypothesis could be that the coin is unfair and that one side has a higher probability of being flipped (e.g., a probability of 1/4 to get heads, but a probability of 3/4 to get tails). We write this set of hypotheses as:</p>
+<p>Statistical testing, also called hypothesis testing, involves declaring a null and alternative hypothesis. A null hypothesis is denoted as <span class="math inline">\(H_0\)</span> and the alternative hypothesis is denoted as <span class="math inline">\(H_A\)</span>. The null hypothesis is the default assumption in that there are no differences in the data, or that the data are operating under “standard” behaviors. On the other hand, the alternative hypothesis is the break from the “standard” and what we are trying to determine if the data support this alternative hypothesis.</p>
+<p>Let’s review an example outside of survey data. If we are flipping a coin, a null hypothesis would be that the coin is fair and that each side has an equal chance of being flipped. In other words, the probability of the coin landing on each side is 1/2. Whereas an alternative hypothesis could be that the coin is unfair and that one side has a higher probability of being flipped (e.g., a probability of 1/4 to get heads but a probability of 3/4 to get tails.) We write this set of hypotheses as:</p>
 <ul>
 <li><span class="math inline">\(H_0: \rho_{heads} = \rho_{tails}\)</span>, where <span class="math inline">\(\rho_{x}\)</span> is the probability of flipping the coin and having it land on heads (<span class="math inline">\(\rho_{heads}\)</span>) or tails (<span class="math inline">\(\rho_{tails}\)</span>)</li>
 <li><span class="math inline">\(H_A: \rho_{heads} \neq \rho_{tails}\)</span></li>
 </ul>
-<p>When we conduct hypothesis testing, the statistical models calculate a p-value, which shows how likely we are to observe the data if the null hypothesis is true. If the p-value (a probability between 0 and 1) is small, we have strong evidence to reject the null hypothesis as it is unlikely to see the data we are observing if the null hypothesis is true. However, if the p-value is large, we say we do not have evidence to reject the null hypothesis. The size of the p-value for this cut off is determined by type 1 error known as <span class="math inline">\(\alpha\)</span>. A common type 1 error value for statistical testing is to use <span class="math inline">\(\alpha = 0.05\)</span>.<a href="#fn12" class="footnote-ref" id="fnref12"><sup>12</sup></a> It is common for explanations of statistical testing to refer to confidence level. The confidence level is the inverse of the type 1 error. Thus, if <span class="math inline">\(\alpha = 0.05\)</span>, the confidence level would be 95%.</p>
-<p>The functions in the {survey} package allow for the correct estimation of the variances. This chapter will cover the following statistical tests with survey data and the following functions from the {survey} package<span class="citation">(<a href="#ref-lumley2010complex">Lumley 2010</a>)</span>:</p>
+<p>When we conduct hypothesis testing, the statistical models calculate a p-value, which shows how likely we are to observe the data if the null hypothesis is true. If the p-value (a probability between 0 and 1) is small, we have strong evidence to reject the null hypothesis as it is unlikely to see the data we observe if the null hypothesis is true. However, if the p-value is large, we say we do not have evidence to reject the null hypothesis. The size of the p-value for this cut-off is determined by Type 1 error known as <span class="math inline">\(\alpha\)</span>. A common Type 1 error value for statistical testing is to use <span class="math inline">\(\alpha = 0.05\)</span>.<a href="#fn12" class="footnote-ref" id="fnref12"><sup>12</sup></a> It is common for explanations of statistical testing to refer to confidence level. The confidence level is the inverse of the Type 1 error. Thus, if <span class="math inline">\(\alpha = 0.05\)</span>, the confidence level would be 95%.</p>
+<p>The functions in the {survey} package allow for the correct estimation of the variances. This chapter covers the following statistical tests with survey data and the following functions from the {survey} package<span class="citation">(<a href="#ref-lumley2010complex">Lumley 2010</a>)</span>:</p>
 <ul>
-<li>Comparison of proportions <code>svyttest()</code></li>
-<li>Comparison of means <code>svyttest()</code></li>
-<li>Goodness of fit tests <code>svygofchisq()</code></li>
-<li>Tests of independence <code>svychisq()</code></li>
-<li>Tests of homogeneity <code>svychisq()</code></li>
+<li>Comparison of proportions (<code>svyttest()</code>)</li>
+<li>Comparison of means (<code>svyttest()</code>)</li>
+<li>Goodness of fit tests (<code>svygofchisq()</code>)</li>
+<li>Tests of independence (<code>svychisq()</code>)</li>
+<li>Tests of homogeneity (<code>svychisq()</code>)</li>
 </ul>
 </div>
 <div id="dot-notation" class="section level2 hasAnchor" number="6.2">
 <h2><span class="header-section-number">6.2</span> Dot notation<a href="c06-statistical-testing.html#dot-notation" class="anchor-section" aria-label="Anchor link to header"></a></h2>
-<p>Up to this point, we have shown functions that use wrappers from the {srvyr} package. This means that the functions work with tidyverse syntax. However, the functions in this chapter do not have wrappers in the {srvyr} package and are instead used directly from the {survey} package. Therefore, the design object is <em>not</em> the first argument, and to use these functions with the magrittr pipe (<code>%&gt;%</code>) and tidyverse syntax, we will need to use dot (<code>.</code>) notation<a href="#fn13" class="footnote-ref" id="fnref13"><sup>13</sup></a></p>
-<p>Functions that work with the magrittr pipe (<code>%&gt;%</code>) have the data as the first argument. When we run a function with the pipe, it automatically places anything to the left of the pipe into the first argument of the function to the right of the pipe. For example, if we wanted to take the <code>mtcars</code> data and filter to cars with six cylinders, we can write the code in at least four different ways:</p>
+<p>Up to this point, we have shown functions that use wrappers from the {srvyr} package. This means that the functions work with tidyverse syntax. However, the functions in this chapter do not have wrappers in the {srvyr} package and are instead used directly from the {survey} package. Therefore, the design object is <em>not</em> the first argument, and to use these functions with the magrittr pipe (<code>%&gt;%</code>) and tidyverse syntax, we need to use dot (<code>.</code>) notation.<a href="#fn13" class="footnote-ref" id="fnref13"><sup>13</sup></a></p>
+<p>Functions that work with the magrittr pipe (<code>%&gt;%</code>) have the dataset as the first argument. When we run a function with the pipe, it automatically places anything to the left of the pipe into the first argument of the function to the right of the pipe. For example, if we wanted to take the <code>towny</code> data from the {gt} package and filter to municipalities with the Census Subdivision Type of “city”, we can write the code in at least four different ways:</p>
 <ol style="list-style-type: decimal">
-<li><code>filter(mtcars, cyl == 6)</code></li>
-<li><code>mtcars %&gt;% filter(cyl == 6)</code></li>
-<li><code>mtcars %&gt;% filter(., cyl == 6)</code></li>
-<li><code>mtcars %&gt;% filter(.data = ., cyl == 6)</code></li>
+<li><code>filter(towny, csd_type == "city")</code></li>
+<li><code>towny %&gt;% filter(csd_type == "city")</code></li>
+<li><code>towny %&gt;% filter(., csd_type == "city")</code></li>
+<li><code>towny %&gt;% filter(.data = ., csd_type == "city")</code></li>
 </ol>
-<p>Each of these lines of code will produce the same output since the argument that takes the data is in the first spot in <code>filter()</code>. The first two are probably familiar to those who have worked with the tidyverse. The third option functions the same way as the second one but is explicit that <code>mtcars</code> goes into the first argument, and the fourth option indicates that <code>mtcars</code> is going into the named argument of <code>.data</code>. Here, we are telling R to take what’s on the left side of the pipe (<code>mtcars</code>) and pipe it into the spot with the dot (<code>.</code>)—the first argument.</p>
+<p>Each of these lines of code produces the same output since the argument that takes the dataset is in the first spot in <code>filter()</code>. The first two are probably familiar to those who have worked with the tidyverse. The third option functions the same way as the second one but is explicit that <code>towny</code> goes into the first argument, and the fourth option indicates that <code>towny</code> is going into the named argument of <code>.data</code>. Here, we are telling R to take what is on the left side of the pipe (<code>towny</code>) and pipe it into the spot with the dot (<code>.</code>)—the first argument.</p>
 <p>In functions that are not part of the tidyverse, the data argument may not be in the first spot. For example, in <code>svyttest()</code>, the data argument is in the second spot, which means we need to place the dot (<code>.</code>) in the second spot and not the first. For example:</p>
-<div class="sourceCode" id="cb162"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb162-1"><a href="c06-statistical-testing.html#cb162-1" tabindex="-1"></a>svydata_des <span class="sc">%&gt;%</span></span>
-<span id="cb162-2"><a href="c06-statistical-testing.html#cb162-2" tabindex="-1"></a> <span class="fu">svyttest</span>(x <span class="sc">~</span> y, .)</span></code></pre></div>
+<div class="sourceCode" id="cb160"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb160-1"><a href="c06-statistical-testing.html#cb160-1" tabindex="-1"></a>svydata_des <span class="sc">%&gt;%</span></span>
+<span id="cb160-2"><a href="c06-statistical-testing.html#cb160-2" tabindex="-1"></a> <span class="fu">svyttest</span>(x <span class="sc">~</span> y, .)</span></code></pre></div>
 <p>By default, the pipe places the left-hand object in the first argument spot. Placing the dot (<code>.</code>) in the second argument spot indicates that the survey design object <code>svydata_des</code> should be used in the second argument and not the first.</p>
 <p>Alternatively, named arguments could be used to place the dot first as named arguments can appear at any location, as in the following:</p>
-<div class="sourceCode" id="cb163"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb163-1"><a href="c06-statistical-testing.html#cb163-1" tabindex="-1"></a>svydata_des <span class="sc">%&gt;%</span></span>
-<span id="cb163-2"><a href="c06-statistical-testing.html#cb163-2" tabindex="-1"></a> <span class="fu">svyttest</span>(<span class="at">design =</span> ., x <span class="sc">~</span> y)</span></code></pre></div>
-<p>However, the following code will not work as the <code>svyttest()</code> function expects the formula as the first argument when arguments are not named:</p>
-<div class="sourceCode" id="cb164"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb164-1"><a href="c06-statistical-testing.html#cb164-1" tabindex="-1"></a>svydata_des <span class="sc">%&gt;%</span></span>
-<span id="cb164-2"><a href="c06-statistical-testing.html#cb164-2" tabindex="-1"></a> <span class="fu">svyttest</span>(., x <span class="sc">~</span> y)</span></code></pre></div>
+<div class="sourceCode" id="cb161"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb161-1"><a href="c06-statistical-testing.html#cb161-1" tabindex="-1"></a>svydata_des <span class="sc">%&gt;%</span></span>
+<span id="cb161-2"><a href="c06-statistical-testing.html#cb161-2" tabindex="-1"></a> <span class="fu">svyttest</span>(<span class="at">design =</span> ., x <span class="sc">~</span> y)</span></code></pre></div>
+<p>However, the following code does not work as the <code>svyttest()</code> function expects the formula as the first argument when arguments are not named:</p>
+<div class="sourceCode" id="cb162"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb162-1"><a href="c06-statistical-testing.html#cb162-1" tabindex="-1"></a>svydata_des <span class="sc">%&gt;%</span></span>
+<span id="cb162-2"><a href="c06-statistical-testing.html#cb162-2" tabindex="-1"></a> <span class="fu">svyttest</span>(., x <span class="sc">~</span> y)</span></code></pre></div>
 </div>
 <div id="stattest-ttest" class="section level2 hasAnchor" number="6.3">
 <h2><span class="header-section-number">6.3</span> Comparison of proportions and means<a href="c06-statistical-testing.html#stattest-ttest" class="anchor-section" aria-label="Anchor link to header"></a></h2>
-<p>We use t-tests to compare two proportions or means. T-tests allow us to determine if one proportion or mean is statistically different from another. They are commonly used to determine if a single estimate differs from a known value (e.g., 0 or 50%) or to compare two group means (e.g., North versus South). Comparing a single estimate to a known value is called a <em>one sample t-test</em>, and we can set up the hypothesis test as follows:</p>
+<p>We use t-tests to compare two proportions or means. T-tests allow us to determine if one proportion or mean is statistically different from another. They are commonly used to determine if a single estimate differs from a known value (e.g., 0 or 50%) or to compare two group means (e.g., North versus South.) Comparing a single estimate to a known value is called a <em>one sample t-test</em>, and we can set up the hypothesis test as follows:</p>
 <ul>
 <li><span class="math inline">\(H_0: \mu = 0\)</span> where <span class="math inline">\(\mu\)</span> is the mean outcome and <span class="math inline">\(0\)</span> is the value we are comparing it to</li>
 <li><span class="math inline">\(H_A: \mu \neq 0\)</span></li>
 </ul>
-<p>For comparing two estimates, this is called a <em>two-sample t-test</em> and we can set up the hypothesis test as follows:</p>
+<p>For comparing two estimates, this is called a <em>two-sample t-test</em>. We can set up the hypothesis test as follows:</p>
 <ul>
 <li><span class="math inline">\(H_0: \mu_1 = \mu_2\)</span> where <span class="math inline">\(\mu_i\)</span> is the mean outcome for group <span class="math inline">\(i\)</span></li>
 <li><span class="math inline">\(H_A: \mu_1 \neq \mu_2\)</span></li>
 </ul>
-<p>Two sample t-tests can also be <em>paired</em> or <em>unpaired</em>. If the data come from two different populations (e.g., North versus South), the t-test run will be an <em>unpaired</em> or <em>independent samples</em> t-test. <em>Paired</em> t-tests occur when the data come from the same population. This is commonly seen with data from the same population in two different time periods (e.g., before and after an intervention).</p>
-<p>The difference between t-tests with non-survey data and survey data is based on the underlying variance estimation difference. Chapter <a href="c10-sample-designs-replicate-weights.html#c10-sample-designs-replicate-weights">10</a> provides a detailed overview of the math behind the mean and sampling error calculations for various sample designs. The functions in the {survey} package will account for these nuances, provided the design object is correctly defined.</p>
+<p>Two sample t-tests can also be <em>paired</em> or <em>unpaired</em>. If the data come from two different populations (e.g., North versus South), the t-test run is an <em>unpaired</em> or <em>independent samples</em> t-test. <em>Paired</em> t-tests occur when the data come from the same population. This is commonly seen with data from the same population in two different time periods (e.g., before and after an intervention.)</p>
+<p>The difference between t-tests with non-survey data and survey data is based on the underlying variance estimation difference. Chapter <a href="c10-sample-designs-replicate-weights.html#c10-sample-designs-replicate-weights">10</a> provides a detailed overview of the math behind the mean and sampling error calculations for various sample designs. The functions in the {survey} package account for these nuances, provided the design object is correctly defined.</p>
 <div id="stattest-ttest-syntax" class="section level3 hasAnchor" number="6.3.1">
 <h3><span class="header-section-number">6.3.1</span> Syntax<a href="c06-statistical-testing.html#stattest-ttest-syntax" class="anchor-section" aria-label="Anchor link to header"></a></h3>
 <p>When we do not have survey data, we can use the <code>t.test()</code> function from the {stats} package. This function does not allow for weights or the variance structure that need to be accounted for with survey data. Therefore, we need to use the <code>svyttest()</code> function from {survey} when using survey data. Many of the arguments are the same between the two functions, but there are a few key differences:</p>
 <ul>
 <li>We need to use the survey design object instead of the original data frame</li>
 <li>We can only use a formula and not separate x and y data</li>
-<li>The confidence level cannot be specified and will always be set to 95%. However, we will show examples of how the confidence level can be changed after running the <code>svyttest()</code> function by using the <code>confint()</code> function.</li>
+<li>The confidence level cannot be specified and is always be set to 95%. However, we show examples of how the confidence level can be changed after running the <code>svyttest()</code> function by using the <code>confint()</code> function.</li>
 </ul>
 <p>Here is the syntax for the <code>svyttest()</code> function:</p>
-<div class="sourceCode" id="cb165"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb165-1"><a href="c06-statistical-testing.html#cb165-1" tabindex="-1"></a><span class="fu">svyttest</span>(formula,</span>
-<span id="cb165-2"><a href="c06-statistical-testing.html#cb165-2" tabindex="-1"></a>         design,</span>
-<span id="cb165-3"><a href="c06-statistical-testing.html#cb165-3" tabindex="-1"></a>         ...)</span></code></pre></div>
+<div class="sourceCode" id="cb163"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb163-1"><a href="c06-statistical-testing.html#cb163-1" tabindex="-1"></a><span class="fu">svyttest</span>(formula,</span>
+<span id="cb163-2"><a href="c06-statistical-testing.html#cb163-2" tabindex="-1"></a>         design,</span>
+<span id="cb163-3"><a href="c06-statistical-testing.html#cb163-3" tabindex="-1"></a>         ...)</span></code></pre></div>
 <p>The arguments are:</p>
 <ul>
 <li><code>formula</code>: Formula, <code>outcome~group</code> for two-sample, <code>outcome~0</code> or <code>outcome~1</code> for one-sample. The group variable must be a factor or character with two levels, or be coded 0/1 or 1/2. We give more details on formula set-up below for different types of tests.</li>
@@ -634,7 +634,7 @@ <h3><span class="header-section-number">6.3.1</span> Syntax<a href="c06-statisti
 <li><strong>One-sample t-test:</strong>
 <ol style="list-style-type: lower-alpha">
 <li><strong>Comparison to 0:</strong> <code>var ~ 0</code>, where <code>var</code> is the measure of interest, and we compare it to the value <code>0</code>. For example, we could test if the population mean of household debt is different from <code>0</code> given the sample data collected.</li>
-<li><strong>Comparison to a different value:</strong> <code>var - value ~ 0</code>, where <code>var</code> is the measure of interest and <code>value</code> is what we are comparing to. For example, we could test if the proportion of the population that has blue eyes is different from <code>25%</code> by using <code>var - 0.25 ~ 0</code>. Note that specifying the formula as <code>var ~ 0.25</code> is not equivalent and will result in a syntax error.</li>
+<li><strong>Comparison to a different value:</strong> <code>var - value ~ 0</code>, where <code>var</code> is the measure of interest and <code>value</code> is what we are comparing to. For example, we could test if the proportion of the population that has blue eyes is different from <code>25%</code> by using <code>var - 0.25 ~ 0</code>. Note that specifying the formula as <code>var ~ 0.25</code> is not equivalent and results in a syntax error.</li>
 </ol></li>
 <li><strong>Two-sample t-test:</strong>
 <ol style="list-style-type: lower-alpha">
@@ -643,10 +643,10 @@ <h3><span class="header-section-number">6.3.1</span> Syntax<a href="c06-statisti
 <li><strong>2 level grouping variable:</strong> <code>var ~ groupVar</code>, where <code>var</code> is the measure of interest and <code>groupVar</code> is a variable with two categories. For example, we could test if the average age of the population who voted for president in 2020 differed from the age of people who did not vote. In this case, age would be used for <code>var</code>, and a binary variable indicating voting activity would be the <code>groupVar</code>.</li>
 <li><strong>3+ level grouping variable:</strong> <code>var ~ groupVar == level</code>, where <code>var</code> is the measure of interest, <code>groupVar</code> is the categorical variable, and <code>level</code> is the category level to isolate. For example, we could test if the test scores in one classroom differed from all other classrooms where <code>groupVar</code> would be the variable holding the values for classroom IDs and <code>level</code> is the classroom ID we want to compare to the others.</li>
 </ul></li>
-<li><strong>Paired:</strong> <code>var_1 - var_2 ~ 0</code>, where <code>var_1</code> is the first variable of interest and <code>var_2</code> is the second variable of interest. For example, we could test if test scores on a subject differed between the start and the end of a course so <code>var_1</code> would be the test score at the beginning of the course and <code>var_2</code> would be the score at the end of the course.</li>
+<li><strong>Paired:</strong> <code>var_1 - var_2 ~ 0</code>, where <code>var_1</code> is the first variable of interest and <code>var_2</code> is the second variable of interest. For example, we could test if test scores on a subject differed between the start and the end of a course, so <code>var_1</code> would be the test score at the beginning of the course, and <code>var_2</code> would be the score at the end of the course.</li>
 </ol></li>
 </ol>
-<p>The <code>na.rm</code> argument defaults to <code>FALSE</code>, which means if any data is missing, the t-test will not compute. Throughout this chapter, we will always set <code>na.rm = TRUE</code>, but before analyzing the survey data, review the notes provided in Chapter <a href="c03-survey-data-documentation.html#c03-survey-data-documentation">3</a> to better understand how to handle missing data.</p>
+<p>The <code>na.rm</code> argument defaults to <code>FALSE</code>, which means if any data values are missing, the t-test does not compute. Throughout this chapter, we always set <code>na.rm = TRUE</code>, but before analyzing the survey data, review the notes provided in Chapter <a href="c11-missing-data.html#c11-missing-data">11</a> to better understand how to handle missing data.</p>
 <p>Let’s walk through a few examples using the ANES and RECS data.</p>
 </div>
 <div id="stattest-ttest-examples" class="section level3 hasAnchor" number="6.3.2">
@@ -659,14 +659,14 @@ <h4>Example 1: One-sample t-test for mean<a href="c06-statistical-testing.html#s
 <li><span class="math inline">\(H_A: \mu \neq 68\)</span></li>
 </ul>
 <p>To conduct this in R, we use <code>svyttest()</code> and subtract the temperature on the left-hand side of the formula:</p>
-<div class="sourceCode" id="cb166"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb166-1"><a href="c06-statistical-testing.html#cb166-1" tabindex="-1"></a>ttest_ex1 <span class="ot">&lt;-</span> recs_des <span class="sc">%&gt;%</span></span>
-<span id="cb166-2"><a href="c06-statistical-testing.html#cb166-2" tabindex="-1"></a>  <span class="fu">svyttest</span>(</span>
-<span id="cb166-3"><a href="c06-statistical-testing.html#cb166-3" tabindex="-1"></a>    <span class="at">formula =</span> SummerTempNight <span class="sc">-</span> <span class="dv">68</span> <span class="sc">~</span> <span class="dv">0</span>,</span>
-<span id="cb166-4"><a href="c06-statistical-testing.html#cb166-4" tabindex="-1"></a>    <span class="at">design =</span> .,</span>
-<span id="cb166-5"><a href="c06-statistical-testing.html#cb166-5" tabindex="-1"></a>    <span class="at">na.rm =</span> <span class="cn">TRUE</span></span>
-<span id="cb166-6"><a href="c06-statistical-testing.html#cb166-6" tabindex="-1"></a>  )</span>
-<span id="cb166-7"><a href="c06-statistical-testing.html#cb166-7" tabindex="-1"></a></span>
-<span id="cb166-8"><a href="c06-statistical-testing.html#cb166-8" tabindex="-1"></a>ttest_ex1</span></code></pre></div>
+<div class="sourceCode" id="cb164"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb164-1"><a href="c06-statistical-testing.html#cb164-1" tabindex="-1"></a>ttest_ex1 <span class="ot">&lt;-</span> recs_des <span class="sc">%&gt;%</span></span>
+<span id="cb164-2"><a href="c06-statistical-testing.html#cb164-2" tabindex="-1"></a>  <span class="fu">svyttest</span>(</span>
+<span id="cb164-3"><a href="c06-statistical-testing.html#cb164-3" tabindex="-1"></a>    <span class="at">formula =</span> SummerTempNight <span class="sc">-</span> <span class="dv">68</span> <span class="sc">~</span> <span class="dv">0</span>,</span>
+<span id="cb164-4"><a href="c06-statistical-testing.html#cb164-4" tabindex="-1"></a>    <span class="at">design =</span> .,</span>
+<span id="cb164-5"><a href="c06-statistical-testing.html#cb164-5" tabindex="-1"></a>    <span class="at">na.rm =</span> <span class="cn">TRUE</span></span>
+<span id="cb164-6"><a href="c06-statistical-testing.html#cb164-6" tabindex="-1"></a>  )</span>
+<span id="cb164-7"><a href="c06-statistical-testing.html#cb164-7" tabindex="-1"></a></span>
+<span id="cb164-8"><a href="c06-statistical-testing.html#cb164-8" tabindex="-1"></a>ttest_ex1</span></code></pre></div>
 <pre><code>## 
 ##  Design-based one-sample t-test
 ## 
@@ -680,24 +680,29 @@ <h4>Example 1: One-sample t-test for mean<a href="c06-statistical-testing.html#s
 ## 3.367</code></pre>
 <p>To pull out specific output, we can use R’s built-in <code>$</code> operator. For instance, to obtain the estimate <span class="math inline">\(\mu - 68\)</span>, we run <code>ttest_ex1$estimate</code>.</p>
 <p>If we want the average, we take our t-test estimate and add it to 68:</p>
-<div class="sourceCode" id="cb168"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb168-1"><a href="c06-statistical-testing.html#cb168-1" tabindex="-1"></a>ttest_ex1<span class="sc">$</span>estimate <span class="sc">+</span> <span class="dv">68</span></span></code></pre></div>
+<div class="sourceCode" id="cb166"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb166-1"><a href="c06-statistical-testing.html#cb166-1" tabindex="-1"></a>ttest_ex1<span class="sc">$</span>estimate <span class="sc">+</span> <span class="dv">68</span></span></code></pre></div>
 <pre><code>##  mean 
 ## 71.37</code></pre>
 <p>Or, we can use the <code>survey_mean()</code> function described in Chapter <a href="c05-descriptive-analysis.html#c05-descriptive-analysis">5</a>:</p>
-<div class="sourceCode" id="cb170"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb170-1"><a href="c06-statistical-testing.html#cb170-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
-<span id="cb170-2"><a href="c06-statistical-testing.html#cb170-2" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">mu =</span> <span class="fu">survey_mean</span>(SummerTempNight, <span class="at">na.rm =</span> <span class="cn">TRUE</span>))</span></code></pre></div>
+<div class="sourceCode" id="cb168"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb168-1"><a href="c06-statistical-testing.html#cb168-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
+<span id="cb168-2"><a href="c06-statistical-testing.html#cb168-2" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">mu =</span> <span class="fu">survey_mean</span>(SummerTempNight, <span class="at">na.rm =</span> <span class="cn">TRUE</span>))</span></code></pre></div>
 <pre><code>## # A tibble: 1 × 2
 ##      mu  mu_se
 ##   &lt;dbl&gt;  &lt;dbl&gt;
 ## 1  71.4 0.0397</code></pre>
 <p>The result is the same in both methods, so we see that the average temperature U.S. households set their thermostat to in the summer at night is 71.4<span class="math inline">\(^\circ\)</span>F. Looking at the output from <code>svyttest()</code>, the t-statistic is 84.8, and the p-value is <span class="math inline">\(&lt;0.0001\)</span>, indicating that the average is statistically different from 68<span class="math inline">\(^\circ\)</span>F at an <span class="math inline">\(\alpha\)</span> level of <span class="math inline">\(0.05\)</span>.</p>
-<p>If we want an 80% confidence interval for the test statistic, we can use the function <code>confint()</code> to change the confidence level. Below, we print both the original 95% confidence interval and the 80% confidence interval:</p>
-<div class="sourceCode" id="cb172"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb172-1"><a href="c06-statistical-testing.html#cb172-1" tabindex="-1"></a><span class="fu">confint</span>(ttest_ex1, <span class="at">level =</span> <span class="fl">0.95</span>)</span></code></pre></div>
+<p>If we want an 80% confidence interval for the test statistic, we can use the function <code>confint()</code> to change the confidence level. Below, we print the default confidence interval (95%), the confidence interval explicitly specifying the level as 95%, and the 80% confidence interval. The default confidence level is 95%, and when we specify this level, R returns a vector with both row and column names. However, when we specify any other confidence level, an unnamed vector is returned, with the first element being the lower bound and the second element being the upper bound of the confidence interval.</p>
+<div class="sourceCode" id="cb170"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb170-1"><a href="c06-statistical-testing.html#cb170-1" tabindex="-1"></a><span class="fu">confint</span>(ttest_ex1)</span></code></pre></div>
+<pre><code>##                                  2.5 % 97.5 %
+## as.numeric(SummerTempNight - 68) 3.288  3.447
+## attr(,&quot;conf.level&quot;)
+## [1] 0.95</code></pre>
+<div class="sourceCode" id="cb172"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb172-1"><a href="c06-statistical-testing.html#cb172-1" tabindex="-1"></a><span class="fu">confint</span>(ttest_ex1, <span class="at">level =</span> <span class="fl">0.95</span>) </span></code></pre></div>
 <pre><code>##                                  2.5 % 97.5 %
 ## as.numeric(SummerTempNight - 68) 3.288  3.447
 ## attr(,&quot;conf.level&quot;)
 ## [1] 0.95</code></pre>
-<div class="sourceCode" id="cb174"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb174-1"><a href="c06-statistical-testing.html#cb174-1" tabindex="-1"></a><span class="fu">confint</span>(ttest_ex1, <span class="at">level =</span> <span class="fl">0.8</span>)</span></code></pre></div>
+<div class="sourceCode" id="cb174"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb174-1"><a href="c06-statistical-testing.html#cb174-1" tabindex="-1"></a><span class="fu">confint</span>(ttest_ex1, <span class="at">level =</span> <span class="fl">0.8</span>) </span></code></pre></div>
 <pre><code>## [1] 3.316 3.419
 ## attr(,&quot;conf.level&quot;)
 ## [1] 0.8</code></pre>
@@ -705,7 +710,7 @@ <h4>Example 1: One-sample t-test for mean<a href="c06-statistical-testing.html#s
 </div>
 <div id="stattest-ttest-ex2" class="section level4 unnumbered hasAnchor">
 <h4>Example 2: One-sample t-test for proportion<a href="c06-statistical-testing.html#stattest-ttest-ex2" class="anchor-section" aria-label="Anchor link to header"></a></h4>
-<p>RECS asked respondents if they use any air conditioning (AC) in their home.<a href="#fn16" class="footnote-ref" id="fnref16"><sup>16</sup></a> In our data, we call this variable <code>ACUsed</code>. Let’s look at the proportion of U.S. households that use AC in their homes using the <code>survey_prop()</code> function we learned in Chapter <a href="c05-descriptive-analysis.html#c05-descriptive-analysis">5</a>.</p>
+<p>RECS asked respondents if they use air conditioning (A/C) in their home.<a href="#fn16" class="footnote-ref" id="fnref16"><sup>16</sup></a> In our data, we call this variable <code>ACUsed</code>. Let’s look at the proportion of U.S. households that use A/C in their homes using the <code>survey_prop()</code> function we learned in Chapter <a href="c05-descriptive-analysis.html#c05-descriptive-analysis">5</a>.</p>
 <div class="sourceCode" id="cb176"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb176-1"><a href="c06-statistical-testing.html#cb176-1" tabindex="-1"></a>acprop <span class="ot">&lt;-</span> recs_des <span class="sc">%&gt;%</span></span>
 <span id="cb176-2"><a href="c06-statistical-testing.html#cb176-2" tabindex="-1"></a>  <span class="fu">group_by</span>(ACUsed) <span class="sc">%&gt;%</span></span>
 <span id="cb176-3"><a href="c06-statistical-testing.html#cb176-3" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">p =</span> <span class="fu">survey_prop</span>())</span>
@@ -716,9 +721,9 @@ <h4>Example 2: One-sample t-test for proportion<a href="c06-statistical-testing.
 ##   &lt;lgl&gt;  &lt;dbl&gt;   &lt;dbl&gt;
 ## 1 FALSE  0.113 0.00306
 ## 2 TRUE   0.887 0.00306</code></pre>
-<p>Based on this, 88.7% of U.S. households use AC in their homes. If we wanted to know if this differs from 90%, we could set up our hypothesis as follows:</p>
+<p>Based on this, 88.7% of U.S. households use A/C in their homes. If we wanted to know if this differs from 90%, we could set up our hypothesis as follows:</p>
 <ul>
-<li><span class="math inline">\(H_0: p = 0.90\)</span> where <span class="math inline">\(p\)</span> is the proportion of the U.S. households that use AC in their homes</li>
+<li><span class="math inline">\(H_0: p = 0.90\)</span> where <span class="math inline">\(p\)</span> is the proportion of U.S. households that use A/C in their homes</li>
 <li><span class="math inline">\(H_A: p \neq 0.90\)</span></li>
 </ul>
 <p>To conduct this in R, we use the <code>svyttest()</code> function as follows:</p>
@@ -748,29 +753,29 @@ <h4>Example 2: One-sample t-test for proportion<a href="c06-statistical-testing.
 ##      &lt;dbl&gt;     &lt;dbl&gt;     &lt;dbl&gt;     &lt;dbl&gt;    &lt;dbl&gt;     &lt;dbl&gt; &lt;chr&gt;       
 ## 1  -0.0135     -4.40 0.0000466        58  -0.0196  -0.00735 Design-base…
 ## # ℹ 1 more variable: alternative &lt;chr&gt;</code></pre>
-<p>The ‘tidied’ output can also be piped into the {gt} package to create a table ready for publication. We go over the {gt} package in Chapter <a href="c08-communicating-results.html#c08-communicating-results">8</a>. The function <code>pretty_p_value()</code> comes from the {prettyunits} package and converts numeric p-values to characters and, by default prints four decimal places and displays any p-value less than 0.0001 as <code>"&lt;0.0001"</code> though another minimum display p-value can be specified <span class="citation">(<a href="#ref-R-prettyunits">Csardi 2023</a>)</span>.</p>
+<p>The ‘tidied’ output can also be piped into the {gt} package to create a table ready for publication. We go over the {gt} package in Chapter <a href="c08-communicating-results.html#c08-communicating-results">8</a>. The function <code>pretty_p_value()</code> comes from the {prettyunits} package and converts numeric p-values to characters and, by default, prints four decimal places and displays any p-value less than 0.0001 as <code>"&lt;0.0001"</code> though another minimum display p-value can be specified <span class="citation">(<a href="#ref-R-prettyunits">Csardi 2023</a>)</span>.</p>
 <div class="sourceCode" id="cb182"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb182-1"><a href="c06-statistical-testing.html#cb182-1" tabindex="-1"></a><span class="fu">tidy</span>(ttest_ex2) <span class="sc">%&gt;%</span></span>
 <span id="cb182-2"><a href="c06-statistical-testing.html#cb182-2" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">p.value =</span> <span class="fu">pretty_p_value</span>(p.value)) <span class="sc">%&gt;%</span></span>
 <span id="cb182-3"><a href="c06-statistical-testing.html#cb182-3" tabindex="-1"></a>  <span class="fu">gt</span>() <span class="sc">%&gt;%</span></span>
 <span id="cb182-4"><a href="c06-statistical-testing.html#cb182-4" tabindex="-1"></a>  <span class="fu">fmt_number</span>()</span></code></pre></div>
 
-<div id="lqgijczpsn" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
-<style>#lqgijczpsn table {
+<div id="lapnrnbvkd" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
+<style>#lapnrnbvkd table {
   font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
   -webkit-font-smoothing: antialiased;
   -moz-osx-font-smoothing: grayscale;
 }
 
-#lqgijczpsn thead, #lqgijczpsn tbody, #lqgijczpsn tfoot, #lqgijczpsn tr, #lqgijczpsn td, #lqgijczpsn th {
+#lapnrnbvkd thead, #lapnrnbvkd tbody, #lapnrnbvkd tfoot, #lapnrnbvkd tr, #lapnrnbvkd td, #lapnrnbvkd th {
   border-style: none;
 }
 
-#lqgijczpsn p {
+#lapnrnbvkd p {
   margin: 0;
   padding: 0;
 }
 
-#lqgijczpsn .gt_table {
+#lapnrnbvkd .gt_table {
   display: table;
   border-collapse: collapse;
   line-height: normal;
@@ -796,12 +801,12 @@ <h4>Example 2: One-sample t-test for proportion<a href="c06-statistical-testing.
   border-left-color: #D3D3D3;
 }
 
-#lqgijczpsn .gt_caption {
+#lapnrnbvkd .gt_caption {
   padding-top: 4px;
   padding-bottom: 4px;
 }
 
-#lqgijczpsn .gt_title {
+#lapnrnbvkd .gt_title {
   color: #333333;
   font-size: 125%;
   font-weight: initial;
@@ -813,7 +818,7 @@ <h4>Example 2: One-sample t-test for proportion<a href="c06-statistical-testing.
   border-bottom-width: 0;
 }
 
-#lqgijczpsn .gt_subtitle {
+#lapnrnbvkd .gt_subtitle {
   color: #333333;
   font-size: 85%;
   font-weight: initial;
@@ -825,7 +830,7 @@ <h4>Example 2: One-sample t-test for proportion<a href="c06-statistical-testing.
   border-top-width: 0;
 }
 
-#lqgijczpsn .gt_heading {
+#lapnrnbvkd .gt_heading {
   background-color: #FFFFFF;
   text-align: center;
   border-bottom-color: #FFFFFF;
@@ -837,13 +842,13 @@ <h4>Example 2: One-sample t-test for proportion<a href="c06-statistical-testing.
   border-right-color: #D3D3D3;
 }
 
-#lqgijczpsn .gt_bottom_border {
+#lapnrnbvkd .gt_bottom_border {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
 }
 
-#lqgijczpsn .gt_col_headings {
+#lapnrnbvkd .gt_col_headings {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -858,7 +863,7 @@ <h4>Example 2: One-sample t-test for proportion<a href="c06-statistical-testing.
   border-right-color: #D3D3D3;
 }
 
-#lqgijczpsn .gt_col_heading {
+#lapnrnbvkd .gt_col_heading {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -878,7 +883,7 @@ <h4>Example 2: One-sample t-test for proportion<a href="c06-statistical-testing.
   overflow-x: hidden;
 }
 
-#lqgijczpsn .gt_column_spanner_outer {
+#lapnrnbvkd .gt_column_spanner_outer {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -890,15 +895,15 @@ <h4>Example 2: One-sample t-test for proportion<a href="c06-statistical-testing.
   padding-right: 4px;
 }
 
-#lqgijczpsn .gt_column_spanner_outer:first-child {
+#lapnrnbvkd .gt_column_spanner_outer:first-child {
   padding-left: 0;
 }
 
-#lqgijczpsn .gt_column_spanner_outer:last-child {
+#lapnrnbvkd .gt_column_spanner_outer:last-child {
   padding-right: 0;
 }
 
-#lqgijczpsn .gt_column_spanner {
+#lapnrnbvkd .gt_column_spanner {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
@@ -910,11 +915,11 @@ <h4>Example 2: One-sample t-test for proportion<a href="c06-statistical-testing.
   width: 100%;
 }
 
-#lqgijczpsn .gt_spanner_row {
+#lapnrnbvkd .gt_spanner_row {
   border-bottom-style: hidden;
 }
 
-#lqgijczpsn .gt_group_heading {
+#lapnrnbvkd .gt_group_heading {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -940,7 +945,7 @@ <h4>Example 2: One-sample t-test for proportion<a href="c06-statistical-testing.
   text-align: left;
 }
 
-#lqgijczpsn .gt_empty_group_heading {
+#lapnrnbvkd .gt_empty_group_heading {
   padding: 0.5px;
   color: #333333;
   background-color: #FFFFFF;
@@ -955,15 +960,15 @@ <h4>Example 2: One-sample t-test for proportion<a href="c06-statistical-testing.
   vertical-align: middle;
 }
 
-#lqgijczpsn .gt_from_md > :first-child {
+#lapnrnbvkd .gt_from_md > :first-child {
   margin-top: 0;
 }
 
-#lqgijczpsn .gt_from_md > :last-child {
+#lapnrnbvkd .gt_from_md > :last-child {
   margin-bottom: 0;
 }
 
-#lqgijczpsn .gt_row {
+#lapnrnbvkd .gt_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -982,7 +987,7 @@ <h4>Example 2: One-sample t-test for proportion<a href="c06-statistical-testing.
   overflow-x: hidden;
 }
 
-#lqgijczpsn .gt_stub {
+#lapnrnbvkd .gt_stub {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -995,7 +1000,7 @@ <h4>Example 2: One-sample t-test for proportion<a href="c06-statistical-testing.
   padding-right: 5px;
 }
 
-#lqgijczpsn .gt_stub_row_group {
+#lapnrnbvkd .gt_stub_row_group {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -1009,15 +1014,15 @@ <h4>Example 2: One-sample t-test for proportion<a href="c06-statistical-testing.
   vertical-align: top;
 }
 
-#lqgijczpsn .gt_row_group_first td {
+#lapnrnbvkd .gt_row_group_first td {
   border-top-width: 2px;
 }
 
-#lqgijczpsn .gt_row_group_first th {
+#lapnrnbvkd .gt_row_group_first th {
   border-top-width: 2px;
 }
 
-#lqgijczpsn .gt_summary_row {
+#lapnrnbvkd .gt_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -1027,16 +1032,16 @@ <h4>Example 2: One-sample t-test for proportion<a href="c06-statistical-testing.
   padding-right: 5px;
 }
 
-#lqgijczpsn .gt_first_summary_row {
+#lapnrnbvkd .gt_first_summary_row {
   border-top-style: solid;
   border-top-color: #D3D3D3;
 }
 
-#lqgijczpsn .gt_first_summary_row.thick {
+#lapnrnbvkd .gt_first_summary_row.thick {
   border-top-width: 2px;
 }
 
-#lqgijczpsn .gt_last_summary_row {
+#lapnrnbvkd .gt_last_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -1046,7 +1051,7 @@ <h4>Example 2: One-sample t-test for proportion<a href="c06-statistical-testing.
   border-bottom-color: #D3D3D3;
 }
 
-#lqgijczpsn .gt_grand_summary_row {
+#lapnrnbvkd .gt_grand_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -1056,7 +1061,7 @@ <h4>Example 2: One-sample t-test for proportion<a href="c06-statistical-testing.
   padding-right: 5px;
 }
 
-#lqgijczpsn .gt_first_grand_summary_row {
+#lapnrnbvkd .gt_first_grand_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -1066,7 +1071,7 @@ <h4>Example 2: One-sample t-test for proportion<a href="c06-statistical-testing.
   border-top-color: #D3D3D3;
 }
 
-#lqgijczpsn .gt_last_grand_summary_row_top {
+#lapnrnbvkd .gt_last_grand_summary_row_top {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -1076,11 +1081,11 @@ <h4>Example 2: One-sample t-test for proportion<a href="c06-statistical-testing.
   border-bottom-color: #D3D3D3;
 }
 
-#lqgijczpsn .gt_striped {
+#lapnrnbvkd .gt_striped {
   background-color: rgba(128, 128, 128, 0.05);
 }
 
-#lqgijczpsn .gt_table_body {
+#lapnrnbvkd .gt_table_body {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -1089,7 +1094,7 @@ <h4>Example 2: One-sample t-test for proportion<a href="c06-statistical-testing.
   border-bottom-color: #D3D3D3;
 }
 
-#lqgijczpsn .gt_footnotes {
+#lapnrnbvkd .gt_footnotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -1103,7 +1108,7 @@ <h4>Example 2: One-sample t-test for proportion<a href="c06-statistical-testing.
   border-right-color: #D3D3D3;
 }
 
-#lqgijczpsn .gt_footnote {
+#lapnrnbvkd .gt_footnote {
   margin: 0px;
   font-size: 90%;
   padding-top: 4px;
@@ -1112,7 +1117,7 @@ <h4>Example 2: One-sample t-test for proportion<a href="c06-statistical-testing.
   padding-right: 5px;
 }
 
-#lqgijczpsn .gt_sourcenotes {
+#lapnrnbvkd .gt_sourcenotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -1126,7 +1131,7 @@ <h4>Example 2: One-sample t-test for proportion<a href="c06-statistical-testing.
   border-right-color: #D3D3D3;
 }
 
-#lqgijczpsn .gt_sourcenote {
+#lapnrnbvkd .gt_sourcenote {
   font-size: 90%;
   padding-top: 4px;
   padding-bottom: 4px;
@@ -1134,68 +1139,68 @@ <h4>Example 2: One-sample t-test for proportion<a href="c06-statistical-testing.
   padding-right: 5px;
 }
 
-#lqgijczpsn .gt_left {
+#lapnrnbvkd .gt_left {
   text-align: left;
 }
 
-#lqgijczpsn .gt_center {
+#lapnrnbvkd .gt_center {
   text-align: center;
 }
 
-#lqgijczpsn .gt_right {
+#lapnrnbvkd .gt_right {
   text-align: right;
   font-variant-numeric: tabular-nums;
 }
 
-#lqgijczpsn .gt_font_normal {
+#lapnrnbvkd .gt_font_normal {
   font-weight: normal;
 }
 
-#lqgijczpsn .gt_font_bold {
+#lapnrnbvkd .gt_font_bold {
   font-weight: bold;
 }
 
-#lqgijczpsn .gt_font_italic {
+#lapnrnbvkd .gt_font_italic {
   font-style: italic;
 }
 
-#lqgijczpsn .gt_super {
+#lapnrnbvkd .gt_super {
   font-size: 65%;
 }
 
-#lqgijczpsn .gt_footnote_marks {
+#lapnrnbvkd .gt_footnote_marks {
   font-size: 75%;
   vertical-align: 0.4em;
   position: initial;
 }
 
-#lqgijczpsn .gt_asterisk {
+#lapnrnbvkd .gt_asterisk {
   font-size: 100%;
   vertical-align: 0;
 }
 
-#lqgijczpsn .gt_indent_1 {
+#lapnrnbvkd .gt_indent_1 {
   text-indent: 5px;
 }
 
-#lqgijczpsn .gt_indent_2 {
+#lapnrnbvkd .gt_indent_2 {
   text-indent: 10px;
 }
 
-#lqgijczpsn .gt_indent_3 {
+#lapnrnbvkd .gt_indent_3 {
   text-indent: 15px;
 }
 
-#lqgijczpsn .gt_indent_4 {
+#lapnrnbvkd .gt_indent_4 {
   text-indent: 20px;
 }
 
-#lqgijczpsn .gt_indent_5 {
+#lapnrnbvkd .gt_indent_5 {
   text-indent: 25px;
 }
 </style>
 <table class="gt_table" data-quarto-disable-processing="false" data-quarto-bootstrap="false">
-  <caption><span id="tab:stattest-ttest-ex2-gt-tab">TABLE 6.1: </span>One-sample t-test output for estimates of U.S. households use AC in their homes differing from 90%, RECS 2020</caption>
+  <caption><span id="tab:stattest-ttest-ex2-gt-tab">TABLE 6.1: </span>One-sample t-test output for estimates of U.S. households use A/C in their homes differing from 90%, RECS 2020</caption>
   <thead>
     
     <tr class="gt_col_headings">
@@ -1223,13 +1228,13 @@ <h4>Example 2: One-sample t-test for proportion<a href="c06-statistical-testing.
   
 </table>
 </div>
-<p>The estimate differs from Example 1 in that the estimate is not displaying <span class="math inline">\(\mu - 0.90\)</span> but rather <span class="math inline">\(\mu\)</span>, or the difference between the U.S. households that use AC and the proportion we are comparing to. We can see that there is a difference of -1.35 percentage points. Additionally, the t-statistic value in the <code>statistic</code> column is -4.4, and the p-value is &lt;0.0001. These results indicate that the fewer than 90% of U.S. households use AC in their homes.</p>
+<p>The estimate differs from Example 1 in that the estimate does not display <span class="math inline">\(p - 0.90\)</span> but rather <span class="math inline">\(p\)</span>, or the difference between the U.S. households that use A/C and the proportion we are comparing to. We can see that there is a difference of -1.35 percentage points. Additionally, the t-statistic value in the <code>statistic</code> column is -4.4, and the p-value is &lt;0.0001. These results indicate that fewer than 90% of U.S. households use A/C in their homes.</p>
 </div>
 <div id="stattest-ttest-ex3" class="section level4 unnumbered hasAnchor">
 <h4>Example 3: Unpaired two-sample t-test<a href="c06-statistical-testing.html#stattest-ttest-ex3" class="anchor-section" aria-label="Anchor link to header"></a></h4>
-<p>Two additional variables in the RECS data are the electric bill cost (<code>DOLLAREL</code>) and whether the house used AC or not (<code>ACUsed</code>).<a href="#fn17" class="footnote-ref" id="fnref17"><sup>17</sup></a> If we want to know if the U.S. households that used AC had higher electrical bills compared to those that did not, we could set up the hypothesis as follows:</p>
+<p>Two additional variables in the RECS data are the electric bill cost (<code>DOLLAREL</code>) and whether the house used A/C or not (<code>ACUsed</code>.)<a href="#fn17" class="footnote-ref" id="fnref17"><sup>17</sup></a> If we want to know if the U.S. households that used A/C had higher electrical bills compared to those that did not, we could set up the hypothesis as follows:</p>
 <ul>
-<li><span class="math inline">\(H_0: \mu_{AC} = \mu_{noAC}\)</span> where <span class="math inline">\(\mu_{AC}\)</span> is the electrical bill cost for U.S. households that used AC and <span class="math inline">\(\mu_{noAC}\)</span> is the electrical bill cost for U.S. households that did not use AC</li>
+<li><span class="math inline">\(H_0: \mu_{AC} = \mu_{noAC}\)</span> where <span class="math inline">\(\mu_{AC}\)</span> is the electrical bill cost for U.S. households that used A/C and <span class="math inline">\(\mu_{noAC}\)</span> is the electrical bill cost for U.S. households that did not use A/C</li>
 <li><span class="math inline">\(H_A: \mu_{AC} \neq \mu_{noAC}\)</span></li>
 </ul>
 <p>Let’s take a quick look at the data to see the format the data are in:</p>
@@ -1251,23 +1256,23 @@ <h4>Example 3: Unpaired two-sample t-test<a href="c06-statistical-testing.html#s
 <span id="cb186-3"><a href="c06-statistical-testing.html#cb186-3" tabindex="-1"></a>  <span class="fu">gt</span>() <span class="sc">%&gt;%</span></span>
 <span id="cb186-4"><a href="c06-statistical-testing.html#cb186-4" tabindex="-1"></a>  <span class="fu">fmt_number</span>()</span></code></pre></div>
 
-<div id="hpkosvtmzq" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
-<style>#hpkosvtmzq table {
+<div id="tbdsbanjdr" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
+<style>#tbdsbanjdr table {
   font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
   -webkit-font-smoothing: antialiased;
   -moz-osx-font-smoothing: grayscale;
 }
 
-#hpkosvtmzq thead, #hpkosvtmzq tbody, #hpkosvtmzq tfoot, #hpkosvtmzq tr, #hpkosvtmzq td, #hpkosvtmzq th {
+#tbdsbanjdr thead, #tbdsbanjdr tbody, #tbdsbanjdr tfoot, #tbdsbanjdr tr, #tbdsbanjdr td, #tbdsbanjdr th {
   border-style: none;
 }
 
-#hpkosvtmzq p {
+#tbdsbanjdr p {
   margin: 0;
   padding: 0;
 }
 
-#hpkosvtmzq .gt_table {
+#tbdsbanjdr .gt_table {
   display: table;
   border-collapse: collapse;
   line-height: normal;
@@ -1293,12 +1298,12 @@ <h4>Example 3: Unpaired two-sample t-test<a href="c06-statistical-testing.html#s
   border-left-color: #D3D3D3;
 }
 
-#hpkosvtmzq .gt_caption {
+#tbdsbanjdr .gt_caption {
   padding-top: 4px;
   padding-bottom: 4px;
 }
 
-#hpkosvtmzq .gt_title {
+#tbdsbanjdr .gt_title {
   color: #333333;
   font-size: 125%;
   font-weight: initial;
@@ -1310,7 +1315,7 @@ <h4>Example 3: Unpaired two-sample t-test<a href="c06-statistical-testing.html#s
   border-bottom-width: 0;
 }
 
-#hpkosvtmzq .gt_subtitle {
+#tbdsbanjdr .gt_subtitle {
   color: #333333;
   font-size: 85%;
   font-weight: initial;
@@ -1322,7 +1327,7 @@ <h4>Example 3: Unpaired two-sample t-test<a href="c06-statistical-testing.html#s
   border-top-width: 0;
 }
 
-#hpkosvtmzq .gt_heading {
+#tbdsbanjdr .gt_heading {
   background-color: #FFFFFF;
   text-align: center;
   border-bottom-color: #FFFFFF;
@@ -1334,13 +1339,13 @@ <h4>Example 3: Unpaired two-sample t-test<a href="c06-statistical-testing.html#s
   border-right-color: #D3D3D3;
 }
 
-#hpkosvtmzq .gt_bottom_border {
+#tbdsbanjdr .gt_bottom_border {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
 }
 
-#hpkosvtmzq .gt_col_headings {
+#tbdsbanjdr .gt_col_headings {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -1355,7 +1360,7 @@ <h4>Example 3: Unpaired two-sample t-test<a href="c06-statistical-testing.html#s
   border-right-color: #D3D3D3;
 }
 
-#hpkosvtmzq .gt_col_heading {
+#tbdsbanjdr .gt_col_heading {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -1375,7 +1380,7 @@ <h4>Example 3: Unpaired two-sample t-test<a href="c06-statistical-testing.html#s
   overflow-x: hidden;
 }
 
-#hpkosvtmzq .gt_column_spanner_outer {
+#tbdsbanjdr .gt_column_spanner_outer {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -1387,15 +1392,15 @@ <h4>Example 3: Unpaired two-sample t-test<a href="c06-statistical-testing.html#s
   padding-right: 4px;
 }
 
-#hpkosvtmzq .gt_column_spanner_outer:first-child {
+#tbdsbanjdr .gt_column_spanner_outer:first-child {
   padding-left: 0;
 }
 
-#hpkosvtmzq .gt_column_spanner_outer:last-child {
+#tbdsbanjdr .gt_column_spanner_outer:last-child {
   padding-right: 0;
 }
 
-#hpkosvtmzq .gt_column_spanner {
+#tbdsbanjdr .gt_column_spanner {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
@@ -1407,11 +1412,11 @@ <h4>Example 3: Unpaired two-sample t-test<a href="c06-statistical-testing.html#s
   width: 100%;
 }
 
-#hpkosvtmzq .gt_spanner_row {
+#tbdsbanjdr .gt_spanner_row {
   border-bottom-style: hidden;
 }
 
-#hpkosvtmzq .gt_group_heading {
+#tbdsbanjdr .gt_group_heading {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -1437,7 +1442,7 @@ <h4>Example 3: Unpaired two-sample t-test<a href="c06-statistical-testing.html#s
   text-align: left;
 }
 
-#hpkosvtmzq .gt_empty_group_heading {
+#tbdsbanjdr .gt_empty_group_heading {
   padding: 0.5px;
   color: #333333;
   background-color: #FFFFFF;
@@ -1452,15 +1457,15 @@ <h4>Example 3: Unpaired two-sample t-test<a href="c06-statistical-testing.html#s
   vertical-align: middle;
 }
 
-#hpkosvtmzq .gt_from_md > :first-child {
+#tbdsbanjdr .gt_from_md > :first-child {
   margin-top: 0;
 }
 
-#hpkosvtmzq .gt_from_md > :last-child {
+#tbdsbanjdr .gt_from_md > :last-child {
   margin-bottom: 0;
 }
 
-#hpkosvtmzq .gt_row {
+#tbdsbanjdr .gt_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -1479,7 +1484,7 @@ <h4>Example 3: Unpaired two-sample t-test<a href="c06-statistical-testing.html#s
   overflow-x: hidden;
 }
 
-#hpkosvtmzq .gt_stub {
+#tbdsbanjdr .gt_stub {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -1492,7 +1497,7 @@ <h4>Example 3: Unpaired two-sample t-test<a href="c06-statistical-testing.html#s
   padding-right: 5px;
 }
 
-#hpkosvtmzq .gt_stub_row_group {
+#tbdsbanjdr .gt_stub_row_group {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -1506,15 +1511,15 @@ <h4>Example 3: Unpaired two-sample t-test<a href="c06-statistical-testing.html#s
   vertical-align: top;
 }
 
-#hpkosvtmzq .gt_row_group_first td {
+#tbdsbanjdr .gt_row_group_first td {
   border-top-width: 2px;
 }
 
-#hpkosvtmzq .gt_row_group_first th {
+#tbdsbanjdr .gt_row_group_first th {
   border-top-width: 2px;
 }
 
-#hpkosvtmzq .gt_summary_row {
+#tbdsbanjdr .gt_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -1524,16 +1529,16 @@ <h4>Example 3: Unpaired two-sample t-test<a href="c06-statistical-testing.html#s
   padding-right: 5px;
 }
 
-#hpkosvtmzq .gt_first_summary_row {
+#tbdsbanjdr .gt_first_summary_row {
   border-top-style: solid;
   border-top-color: #D3D3D3;
 }
 
-#hpkosvtmzq .gt_first_summary_row.thick {
+#tbdsbanjdr .gt_first_summary_row.thick {
   border-top-width: 2px;
 }
 
-#hpkosvtmzq .gt_last_summary_row {
+#tbdsbanjdr .gt_last_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -1543,7 +1548,7 @@ <h4>Example 3: Unpaired two-sample t-test<a href="c06-statistical-testing.html#s
   border-bottom-color: #D3D3D3;
 }
 
-#hpkosvtmzq .gt_grand_summary_row {
+#tbdsbanjdr .gt_grand_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -1553,7 +1558,7 @@ <h4>Example 3: Unpaired two-sample t-test<a href="c06-statistical-testing.html#s
   padding-right: 5px;
 }
 
-#hpkosvtmzq .gt_first_grand_summary_row {
+#tbdsbanjdr .gt_first_grand_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -1563,7 +1568,7 @@ <h4>Example 3: Unpaired two-sample t-test<a href="c06-statistical-testing.html#s
   border-top-color: #D3D3D3;
 }
 
-#hpkosvtmzq .gt_last_grand_summary_row_top {
+#tbdsbanjdr .gt_last_grand_summary_row_top {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -1573,11 +1578,11 @@ <h4>Example 3: Unpaired two-sample t-test<a href="c06-statistical-testing.html#s
   border-bottom-color: #D3D3D3;
 }
 
-#hpkosvtmzq .gt_striped {
+#tbdsbanjdr .gt_striped {
   background-color: rgba(128, 128, 128, 0.05);
 }
 
-#hpkosvtmzq .gt_table_body {
+#tbdsbanjdr .gt_table_body {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -1586,7 +1591,7 @@ <h4>Example 3: Unpaired two-sample t-test<a href="c06-statistical-testing.html#s
   border-bottom-color: #D3D3D3;
 }
 
-#hpkosvtmzq .gt_footnotes {
+#tbdsbanjdr .gt_footnotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -1600,7 +1605,7 @@ <h4>Example 3: Unpaired two-sample t-test<a href="c06-statistical-testing.html#s
   border-right-color: #D3D3D3;
 }
 
-#hpkosvtmzq .gt_footnote {
+#tbdsbanjdr .gt_footnote {
   margin: 0px;
   font-size: 90%;
   padding-top: 4px;
@@ -1609,7 +1614,7 @@ <h4>Example 3: Unpaired two-sample t-test<a href="c06-statistical-testing.html#s
   padding-right: 5px;
 }
 
-#hpkosvtmzq .gt_sourcenotes {
+#tbdsbanjdr .gt_sourcenotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -1623,7 +1628,7 @@ <h4>Example 3: Unpaired two-sample t-test<a href="c06-statistical-testing.html#s
   border-right-color: #D3D3D3;
 }
 
-#hpkosvtmzq .gt_sourcenote {
+#tbdsbanjdr .gt_sourcenote {
   font-size: 90%;
   padding-top: 4px;
   padding-bottom: 4px;
@@ -1631,68 +1636,68 @@ <h4>Example 3: Unpaired two-sample t-test<a href="c06-statistical-testing.html#s
   padding-right: 5px;
 }
 
-#hpkosvtmzq .gt_left {
+#tbdsbanjdr .gt_left {
   text-align: left;
 }
 
-#hpkosvtmzq .gt_center {
+#tbdsbanjdr .gt_center {
   text-align: center;
 }
 
-#hpkosvtmzq .gt_right {
+#tbdsbanjdr .gt_right {
   text-align: right;
   font-variant-numeric: tabular-nums;
 }
 
-#hpkosvtmzq .gt_font_normal {
+#tbdsbanjdr .gt_font_normal {
   font-weight: normal;
 }
 
-#hpkosvtmzq .gt_font_bold {
+#tbdsbanjdr .gt_font_bold {
   font-weight: bold;
 }
 
-#hpkosvtmzq .gt_font_italic {
+#tbdsbanjdr .gt_font_italic {
   font-style: italic;
 }
 
-#hpkosvtmzq .gt_super {
+#tbdsbanjdr .gt_super {
   font-size: 65%;
 }
 
-#hpkosvtmzq .gt_footnote_marks {
+#tbdsbanjdr .gt_footnote_marks {
   font-size: 75%;
   vertical-align: 0.4em;
   position: initial;
 }
 
-#hpkosvtmzq .gt_asterisk {
+#tbdsbanjdr .gt_asterisk {
   font-size: 100%;
   vertical-align: 0;
 }
 
-#hpkosvtmzq .gt_indent_1 {
+#tbdsbanjdr .gt_indent_1 {
   text-indent: 5px;
 }
 
-#hpkosvtmzq .gt_indent_2 {
+#tbdsbanjdr .gt_indent_2 {
   text-indent: 10px;
 }
 
-#hpkosvtmzq .gt_indent_3 {
+#tbdsbanjdr .gt_indent_3 {
   text-indent: 15px;
 }
 
-#hpkosvtmzq .gt_indent_4 {
+#tbdsbanjdr .gt_indent_4 {
   text-indent: 20px;
 }
 
-#hpkosvtmzq .gt_indent_5 {
+#tbdsbanjdr .gt_indent_5 {
   text-indent: 25px;
 }
 </style>
 <table class="gt_table" data-quarto-disable-processing="false" data-quarto-bootstrap="false">
-  <caption><span id="tab:stattest-ttest-ex3-gt-tab">TABLE 6.2: </span>Unpaired two-sample t-test output for estimates of U.S. households electrical bills by AC use, RECS 2020</caption>
+  <caption><span id="tab:stattest-ttest-ex3-gt-tab">TABLE 6.2: </span>Unpaired two-sample t-test output for estimates of U.S. households electrical bills by A/C use, RECS 2020</caption>
   <thead>
     
     <tr class="gt_col_headings">
@@ -1720,11 +1725,11 @@ <h4>Example 3: Unpaired two-sample t-test<a href="c06-statistical-testing.html#s
   
 </table>
 </div>
-<p>The results indicate that the difference in electrical bills for those that used AC and those that did not is, on average, $365.72. The difference appears to be statistically significant as the t-statistic is 21.3 and the p-value is <span class="math inline">\(&lt;0.0001\)</span>. Households that used AC spent, on average, $365.72 more in 2020 on electricity than households without AC.</p>
+<p>The results indicate that the difference in electrical bills for those who used A/C and those who did not is, on average, $365.72. The difference appears to be statistically significant as the t-statistic is 21.3 and the p-value is <span class="math inline">\(&lt;0.0001\)</span>. Households that used A/C spent, on average, $365.72 more in 2020 on electricity than households without A/C.</p>
 </div>
 <div id="stattest-ttest-ex4" class="section level4 unnumbered hasAnchor">
 <h4>Example 4: Paired two-sample t-test<a href="c06-statistical-testing.html#stattest-ttest-ex4" class="anchor-section" aria-label="Anchor link to header"></a></h4>
-<p>Let’s say we want to test whether the temperature that U.S. households set their thermostat at night differs depending on the season (comparing summer<a href="#fn18" class="footnote-ref" id="fnref18"><sup>18</sup></a> and winter<a href="#fn19" class="footnote-ref" id="fnref19"><sup>19</sup></a> temperatures). We could set up the hypothesis as follows:</p>
+<p>Let’s say we want to test whether the temperature at which U.S. households set their thermostat at night differs depending on the season (comparing summer<a href="#fn18" class="footnote-ref" id="fnref18"><sup>18</sup></a> and winter<a href="#fn19" class="footnote-ref" id="fnref19"><sup>19</sup></a> temperatures.) We could set up the hypothesis as follows:</p>
 <ul>
 <li><span class="math inline">\(H_0: \mu_{summer} = \mu_{winter}\)</span> where <span class="math inline">\(\mu_{summer}\)</span> is the temperature that U.S. households set their thermostat to during summer nights, and <span class="math inline">\(\mu_{winter}\)</span> is the temperature that U.S. households set their thermostat to during winter nights</li>
 <li><span class="math inline">\(H_A: \mu_{summer} \neq \mu_{winter}\)</span></li>
@@ -1741,23 +1746,23 @@ <h4>Example 4: Paired two-sample t-test<a href="c06-statistical-testing.html#sta
 <span id="cb188-3"><a href="c06-statistical-testing.html#cb188-3" tabindex="-1"></a>  <span class="fu">gt</span>() <span class="sc">%&gt;%</span></span>
 <span id="cb188-4"><a href="c06-statistical-testing.html#cb188-4" tabindex="-1"></a>  <span class="fu">fmt_number</span>()</span></code></pre></div>
 
-<div id="rsccxkcyhx" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
-<style>#rsccxkcyhx table {
+<div id="ckowdxwsja" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
+<style>#ckowdxwsja table {
   font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
   -webkit-font-smoothing: antialiased;
   -moz-osx-font-smoothing: grayscale;
 }
 
-#rsccxkcyhx thead, #rsccxkcyhx tbody, #rsccxkcyhx tfoot, #rsccxkcyhx tr, #rsccxkcyhx td, #rsccxkcyhx th {
+#ckowdxwsja thead, #ckowdxwsja tbody, #ckowdxwsja tfoot, #ckowdxwsja tr, #ckowdxwsja td, #ckowdxwsja th {
   border-style: none;
 }
 
-#rsccxkcyhx p {
+#ckowdxwsja p {
   margin: 0;
   padding: 0;
 }
 
-#rsccxkcyhx .gt_table {
+#ckowdxwsja .gt_table {
   display: table;
   border-collapse: collapse;
   line-height: normal;
@@ -1783,12 +1788,12 @@ <h4>Example 4: Paired two-sample t-test<a href="c06-statistical-testing.html#sta
   border-left-color: #D3D3D3;
 }
 
-#rsccxkcyhx .gt_caption {
+#ckowdxwsja .gt_caption {
   padding-top: 4px;
   padding-bottom: 4px;
 }
 
-#rsccxkcyhx .gt_title {
+#ckowdxwsja .gt_title {
   color: #333333;
   font-size: 125%;
   font-weight: initial;
@@ -1800,7 +1805,7 @@ <h4>Example 4: Paired two-sample t-test<a href="c06-statistical-testing.html#sta
   border-bottom-width: 0;
 }
 
-#rsccxkcyhx .gt_subtitle {
+#ckowdxwsja .gt_subtitle {
   color: #333333;
   font-size: 85%;
   font-weight: initial;
@@ -1812,7 +1817,7 @@ <h4>Example 4: Paired two-sample t-test<a href="c06-statistical-testing.html#sta
   border-top-width: 0;
 }
 
-#rsccxkcyhx .gt_heading {
+#ckowdxwsja .gt_heading {
   background-color: #FFFFFF;
   text-align: center;
   border-bottom-color: #FFFFFF;
@@ -1824,13 +1829,13 @@ <h4>Example 4: Paired two-sample t-test<a href="c06-statistical-testing.html#sta
   border-right-color: #D3D3D3;
 }
 
-#rsccxkcyhx .gt_bottom_border {
+#ckowdxwsja .gt_bottom_border {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
 }
 
-#rsccxkcyhx .gt_col_headings {
+#ckowdxwsja .gt_col_headings {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -1845,7 +1850,7 @@ <h4>Example 4: Paired two-sample t-test<a href="c06-statistical-testing.html#sta
   border-right-color: #D3D3D3;
 }
 
-#rsccxkcyhx .gt_col_heading {
+#ckowdxwsja .gt_col_heading {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -1865,7 +1870,7 @@ <h4>Example 4: Paired two-sample t-test<a href="c06-statistical-testing.html#sta
   overflow-x: hidden;
 }
 
-#rsccxkcyhx .gt_column_spanner_outer {
+#ckowdxwsja .gt_column_spanner_outer {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -1877,15 +1882,15 @@ <h4>Example 4: Paired two-sample t-test<a href="c06-statistical-testing.html#sta
   padding-right: 4px;
 }
 
-#rsccxkcyhx .gt_column_spanner_outer:first-child {
+#ckowdxwsja .gt_column_spanner_outer:first-child {
   padding-left: 0;
 }
 
-#rsccxkcyhx .gt_column_spanner_outer:last-child {
+#ckowdxwsja .gt_column_spanner_outer:last-child {
   padding-right: 0;
 }
 
-#rsccxkcyhx .gt_column_spanner {
+#ckowdxwsja .gt_column_spanner {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
@@ -1897,11 +1902,11 @@ <h4>Example 4: Paired two-sample t-test<a href="c06-statistical-testing.html#sta
   width: 100%;
 }
 
-#rsccxkcyhx .gt_spanner_row {
+#ckowdxwsja .gt_spanner_row {
   border-bottom-style: hidden;
 }
 
-#rsccxkcyhx .gt_group_heading {
+#ckowdxwsja .gt_group_heading {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -1927,7 +1932,7 @@ <h4>Example 4: Paired two-sample t-test<a href="c06-statistical-testing.html#sta
   text-align: left;
 }
 
-#rsccxkcyhx .gt_empty_group_heading {
+#ckowdxwsja .gt_empty_group_heading {
   padding: 0.5px;
   color: #333333;
   background-color: #FFFFFF;
@@ -1942,15 +1947,15 @@ <h4>Example 4: Paired two-sample t-test<a href="c06-statistical-testing.html#sta
   vertical-align: middle;
 }
 
-#rsccxkcyhx .gt_from_md > :first-child {
+#ckowdxwsja .gt_from_md > :first-child {
   margin-top: 0;
 }
 
-#rsccxkcyhx .gt_from_md > :last-child {
+#ckowdxwsja .gt_from_md > :last-child {
   margin-bottom: 0;
 }
 
-#rsccxkcyhx .gt_row {
+#ckowdxwsja .gt_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -1969,7 +1974,7 @@ <h4>Example 4: Paired two-sample t-test<a href="c06-statistical-testing.html#sta
   overflow-x: hidden;
 }
 
-#rsccxkcyhx .gt_stub {
+#ckowdxwsja .gt_stub {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -1982,7 +1987,7 @@ <h4>Example 4: Paired two-sample t-test<a href="c06-statistical-testing.html#sta
   padding-right: 5px;
 }
 
-#rsccxkcyhx .gt_stub_row_group {
+#ckowdxwsja .gt_stub_row_group {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -1996,15 +2001,15 @@ <h4>Example 4: Paired two-sample t-test<a href="c06-statistical-testing.html#sta
   vertical-align: top;
 }
 
-#rsccxkcyhx .gt_row_group_first td {
+#ckowdxwsja .gt_row_group_first td {
   border-top-width: 2px;
 }
 
-#rsccxkcyhx .gt_row_group_first th {
+#ckowdxwsja .gt_row_group_first th {
   border-top-width: 2px;
 }
 
-#rsccxkcyhx .gt_summary_row {
+#ckowdxwsja .gt_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -2014,16 +2019,16 @@ <h4>Example 4: Paired two-sample t-test<a href="c06-statistical-testing.html#sta
   padding-right: 5px;
 }
 
-#rsccxkcyhx .gt_first_summary_row {
+#ckowdxwsja .gt_first_summary_row {
   border-top-style: solid;
   border-top-color: #D3D3D3;
 }
 
-#rsccxkcyhx .gt_first_summary_row.thick {
+#ckowdxwsja .gt_first_summary_row.thick {
   border-top-width: 2px;
 }
 
-#rsccxkcyhx .gt_last_summary_row {
+#ckowdxwsja .gt_last_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -2033,7 +2038,7 @@ <h4>Example 4: Paired two-sample t-test<a href="c06-statistical-testing.html#sta
   border-bottom-color: #D3D3D3;
 }
 
-#rsccxkcyhx .gt_grand_summary_row {
+#ckowdxwsja .gt_grand_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -2043,7 +2048,7 @@ <h4>Example 4: Paired two-sample t-test<a href="c06-statistical-testing.html#sta
   padding-right: 5px;
 }
 
-#rsccxkcyhx .gt_first_grand_summary_row {
+#ckowdxwsja .gt_first_grand_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -2053,7 +2058,7 @@ <h4>Example 4: Paired two-sample t-test<a href="c06-statistical-testing.html#sta
   border-top-color: #D3D3D3;
 }
 
-#rsccxkcyhx .gt_last_grand_summary_row_top {
+#ckowdxwsja .gt_last_grand_summary_row_top {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -2063,11 +2068,11 @@ <h4>Example 4: Paired two-sample t-test<a href="c06-statistical-testing.html#sta
   border-bottom-color: #D3D3D3;
 }
 
-#rsccxkcyhx .gt_striped {
+#ckowdxwsja .gt_striped {
   background-color: rgba(128, 128, 128, 0.05);
 }
 
-#rsccxkcyhx .gt_table_body {
+#ckowdxwsja .gt_table_body {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -2076,7 +2081,7 @@ <h4>Example 4: Paired two-sample t-test<a href="c06-statistical-testing.html#sta
   border-bottom-color: #D3D3D3;
 }
 
-#rsccxkcyhx .gt_footnotes {
+#ckowdxwsja .gt_footnotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -2090,7 +2095,7 @@ <h4>Example 4: Paired two-sample t-test<a href="c06-statistical-testing.html#sta
   border-right-color: #D3D3D3;
 }
 
-#rsccxkcyhx .gt_footnote {
+#ckowdxwsja .gt_footnote {
   margin: 0px;
   font-size: 90%;
   padding-top: 4px;
@@ -2099,7 +2104,7 @@ <h4>Example 4: Paired two-sample t-test<a href="c06-statistical-testing.html#sta
   padding-right: 5px;
 }
 
-#rsccxkcyhx .gt_sourcenotes {
+#ckowdxwsja .gt_sourcenotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -2113,7 +2118,7 @@ <h4>Example 4: Paired two-sample t-test<a href="c06-statistical-testing.html#sta
   border-right-color: #D3D3D3;
 }
 
-#rsccxkcyhx .gt_sourcenote {
+#ckowdxwsja .gt_sourcenote {
   font-size: 90%;
   padding-top: 4px;
   padding-bottom: 4px;
@@ -2121,63 +2126,63 @@ <h4>Example 4: Paired two-sample t-test<a href="c06-statistical-testing.html#sta
   padding-right: 5px;
 }
 
-#rsccxkcyhx .gt_left {
+#ckowdxwsja .gt_left {
   text-align: left;
 }
 
-#rsccxkcyhx .gt_center {
+#ckowdxwsja .gt_center {
   text-align: center;
 }
 
-#rsccxkcyhx .gt_right {
+#ckowdxwsja .gt_right {
   text-align: right;
   font-variant-numeric: tabular-nums;
 }
 
-#rsccxkcyhx .gt_font_normal {
+#ckowdxwsja .gt_font_normal {
   font-weight: normal;
 }
 
-#rsccxkcyhx .gt_font_bold {
+#ckowdxwsja .gt_font_bold {
   font-weight: bold;
 }
 
-#rsccxkcyhx .gt_font_italic {
+#ckowdxwsja .gt_font_italic {
   font-style: italic;
 }
 
-#rsccxkcyhx .gt_super {
+#ckowdxwsja .gt_super {
   font-size: 65%;
 }
 
-#rsccxkcyhx .gt_footnote_marks {
+#ckowdxwsja .gt_footnote_marks {
   font-size: 75%;
   vertical-align: 0.4em;
   position: initial;
 }
 
-#rsccxkcyhx .gt_asterisk {
+#ckowdxwsja .gt_asterisk {
   font-size: 100%;
   vertical-align: 0;
 }
 
-#rsccxkcyhx .gt_indent_1 {
+#ckowdxwsja .gt_indent_1 {
   text-indent: 5px;
 }
 
-#rsccxkcyhx .gt_indent_2 {
+#ckowdxwsja .gt_indent_2 {
   text-indent: 10px;
 }
 
-#rsccxkcyhx .gt_indent_3 {
+#ckowdxwsja .gt_indent_3 {
   text-indent: 15px;
 }
 
-#rsccxkcyhx .gt_indent_4 {
+#ckowdxwsja .gt_indent_4 {
   text-indent: 20px;
 }
 
-#rsccxkcyhx .gt_indent_5 {
+#ckowdxwsja .gt_indent_5 {
   text-indent: 25px;
 }
 </style>
@@ -2210,14 +2215,14 @@ <h4>Example 4: Paired two-sample t-test<a href="c06-statistical-testing.html#sta
   
 </table>
 </div>
-<p>U.S. households set their thermostat on average 2.9<span class="math inline">\(^\circ\)</span>F warmer in summer nights than winter nights, which is statistically significant (t = 50.8, p-value = <span class="math inline">\(&lt;0.0001\)</span>).</p>
+<p>U.S. households set their thermostat on average 2.9<span class="math inline">\(^\circ\)</span>F warmer in summer nights than winter nights, which is statistically significant (t = 50.8, p-value = <span class="math inline">\(&lt;0.0001\)</span>.)</p>
 </div>
 </div>
 </div>
 <div id="stattest-chi" class="section level2 hasAnchor" number="6.4">
 <h2><span class="header-section-number">6.4</span> Chi-square tests<a href="c06-statistical-testing.html#stattest-chi" class="anchor-section" aria-label="Anchor link to header"></a></h2>
 <p>Chi-square tests (<span class="math inline">\(\chi^2\)</span>) allow us to examine multiple proportions using a goodness-of-fit test, a test of independence, or a test of homogeneity. These three tests have the same <span class="math inline">\(\chi^2\)</span> distributions but with slightly different underlying assumptions.</p>
-<p>First, <strong>goodness-of-fit</strong> tests are used when comparing <em>observed</em> data to <em>expected</em> data. For example, this could be used to determine if respondent demographics (the observed data in the sample) match known population information (the expected data). In this case, we can set up the hypothesis test as follows:</p>
+<p>First, <strong>goodness-of-fit</strong> tests are used when comparing <em>observed</em> data to <em>expected</em> data. For example, this could be used to determine if respondent demographics (the observed data in the sample) match known population information (the expected data.) In this case, we can set up the hypothesis test as follows:</p>
 <ul>
 <li><span class="math inline">\(H_0: p_1 = \pi_1, ~ p_2 = \pi_2, ~ ..., ~ p_k = \pi_k\)</span> where <span class="math inline">\(p_i\)</span> is the observed proportion for category <span class="math inline">\(i\)</span>, <span class="math inline">\(\pi_i\)</span> is expected proportion for category <span class="math inline">\(i\)</span>, and <span class="math inline">\(k\)</span> is the number of categories</li>
 <li><span class="math inline">\(H_A:\)</span> at least one level of <span class="math inline">\(p_i\)</span> does not match <span class="math inline">\(\pi_i\)</span></li>
@@ -2232,7 +2237,7 @@ <h2><span class="header-section-number">6.4</span> Chi-square tests<a href="c06-
 <li><span class="math inline">\(H_0: p_{1a} = p_{1b}, ~ p_{2a} = p_{2b}, ~ ..., ~ p_{ka} = p_{kb}\)</span> where <span class="math inline">\(p_{ia}\)</span> is the observed proportion of category <span class="math inline">\(i\)</span> for subgroup <span class="math inline">\(a\)</span>, <span class="math inline">\(p_{ib}\)</span> is the observed proportion of category <span class="math inline">\(i\)</span> for subgroup <span class="math inline">\(a\)</span> and <span class="math inline">\(k\)</span> is the number of categories</li>
 <li><span class="math inline">\(H_A:\)</span> at least one category of <span class="math inline">\(p_{ia}\)</span> does not match <span class="math inline">\(p_{ib}\)</span></li>
 </ul>
-<p>As with t-tests, the difference between using <span class="math inline">\(\chi^2\)</span> tests with non-survey data and survey data is based on the underlying variance estimation. The functions in the {survey} package will account for these nuances, provided the design object is correctly defined. For basic variance estimation formulas for different survey design types, refer to Chapter <a href="c10-sample-designs-replicate-weights.html#c10-sample-designs-replicate-weights">10</a>.</p>
+<p>As with t-tests, the difference between using <span class="math inline">\(\chi^2\)</span> tests with non-survey data and survey data is based on the underlying variance estimation. The functions in the {survey} package account for these nuances, provided the design object is correctly defined. For basic variance estimation formulas for different survey design types, refer to Chapter <a href="c10-sample-designs-replicate-weights.html#c10-sample-designs-replicate-weights">10</a>.</p>
 <div id="stattest-chi-syntax" class="section level3 hasAnchor" number="6.4.1">
 <h3><span class="header-section-number">6.4.1</span> Syntax<a href="c06-statistical-testing.html#stattest-chi-syntax" class="anchor-section" aria-label="Anchor link to header"></a></h3>
 <p>When we do not have survey data, we may be able to use the <code>chisq.test()</code> function from the {stats} package in base R <span class="citation">(<a href="#ref-R-base">R Core Team 2023</a>)</span>. However, this function does not allow for weights or the variance structure to be accounted for with survey data. Therefore, when using survey data, we need to use one of two functions:</p>
@@ -2250,11 +2255,11 @@ <h3><span class="header-section-number">6.4.1</span> Syntax<a href="c06-statisti
 <p>The arguments are:</p>
 <ul>
 <li><code>formula</code>: Formula specifying a single factor variable</li>
-<li><code>p</code>: Vector of probabilities for the categories of the factor in the correct order. If they probabilities do not sum to 1, they will be rescaled to sum to 1.</li>
+<li><code>p</code>: Vector of probabilities for the categories of the factor in the correct order. If the probabilities do not sum to 1, they are rescaled to sum to 1.</li>
 <li><code>design</code>: Survey design object</li>
 <li>…: Other arguments to pass on, such as <code>na.rm</code></li>
 </ul>
-<p>Based on the order of the arguments, we again must use the dot <code>(.)</code> notation if we pipe in the survey design object or explicitly name the arguments as described in Section <a href="c06-statistical-testing.html#dot-notation">6.2</a>. For the goodness of fit tests, the formula will be a single variable <code>formula = ~var</code> as we compare the observed data from this variable to the expected data. The expected probabilities are then entered in the <code>p</code> argument and need to be a vector of the same length as the number of categories in the variable. For example, if we want to know if the proportion of males and females matches a distribution of 30/70, then the sex variable (with two categories) would be used <code>formula = ~SEX</code>, and the proportions would be included as <code>p = c(.3, .7)</code>. It is important to note that the variable entered into the formula should be formatted as either a factor or a character. The examples below provide more detail and tips on how to make sure the levels match up correctly.</p>
+<p>Based on the order of the arguments, we again must use the dot <code>(.)</code> notation if we pipe in the survey design object or explicitly name the arguments as described in Section <a href="c06-statistical-testing.html#dot-notation">6.2</a>. For the goodness of fit tests, the formula is a single variable <code>formula = ~var</code> as we compare the observed data from this variable to the expected data. The expected probabilities are then entered in the <code>p</code> argument and need to be a vector of the same length as the number of categories in the variable. For example, if we want to know if the proportion of males and females matches a distribution of 30/70, then the sex variable (with two categories) would be used <code>formula = ~SEX</code>, and the proportions would be included as <code>p = c(.3, .7)</code>. It is important to note that the variable entered into the formula should be formatted as either a factor or a character. The examples below provide more detail and tips on how to make sure the levels match up correctly.</p>
 <p>For tests of homogeneity and independence, the <code>svychisq()</code> function should be used. The syntax is as follows:</p>
 <div class="sourceCode" id="cb190"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb190-1"><a href="c06-statistical-testing.html#cb190-1" tabindex="-1"></a><span class="fu">svychisq</span>(</span>
 <span id="cb190-2"><a href="c06-statistical-testing.html#cb190-2" tabindex="-1"></a>  formula,</span>
@@ -2271,9 +2276,9 @@ <h3><span class="header-section-number">6.4.1</span> Syntax<a href="c06-statisti
 <li><code>na.rm</code>: Remove missing values</li>
 </ul>
 <p>There are six statistics that are accepted in this formula. For tests of homogeneity (when comparing cross-tabulations), the <code>F</code> or <code>Chisq</code> statistics should be used.<a href="#fn20" class="footnote-ref" id="fnref20"><sup>20</sup></a> The <code>F</code> statistic is the default and uses the Rao-Scott second-order correction. This correction is designed to assist with complicated sampling designs (i.e., those other than a simple random sample) <span class="citation">(<a href="#ref-Scott2007">Scott 2007</a>)</span>. The <code>Chisq</code> statistic is an adjusted version of the Pearson <span class="math inline">\(\chi^2\)</span> statistic. The version of this statistic in the <code>svychisq()</code> function compares the design effect estimate from the provided survey data to what the <span class="math inline">\(\chi^2\)</span> distribution would have been if the data came from a simple random sampling.</p>
-<p>For tests of independence, the <code>Wald</code> and <code>adjWald</code> are recommended as they provide a better adjustment for variable comparisons <span class="citation">(<a href="#ref-lumley2010complex">Lumley 2010</a>)</span>. If the data has a small number of primary sampling units (PSUs) compared to the degrees of freedom, then the <code>adjWald</code> statistic should be used to account for this. The <code>lincom</code> and <code>saddlepoint</code> statistics are available for more complicated data structures.</p>
-<p>The formula argument will always be one-sided, unlike the <code>svyttest()</code> function. The two variables of interest should be included with a plus sign: <code>formula = ~ var_1 + var_2</code>. As with the <code>svygofchisq()</code> function, the variables entered into the formula should be formatted as either a factor or a character.</p>
-<p>Additionally, as with the t-test function, both <code>svygofchisq()</code> and <code>svychisq()</code> have the <code>na.rm</code> argument. If any data is missing, the <span class="math inline">\(\chi^2\)</span> tests will assume that <code>NA</code> is a category and include it in the calculation. Throughout this chapter, we will always set <code>na.rm = TRUE</code>, but before analyzing the survey data, review the notes provided in Chapter <a href="c03-survey-data-documentation.html#c03-survey-data-documentation">3</a> to better understand how to handle missing data.</p>
+<p>For tests of independence, the <code>Wald</code> and <code>adjWald</code> are recommended as they provide a better adjustment for variable comparisons <span class="citation">(<a href="#ref-lumley2010complex">Lumley 2010</a>)</span>. If the data have a small number of primary sampling units (PSUs) compared to the degrees of freedom, then the <code>adjWald</code> statistic should be used to account for this. The <code>lincom</code> and <code>saddlepoint</code> statistics are available for more complicated data structures.</p>
+<p>The formula argument is always one-sided, unlike the <code>svyttest()</code> function. The two variables of interest should be included with a plus sign: <code>formula = ~ var_1 + var_2</code>. As with the <code>svygofchisq()</code> function, the variables entered into the formula should be formatted as either a factor or a character.</p>
+<p>Additionally, as with the t-test function, both <code>svygofchisq()</code> and <code>svychisq()</code> have the <code>na.rm</code> argument. If any data values are missing, the <span class="math inline">\(\chi^2\)</span> tests assume that <code>NA</code> is a category and include it in the calculation. Throughout this chapter, we always set <code>na.rm = TRUE</code>, but before analyzing the survey data, review the notes provided in Chapter <a href="c11-missing-data.html#c11-missing-data">11</a> to better understand how to handle missing data.</p>
 </div>
 <div id="stattest-chi-examples" class="section level3 hasAnchor" number="6.4.2">
 <h3><span class="header-section-number">6.4.2</span> Examples<a href="c06-statistical-testing.html#stattest-chi-examples" class="anchor-section" aria-label="Anchor link to header"></a></h3>
@@ -2282,7 +2287,7 @@ <h3><span class="header-section-number">6.4.2</span> Examples<a href="c06-statis
 <h4>Example 1: Goodness of fit test<a href="c06-statistical-testing.html#stattest-chi-ex1" class="anchor-section" aria-label="Anchor link to header"></a></h4>
 <p>ANES asked respondents about their highest education level.<a href="#fn21" class="footnote-ref" id="fnref21"><sup>21</sup></a> Based on the data from the 2020 American Community Survey (ACS) 5-year estimates<a href="#fn22" class="footnote-ref" id="fnref22"><sup>22</sup></a>, the education distribution of those aged 18+ in the United States (among the 50 states and District of Columbia) is as follows:</p>
 <ul>
-<li>11% had less than High School degree</li>
+<li>11% had less than a High School degree</li>
 <li>27% had a High School degree</li>
 <li>29% had some college or associate’s degree</li>
 <li>33% had a bachelor’s degree or higher</li>
@@ -2305,7 +2310,7 @@ <h4>Example 1: Goodness of fit test<a href="c06-statistical-testing.html#stattes
 ## 3 Post HS      0.290  0.00713
 ## 4 Bachelor&#39;s   0.226  0.00633
 ## 5 Graduate     0.126  0.00499</code></pre>
-<p>Based on this output, we can see that we have different levels than the ACS data provides. Specifically, the education data from ANES has two levels for Bachelor’s Degree or Higher (Bachelor’s and Graduate), so these two categories need to be collapsed into a single category to match the ACS data. For this, among other methods, we can use the {forcats} package from the tidyverse <span class="citation">(<a href="#ref-R-forcats">Wickham 2023a</a>)</span>. The package’s <code>fct_collapse()</code> function helps us create a new variable by collapsing categories into a single one. Then, we will use the <code>svygofchisq()</code> function to compare the ANES data to the ACS data where we specify the updated design object, the formula using the collapsed education variable, the ACS estimates for education levels as p, and removing NA values.</p>
+<p>Based on this output, we can see that we have different levels from the ACS data. Specifically, the education data from ANES include two levels for Bachelor’s Degree or Higher (Bachelor’s and Graduate), so these two categories need to be collapsed into a single category to match the ACS data. For this, among other methods, we can use the {forcats} package from the tidyverse <span class="citation">(<a href="#ref-R-forcats">Wickham 2023a</a>)</span>. The package’s <code>fct_collapse()</code> function helps us create a new variable by collapsing categories into a single one. Then, we use the <code>svygofchisq()</code> function to compare the ANES data to the ACS data, where we specify the updated design object, the formula using the collapsed education variable, the ACS estimates for education levels as p, and removing NA values.</p>
 <div class="sourceCode" id="cb193"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb193-1"><a href="c06-statistical-testing.html#cb193-1" tabindex="-1"></a>anes_des_educ <span class="ot">&lt;-</span> anes_des <span class="sc">%&gt;%</span></span>
 <span id="cb193-2"><a href="c06-statistical-testing.html#cb193-2" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">Education2 =</span></span>
 <span id="cb193-3"><a href="c06-statistical-testing.html#cb193-3" tabindex="-1"></a>           <span class="fu">fct_collapse</span>(Education,</span>
@@ -2338,7 +2343,7 @@ <h4>Example 1: Goodness of fit test<a href="c06-statistical-testing.html#stattes
 ## data:  ~Education2
 ## X-squared = 2172220, scale = 1.1e+05, df = 2.3e+00, p-value =
 ## 9e-05</code></pre>
-<p>The output from the <code>svygofchisq()</code> indicates that at least one proportion from ANES does not match the ACS data (<span class="math inline">\(\chi^2 =\)</span> 2,172,220; p-value &lt;0.0001). To get a better idea of the differences, we can use the <code>expected</code> output along with <code>survey_mean()</code> to create a comparison table:</p>
+<p>The output from the <code>svygofchisq()</code> indicates that at least one proportion from ANES does not match the ACS data (<span class="math inline">\(\chi^2 =\)</span> 2,172,220; p-value &lt;0.0001.) To get a better idea of the differences, we can use the <code>expected</code> output along with <code>survey_mean()</code> to create a comparison table:</p>
 <div class="sourceCode" id="cb197"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb197-1"><a href="c06-statistical-testing.html#cb197-1" tabindex="-1"></a>ex1_table <span class="ot">&lt;-</span> anes_des_educ <span class="sc">%&gt;%</span></span>
 <span id="cb197-2"><a href="c06-statistical-testing.html#cb197-2" tabindex="-1"></a>  <span class="fu">drop_na</span>(Education2) <span class="sc">%&gt;%</span></span>
 <span id="cb197-3"><a href="c06-statistical-testing.html#cb197-3" tabindex="-1"></a>  <span class="fu">group_by</span>(Education2) <span class="sc">%&gt;%</span></span>
@@ -2355,7 +2360,7 @@ <h4>Example 1: Goodness of fit test<a href="c06-statistical-testing.html#stattes
 ## 2 High school            0.27   0.277        0.257        0.298 
 ## 3 Post HS                0.29   0.290        0.276        0.305 
 ## 4 Bachelor or Higher     0.33   0.352        0.337        0.367</code></pre>
-<p>This output includes our expected proportions from the ACS that we provided the <code>svygofchisq()</code> function along with the output of the observed proportions and their confidence intervals. This table shows that the “High school” and “Post HS” categories have nearly identical proportions but that the other two categories are slightly different. Looking at the confidence intervals, we can see that the ANES data skews to include fewer people in the “Less than HS” category and more people in the “Bachelor or Higher” category. This may be easier to see if we plot this. The code below uses the tabular output to create Figure <a href="c06-statistical-testing.html#fig:stattest-chi-ex1-graph">6.1</a>.</p>
+<p>This output includes our expected proportions from the ACS that we provided the <code>svygofchisq()</code> function along with the output of the observed proportions and their confidence intervals. This table shows that the “High school” and “Post HS” categories have nearly identical proportions but that the other two categories are slightly different. Looking at the confidence intervals, we can see that the ANES data skew to include fewer people in the “Less than HS” category and more people in the “Bachelor or Higher” category. This may be easier to see if we plot this. The code below uses the tabular output to create Figure <a href="c06-statistical-testing.html#fig:stattest-chi-ex1-graph">6.1</a>.</p>
 <div class="sourceCode" id="cb199"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb199-1"><a href="c06-statistical-testing.html#cb199-1" tabindex="-1"></a>ex1_table <span class="sc">%&gt;%</span></span>
 <span id="cb199-2"><a href="c06-statistical-testing.html#cb199-2" tabindex="-1"></a>  <span class="fu">pivot_longer</span>(</span>
 <span id="cb199-3"><a href="c06-statistical-testing.html#cb199-3" tabindex="-1"></a>    <span class="at">cols =</span> <span class="fu">c</span>(<span class="st">&quot;Expected&quot;</span>, <span class="st">&quot;Observed&quot;</span>),</span>
@@ -2365,18 +2370,20 @@ <h4>Example 1: Goodness of fit test<a href="c06-statistical-testing.html#stattes
 <span id="cb199-7"><a href="c06-statistical-testing.html#cb199-7" tabindex="-1"></a>  <span class="fu">mutate</span>(</span>
 <span id="cb199-8"><a href="c06-statistical-testing.html#cb199-8" tabindex="-1"></a>    <span class="at">Observed_low =</span> <span class="fu">if_else</span>(Names <span class="sc">==</span> <span class="st">&quot;Observed&quot;</span>, Observed_low, <span class="cn">NA_real_</span>),</span>
 <span id="cb199-9"><a href="c06-statistical-testing.html#cb199-9" tabindex="-1"></a>    <span class="at">Observed_upp =</span> <span class="fu">if_else</span>(Names <span class="sc">==</span> <span class="st">&quot;Observed&quot;</span>, Observed_upp, <span class="cn">NA_real_</span>),</span>
-<span id="cb199-10"><a href="c06-statistical-testing.html#cb199-10" tabindex="-1"></a>    <span class="at">Names =</span> <span class="fu">if_else</span>(Names <span class="sc">==</span> <span class="st">&quot;Observed&quot;</span>, <span class="st">&quot;ANES (observed)&quot;</span>, <span class="st">&quot;ACS (expected)&quot;</span>)</span>
-<span id="cb199-11"><a href="c06-statistical-testing.html#cb199-11" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
-<span id="cb199-12"><a href="c06-statistical-testing.html#cb199-12" tabindex="-1"></a>  <span class="fu">ggplot</span>(<span class="fu">aes</span>(<span class="at">x =</span> Education, <span class="at">y =</span> Proportion, <span class="at">color =</span> Names)) <span class="sc">+</span></span>
-<span id="cb199-13"><a href="c06-statistical-testing.html#cb199-13" tabindex="-1"></a>  <span class="fu">geom_point</span>(<span class="at">alpha =</span> <span class="fl">0.75</span>, <span class="at">size =</span> <span class="dv">2</span>) <span class="sc">+</span></span>
-<span id="cb199-14"><a href="c06-statistical-testing.html#cb199-14" tabindex="-1"></a>  <span class="fu">geom_errorbar</span>(<span class="fu">aes</span>(<span class="at">ymin =</span> Observed_low, <span class="at">ymax =</span> Observed_upp), <span class="at">width =</span> <span class="fl">0.25</span>) <span class="sc">+</span></span>
-<span id="cb199-15"><a href="c06-statistical-testing.html#cb199-15" tabindex="-1"></a>  <span class="fu">theme_bw</span>() <span class="sc">+</span></span>
-<span id="cb199-16"><a href="c06-statistical-testing.html#cb199-16" tabindex="-1"></a>  <span class="fu">scale_color_manual</span>(<span class="at">name =</span> <span class="st">&quot;Type&quot;</span>, <span class="at">values =</span> book_colors[<span class="fu">c</span>(<span class="dv">4</span>, <span class="dv">1</span>)]) <span class="sc">+</span></span>
-<span id="cb199-17"><a href="c06-statistical-testing.html#cb199-17" tabindex="-1"></a>  <span class="fu">theme</span>(<span class="at">legend.position =</span> <span class="st">&quot;bottom&quot;</span>, <span class="at">legend.title =</span> <span class="fu">element_blank</span>())</span></code></pre></div>
+<span id="cb199-10"><a href="c06-statistical-testing.html#cb199-10" tabindex="-1"></a>    <span class="at">Names =</span> <span class="fu">if_else</span>(Names <span class="sc">==</span> <span class="st">&quot;Observed&quot;</span>,</span>
+<span id="cb199-11"><a href="c06-statistical-testing.html#cb199-11" tabindex="-1"></a>                    <span class="st">&quot;ANES (observed)&quot;</span>, <span class="st">&quot;ACS (expected)&quot;</span>)</span>
+<span id="cb199-12"><a href="c06-statistical-testing.html#cb199-12" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
+<span id="cb199-13"><a href="c06-statistical-testing.html#cb199-13" tabindex="-1"></a>  <span class="fu">ggplot</span>(<span class="fu">aes</span>(<span class="at">x =</span> Education, <span class="at">y =</span> Proportion, <span class="at">color =</span> Names)) <span class="sc">+</span></span>
+<span id="cb199-14"><a href="c06-statistical-testing.html#cb199-14" tabindex="-1"></a>  <span class="fu">geom_point</span>(<span class="at">alpha =</span> <span class="fl">0.75</span>, <span class="at">size =</span> <span class="dv">2</span>) <span class="sc">+</span></span>
+<span id="cb199-15"><a href="c06-statistical-testing.html#cb199-15" tabindex="-1"></a>  <span class="fu">geom_errorbar</span>(<span class="fu">aes</span>(<span class="at">ymin =</span> Observed_low, <span class="at">ymax =</span> Observed_upp), </span>
+<span id="cb199-16"><a href="c06-statistical-testing.html#cb199-16" tabindex="-1"></a>                <span class="at">width =</span> <span class="fl">0.25</span>) <span class="sc">+</span></span>
+<span id="cb199-17"><a href="c06-statistical-testing.html#cb199-17" tabindex="-1"></a>  <span class="fu">theme_bw</span>() <span class="sc">+</span></span>
+<span id="cb199-18"><a href="c06-statistical-testing.html#cb199-18" tabindex="-1"></a>  <span class="fu">scale_color_manual</span>(<span class="at">name =</span> <span class="st">&quot;Type&quot;</span>, <span class="at">values =</span> book_colors[<span class="fu">c</span>(<span class="dv">4</span>, <span class="dv">1</span>)]) <span class="sc">+</span></span>
+<span id="cb199-19"><a href="c06-statistical-testing.html#cb199-19" tabindex="-1"></a>  <span class="fu">theme</span>(<span class="at">legend.position =</span> <span class="st">&quot;bottom&quot;</span>, <span class="at">legend.title =</span> <span class="fu">element_blank</span>())</span></code></pre></div>
 <div class="figure"><span style="display:block;" id="fig:stattest-chi-ex1-graph"></span>
 <img src="bookdown_files/figure-html/stattest-chi-ex1-graph-1.png" alt="Expected and observed proportions of education, showing the confidence intervals for the expected proportions and whether the observed proportions lie within them. The x-axis has labels 'Less than HS', 'High school', 'Post HS', and 'Bachelor or Higher'. The only ones where expected proportion is outside of the intervals is 'Less than HS' and 'Bachelor or Higher'." width="672" />
 <p class="caption">
-FIGURE 6.1: Expected and observed proportions of education, showing the confidence intervals for the expected proportions and whether the observed proportions lie within them.
+FIGURE 6.1: Expected and observed proportions of education with confidence intervals
 </p>
 </div>
 </div>
@@ -2423,7 +2430,7 @@ <h4>Example 2: Test of independence<a href="c06-statistical-testing.html#stattes
 ##   About half the time          428.871   65.024
 ##   Some of the time             932.628   89.596
 ##   Never                        217.994  189.307</code></pre>
-<p>However, as researchers, we often want to know about the proportions and not just the respondent counts from the survey. There are a couple of different ways that we can do this. The first is using the counts from <code>chi_ex2$observed</code> to calculate the proportion. We can then pivot the table to create a cross-tabulation similar to the counts table above. Adding <code>group_by()</code> to the code means that we are obtaining the proportions within each level of that variable. In this case, we are looking at the distribution of <code>TrustGovernment</code> for each level of <code>TrustPeople</code>. The resulting table is shown in Table <a href="c06-statistical-testing.html#tab:stattest-chi-ex2-prop1-tab">6.4</a>.</p>
+<p>However, we often want to know about the proportions, not just the respondent counts from the survey. There are a couple of different ways that we can do this. The first is using the counts from <code>chi_ex2$observed</code> to calculate the proportion. We can then pivot the table to create a cross-tabulation similar to the counts table above. Adding <code>group_by()</code> to the code means that we obtain the proportions within each variable level. In this case, we are looking at the distribution of <code>TrustGovernment</code> for each level of <code>TrustPeople</code>. The resulting table is shown in Table <a href="c06-statistical-testing.html#tab:stattest-chi-ex2-prop1-tab">6.4</a>.</p>
 <div class="sourceCode" id="cb204"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb204-1"><a href="c06-statistical-testing.html#cb204-1" tabindex="-1"></a>chi_ex2_table<span class="ot">&lt;-</span>chi_ex2<span class="sc">$</span>observed <span class="sc">%&gt;%</span> </span>
 <span id="cb204-2"><a href="c06-statistical-testing.html#cb204-2" tabindex="-1"></a>  <span class="fu">as_tibble</span>() <span class="sc">%&gt;%</span></span>
 <span id="cb204-3"><a href="c06-statistical-testing.html#cb204-3" tabindex="-1"></a>  <span class="fu">group_by</span>(TrustPeople) <span class="sc">%&gt;%</span></span>
@@ -2439,23 +2446,23 @@ <h4>Example 2: Test of independence<a href="c06-statistical-testing.html#stattes
 <span id="cb204-13"><a href="c06-statistical-testing.html#cb204-13" tabindex="-1"></a>             <span class="st">`</span><span class="at">Some of the time</span><span class="st">`</span> <span class="ot">=</span> <span class="fu">md</span>(<span class="st">&quot;Some of&lt;br /&gt;the time&quot;</span>))</span></code></pre></div>
 <div class="sourceCode" id="cb205"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb205-1"><a href="c06-statistical-testing.html#cb205-1" tabindex="-1"></a>chi_ex2_table</span></code></pre></div>
 
-<div id="tntohixzez" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
-<style>#tntohixzez table {
+<div id="vxngdmvipb" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
+<style>#vxngdmvipb table {
   font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
   -webkit-font-smoothing: antialiased;
   -moz-osx-font-smoothing: grayscale;
 }
 
-#tntohixzez thead, #tntohixzez tbody, #tntohixzez tfoot, #tntohixzez tr, #tntohixzez td, #tntohixzez th {
+#vxngdmvipb thead, #vxngdmvipb tbody, #vxngdmvipb tfoot, #vxngdmvipb tr, #vxngdmvipb td, #vxngdmvipb th {
   border-style: none;
 }
 
-#tntohixzez p {
+#vxngdmvipb p {
   margin: 0;
   padding: 0;
 }
 
-#tntohixzez .gt_table {
+#vxngdmvipb .gt_table {
   display: table;
   border-collapse: collapse;
   line-height: normal;
@@ -2481,12 +2488,12 @@ <h4>Example 2: Test of independence<a href="c06-statistical-testing.html#stattes
   border-left-color: #D3D3D3;
 }
 
-#tntohixzez .gt_caption {
+#vxngdmvipb .gt_caption {
   padding-top: 4px;
   padding-bottom: 4px;
 }
 
-#tntohixzez .gt_title {
+#vxngdmvipb .gt_title {
   color: #333333;
   font-size: 125%;
   font-weight: initial;
@@ -2498,7 +2505,7 @@ <h4>Example 2: Test of independence<a href="c06-statistical-testing.html#stattes
   border-bottom-width: 0;
 }
 
-#tntohixzez .gt_subtitle {
+#vxngdmvipb .gt_subtitle {
   color: #333333;
   font-size: 85%;
   font-weight: initial;
@@ -2510,7 +2517,7 @@ <h4>Example 2: Test of independence<a href="c06-statistical-testing.html#stattes
   border-top-width: 0;
 }
 
-#tntohixzez .gt_heading {
+#vxngdmvipb .gt_heading {
   background-color: #FFFFFF;
   text-align: center;
   border-bottom-color: #FFFFFF;
@@ -2522,13 +2529,13 @@ <h4>Example 2: Test of independence<a href="c06-statistical-testing.html#stattes
   border-right-color: #D3D3D3;
 }
 
-#tntohixzez .gt_bottom_border {
+#vxngdmvipb .gt_bottom_border {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
 }
 
-#tntohixzez .gt_col_headings {
+#vxngdmvipb .gt_col_headings {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -2543,7 +2550,7 @@ <h4>Example 2: Test of independence<a href="c06-statistical-testing.html#stattes
   border-right-color: #D3D3D3;
 }
 
-#tntohixzez .gt_col_heading {
+#vxngdmvipb .gt_col_heading {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -2563,7 +2570,7 @@ <h4>Example 2: Test of independence<a href="c06-statistical-testing.html#stattes
   overflow-x: hidden;
 }
 
-#tntohixzez .gt_column_spanner_outer {
+#vxngdmvipb .gt_column_spanner_outer {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -2575,15 +2582,15 @@ <h4>Example 2: Test of independence<a href="c06-statistical-testing.html#stattes
   padding-right: 4px;
 }
 
-#tntohixzez .gt_column_spanner_outer:first-child {
+#vxngdmvipb .gt_column_spanner_outer:first-child {
   padding-left: 0;
 }
 
-#tntohixzez .gt_column_spanner_outer:last-child {
+#vxngdmvipb .gt_column_spanner_outer:last-child {
   padding-right: 0;
 }
 
-#tntohixzez .gt_column_spanner {
+#vxngdmvipb .gt_column_spanner {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
@@ -2595,11 +2602,11 @@ <h4>Example 2: Test of independence<a href="c06-statistical-testing.html#stattes
   width: 100%;
 }
 
-#tntohixzez .gt_spanner_row {
+#vxngdmvipb .gt_spanner_row {
   border-bottom-style: hidden;
 }
 
-#tntohixzez .gt_group_heading {
+#vxngdmvipb .gt_group_heading {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -2625,7 +2632,7 @@ <h4>Example 2: Test of independence<a href="c06-statistical-testing.html#stattes
   text-align: left;
 }
 
-#tntohixzez .gt_empty_group_heading {
+#vxngdmvipb .gt_empty_group_heading {
   padding: 0.5px;
   color: #333333;
   background-color: #FFFFFF;
@@ -2640,15 +2647,15 @@ <h4>Example 2: Test of independence<a href="c06-statistical-testing.html#stattes
   vertical-align: middle;
 }
 
-#tntohixzez .gt_from_md > :first-child {
+#vxngdmvipb .gt_from_md > :first-child {
   margin-top: 0;
 }
 
-#tntohixzez .gt_from_md > :last-child {
+#vxngdmvipb .gt_from_md > :last-child {
   margin-bottom: 0;
 }
 
-#tntohixzez .gt_row {
+#vxngdmvipb .gt_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -2667,7 +2674,7 @@ <h4>Example 2: Test of independence<a href="c06-statistical-testing.html#stattes
   overflow-x: hidden;
 }
 
-#tntohixzez .gt_stub {
+#vxngdmvipb .gt_stub {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -2680,7 +2687,7 @@ <h4>Example 2: Test of independence<a href="c06-statistical-testing.html#stattes
   padding-right: 5px;
 }
 
-#tntohixzez .gt_stub_row_group {
+#vxngdmvipb .gt_stub_row_group {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -2694,15 +2701,15 @@ <h4>Example 2: Test of independence<a href="c06-statistical-testing.html#stattes
   vertical-align: top;
 }
 
-#tntohixzez .gt_row_group_first td {
+#vxngdmvipb .gt_row_group_first td {
   border-top-width: 2px;
 }
 
-#tntohixzez .gt_row_group_first th {
+#vxngdmvipb .gt_row_group_first th {
   border-top-width: 2px;
 }
 
-#tntohixzez .gt_summary_row {
+#vxngdmvipb .gt_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -2712,16 +2719,16 @@ <h4>Example 2: Test of independence<a href="c06-statistical-testing.html#stattes
   padding-right: 5px;
 }
 
-#tntohixzez .gt_first_summary_row {
+#vxngdmvipb .gt_first_summary_row {
   border-top-style: solid;
   border-top-color: #D3D3D3;
 }
 
-#tntohixzez .gt_first_summary_row.thick {
+#vxngdmvipb .gt_first_summary_row.thick {
   border-top-width: 2px;
 }
 
-#tntohixzez .gt_last_summary_row {
+#vxngdmvipb .gt_last_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -2731,7 +2738,7 @@ <h4>Example 2: Test of independence<a href="c06-statistical-testing.html#stattes
   border-bottom-color: #D3D3D3;
 }
 
-#tntohixzez .gt_grand_summary_row {
+#vxngdmvipb .gt_grand_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -2741,7 +2748,7 @@ <h4>Example 2: Test of independence<a href="c06-statistical-testing.html#stattes
   padding-right: 5px;
 }
 
-#tntohixzez .gt_first_grand_summary_row {
+#vxngdmvipb .gt_first_grand_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -2751,7 +2758,7 @@ <h4>Example 2: Test of independence<a href="c06-statistical-testing.html#stattes
   border-top-color: #D3D3D3;
 }
 
-#tntohixzez .gt_last_grand_summary_row_top {
+#vxngdmvipb .gt_last_grand_summary_row_top {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -2761,11 +2768,11 @@ <h4>Example 2: Test of independence<a href="c06-statistical-testing.html#stattes
   border-bottom-color: #D3D3D3;
 }
 
-#tntohixzez .gt_striped {
+#vxngdmvipb .gt_striped {
   background-color: rgba(128, 128, 128, 0.05);
 }
 
-#tntohixzez .gt_table_body {
+#vxngdmvipb .gt_table_body {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -2774,7 +2781,7 @@ <h4>Example 2: Test of independence<a href="c06-statistical-testing.html#stattes
   border-bottom-color: #D3D3D3;
 }
 
-#tntohixzez .gt_footnotes {
+#vxngdmvipb .gt_footnotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -2788,7 +2795,7 @@ <h4>Example 2: Test of independence<a href="c06-statistical-testing.html#stattes
   border-right-color: #D3D3D3;
 }
 
-#tntohixzez .gt_footnote {
+#vxngdmvipb .gt_footnote {
   margin: 0px;
   font-size: 90%;
   padding-top: 4px;
@@ -2797,7 +2804,7 @@ <h4>Example 2: Test of independence<a href="c06-statistical-testing.html#stattes
   padding-right: 5px;
 }
 
-#tntohixzez .gt_sourcenotes {
+#vxngdmvipb .gt_sourcenotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -2811,7 +2818,7 @@ <h4>Example 2: Test of independence<a href="c06-statistical-testing.html#stattes
   border-right-color: #D3D3D3;
 }
 
-#tntohixzez .gt_sourcenote {
+#vxngdmvipb .gt_sourcenote {
   font-size: 90%;
   padding-top: 4px;
   padding-bottom: 4px;
@@ -2819,63 +2826,63 @@ <h4>Example 2: Test of independence<a href="c06-statistical-testing.html#stattes
   padding-right: 5px;
 }
 
-#tntohixzez .gt_left {
+#vxngdmvipb .gt_left {
   text-align: left;
 }
 
-#tntohixzez .gt_center {
+#vxngdmvipb .gt_center {
   text-align: center;
 }
 
-#tntohixzez .gt_right {
+#vxngdmvipb .gt_right {
   text-align: right;
   font-variant-numeric: tabular-nums;
 }
 
-#tntohixzez .gt_font_normal {
+#vxngdmvipb .gt_font_normal {
   font-weight: normal;
 }
 
-#tntohixzez .gt_font_bold {
+#vxngdmvipb .gt_font_bold {
   font-weight: bold;
 }
 
-#tntohixzez .gt_font_italic {
+#vxngdmvipb .gt_font_italic {
   font-style: italic;
 }
 
-#tntohixzez .gt_super {
+#vxngdmvipb .gt_super {
   font-size: 65%;
 }
 
-#tntohixzez .gt_footnote_marks {
+#vxngdmvipb .gt_footnote_marks {
   font-size: 75%;
   vertical-align: 0.4em;
   position: initial;
 }
 
-#tntohixzez .gt_asterisk {
+#vxngdmvipb .gt_asterisk {
   font-size: 100%;
   vertical-align: 0;
 }
 
-#tntohixzez .gt_indent_1 {
+#vxngdmvipb .gt_indent_1 {
   text-indent: 5px;
 }
 
-#tntohixzez .gt_indent_2 {
+#vxngdmvipb .gt_indent_2 {
   text-indent: 10px;
 }
 
-#tntohixzez .gt_indent_3 {
+#vxngdmvipb .gt_indent_3 {
   text-indent: 15px;
 }
 
-#tntohixzez .gt_indent_4 {
+#vxngdmvipb .gt_indent_4 {
   text-indent: 20px;
 }
 
-#tntohixzez .gt_indent_5 {
+#vxngdmvipb .gt_indent_5 {
   text-indent: 25px;
 }
 </style>
@@ -2933,8 +2940,8 @@ <h4>Example 2: Test of independence<a href="c06-statistical-testing.html#stattes
   
 </table>
 </div>
-<p>In Table <a href="c06-statistical-testing.html#tab:stattest-chi-ex2-prop1-tab">6.4</a>, each column sums to 1. For example, we can say that it is estimated that of people who always trust in people, 27.7% also always trust in government based on the top-left cell but 5.3% never trust in government.</p>
-<p>The second option is to use <code>group_by()</code> and <code>survey_mean()</code> functions to calculate the proportions from the ANES design object. A reminder that with more than one variable listed in the <code>group_by()</code> statement, the proportions are within the first variable listed. As mentioned above, we are looking at the distribution of <code>TrustGovernment</code> for each level of <code>TrustPeople</code>.</p>
+<p>In Table <a href="c06-statistical-testing.html#tab:stattest-chi-ex2-prop1-tab">6.4</a>, each column sums to 1. For example, we can say that it is estimated that of people who always trust in people, 27.7% also always trust in the government based on the top-left cell, but 5.3% never trust in the government.</p>
+<p>The second option is to use the <code>group_by()</code> and <code>survey_mean()</code> functions to calculate the proportions from the ANES design object. Remember that with more than one variable listed in the <code>group_by()</code> statement, the proportions are within the first variable listed. As mentioned above, we are looking at the distribution of <code>TrustGovernment</code> for each level of <code>TrustPeople</code>.</p>
 <div class="sourceCode" id="cb206"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb206-1"><a href="c06-statistical-testing.html#cb206-1" tabindex="-1"></a>chi_ex2_obs <span class="ot">&lt;-</span> anes_des <span class="sc">%&gt;%</span></span>
 <span id="cb206-2"><a href="c06-statistical-testing.html#cb206-2" tabindex="-1"></a>  <span class="fu">drop_na</span>(TrustPeople, TrustGovernment) <span class="sc">%&gt;%</span></span>
 <span id="cb206-3"><a href="c06-statistical-testing.html#cb206-3" tabindex="-1"></a>  <span class="fu">group_by</span>(TrustPeople, TrustGovernment) <span class="sc">%&gt;%</span></span>
@@ -2953,23 +2960,23 @@ <h4>Example 2: Test of independence<a href="c06-statistical-testing.html#stattes
 <span id="cb206-16"><a href="c06-statistical-testing.html#cb206-16" tabindex="-1"></a>  <span class="fu">tab_options</span>(<span class="at">page.orientation =</span> <span class="st">&quot;landscape&quot;</span>)</span></code></pre></div>
 <div class="sourceCode" id="cb207"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb207-1"><a href="c06-statistical-testing.html#cb207-1" tabindex="-1"></a>chi_ex2_obs_table</span></code></pre></div>
 
-<div id="vpfhdsqoxw" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
-<style>#vpfhdsqoxw table {
+<div id="ddwtqmsbxc" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
+<style>#ddwtqmsbxc table {
   font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
   -webkit-font-smoothing: antialiased;
   -moz-osx-font-smoothing: grayscale;
 }
 
-#vpfhdsqoxw thead, #vpfhdsqoxw tbody, #vpfhdsqoxw tfoot, #vpfhdsqoxw tr, #vpfhdsqoxw td, #vpfhdsqoxw th {
+#ddwtqmsbxc thead, #ddwtqmsbxc tbody, #ddwtqmsbxc tfoot, #ddwtqmsbxc tr, #ddwtqmsbxc td, #ddwtqmsbxc th {
   border-style: none;
 }
 
-#vpfhdsqoxw p {
+#ddwtqmsbxc p {
   margin: 0;
   padding: 0;
 }
 
-#vpfhdsqoxw .gt_table {
+#ddwtqmsbxc .gt_table {
   display: table;
   border-collapse: collapse;
   line-height: normal;
@@ -2995,12 +3002,12 @@ <h4>Example 2: Test of independence<a href="c06-statistical-testing.html#stattes
   border-left-color: #D3D3D3;
 }
 
-#vpfhdsqoxw .gt_caption {
+#ddwtqmsbxc .gt_caption {
   padding-top: 4px;
   padding-bottom: 4px;
 }
 
-#vpfhdsqoxw .gt_title {
+#ddwtqmsbxc .gt_title {
   color: #333333;
   font-size: 125%;
   font-weight: initial;
@@ -3012,7 +3019,7 @@ <h4>Example 2: Test of independence<a href="c06-statistical-testing.html#stattes
   border-bottom-width: 0;
 }
 
-#vpfhdsqoxw .gt_subtitle {
+#ddwtqmsbxc .gt_subtitle {
   color: #333333;
   font-size: 85%;
   font-weight: initial;
@@ -3024,7 +3031,7 @@ <h4>Example 2: Test of independence<a href="c06-statistical-testing.html#stattes
   border-top-width: 0;
 }
 
-#vpfhdsqoxw .gt_heading {
+#ddwtqmsbxc .gt_heading {
   background-color: #FFFFFF;
   text-align: center;
   border-bottom-color: #FFFFFF;
@@ -3036,13 +3043,13 @@ <h4>Example 2: Test of independence<a href="c06-statistical-testing.html#stattes
   border-right-color: #D3D3D3;
 }
 
-#vpfhdsqoxw .gt_bottom_border {
+#ddwtqmsbxc .gt_bottom_border {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
 }
 
-#vpfhdsqoxw .gt_col_headings {
+#ddwtqmsbxc .gt_col_headings {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -3057,7 +3064,7 @@ <h4>Example 2: Test of independence<a href="c06-statistical-testing.html#stattes
   border-right-color: #D3D3D3;
 }
 
-#vpfhdsqoxw .gt_col_heading {
+#ddwtqmsbxc .gt_col_heading {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -3077,7 +3084,7 @@ <h4>Example 2: Test of independence<a href="c06-statistical-testing.html#stattes
   overflow-x: hidden;
 }
 
-#vpfhdsqoxw .gt_column_spanner_outer {
+#ddwtqmsbxc .gt_column_spanner_outer {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -3089,15 +3096,15 @@ <h4>Example 2: Test of independence<a href="c06-statistical-testing.html#stattes
   padding-right: 4px;
 }
 
-#vpfhdsqoxw .gt_column_spanner_outer:first-child {
+#ddwtqmsbxc .gt_column_spanner_outer:first-child {
   padding-left: 0;
 }
 
-#vpfhdsqoxw .gt_column_spanner_outer:last-child {
+#ddwtqmsbxc .gt_column_spanner_outer:last-child {
   padding-right: 0;
 }
 
-#vpfhdsqoxw .gt_column_spanner {
+#ddwtqmsbxc .gt_column_spanner {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
@@ -3109,11 +3116,11 @@ <h4>Example 2: Test of independence<a href="c06-statistical-testing.html#stattes
   width: 100%;
 }
 
-#vpfhdsqoxw .gt_spanner_row {
+#ddwtqmsbxc .gt_spanner_row {
   border-bottom-style: hidden;
 }
 
-#vpfhdsqoxw .gt_group_heading {
+#ddwtqmsbxc .gt_group_heading {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -3139,7 +3146,7 @@ <h4>Example 2: Test of independence<a href="c06-statistical-testing.html#stattes
   text-align: left;
 }
 
-#vpfhdsqoxw .gt_empty_group_heading {
+#ddwtqmsbxc .gt_empty_group_heading {
   padding: 0.5px;
   color: #333333;
   background-color: #FFFFFF;
@@ -3154,15 +3161,15 @@ <h4>Example 2: Test of independence<a href="c06-statistical-testing.html#stattes
   vertical-align: middle;
 }
 
-#vpfhdsqoxw .gt_from_md > :first-child {
+#ddwtqmsbxc .gt_from_md > :first-child {
   margin-top: 0;
 }
 
-#vpfhdsqoxw .gt_from_md > :last-child {
+#ddwtqmsbxc .gt_from_md > :last-child {
   margin-bottom: 0;
 }
 
-#vpfhdsqoxw .gt_row {
+#ddwtqmsbxc .gt_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -3181,7 +3188,7 @@ <h4>Example 2: Test of independence<a href="c06-statistical-testing.html#stattes
   overflow-x: hidden;
 }
 
-#vpfhdsqoxw .gt_stub {
+#ddwtqmsbxc .gt_stub {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -3194,7 +3201,7 @@ <h4>Example 2: Test of independence<a href="c06-statistical-testing.html#stattes
   padding-right: 5px;
 }
 
-#vpfhdsqoxw .gt_stub_row_group {
+#ddwtqmsbxc .gt_stub_row_group {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -3208,15 +3215,15 @@ <h4>Example 2: Test of independence<a href="c06-statistical-testing.html#stattes
   vertical-align: top;
 }
 
-#vpfhdsqoxw .gt_row_group_first td {
+#ddwtqmsbxc .gt_row_group_first td {
   border-top-width: 2px;
 }
 
-#vpfhdsqoxw .gt_row_group_first th {
+#ddwtqmsbxc .gt_row_group_first th {
   border-top-width: 2px;
 }
 
-#vpfhdsqoxw .gt_summary_row {
+#ddwtqmsbxc .gt_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -3226,16 +3233,16 @@ <h4>Example 2: Test of independence<a href="c06-statistical-testing.html#stattes
   padding-right: 5px;
 }
 
-#vpfhdsqoxw .gt_first_summary_row {
+#ddwtqmsbxc .gt_first_summary_row {
   border-top-style: solid;
   border-top-color: #D3D3D3;
 }
 
-#vpfhdsqoxw .gt_first_summary_row.thick {
+#ddwtqmsbxc .gt_first_summary_row.thick {
   border-top-width: 2px;
 }
 
-#vpfhdsqoxw .gt_last_summary_row {
+#ddwtqmsbxc .gt_last_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -3245,7 +3252,7 @@ <h4>Example 2: Test of independence<a href="c06-statistical-testing.html#stattes
   border-bottom-color: #D3D3D3;
 }
 
-#vpfhdsqoxw .gt_grand_summary_row {
+#ddwtqmsbxc .gt_grand_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -3255,7 +3262,7 @@ <h4>Example 2: Test of independence<a href="c06-statistical-testing.html#stattes
   padding-right: 5px;
 }
 
-#vpfhdsqoxw .gt_first_grand_summary_row {
+#ddwtqmsbxc .gt_first_grand_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -3265,7 +3272,7 @@ <h4>Example 2: Test of independence<a href="c06-statistical-testing.html#stattes
   border-top-color: #D3D3D3;
 }
 
-#vpfhdsqoxw .gt_last_grand_summary_row_top {
+#ddwtqmsbxc .gt_last_grand_summary_row_top {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -3275,11 +3282,11 @@ <h4>Example 2: Test of independence<a href="c06-statistical-testing.html#stattes
   border-bottom-color: #D3D3D3;
 }
 
-#vpfhdsqoxw .gt_striped {
+#ddwtqmsbxc .gt_striped {
   background-color: rgba(128, 128, 128, 0.05);
 }
 
-#vpfhdsqoxw .gt_table_body {
+#ddwtqmsbxc .gt_table_body {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -3288,7 +3295,7 @@ <h4>Example 2: Test of independence<a href="c06-statistical-testing.html#stattes
   border-bottom-color: #D3D3D3;
 }
 
-#vpfhdsqoxw .gt_footnotes {
+#ddwtqmsbxc .gt_footnotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -3302,7 +3309,7 @@ <h4>Example 2: Test of independence<a href="c06-statistical-testing.html#stattes
   border-right-color: #D3D3D3;
 }
 
-#vpfhdsqoxw .gt_footnote {
+#ddwtqmsbxc .gt_footnote {
   margin: 0px;
   font-size: 90%;
   padding-top: 4px;
@@ -3311,7 +3318,7 @@ <h4>Example 2: Test of independence<a href="c06-statistical-testing.html#stattes
   padding-right: 5px;
 }
 
-#vpfhdsqoxw .gt_sourcenotes {
+#ddwtqmsbxc .gt_sourcenotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -3325,7 +3332,7 @@ <h4>Example 2: Test of independence<a href="c06-statistical-testing.html#stattes
   border-right-color: #D3D3D3;
 }
 
-#vpfhdsqoxw .gt_sourcenote {
+#ddwtqmsbxc .gt_sourcenote {
   font-size: 90%;
   padding-top: 4px;
   padding-bottom: 4px;
@@ -3333,63 +3340,63 @@ <h4>Example 2: Test of independence<a href="c06-statistical-testing.html#stattes
   padding-right: 5px;
 }
 
-#vpfhdsqoxw .gt_left {
+#ddwtqmsbxc .gt_left {
   text-align: left;
 }
 
-#vpfhdsqoxw .gt_center {
+#ddwtqmsbxc .gt_center {
   text-align: center;
 }
 
-#vpfhdsqoxw .gt_right {
+#ddwtqmsbxc .gt_right {
   text-align: right;
   font-variant-numeric: tabular-nums;
 }
 
-#vpfhdsqoxw .gt_font_normal {
+#ddwtqmsbxc .gt_font_normal {
   font-weight: normal;
 }
 
-#vpfhdsqoxw .gt_font_bold {
+#ddwtqmsbxc .gt_font_bold {
   font-weight: bold;
 }
 
-#vpfhdsqoxw .gt_font_italic {
+#ddwtqmsbxc .gt_font_italic {
   font-style: italic;
 }
 
-#vpfhdsqoxw .gt_super {
+#ddwtqmsbxc .gt_super {
   font-size: 65%;
 }
 
-#vpfhdsqoxw .gt_footnote_marks {
+#ddwtqmsbxc .gt_footnote_marks {
   font-size: 75%;
   vertical-align: 0.4em;
   position: initial;
 }
 
-#vpfhdsqoxw .gt_asterisk {
+#ddwtqmsbxc .gt_asterisk {
   font-size: 100%;
   vertical-align: 0;
 }
 
-#vpfhdsqoxw .gt_indent_1 {
+#ddwtqmsbxc .gt_indent_1 {
   text-indent: 5px;
 }
 
-#vpfhdsqoxw .gt_indent_2 {
+#ddwtqmsbxc .gt_indent_2 {
   text-indent: 10px;
 }
 
-#vpfhdsqoxw .gt_indent_3 {
+#ddwtqmsbxc .gt_indent_3 {
   text-indent: 15px;
 }
 
-#vpfhdsqoxw .gt_indent_4 {
+#ddwtqmsbxc .gt_indent_4 {
   text-indent: 20px;
 }
 
-#vpfhdsqoxw .gt_indent_5 {
+#ddwtqmsbxc .gt_indent_5 {
   text-indent: 25px;
 }
 </style>
@@ -3447,24 +3454,26 @@ <h4>Example 2: Test of independence<a href="c06-statistical-testing.html#stattes
   
 </table>
 </div>
-<p>Both methods produce the same output as the <code>svychisq()</code> function does account for the survey design. However, calculating the proportions directly from the design object means we can also obtain the variance information. In this case, the table output displays the survey estimate followed by the confidence intervals. Based on the output, we can see that of those who never trust people, 50.3% also never trust the government, while the proportions of never trusting the government are much lower for each of the other levels of trusting people.</p>
-<p>We may find it easier to look at these proportions graphically. We can use <code>ggplot()</code> and facets to provide an overview as shown below to create Figure <a href="c06-statistical-testing.html#fig:stattest-chi-ex2-graph">6.2</a>:</p>
+<p>Both methods produce the same output as the <code>svychisq()</code> function. However, calculating the proportions directly from the design object allows us to obtain the variance information. In this case, the table output displays the survey estimate followed by the confidence intervals. Based on the output, we can see that of those who never trust people, 50.3% also never trust the government, while the proportions of never trusting the government are much lower for each of the other levels of trusting people.</p>
+<p>We may find it easier to look at these proportions graphically. We can use <code>ggplot()</code> and facets to provide an overview to create Figure <a href="c06-statistical-testing.html#fig:stattest-chi-ex2-graph">6.2</a> below:</p>
 <div class="sourceCode" id="cb208"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb208-1"><a href="c06-statistical-testing.html#cb208-1" tabindex="-1"></a>chi_ex2_obs <span class="sc">%&gt;%</span></span>
 <span id="cb208-2"><a href="c06-statistical-testing.html#cb208-2" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">TrustPeople=</span></span>
 <span id="cb208-3"><a href="c06-statistical-testing.html#cb208-3" tabindex="-1"></a>           <span class="fu">fct_reorder</span>(<span class="fu">str_c</span>(<span class="st">&quot;Trust in People:</span><span class="sc">\n</span><span class="st">&quot;</span>, TrustPeople), </span>
 <span id="cb208-4"><a href="c06-statistical-testing.html#cb208-4" tabindex="-1"></a>                       <span class="fu">order</span>(TrustPeople))) <span class="sc">%&gt;%</span></span>
-<span id="cb208-5"><a href="c06-statistical-testing.html#cb208-5" tabindex="-1"></a>  <span class="fu">ggplot</span>(<span class="fu">aes</span>(<span class="at">x =</span> TrustGovernment, <span class="at">y =</span> Observed, <span class="at">color =</span> TrustGovernment)) <span class="sc">+</span></span>
-<span id="cb208-6"><a href="c06-statistical-testing.html#cb208-6" tabindex="-1"></a>  <span class="fu">facet_wrap</span>( <span class="sc">~</span> TrustPeople, <span class="at">ncol =</span> <span class="dv">5</span>) <span class="sc">+</span></span>
-<span id="cb208-7"><a href="c06-statistical-testing.html#cb208-7" tabindex="-1"></a>  <span class="fu">geom_point</span>() <span class="sc">+</span></span>
-<span id="cb208-8"><a href="c06-statistical-testing.html#cb208-8" tabindex="-1"></a>  <span class="fu">geom_errorbar</span>(<span class="fu">aes</span>(<span class="at">ymin =</span> Observed_low, <span class="at">ymax =</span> Observed_upp)) <span class="sc">+</span></span>
-<span id="cb208-9"><a href="c06-statistical-testing.html#cb208-9" tabindex="-1"></a>  <span class="fu">ylab</span>(<span class="st">&quot;Proportion&quot;</span>) <span class="sc">+</span></span>
-<span id="cb208-10"><a href="c06-statistical-testing.html#cb208-10" tabindex="-1"></a>  <span class="fu">xlab</span>(<span class="st">&quot;&quot;</span>) <span class="sc">+</span></span>
-<span id="cb208-11"><a href="c06-statistical-testing.html#cb208-11" tabindex="-1"></a>  <span class="fu">theme_bw</span>() <span class="sc">+</span></span>
-<span id="cb208-12"><a href="c06-statistical-testing.html#cb208-12" tabindex="-1"></a>  <span class="fu">scale_color_manual</span>(<span class="at">name=</span><span class="st">&quot;Trust in Government&quot;</span>, <span class="at">values=</span>book_colors) <span class="sc">+</span></span>
-<span id="cb208-13"><a href="c06-statistical-testing.html#cb208-13" tabindex="-1"></a>  <span class="fu">theme</span>(<span class="at">axis.text.x =</span> <span class="fu">element_blank</span>(), </span>
-<span id="cb208-14"><a href="c06-statistical-testing.html#cb208-14" tabindex="-1"></a>        <span class="at">axis.ticks.x =</span> <span class="fu">element_blank</span>(),</span>
-<span id="cb208-15"><a href="c06-statistical-testing.html#cb208-15" tabindex="-1"></a>        <span class="at">legend.position =</span> <span class="st">&quot;bottom&quot;</span>) <span class="sc">+</span></span>
-<span id="cb208-16"><a href="c06-statistical-testing.html#cb208-16" tabindex="-1"></a>  <span class="fu">guides</span>(<span class="at">col =</span> <span class="fu">guide_legend</span>(<span class="at">nrow=</span><span class="dv">2</span>))</span></code></pre></div>
+<span id="cb208-5"><a href="c06-statistical-testing.html#cb208-5" tabindex="-1"></a>  <span class="fu">ggplot</span>(</span>
+<span id="cb208-6"><a href="c06-statistical-testing.html#cb208-6" tabindex="-1"></a>    <span class="fu">aes</span>(<span class="at">x =</span> TrustGovernment, <span class="at">y =</span> Observed, <span class="at">color =</span> TrustGovernment)) <span class="sc">+</span></span>
+<span id="cb208-7"><a href="c06-statistical-testing.html#cb208-7" tabindex="-1"></a>  <span class="fu">facet_wrap</span>( <span class="sc">~</span> TrustPeople, <span class="at">ncol =</span> <span class="dv">5</span>) <span class="sc">+</span></span>
+<span id="cb208-8"><a href="c06-statistical-testing.html#cb208-8" tabindex="-1"></a>  <span class="fu">geom_point</span>() <span class="sc">+</span></span>
+<span id="cb208-9"><a href="c06-statistical-testing.html#cb208-9" tabindex="-1"></a>  <span class="fu">geom_errorbar</span>(<span class="fu">aes</span>(<span class="at">ymin =</span> Observed_low, <span class="at">ymax =</span> Observed_upp)) <span class="sc">+</span></span>
+<span id="cb208-10"><a href="c06-statistical-testing.html#cb208-10" tabindex="-1"></a>  <span class="fu">ylab</span>(<span class="st">&quot;Proportion&quot;</span>) <span class="sc">+</span></span>
+<span id="cb208-11"><a href="c06-statistical-testing.html#cb208-11" tabindex="-1"></a>  <span class="fu">xlab</span>(<span class="st">&quot;&quot;</span>) <span class="sc">+</span></span>
+<span id="cb208-12"><a href="c06-statistical-testing.html#cb208-12" tabindex="-1"></a>  <span class="fu">theme_bw</span>() <span class="sc">+</span></span>
+<span id="cb208-13"><a href="c06-statistical-testing.html#cb208-13" tabindex="-1"></a>  <span class="fu">scale_color_manual</span>(<span class="at">name=</span><span class="st">&quot;Trust in Government&quot;</span>, </span>
+<span id="cb208-14"><a href="c06-statistical-testing.html#cb208-14" tabindex="-1"></a>                     <span class="at">values=</span>book_colors) <span class="sc">+</span></span>
+<span id="cb208-15"><a href="c06-statistical-testing.html#cb208-15" tabindex="-1"></a>  <span class="fu">theme</span>(<span class="at">axis.text.x =</span> <span class="fu">element_blank</span>(), </span>
+<span id="cb208-16"><a href="c06-statistical-testing.html#cb208-16" tabindex="-1"></a>        <span class="at">axis.ticks.x =</span> <span class="fu">element_blank</span>(),</span>
+<span id="cb208-17"><a href="c06-statistical-testing.html#cb208-17" tabindex="-1"></a>        <span class="at">legend.position =</span> <span class="st">&quot;bottom&quot;</span>) <span class="sc">+</span></span>
+<span id="cb208-18"><a href="c06-statistical-testing.html#cb208-18" tabindex="-1"></a>  <span class="fu">guides</span>(<span class="at">col =</span> <span class="fu">guide_legend</span>(<span class="at">nrow=</span><span class="dv">2</span>))</span></code></pre></div>
 <div class="figure"><span style="display:block;" id="fig:stattest-chi-ex2-graph"></span>
 <img src="bookdown_files/figure-html/stattest-chi-ex2-graph-1.png" alt="Proportion of adults in the U.S. by levels of trust in people and government with confidence intervals, ANES 2020. This presents the same information as the previous table in graphical form." width="672" />
 <p class="caption">
@@ -3521,23 +3530,23 @@ <h4>Example 3: Test of homogeneity<a href="c06-statistical-testing.html#stattest
 <span id="cb211-14"><a href="c06-statistical-testing.html#cb211-14" tabindex="-1"></a>  <span class="fu">tab_stubhead</span>(<span class="at">label =</span> <span class="st">&quot;Age Group&quot;</span>)</span></code></pre></div>
 <div class="sourceCode" id="cb212"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb212-1"><a href="c06-statistical-testing.html#cb212-1" tabindex="-1"></a>chi_ex3_obs_table</span></code></pre></div>
 
-<div id="quoubnyrtk" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
-<style>#quoubnyrtk table {
+<div id="zbxwhiitju" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
+<style>#zbxwhiitju table {
   font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
   -webkit-font-smoothing: antialiased;
   -moz-osx-font-smoothing: grayscale;
 }
 
-#quoubnyrtk thead, #quoubnyrtk tbody, #quoubnyrtk tfoot, #quoubnyrtk tr, #quoubnyrtk td, #quoubnyrtk th {
+#zbxwhiitju thead, #zbxwhiitju tbody, #zbxwhiitju tfoot, #zbxwhiitju tr, #zbxwhiitju td, #zbxwhiitju th {
   border-style: none;
 }
 
-#quoubnyrtk p {
+#zbxwhiitju p {
   margin: 0;
   padding: 0;
 }
 
-#quoubnyrtk .gt_table {
+#zbxwhiitju .gt_table {
   display: table;
   border-collapse: collapse;
   line-height: normal;
@@ -3563,12 +3572,12 @@ <h4>Example 3: Test of homogeneity<a href="c06-statistical-testing.html#stattest
   border-left-color: #D3D3D3;
 }
 
-#quoubnyrtk .gt_caption {
+#zbxwhiitju .gt_caption {
   padding-top: 4px;
   padding-bottom: 4px;
 }
 
-#quoubnyrtk .gt_title {
+#zbxwhiitju .gt_title {
   color: #333333;
   font-size: 125%;
   font-weight: initial;
@@ -3580,7 +3589,7 @@ <h4>Example 3: Test of homogeneity<a href="c06-statistical-testing.html#stattest
   border-bottom-width: 0;
 }
 
-#quoubnyrtk .gt_subtitle {
+#zbxwhiitju .gt_subtitle {
   color: #333333;
   font-size: 85%;
   font-weight: initial;
@@ -3592,7 +3601,7 @@ <h4>Example 3: Test of homogeneity<a href="c06-statistical-testing.html#stattest
   border-top-width: 0;
 }
 
-#quoubnyrtk .gt_heading {
+#zbxwhiitju .gt_heading {
   background-color: #FFFFFF;
   text-align: center;
   border-bottom-color: #FFFFFF;
@@ -3604,13 +3613,13 @@ <h4>Example 3: Test of homogeneity<a href="c06-statistical-testing.html#stattest
   border-right-color: #D3D3D3;
 }
 
-#quoubnyrtk .gt_bottom_border {
+#zbxwhiitju .gt_bottom_border {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
 }
 
-#quoubnyrtk .gt_col_headings {
+#zbxwhiitju .gt_col_headings {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -3625,7 +3634,7 @@ <h4>Example 3: Test of homogeneity<a href="c06-statistical-testing.html#stattest
   border-right-color: #D3D3D3;
 }
 
-#quoubnyrtk .gt_col_heading {
+#zbxwhiitju .gt_col_heading {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -3645,7 +3654,7 @@ <h4>Example 3: Test of homogeneity<a href="c06-statistical-testing.html#stattest
   overflow-x: hidden;
 }
 
-#quoubnyrtk .gt_column_spanner_outer {
+#zbxwhiitju .gt_column_spanner_outer {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -3657,15 +3666,15 @@ <h4>Example 3: Test of homogeneity<a href="c06-statistical-testing.html#stattest
   padding-right: 4px;
 }
 
-#quoubnyrtk .gt_column_spanner_outer:first-child {
+#zbxwhiitju .gt_column_spanner_outer:first-child {
   padding-left: 0;
 }
 
-#quoubnyrtk .gt_column_spanner_outer:last-child {
+#zbxwhiitju .gt_column_spanner_outer:last-child {
   padding-right: 0;
 }
 
-#quoubnyrtk .gt_column_spanner {
+#zbxwhiitju .gt_column_spanner {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
@@ -3677,11 +3686,11 @@ <h4>Example 3: Test of homogeneity<a href="c06-statistical-testing.html#stattest
   width: 100%;
 }
 
-#quoubnyrtk .gt_spanner_row {
+#zbxwhiitju .gt_spanner_row {
   border-bottom-style: hidden;
 }
 
-#quoubnyrtk .gt_group_heading {
+#zbxwhiitju .gt_group_heading {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -3707,7 +3716,7 @@ <h4>Example 3: Test of homogeneity<a href="c06-statistical-testing.html#stattest
   text-align: left;
 }
 
-#quoubnyrtk .gt_empty_group_heading {
+#zbxwhiitju .gt_empty_group_heading {
   padding: 0.5px;
   color: #333333;
   background-color: #FFFFFF;
@@ -3722,15 +3731,15 @@ <h4>Example 3: Test of homogeneity<a href="c06-statistical-testing.html#stattest
   vertical-align: middle;
 }
 
-#quoubnyrtk .gt_from_md > :first-child {
+#zbxwhiitju .gt_from_md > :first-child {
   margin-top: 0;
 }
 
-#quoubnyrtk .gt_from_md > :last-child {
+#zbxwhiitju .gt_from_md > :last-child {
   margin-bottom: 0;
 }
 
-#quoubnyrtk .gt_row {
+#zbxwhiitju .gt_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -3749,7 +3758,7 @@ <h4>Example 3: Test of homogeneity<a href="c06-statistical-testing.html#stattest
   overflow-x: hidden;
 }
 
-#quoubnyrtk .gt_stub {
+#zbxwhiitju .gt_stub {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -3762,7 +3771,7 @@ <h4>Example 3: Test of homogeneity<a href="c06-statistical-testing.html#stattest
   padding-right: 5px;
 }
 
-#quoubnyrtk .gt_stub_row_group {
+#zbxwhiitju .gt_stub_row_group {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -3776,15 +3785,15 @@ <h4>Example 3: Test of homogeneity<a href="c06-statistical-testing.html#stattest
   vertical-align: top;
 }
 
-#quoubnyrtk .gt_row_group_first td {
+#zbxwhiitju .gt_row_group_first td {
   border-top-width: 2px;
 }
 
-#quoubnyrtk .gt_row_group_first th {
+#zbxwhiitju .gt_row_group_first th {
   border-top-width: 2px;
 }
 
-#quoubnyrtk .gt_summary_row {
+#zbxwhiitju .gt_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -3794,16 +3803,16 @@ <h4>Example 3: Test of homogeneity<a href="c06-statistical-testing.html#stattest
   padding-right: 5px;
 }
 
-#quoubnyrtk .gt_first_summary_row {
+#zbxwhiitju .gt_first_summary_row {
   border-top-style: solid;
   border-top-color: #D3D3D3;
 }
 
-#quoubnyrtk .gt_first_summary_row.thick {
+#zbxwhiitju .gt_first_summary_row.thick {
   border-top-width: 2px;
 }
 
-#quoubnyrtk .gt_last_summary_row {
+#zbxwhiitju .gt_last_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -3813,7 +3822,7 @@ <h4>Example 3: Test of homogeneity<a href="c06-statistical-testing.html#stattest
   border-bottom-color: #D3D3D3;
 }
 
-#quoubnyrtk .gt_grand_summary_row {
+#zbxwhiitju .gt_grand_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -3823,7 +3832,7 @@ <h4>Example 3: Test of homogeneity<a href="c06-statistical-testing.html#stattest
   padding-right: 5px;
 }
 
-#quoubnyrtk .gt_first_grand_summary_row {
+#zbxwhiitju .gt_first_grand_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -3833,7 +3842,7 @@ <h4>Example 3: Test of homogeneity<a href="c06-statistical-testing.html#stattest
   border-top-color: #D3D3D3;
 }
 
-#quoubnyrtk .gt_last_grand_summary_row_top {
+#zbxwhiitju .gt_last_grand_summary_row_top {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -3843,11 +3852,11 @@ <h4>Example 3: Test of homogeneity<a href="c06-statistical-testing.html#stattest
   border-bottom-color: #D3D3D3;
 }
 
-#quoubnyrtk .gt_striped {
+#zbxwhiitju .gt_striped {
   background-color: rgba(128, 128, 128, 0.05);
 }
 
-#quoubnyrtk .gt_table_body {
+#zbxwhiitju .gt_table_body {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -3856,7 +3865,7 @@ <h4>Example 3: Test of homogeneity<a href="c06-statistical-testing.html#stattest
   border-bottom-color: #D3D3D3;
 }
 
-#quoubnyrtk .gt_footnotes {
+#zbxwhiitju .gt_footnotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -3870,7 +3879,7 @@ <h4>Example 3: Test of homogeneity<a href="c06-statistical-testing.html#stattest
   border-right-color: #D3D3D3;
 }
 
-#quoubnyrtk .gt_footnote {
+#zbxwhiitju .gt_footnote {
   margin: 0px;
   font-size: 90%;
   padding-top: 4px;
@@ -3879,7 +3888,7 @@ <h4>Example 3: Test of homogeneity<a href="c06-statistical-testing.html#stattest
   padding-right: 5px;
 }
 
-#quoubnyrtk .gt_sourcenotes {
+#zbxwhiitju .gt_sourcenotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -3893,7 +3902,7 @@ <h4>Example 3: Test of homogeneity<a href="c06-statistical-testing.html#stattest
   border-right-color: #D3D3D3;
 }
 
-#quoubnyrtk .gt_sourcenote {
+#zbxwhiitju .gt_sourcenote {
   font-size: 90%;
   padding-top: 4px;
   padding-bottom: 4px;
@@ -3901,63 +3910,63 @@ <h4>Example 3: Test of homogeneity<a href="c06-statistical-testing.html#stattest
   padding-right: 5px;
 }
 
-#quoubnyrtk .gt_left {
+#zbxwhiitju .gt_left {
   text-align: left;
 }
 
-#quoubnyrtk .gt_center {
+#zbxwhiitju .gt_center {
   text-align: center;
 }
 
-#quoubnyrtk .gt_right {
+#zbxwhiitju .gt_right {
   text-align: right;
   font-variant-numeric: tabular-nums;
 }
 
-#quoubnyrtk .gt_font_normal {
+#zbxwhiitju .gt_font_normal {
   font-weight: normal;
 }
 
-#quoubnyrtk .gt_font_bold {
+#zbxwhiitju .gt_font_bold {
   font-weight: bold;
 }
 
-#quoubnyrtk .gt_font_italic {
+#zbxwhiitju .gt_font_italic {
   font-style: italic;
 }
 
-#quoubnyrtk .gt_super {
+#zbxwhiitju .gt_super {
   font-size: 65%;
 }
 
-#quoubnyrtk .gt_footnote_marks {
+#zbxwhiitju .gt_footnote_marks {
   font-size: 75%;
   vertical-align: 0.4em;
   position: initial;
 }
 
-#quoubnyrtk .gt_asterisk {
+#zbxwhiitju .gt_asterisk {
   font-size: 100%;
   vertical-align: 0;
 }
 
-#quoubnyrtk .gt_indent_1 {
+#zbxwhiitju .gt_indent_1 {
   text-indent: 5px;
 }
 
-#quoubnyrtk .gt_indent_2 {
+#zbxwhiitju .gt_indent_2 {
   text-indent: 10px;
 }
 
-#quoubnyrtk .gt_indent_3 {
+#zbxwhiitju .gt_indent_3 {
   text-indent: 15px;
 }
 
-#quoubnyrtk .gt_indent_4 {
+#zbxwhiitju .gt_indent_4 {
   text-indent: 20px;
 }
 
-#quoubnyrtk .gt_indent_5 {
+#zbxwhiitju .gt_indent_5 {
   text-indent: 25px;
 }
 </style>
@@ -4002,18 +4011,18 @@ <h4>Example 3: Test of homogeneity<a href="c06-statistical-testing.html#stattest
   
 </table>
 </div>
-<p>We can see that the age group distribution that voted for Biden and other candidates was younger than those that voted for Trump. For example, of those who voted for Biden, 20.4% were in the 18-29 age group, compared to only 11.4% of those who voted for Trump were in that age group. On the other side, 23.4% of those who voted for Trump were in the 50-59 age group compared to only 15.4% of those who voted for Biden.</p>
+<p>We can see that the age group distribution that voted for Biden and other candidates was younger than those that voted for Trump. For example, of those who voted for Biden, 20.4% were in the 18-29 age group, compared to only 11.4% of those who voted for Trump were in that age group. Conversely, 23.4% of those who voted for Trump were in the 50-59 age group compared to only 15.4% of those who voted for Biden.</p>
 </div>
 </div>
 </div>
 <div id="stattest-exercises" class="section level2 hasAnchor" number="6.5">
 <h2><span class="header-section-number">6.5</span> Exercises<a href="c06-statistical-testing.html#stattest-exercises" class="anchor-section" aria-label="Anchor link to header"></a></h2>
-<p>The exercises use the design objects <code>anes_des</code> and <code>recs_des</code> as provided in the Prerequisites box in the <a href="c06-statistical-testing.html#c06-statistical-testing">beginning of the chapter</a>. Here are some exercises for practicing conducting t-tests using <code>svyttest()</code>:</p>
+<p>The exercises use the design objects <code>anes_des</code> and <code>recs_des</code> as provided in the Prerequisites box at the <a href="c06-statistical-testing.html#c06-statistical-testing">beginning of the chapter</a>. Here are some exercises for practicing conducting t-tests using <code>svyttest()</code>:</p>
 <ol style="list-style-type: decimal">
-<li><p>Using the RECS data, do more than 50% of U.S. households use AC (<code>ACUsed</code>)?</p></li>
-<li><p>Using the RECS data, does the average temperature that U.S. households set their thermostats to differ between the day and night in the winter (<code>WinterTempDay</code> and <code>WinterTempNight</code>)?</p></li>
+<li><p>Using the RECS data, do more than 50% of U.S. households use A/C (<code>ACUsed</code>)?</p></li>
+<li><p>Using the RECS data, does the average temperature at which U.S. households set their thermostats differ between the day and night in the winter (<code>WinterTempDay</code> and <code>WinterTempNight</code>)?</p></li>
 <li><p>Using the ANES data, does the average age (<code>Age</code>) of those who voted for Joseph Biden in 2020 (<code>VotedPres2020_selection</code>) differ from those who voted for another candidate?</p></li>
-<li><p>If you wanted to determine if the political party affiliation differed for males and females, what test would you use?</p></li>
+<li><p>If we wanted to determine if the political party affiliation differed for males and females, what test would we use?</p></li>
 </ol>
 <ol style="list-style-type: lower-alpha">
 <li>Goodness of fit test (<code>svygofchisq()</code>)</li>
@@ -4061,7 +4070,7 @@ <h3>References<a href="references.html#references" class="anchor-section" aria-l
 <li id="fn19"><p>During the winter, what is your home’s typical indoor temperature inside your home at night?<a href="c06-statistical-testing.html#fnref19" class="footnote-back">↩︎</a></p></li>
 <li id="fn20"><p>These two statistics can also be used for goodness of fit tests if the <code>svygofchisq()</code> function is not used.<a href="c06-statistical-testing.html#fnref20" class="footnote-back">↩︎</a></p></li>
 <li id="fn21"><p>What is the highest level of school you have completed or the highest degree you have received?<a href="c06-statistical-testing.html#fnref21" class="footnote-back">↩︎</a></p></li>
-<li id="fn22"><p>Data was pulled from data.census.gov using the S1501 Education Attainment 2020: ACS 5-Year Estimates Subject Tables<a href="c06-statistical-testing.html#fnref22" class="footnote-back">↩︎</a></p></li>
+<li id="fn22"><p>Data was pulled from data.census.gov using the S1501 Education Attainment 2020: ACS 5-Year Estimates Subject Tables.<a href="c06-statistical-testing.html#fnref22" class="footnote-back">↩︎</a></p></li>
 </ol>
 </div>
             </section>
diff --git a/c07-modeling.html b/c07-modeling.html
index 53881ab4..e51eae2e 100644
--- a/c07-modeling.html
+++ b/c07-modeling.html
@@ -23,7 +23,7 @@
 <meta name="author" content="Stephanie Zimmer, Rebecca J. Powell, and Isabella Velásquez" />
 
 
-<meta name="date" content="2024-04-16" />
+<meta name="date" content="2024-04-24" />
 
   <meta name="viewport" content="width=device-width, initial-scale=1" />
   <meta name="apple-mobile-web-app-capable" content="yes" />
@@ -229,7 +229,7 @@
 <li class="chapter" data-level="4.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#introduction-3"><i class="fa fa-check"></i><b>4.1</b> Introduction</a></li>
 <li class="chapter" data-level="4.2" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup"><i class="fa fa-check"></i><b>4.2</b> Setup</a>
 <ul>
-<li class="chapter" data-level="4.2.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#packages"><i class="fa fa-check"></i><b>4.2.1</b> Packages</a></li>
+<li class="chapter" data-level="4.2.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup-load-pkgs"><i class="fa fa-check"></i><b>4.2.1</b> Packages</a></li>
 <li class="chapter" data-level="4.2.2" data-path="c04-getting-started.html"><a href="c04-getting-started.html#data"><i class="fa fa-check"></i><b>4.2.2</b> Data</a></li>
 <li class="chapter" data-level="4.2.3" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup-des-obj"><i class="fa fa-check"></i><b>4.2.3</b> Design objects</a></li>
 </ul></li>
@@ -358,7 +358,7 @@
 <li class="chapter" data-level="9.9.1" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#random-number-seeds"><i class="fa fa-check"></i><b>9.9.1</b> Random number seeds</a></li>
 <li class="chapter" data-level="9.9.2" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#descriptive-names-and-labels"><i class="fa fa-check"></i><b>9.9.2</b> Descriptive names and labels</a></li>
 </ul></li>
-<li class="chapter" data-level="9.10" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#summary"><i class="fa fa-check"></i><b>9.10</b> Summary</a></li>
+<li class="chapter" data-level="9.10" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#additional-resources-1"><i class="fa fa-check"></i><b>9.10</b> Additional resources</a></li>
 </ul></li>
 <li class="part"><span><b>IV Real life data</b></span></li>
 <li class="chapter" data-level="10" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html"><i class="fa fa-check"></i><b>10</b> Sample designs and replicate weights</a>
@@ -380,7 +380,7 @@
 <ul>
 <li class="chapter" data-level="10.4.1" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#balanced-repeated-replication-brr-method"><i class="fa fa-check"></i><b>10.4.1</b> Balanced Repeated Replication (BRR) method</a></li>
 <li class="chapter" data-level="10.4.2" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#fays-brr-method"><i class="fa fa-check"></i><b>10.4.2</b> Fay’s BRR method</a></li>
-<li class="chapter" data-level="10.4.3" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#jackknife-method"><i class="fa fa-check"></i><b>10.4.3</b> Jackknife method</a></li>
+<li class="chapter" data-level="10.4.3" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#samp-jackknife"><i class="fa fa-check"></i><b>10.4.3</b> Jackknife method</a></li>
 <li class="chapter" data-level="10.4.4" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#bootstrap-method"><i class="fa fa-check"></i><b>10.4.4</b> Bootstrap method</a></li>
 </ul></li>
 <li class="chapter" data-level="10.5" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#exercises-2"><i class="fa fa-check"></i><b>10.5</b> Exercises</a></li>
@@ -398,14 +398,14 @@
 <li class="chapter" data-level="11.4" data-path="c11-missing-data.html"><a href="c11-missing-data.html#analysis-with-missing-data"><i class="fa fa-check"></i><b>11.4</b> Analysis with missing data</a>
 <ul>
 <li class="chapter" data-level="11.4.1" data-path="c11-missing-data.html"><a href="c11-missing-data.html#recoding-missing-data"><i class="fa fa-check"></i><b>11.4.1</b> Recoding missing data</a></li>
-<li class="chapter" data-level="11.4.2" data-path="c11-missing-data.html"><a href="c11-missing-data.html#accounting-for-skip-patterns"><i class="fa fa-check"></i><b>11.4.2</b> Accounting for skip patterns</a></li>
+<li class="chapter" data-level="11.4.2" data-path="c11-missing-data.html"><a href="c11-missing-data.html#missing-skip-patt"><i class="fa fa-check"></i><b>11.4.2</b> Accounting for skip patterns</a></li>
 </ul></li>
 </ul></li>
 <li class="chapter" data-level="12" data-path="c12-recommendations.html"><a href="c12-recommendations.html"><i class="fa fa-check"></i><b>12</b> Successful survey analysis recommendations</a>
 <ul>
 <li class="chapter" data-level="" data-path="c12-recommendations.html"><a href="c12-recommendations.html#prereq12"><i class="fa fa-check"></i>Prerequisites</a></li>
 <li class="chapter" data-level="12.1" data-path="c12-recommendations.html"><a href="c12-recommendations.html#introduction-10"><i class="fa fa-check"></i><b>12.1</b> Introduction</a></li>
-<li class="chapter" data-level="12.2" data-path="c12-recommendations.html"><a href="c12-recommendations.html#recs-survey-process"><i class="fa fa-check"></i><b>12.2</b> Follow survey analysis process</a></li>
+<li class="chapter" data-level="12.2" data-path="c12-recommendations.html"><a href="c12-recommendations.html#recs-survey-process"><i class="fa fa-check"></i><b>12.2</b> Follow the survey analysis process</a></li>
 <li class="chapter" data-level="12.3" data-path="c12-recommendations.html"><a href="c12-recommendations.html#begin-with-descriptive-analysis"><i class="fa fa-check"></i><b>12.3</b> Begin with descriptive analysis</a>
 <ul>
 <li class="chapter" data-level="12.3.1" data-path="c12-recommendations.html"><a href="c12-recommendations.html#table-review"><i class="fa fa-check"></i><b>12.3.1</b> Table review</a></li>
@@ -531,7 +531,7 @@ <h3>Prerequisites<a href="c07-modeling.html#prereq7" class="anchor-section" aria
 <span id="cb213-5"><a href="c07-modeling.html#cb213-5" tabindex="-1"></a><span class="fu">library</span>(broom)</span>
 <span id="cb213-6"><a href="c07-modeling.html#cb213-6" tabindex="-1"></a><span class="fu">library</span>(gt)</span>
 <span id="cb213-7"><a href="c07-modeling.html#cb213-7" tabindex="-1"></a><span class="fu">library</span>(prettyunits)</span></code></pre></div>
-<p>We will be using data from ANES and RECS described in Chapter <a href="c04-getting-started.html#c04-getting-started">4</a>. As a reminder, here is the code to create the design objects for each to use throughout this chapter. For ANES, we need to adjust the weight so it sums to the population instead of the sample (see the ANES documentation and Chapter <a href="c04-getting-started.html#c04-getting-started">4</a> for more information).</p>
+<p>We are using data from ANES and RECS described in Chapter <a href="c04-getting-started.html#c04-getting-started">4</a>. As a reminder, here is the code to create the design objects for each to use throughout this chapter. For ANES, we need to adjust the weight so it sums to the population instead of the sample (see the ANES documentation and Chapter <a href="c04-getting-started.html#c04-getting-started">4</a> for more information.)</p>
 <div class="sourceCode" id="cb214"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb214-1"><a href="c07-modeling.html#cb214-1" tabindex="-1"></a>targetpop <span class="ot">&lt;-</span> <span class="dv">231592693</span></span>
 <span id="cb214-2"><a href="c07-modeling.html#cb214-2" tabindex="-1"></a></span>
 <span id="cb214-3"><a href="c07-modeling.html#cb214-3" tabindex="-1"></a>anes_adjwgt <span class="ot">&lt;-</span> anes_2020 <span class="sc">%&gt;%</span></span>
@@ -550,18 +550,18 @@ <h3>Prerequisites<a href="c07-modeling.html#prereq7" class="anchor-section" aria
 <span id="cb215-3"><a href="c07-modeling.html#cb215-3" tabindex="-1"></a>    <span class="at">weights =</span> NWEIGHT,</span>
 <span id="cb215-4"><a href="c07-modeling.html#cb215-4" tabindex="-1"></a>    <span class="at">repweights =</span> NWEIGHT1<span class="sc">:</span>NWEIGHT60,</span>
 <span id="cb215-5"><a href="c07-modeling.html#cb215-5" tabindex="-1"></a>    <span class="at">type =</span> <span class="st">&quot;JK1&quot;</span>,</span>
-<span id="cb215-6"><a href="c07-modeling.html#cb215-6" tabindex="-1"></a>    <span class="at">scale =</span> <span class="dv">59</span><span class="sc">/</span><span class="dv">60</span>,</span>
+<span id="cb215-6"><a href="c07-modeling.html#cb215-6" tabindex="-1"></a>    <span class="at">scale =</span> <span class="dv">59</span> <span class="sc">/</span> <span class="dv">60</span>,</span>
 <span id="cb215-7"><a href="c07-modeling.html#cb215-7" tabindex="-1"></a>    <span class="at">mse =</span> <span class="cn">TRUE</span></span>
 <span id="cb215-8"><a href="c07-modeling.html#cb215-8" tabindex="-1"></a>  )</span></code></pre></div>
 </div>
 <div id="model-intro" class="section level2 hasAnchor" number="7.1">
 <h2><span class="header-section-number">7.1</span> Introduction<a href="c07-modeling.html#model-intro" class="anchor-section" aria-label="Anchor link to header"></a></h2>
-<p>Modeling data is a way for researchers to investigate the relationship between a single dependent variable and one or more independent variables. This builds upon the analyses conducted in Chapter <a href="c06-statistical-testing.html#c06-statistical-testing">6</a>, which looked at the relationships between just two variables. For example, in Example 3 in Section <a href="c06-statistical-testing.html#stattest-ttest-examples">6.3.2</a>, we investigated if there is a relationship between the electrical bill cost and whether or not the household used air-conditioning. However, there are potentially other elements that could go into what the cost of electrical bills are in a household (e.g., outside temperature, desired internal temperature, types and number of appliances, etc.).</p>
-<p>T-tests only allow us to investigate the relationship of one independent variable at a time, but using models we can look into multiple variables and even explore interactions between these variables. There are several types of models, but in this chapter we will cover Analysis of Variance (ANOVA) and linear regression models following common normal (Gaussian) and logit models. Jonas Kristoffer Lindeløv has an interesting <a href="https://lindeloev.github.io/tests-as-linear/">discussion</a> of many statistical tests and models being equivalent to a linear model. For example, a one-way ANOVA is a linear model with one categorical independent variable, and a two-sample t-test is an ANOVA where the independent variable has exactly two levels.</p>
-<p>When modeling data, it is helpful to first create an equation that provides an overview as to what it is that we are modeling. The main structure of these models is as follows:</p>
+<p>Modeling data is a way for researchers to investigate the relationship between a single dependent variable and one or more independent variables. This builds upon the analyses conducted in Chapter <a href="c06-statistical-testing.html#c06-statistical-testing">6</a>, which looked at the relationships between just two variables. For example, in Example 3 in Section <a href="c06-statistical-testing.html#stattest-ttest-examples">6.3.2</a>, we investigated if there is a relationship between the electrical bill cost and whether or not the household used air-conditioning. However, there are potentially other elements that could go into what the cost of electrical bills are in a household (e.g., outside temperature, desired internal temperature, types and number of appliances, etc.)</p>
+<p>T-tests only allow us to investigate the relationship of one independent variable at a time, but using models, we can look into multiple variables and even explore interactions between these variables. There are several types of models, but in this chapter, we cover Analysis of Variance (ANOVA) and linear regression models following common normal (Gaussian) and logit models. Jonas Kristoffer Lindeløv has an interesting <a href="https://lindeloev.github.io/tests-as-linear/">discussion</a> of many statistical tests and models being equivalent to a linear model. For example, a one-way ANOVA is a linear model with one categorical independent variable, and a two-sample t-test is an ANOVA where the independent variable has exactly two levels.</p>
+<p>When modeling data, it is helpful to first create an equation that provides an overview of what we are modeling. The main structure of these models is as follows:</p>
 <p><span class="math display">\[y_i=\beta_0 +\sum_{i=1}^p \beta_i x_i + \epsilon_i\]</span></p>
-<p>where <span class="math inline">\(y_i\)</span> is the outcome, <span class="math inline">\(\beta_0\)</span> is an intercept, <span class="math inline">\(x_1, \cdots, x_p\)</span> are the predictors with <span class="math inline">\(\beta_1, \cdots, \beta_p\)</span> as the associated coefficients, and <span class="math inline">\(\epsilon_i\)</span> is the error. Not all models will have all components. For example, some models may not include an intercept (<span class="math inline">\(\beta_0\)</span>), may have interactions between different independent variables (<span class="math inline">\(x_i\)</span>), or may have different underlying structures for the dependent variable (<span class="math inline">\(y_i\)</span>). However, all linear models have the independent variables related to the dependent variable in a linear form.</p>
-<p>To specify these models in R, the formulas are the same with both survey data and other data. The left side of the formula is the response/dependent variable, and the right side of the formula has the predictor/independent variable(s). There are many symbols used in R to specify the formula.</p>
+<p>where <span class="math inline">\(y_i\)</span> is the outcome, <span class="math inline">\(\beta_0\)</span> is an intercept, <span class="math inline">\(x_1, \cdots, x_p\)</span> are the predictors with <span class="math inline">\(\beta_1, \cdots, \beta_p\)</span> as the associated coefficients, and <span class="math inline">\(\epsilon_i\)</span> is the error. Not all models have all components. For example, some models may not include an intercept (<span class="math inline">\(\beta_0\)</span>), may have interactions between different independent variables (<span class="math inline">\(x_i\)</span>), or may have different underlying structures for the dependent variable (<span class="math inline">\(y_i\)</span>.) However, all linear models have the independent variables related to the dependent variable in a linear form.</p>
+<p>To specify these models in R, the formulas are the same with both survey data and other data. The left side of the formula is the response/dependent variable, and the right side has the predictor/independent variable(s). There are many symbols used in R to specify the formula.</p>
 <p>For example, a linear formula mathematically notated as</p>
 <p><span class="math display">\[y_i=\beta_0+\beta_1 x_i+\epsilon_i\]</span> would be specified in R as <code>y~x</code> where the intercept is not explicitly included. To fit a model with no intercept, that is,</p>
 <p><span class="math display">\[y_i=\beta_1 x_i+\epsilon_i\]</span>
@@ -609,7 +609,7 @@ <h2><span class="header-section-number">7.1</span> Introduction<a href="c07-mode
 <tr class="even">
 <td align="center">I</td>
 <td align="center"><code>I(x-z)</code></td>
-<td>as-is: include a new variable which is calculated inside the parentheses (e.g., x-z, x*z, x/z are possible claculations that could be done)</td>
+<td>as-is: include a new variable that is calculated inside the parentheses (e.g., x-z, x*z, x/z are possible calculations that could be done)</td>
 </tr>
 </tbody>
 </table>
@@ -649,7 +649,7 @@ <h2><span class="header-section-number">7.1</span> Introduction<a href="c07-mode
 </tr>
 </tbody>
 </table>
-<p>When using non-survey data such as experimental or observational data, researchers will use the <code>glm()</code> function for linear models. With survey data, however, we use <code>svyglm()</code> from the {survey} package to ensure that we account for the survey design and weights in modeling<a href="#fn24" class="footnote-ref" id="fnref24"><sup>24</sup></a>. This allows us to generalize a model to the target population and accounts for the fact that the observations in the survey data may not be independent. As discussed in Chapter <a href="c06-statistical-testing.html#c06-statistical-testing">6</a>, modeling survey data cannot be directly done in {srvyr}, but can be done in the {survey} package <span class="citation">(<a href="#ref-lumley2010complex">Lumley 2010</a>)</span>. In this chapter, we will provide syntax and examples for linear models, including ANOVA, normal linear regression, and logistic regression. For details on other types of regression, including ordinal regression, log-linear models, and survival analysis, refer to <span class="citation">Lumley (<a href="#ref-lumley2010complex">2010</a>)</span>. <span class="citation">Lumley (<a href="#ref-lumley2010complex">2010</a>)</span> also discusses custom models such as a negative binomial or Poisson model in Appendix E of his book.</p>
+<p>When using non-survey data, such as experimental or observational data, researchers use the <code>glm()</code> function for linear models. With survey data, however, we use <code>svyglm()</code> from the {survey} package to ensure that we account for the survey design and weights in modeling<a href="#fn24" class="footnote-ref" id="fnref24"><sup>24</sup></a>. This allows us to generalize a model to the population of interest and accounts for the fact that the observations in the survey data may not be independent. As discussed in Chapter <a href="c06-statistical-testing.html#c06-statistical-testing">6</a>, modeling survey data cannot be directly done in {srvyr} but can be done in the {survey} package <span class="citation">(<a href="#ref-lumley2010complex">Lumley 2010</a>)</span>. In this chapter, we provide syntax and examples for linear models, including ANOVA, normal linear regression, and logistic regression. For details on other types of regression, including ordinal regression, log-linear models, and survival analysis, refer to <span class="citation">Lumley (<a href="#ref-lumley2010complex">2010</a>)</span>. <span class="citation">Lumley (<a href="#ref-lumley2010complex">2010</a>)</span> also discusses custom models such as a negative binomial or Poisson model in Appendix E of his book.</p>
 </div>
 <div id="analysis-of-variance-anova" class="section level2 hasAnchor" number="7.2">
 <h2><span class="header-section-number">7.2</span> Analysis of variance (ANOVA)<a href="c07-modeling.html#analysis-of-variance-anova" class="anchor-section" aria-label="Anchor link to header"></a></h2>
@@ -658,7 +658,7 @@ <h2><span class="header-section-number">7.2</span> Analysis of variance (ANOVA)<
 <li><span class="math inline">\(H_0: \mu_1 = \mu_2= \dots = \mu_k\)</span> where <span class="math inline">\(\mu_i\)</span> is the mean outcome for group <span class="math inline">\(i\)</span></li>
 <li><span class="math inline">\(H_A: \text{At least one mean is different}\)</span></li>
 </ul>
-<p>Using the framework, an ANOVA test is also a linear model, we can re-frame the problem as:</p>
+<p>An ANOVA test is also a linear model, we can re-frame the problem using the framework as:</p>
 <p><span class="math display">\[ y_i=\sum_{i=1}^k \mu_i x_i + \epsilon_i\]</span></p>
 <p>where <span class="math inline">\(x_i\)</span> is a group indicator for groups <span class="math inline">\(1, \cdots, k\)</span>.</p>
 <p>Some assumptions when using ANOVA on survey data include:</p>
@@ -684,11 +684,11 @@ <h3><span class="header-section-number">7.2.1</span> Syntax<a href="c07-modeling
 <li><code>na.action</code>: handling of missing data</li>
 <li><code>df.resid</code>: degrees of freedom for Wald tests (optional) - defaults to using <code>degf(design)-(g-1)</code> where <span class="math inline">\(g\)</span> is the number of groups</li>
 </ul>
-<p>The function <code>svyglm()</code> does not have the design as the first argument so the dot (<code>.</code>) notation is used to pass it with a pipe (see Chapter <a href="c06-statistical-testing.html#c06-statistical-testing">6</a> for more details). The default for missing data is <code>na.omit</code>, this means that we are removing all records with any missing data in either predictors or outcomes from analyses. There are other options for handling missing data and we recommend looking at the help documentation for <code>na.omit</code> (run <code>help(na.omit)</code> or <code>?na.omit</code>) for more information on options to use for <code>na.action</code>. For a discussion of how to handle missing data see Chapter <a href="c11-missing-data.html#c11-missing-data">11</a>.</p>
+<p>The function <code>svyglm()</code> does not have the design as the first argument so the dot (<code>.</code>) notation is used to pass it with a pipe (see Chapter <a href="c06-statistical-testing.html#c06-statistical-testing">6</a> for more details.) The default for missing data is <code>na.omit</code>. This means that we are removing all records with any missing data in either predictors or outcomes from analyses. There are other options for handling missing data, and we recommend looking at the help documentation for <code>na.omit</code> (run <code>help(na.omit)</code> or <code>?na.omit</code>) for more information on options to use for <code>na.action</code>. For a discussion on how to handle missing data, see Chapter <a href="c11-missing-data.html#c11-missing-data">11</a>.</p>
 </div>
 <div id="example-1" class="section level3 hasAnchor" number="7.2.2">
 <h3><span class="header-section-number">7.2.2</span> Example<a href="c07-modeling.html#example-1" class="anchor-section" aria-label="Anchor link to header"></a></h3>
-<p>Looking at an example will help us discuss the output and how to interpret the results. In RECS, respondents are asked what temperature they set their thermostat to during the day and evening when using the air-conditioning during the summer. To analyze this data, we filter the respondents to only those using AC (<code>ACUsed</code>). Then if we want to see if there are differences by region, we can use <code>group_by()</code>. A descriptive analysis of the temperature at night (<code>SummerTempNight</code>) set by region and the sample sizes is displayed below.</p>
+<p>Looking at an example helps us discuss the output and how to interpret the results. In RECS, respondents are asked what temperature they set their thermostat to during the day and evening when using the air-conditioning (A/C) during the summer. To analyze these data, we filter the respondents to only those using A/C (<code>ACUsed</code>.) Then, if we want to see if there are regional differences, we can use <code>group_by()</code>. A descriptive analysis of the temperature at night (<code>SummerTempNight</code>) set by region and the sample sizes is displayed below.</p>
 <div class="sourceCode" id="cb217"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb217-1"><a href="c07-modeling.html#cb217-1" tabindex="-1"></a>recs_des <span class="sc">%&gt;%</span></span>
 <span id="cb217-2"><a href="c07-modeling.html#cb217-2" tabindex="-1"></a>  <span class="fu">filter</span>(ACUsed) <span class="sc">%&gt;%</span></span>
 <span id="cb217-3"><a href="c07-modeling.html#cb217-3" tabindex="-1"></a>  <span class="fu">group_by</span>(Region) <span class="sc">%&gt;%</span></span>
@@ -704,7 +704,7 @@ <h3><span class="header-section-number">7.2.2</span> Example<a href="c07-modelin
 ## 2 Midwest    71.0 0.0897  3619     0
 ## 3 South      71.8 0.0536  6065     0
 ## 4 West       72.5 0.129   3283     0</code></pre>
-<p>In the following code, we test whether this temperature varies by region by first using <code>svyglm()</code> to run the test and then using <code>broom::tidy()</code> to display the output. Note that the temperature setting is set to NA when the household does not use air-conditioning, and since the default handling of NAs is <code>na.action=na.omit</code>, records that do not use air-conditioning will not be included in this regression.</p>
+<p>In the following code, we test whether this temperature varies by region by first using <code>svyglm()</code> to run the test and then using <code>broom::tidy()</code> to display the output. Note that the temperature setting is set to NA when the household does not use A/C, and since the default handling of NAs is <code>na.action=na.omit</code>, records that do not use A/C are not included in this regression.</p>
 <div class="sourceCode" id="cb219"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb219-1"><a href="c07-modeling.html#cb219-1" tabindex="-1"></a>anova_out <span class="ot">&lt;-</span> recs_des <span class="sc">%&gt;%</span></span>
 <span id="cb219-2"><a href="c07-modeling.html#cb219-2" tabindex="-1"></a>  <span class="fu">svyglm</span>(<span class="at">design =</span> .,</span>
 <span id="cb219-3"><a href="c07-modeling.html#cb219-3" tabindex="-1"></a>         <span class="at">formula =</span> SummerTempNight <span class="sc">~</span> Region)</span>
@@ -717,8 +717,8 @@ <h3><span class="header-section-number">7.2.2</span> Example<a href="c07-modelin
 ## 2 RegionMidwest     1.34     0.138      9.68 1.46e- 13
 ## 3 RegionSouth       2.05     0.128     16.0  1.36e- 22
 ## 4 RegionWest        2.80     0.177     15.9  2.27e- 22</code></pre>
-<p>In the output above, we can see the estimated coefficients (<code>estimate</code>), estimated standard errors of the coefficients (<code>std.error</code>), the t-statistic (<code>statistic</code>), and the p-value for each coefficient. In this output, the intercept represents the reference value of the Northeast region. The other coefficients indicate the difference in temperature relative to the Northeast region. For example, in the Midwest, temperatures are set, on average, 1.34 (p-value&lt;0.0001) degrees higher than in the Northeast during summer nights and each region sets their thermostats at significantly higher temperatures than the Northeast.</p>
-<p>If we wanted to change the reference value we would reorder the factor before modeling using the function <code>relevel()</code> from {stats} or using one of many factor ordering functions in {forcats} such as <code>fct_relevel()</code> or <code>fct_infreq()</code>. For example, if we wanted the reference level to be the Midwest region, we could use the following code. Note the usage of the <code>gt()</code> function on top of <code>tidy()</code> to print a nice looking output table <span class="citation">(<a href="#ref-R-gt">Iannone et al. 2023</a>; <a href="#ref-R-broom">Robinson, Hayes, and Couch 2023</a>)</span> - we will go over more usage of the {gt} package in Chapter <a href="c08-communicating-results.html#c08-communicating-results">8</a>.</p>
+<p>In the output above, we can see the estimated coefficients (<code>estimate</code>), estimated standard errors of the coefficients (<code>std.error</code>), the t-statistic (<code>statistic</code>), and the p-value for each coefficient. In this output, the intercept represents the reference value of the Northeast region. The other coefficients indicate the difference in temperature relative to the Northeast region. For example, in the Midwest, temperatures are set, on average, 1.34 (p-value&lt;0.0001) degrees higher than in the Northeast during summer nights, and each region sets their thermostats at significantly higher temperatures than the Northeast.</p>
+<p>If we wanted to change the reference value, we would reorder the factor before modeling using the function <code>relevel()</code> from {stats} or using one of many factor ordering functions in {forcats} such as <code>fct_relevel()</code> or <code>fct_infreq()</code>. For example, if we wanted the reference level to be the Midwest region, we could use the following code. Note the usage of the <code>gt()</code> function on top of <code>tidy()</code> to print a nice-looking output table <span class="citation">(<a href="#ref-R-gt">Iannone et al. 2023</a>; <a href="#ref-R-broom">Robinson, Hayes, and Couch 2023</a>)</span> (see Chapter <a href="c08-communicating-results.html#c08-communicating-results">8</a> for more information on the {gt} package.)</p>
 <div class="sourceCode" id="cb221"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb221-1"><a href="c07-modeling.html#cb221-1" tabindex="-1"></a>anova_out_relevel <span class="ot">&lt;-</span> recs_des <span class="sc">%&gt;%</span></span>
 <span id="cb221-2"><a href="c07-modeling.html#cb221-2" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">Region=</span><span class="fu">fct_relevel</span>(Region, <span class="st">&quot;Midwest&quot;</span>, <span class="at">after =</span> <span class="dv">0</span>)) <span class="sc">%&gt;%</span> </span>
 <span id="cb221-3"><a href="c07-modeling.html#cb221-3" tabindex="-1"></a>  <span class="fu">svyglm</span>(<span class="at">design =</span> .,</span>
@@ -728,23 +728,23 @@ <h3><span class="header-section-number">7.2.2</span> Example<a href="c07-modelin
 <span id="cb222-3"><a href="c07-modeling.html#cb222-3" tabindex="-1"></a>  <span class="fu">gt</span>() <span class="sc">%&gt;%</span></span>
 <span id="cb222-4"><a href="c07-modeling.html#cb222-4" tabindex="-1"></a>  <span class="fu">fmt_number</span>()</span></code></pre></div>
 
-<div id="jyqfcdcxup" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
-<style>#jyqfcdcxup table {
+<div id="pyfzfbszwz" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
+<style>#pyfzfbszwz table {
   font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
   -webkit-font-smoothing: antialiased;
   -moz-osx-font-smoothing: grayscale;
 }
 
-#jyqfcdcxup thead, #jyqfcdcxup tbody, #jyqfcdcxup tfoot, #jyqfcdcxup tr, #jyqfcdcxup td, #jyqfcdcxup th {
+#pyfzfbszwz thead, #pyfzfbszwz tbody, #pyfzfbszwz tfoot, #pyfzfbszwz tr, #pyfzfbszwz td, #pyfzfbszwz th {
   border-style: none;
 }
 
-#jyqfcdcxup p {
+#pyfzfbszwz p {
   margin: 0;
   padding: 0;
 }
 
-#jyqfcdcxup .gt_table {
+#pyfzfbszwz .gt_table {
   display: table;
   border-collapse: collapse;
   line-height: normal;
@@ -770,12 +770,12 @@ <h3><span class="header-section-number">7.2.2</span> Example<a href="c07-modelin
   border-left-color: #D3D3D3;
 }
 
-#jyqfcdcxup .gt_caption {
+#pyfzfbszwz .gt_caption {
   padding-top: 4px;
   padding-bottom: 4px;
 }
 
-#jyqfcdcxup .gt_title {
+#pyfzfbszwz .gt_title {
   color: #333333;
   font-size: 125%;
   font-weight: initial;
@@ -787,7 +787,7 @@ <h3><span class="header-section-number">7.2.2</span> Example<a href="c07-modelin
   border-bottom-width: 0;
 }
 
-#jyqfcdcxup .gt_subtitle {
+#pyfzfbszwz .gt_subtitle {
   color: #333333;
   font-size: 85%;
   font-weight: initial;
@@ -799,7 +799,7 @@ <h3><span class="header-section-number">7.2.2</span> Example<a href="c07-modelin
   border-top-width: 0;
 }
 
-#jyqfcdcxup .gt_heading {
+#pyfzfbszwz .gt_heading {
   background-color: #FFFFFF;
   text-align: center;
   border-bottom-color: #FFFFFF;
@@ -811,13 +811,13 @@ <h3><span class="header-section-number">7.2.2</span> Example<a href="c07-modelin
   border-right-color: #D3D3D3;
 }
 
-#jyqfcdcxup .gt_bottom_border {
+#pyfzfbszwz .gt_bottom_border {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
 }
 
-#jyqfcdcxup .gt_col_headings {
+#pyfzfbszwz .gt_col_headings {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -832,7 +832,7 @@ <h3><span class="header-section-number">7.2.2</span> Example<a href="c07-modelin
   border-right-color: #D3D3D3;
 }
 
-#jyqfcdcxup .gt_col_heading {
+#pyfzfbszwz .gt_col_heading {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -852,7 +852,7 @@ <h3><span class="header-section-number">7.2.2</span> Example<a href="c07-modelin
   overflow-x: hidden;
 }
 
-#jyqfcdcxup .gt_column_spanner_outer {
+#pyfzfbszwz .gt_column_spanner_outer {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -864,15 +864,15 @@ <h3><span class="header-section-number">7.2.2</span> Example<a href="c07-modelin
   padding-right: 4px;
 }
 
-#jyqfcdcxup .gt_column_spanner_outer:first-child {
+#pyfzfbszwz .gt_column_spanner_outer:first-child {
   padding-left: 0;
 }
 
-#jyqfcdcxup .gt_column_spanner_outer:last-child {
+#pyfzfbszwz .gt_column_spanner_outer:last-child {
   padding-right: 0;
 }
 
-#jyqfcdcxup .gt_column_spanner {
+#pyfzfbszwz .gt_column_spanner {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
@@ -884,11 +884,11 @@ <h3><span class="header-section-number">7.2.2</span> Example<a href="c07-modelin
   width: 100%;
 }
 
-#jyqfcdcxup .gt_spanner_row {
+#pyfzfbszwz .gt_spanner_row {
   border-bottom-style: hidden;
 }
 
-#jyqfcdcxup .gt_group_heading {
+#pyfzfbszwz .gt_group_heading {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -914,7 +914,7 @@ <h3><span class="header-section-number">7.2.2</span> Example<a href="c07-modelin
   text-align: left;
 }
 
-#jyqfcdcxup .gt_empty_group_heading {
+#pyfzfbszwz .gt_empty_group_heading {
   padding: 0.5px;
   color: #333333;
   background-color: #FFFFFF;
@@ -929,15 +929,15 @@ <h3><span class="header-section-number">7.2.2</span> Example<a href="c07-modelin
   vertical-align: middle;
 }
 
-#jyqfcdcxup .gt_from_md > :first-child {
+#pyfzfbszwz .gt_from_md > :first-child {
   margin-top: 0;
 }
 
-#jyqfcdcxup .gt_from_md > :last-child {
+#pyfzfbszwz .gt_from_md > :last-child {
   margin-bottom: 0;
 }
 
-#jyqfcdcxup .gt_row {
+#pyfzfbszwz .gt_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -956,7 +956,7 @@ <h3><span class="header-section-number">7.2.2</span> Example<a href="c07-modelin
   overflow-x: hidden;
 }
 
-#jyqfcdcxup .gt_stub {
+#pyfzfbszwz .gt_stub {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -969,7 +969,7 @@ <h3><span class="header-section-number">7.2.2</span> Example<a href="c07-modelin
   padding-right: 5px;
 }
 
-#jyqfcdcxup .gt_stub_row_group {
+#pyfzfbszwz .gt_stub_row_group {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -983,15 +983,15 @@ <h3><span class="header-section-number">7.2.2</span> Example<a href="c07-modelin
   vertical-align: top;
 }
 
-#jyqfcdcxup .gt_row_group_first td {
+#pyfzfbszwz .gt_row_group_first td {
   border-top-width: 2px;
 }
 
-#jyqfcdcxup .gt_row_group_first th {
+#pyfzfbszwz .gt_row_group_first th {
   border-top-width: 2px;
 }
 
-#jyqfcdcxup .gt_summary_row {
+#pyfzfbszwz .gt_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -1001,16 +1001,16 @@ <h3><span class="header-section-number">7.2.2</span> Example<a href="c07-modelin
   padding-right: 5px;
 }
 
-#jyqfcdcxup .gt_first_summary_row {
+#pyfzfbszwz .gt_first_summary_row {
   border-top-style: solid;
   border-top-color: #D3D3D3;
 }
 
-#jyqfcdcxup .gt_first_summary_row.thick {
+#pyfzfbszwz .gt_first_summary_row.thick {
   border-top-width: 2px;
 }
 
-#jyqfcdcxup .gt_last_summary_row {
+#pyfzfbszwz .gt_last_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -1020,7 +1020,7 @@ <h3><span class="header-section-number">7.2.2</span> Example<a href="c07-modelin
   border-bottom-color: #D3D3D3;
 }
 
-#jyqfcdcxup .gt_grand_summary_row {
+#pyfzfbszwz .gt_grand_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -1030,7 +1030,7 @@ <h3><span class="header-section-number">7.2.2</span> Example<a href="c07-modelin
   padding-right: 5px;
 }
 
-#jyqfcdcxup .gt_first_grand_summary_row {
+#pyfzfbszwz .gt_first_grand_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -1040,7 +1040,7 @@ <h3><span class="header-section-number">7.2.2</span> Example<a href="c07-modelin
   border-top-color: #D3D3D3;
 }
 
-#jyqfcdcxup .gt_last_grand_summary_row_top {
+#pyfzfbszwz .gt_last_grand_summary_row_top {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -1050,11 +1050,11 @@ <h3><span class="header-section-number">7.2.2</span> Example<a href="c07-modelin
   border-bottom-color: #D3D3D3;
 }
 
-#jyqfcdcxup .gt_striped {
+#pyfzfbszwz .gt_striped {
   background-color: rgba(128, 128, 128, 0.05);
 }
 
-#jyqfcdcxup .gt_table_body {
+#pyfzfbszwz .gt_table_body {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -1063,7 +1063,7 @@ <h3><span class="header-section-number">7.2.2</span> Example<a href="c07-modelin
   border-bottom-color: #D3D3D3;
 }
 
-#jyqfcdcxup .gt_footnotes {
+#pyfzfbszwz .gt_footnotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -1077,7 +1077,7 @@ <h3><span class="header-section-number">7.2.2</span> Example<a href="c07-modelin
   border-right-color: #D3D3D3;
 }
 
-#jyqfcdcxup .gt_footnote {
+#pyfzfbszwz .gt_footnote {
   margin: 0px;
   font-size: 90%;
   padding-top: 4px;
@@ -1086,7 +1086,7 @@ <h3><span class="header-section-number">7.2.2</span> Example<a href="c07-modelin
   padding-right: 5px;
 }
 
-#jyqfcdcxup .gt_sourcenotes {
+#pyfzfbszwz .gt_sourcenotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -1100,7 +1100,7 @@ <h3><span class="header-section-number">7.2.2</span> Example<a href="c07-modelin
   border-right-color: #D3D3D3;
 }
 
-#jyqfcdcxup .gt_sourcenote {
+#pyfzfbszwz .gt_sourcenote {
   font-size: 90%;
   padding-top: 4px;
   padding-bottom: 4px;
@@ -1108,63 +1108,63 @@ <h3><span class="header-section-number">7.2.2</span> Example<a href="c07-modelin
   padding-right: 5px;
 }
 
-#jyqfcdcxup .gt_left {
+#pyfzfbszwz .gt_left {
   text-align: left;
 }
 
-#jyqfcdcxup .gt_center {
+#pyfzfbszwz .gt_center {
   text-align: center;
 }
 
-#jyqfcdcxup .gt_right {
+#pyfzfbszwz .gt_right {
   text-align: right;
   font-variant-numeric: tabular-nums;
 }
 
-#jyqfcdcxup .gt_font_normal {
+#pyfzfbszwz .gt_font_normal {
   font-weight: normal;
 }
 
-#jyqfcdcxup .gt_font_bold {
+#pyfzfbszwz .gt_font_bold {
   font-weight: bold;
 }
 
-#jyqfcdcxup .gt_font_italic {
+#pyfzfbszwz .gt_font_italic {
   font-style: italic;
 }
 
-#jyqfcdcxup .gt_super {
+#pyfzfbszwz .gt_super {
   font-size: 65%;
 }
 
-#jyqfcdcxup .gt_footnote_marks {
+#pyfzfbszwz .gt_footnote_marks {
   font-size: 75%;
   vertical-align: 0.4em;
   position: initial;
 }
 
-#jyqfcdcxup .gt_asterisk {
+#pyfzfbszwz .gt_asterisk {
   font-size: 100%;
   vertical-align: 0;
 }
 
-#jyqfcdcxup .gt_indent_1 {
+#pyfzfbszwz .gt_indent_1 {
   text-indent: 5px;
 }
 
-#jyqfcdcxup .gt_indent_2 {
+#pyfzfbszwz .gt_indent_2 {
   text-indent: 10px;
 }
 
-#jyqfcdcxup .gt_indent_3 {
+#pyfzfbszwz .gt_indent_3 {
   text-indent: 15px;
 }
 
-#jyqfcdcxup .gt_indent_4 {
+#pyfzfbszwz .gt_indent_4 {
   text-indent: 20px;
 }
 
-#jyqfcdcxup .gt_indent_5 {
+#pyfzfbszwz .gt_indent_5 {
   text-indent: 25px;
 }
 </style>
@@ -1206,25 +1206,25 @@ <h3><span class="header-section-number">7.2.2</span> Example<a href="c07-modelin
   
 </table>
 </div>
-<p>This output now has the coefficients indicating the difference in temperature relative to the Midwest region. For example, in the Northeast, temperatures are set, on average, -1.34 (p-value&lt;0.0001) degrees lower than in the Midwest during summer nights and each region sets their thermostats at significantly lower temperatures than the Midwest. This is the reverse from what we saw in the prior model as we are still comparing the same two regions, just from different reference points.</p>
+<p>This output now has the coefficients indicating the difference in temperature relative to the Midwest region. For example, in the Northeast, temperatures are set, on average, -1.34 (p-value&lt;0.0001) degrees lower than in the Midwest during summer nights, and each region sets their thermostats at significantly lower temperatures than the Midwest. This is the reverse of what we saw in the prior model, as we are still comparing the same two regions, just from different reference points.</p>
 </div>
 </div>
 <div id="normal-linear-regression" class="section level2 hasAnchor" number="7.3">
 <h2><span class="header-section-number">7.3</span> Normal linear regression<a href="c07-modeling.html#normal-linear-regression" class="anchor-section" aria-label="Anchor link to header"></a></h2>
-<p>Normal linear regression is a more generalized method than ANOVA where we fit a model of a continuous outcome with any number of categorical or continuous predictors whereas ANOVA only has categorical predictors and is similarly specified as:</p>
+<p>Normal linear regression is a more generalized method than ANOVA, where we fit a model of a continuous outcome with any number of categorical or continuous predictors (whereas ANOVA only has categorical predictors) and is similarly specified as:</p>
 <span class="math display">\[\begin{equation}
 y_i=\beta_0 +\sum_{i=1}^p \beta_i x_i + \epsilon_i
 \end{equation}\]</span>
-<p>where <span class="math inline">\(y_i\)</span> is the outcome, <span class="math inline">\(\beta_0\)</span> is an intercept, <span class="math inline">\(x_1, \cdots, x_n\)</span> are the predictors with <span class="math inline">\(\beta_1, \cdots, \beta_p\)</span> as the associated coefficients, and <span class="math inline">\(\epsilon_i\)</span> is the error.</p>
+<p>where <span class="math inline">\(y_i\)</span> is the outcome, <span class="math inline">\(\beta_0\)</span> is an intercept, <span class="math inline">\(x_1, \cdots, x_p\)</span> are the predictors with <span class="math inline">\(\beta_1, \cdots, \beta_p\)</span> as the associated coefficients, and <span class="math inline">\(\epsilon_i\)</span> is the error.</p>
 <p>Assumptions in normal linear regression using survey data include:</p>
 <ul>
 <li>The residuals (<span class="math inline">\(\epsilon_i\)</span>) are normally distributed, but there is not an assumption of independence, and the correlation structure is captured in the survey design object</li>
 <li>There is a linear relationship between the outcome variable and the independent variables</li>
-<li>The residuals are homoscedastic, that is, the error term is the same across all values of independent variables</li>
+<li>The residuals are homoscedastic; that is, the error term is the same across all values of independent variables</li>
 </ul>
 <div id="syntax-7" class="section level3 hasAnchor" number="7.3.1">
 <h3><span class="header-section-number">7.3.1</span> Syntax<a href="c07-modeling.html#syntax-7" class="anchor-section" aria-label="Anchor link to header"></a></h3>
-<p>The syntax for this regression uses the same function as ANOVA, but can have more than one variable listed on the right-hand side of the formula:</p>
+<p>The syntax for this regression uses the same function as ANOVA but can have more than one variable listed on the right-hand side of the formula:</p>
 <div class="sourceCode" id="cb223"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb223-1"><a href="c07-modeling.html#cb223-1" tabindex="-1"></a>des_obj <span class="sc">%&gt;%</span></span>
 <span id="cb223-2"><a href="c07-modeling.html#cb223-2" tabindex="-1"></a>  <span class="fu">svyglm</span>(</span>
 <span id="cb223-3"><a href="c07-modeling.html#cb223-3" tabindex="-1"></a>    <span class="at">formula =</span> outcomevar <span class="sc">~</span> x1 <span class="sc">+</span> x2 <span class="sc">+</span> x3,</span>
@@ -1243,9 +1243,9 @@ <h3><span class="header-section-number">7.3.1</span> Syntax<a href="c07-modeling
 </div>
 <div id="examples-7" class="section level3 hasAnchor" number="7.3.2">
 <h3><span class="header-section-number">7.3.2</span> Examples<a href="c07-modeling.html#examples-7" class="anchor-section" aria-label="Anchor link to header"></a></h3>
-<div id="example-1-linear-regression-with-single-variable" class="section level4 unnumbered hasAnchor">
-<h4>Example 1: Linear regression with single variable<a href="c07-modeling.html#example-1-linear-regression-with-single-variable" class="anchor-section" aria-label="Anchor link to header"></a></h4>
-<p>On RECS, we can obtain information on the square footage of homes and the electric bills. We assume that square footage is related to the amount of money spent on electricity and examine a model for this. Before any modeling, we first plot the data to determine whether it is reasonable to assume a linear relationship. In Figure <a href="c07-modeling.html#fig:model-plot-sf-elbill">7.1</a>, each hexagon represents the weighted count of households in the bin, and we can see a general positive linear trend (as the square footage increases so does the amount of money spent on electricity).</p>
+<div id="example-1-linear-regression-with-a-single-variable" class="section level4 unnumbered hasAnchor">
+<h4>Example 1: Linear regression with a single variable<a href="c07-modeling.html#example-1-linear-regression-with-a-single-variable" class="anchor-section" aria-label="Anchor link to header"></a></h4>
+<p>On RECS, we can obtain information on the square footage of homes and the electric bills. We assume that square footage is related to the amount of money spent on electricity and examine a model for this. Before any modeling, we first plot the data to determine whether it is reasonable to assume a linear relationship. In Figure <a href="c07-modeling.html#fig:model-plot-sf-elbill">7.1</a>, each hexagon represents the weighted count of households in the bin, and we can see a general positive linear trend (as the square footage increases, so does the amount of money spent on electricity.)</p>
 <div class="sourceCode" id="cb224"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb224-1"><a href="c07-modeling.html#cb224-1" tabindex="-1"></a>recs_2020 <span class="sc">%&gt;%</span></span>
 <span id="cb224-2"><a href="c07-modeling.html#cb224-2" tabindex="-1"></a>  <span class="fu">ggplot</span>(<span class="fu">aes</span>(</span>
 <span id="cb224-3"><a href="c07-modeling.html#cb224-3" tabindex="-1"></a>    <span class="at">x =</span> TOTSQFT_EN,</span>
@@ -1264,12 +1264,12 @@ <h4>Example 1: Linear regression with single variable<a href="c07-modeling.html#
 <span id="cb224-16"><a href="c07-modeling.html#cb224-16" tabindex="-1"></a>  <span class="fu">scale_x_continuous</span>(<span class="at">labels =</span> scales<span class="sc">::</span><span class="fu">comma_format</span>()) <span class="sc">+</span></span>
 <span id="cb224-17"><a href="c07-modeling.html#cb224-17" tabindex="-1"></a>  <span class="fu">theme_minimal</span>() </span></code></pre></div>
 <div class="figure"><span style="display:block;" id="fig:model-plot-sf-elbill"></span>
-<img src="bookdown_files/figure-html/model-plot-sf-elbill-1.png" alt="Hex chart where each hexagon represents a number of housing units at a point. x-axis is 'Total square footage' ranging from 0 to 7,500 and y-axis is 'Amount spent on electricity' ranging from $0 to 8,000. The trend is relatively linear and positve. A high concentration of points have square footage between 0 and 2,500 square feet as well as between electricity expenditure between $0 and 2,000" width="672" />
+<img src="bookdown_files/figure-html/model-plot-sf-elbill-1.png" alt="Hex chart where each hexagon represents a number of housing units at a point. x-axis is 'Total square footage' ranging from 0 to 7,500 and y-axis is 'Amount spent on electricity' ranging from $0 to 8,000. The trend is relatively linear and positive. A high concentration of points have square footage between 0 and 2,500 square feet as well as between electricity expenditure between $0 and 2,000" width="672" />
 <p class="caption">
 FIGURE 7.1: Relationship between square footage and dollars spent on electricity, RECS 2020
 </p>
 </div>
-<p>Given that the plot shows a potential increasing relationship between square footage and electricity expenditure, fitting a model will allow us to determine if the relationship is statistically significant. The model is fit below with electricity expenditure as the outcome.</p>
+<p>Given that the plot shows a potentially increasing relationship between square footage and electricity expenditure, fitting a model allows us to determine if the relationship is statistically significant. The model is fit below with electricity expenditure as the outcome.</p>
 <div class="sourceCode" id="cb225"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb225-1"><a href="c07-modeling.html#cb225-1" tabindex="-1"></a>m_electric_sqft <span class="ot">&lt;-</span> recs_des <span class="sc">%&gt;%</span></span>
 <span id="cb225-2"><a href="c07-modeling.html#cb225-2" tabindex="-1"></a>  <span class="fu">svyglm</span>(<span class="at">design =</span> .,</span>
 <span id="cb225-3"><a href="c07-modeling.html#cb225-3" tabindex="-1"></a>         <span class="at">formula =</span> DOLLAREL <span class="sc">~</span> TOTSQFT_EN,</span>
@@ -1279,23 +1279,23 @@ <h4>Example 1: Linear regression with single variable<a href="c07-modeling.html#
 <span id="cb226-3"><a href="c07-modeling.html#cb226-3" tabindex="-1"></a>  <span class="fu">gt</span>() <span class="sc">%&gt;%</span></span>
 <span id="cb226-4"><a href="c07-modeling.html#cb226-4" tabindex="-1"></a>  <span class="fu">fmt_number</span>()</span></code></pre></div>
 
-<div id="tymamautoy" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
-<style>#tymamautoy table {
+<div id="zulyvxvvtg" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
+<style>#zulyvxvvtg table {
   font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
   -webkit-font-smoothing: antialiased;
   -moz-osx-font-smoothing: grayscale;
 }
 
-#tymamautoy thead, #tymamautoy tbody, #tymamautoy tfoot, #tymamautoy tr, #tymamautoy td, #tymamautoy th {
+#zulyvxvvtg thead, #zulyvxvvtg tbody, #zulyvxvvtg tfoot, #zulyvxvvtg tr, #zulyvxvvtg td, #zulyvxvvtg th {
   border-style: none;
 }
 
-#tymamautoy p {
+#zulyvxvvtg p {
   margin: 0;
   padding: 0;
 }
 
-#tymamautoy .gt_table {
+#zulyvxvvtg .gt_table {
   display: table;
   border-collapse: collapse;
   line-height: normal;
@@ -1321,12 +1321,12 @@ <h4>Example 1: Linear regression with single variable<a href="c07-modeling.html#
   border-left-color: #D3D3D3;
 }
 
-#tymamautoy .gt_caption {
+#zulyvxvvtg .gt_caption {
   padding-top: 4px;
   padding-bottom: 4px;
 }
 
-#tymamautoy .gt_title {
+#zulyvxvvtg .gt_title {
   color: #333333;
   font-size: 125%;
   font-weight: initial;
@@ -1338,7 +1338,7 @@ <h4>Example 1: Linear regression with single variable<a href="c07-modeling.html#
   border-bottom-width: 0;
 }
 
-#tymamautoy .gt_subtitle {
+#zulyvxvvtg .gt_subtitle {
   color: #333333;
   font-size: 85%;
   font-weight: initial;
@@ -1350,7 +1350,7 @@ <h4>Example 1: Linear regression with single variable<a href="c07-modeling.html#
   border-top-width: 0;
 }
 
-#tymamautoy .gt_heading {
+#zulyvxvvtg .gt_heading {
   background-color: #FFFFFF;
   text-align: center;
   border-bottom-color: #FFFFFF;
@@ -1362,13 +1362,13 @@ <h4>Example 1: Linear regression with single variable<a href="c07-modeling.html#
   border-right-color: #D3D3D3;
 }
 
-#tymamautoy .gt_bottom_border {
+#zulyvxvvtg .gt_bottom_border {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
 }
 
-#tymamautoy .gt_col_headings {
+#zulyvxvvtg .gt_col_headings {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -1383,7 +1383,7 @@ <h4>Example 1: Linear regression with single variable<a href="c07-modeling.html#
   border-right-color: #D3D3D3;
 }
 
-#tymamautoy .gt_col_heading {
+#zulyvxvvtg .gt_col_heading {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -1403,7 +1403,7 @@ <h4>Example 1: Linear regression with single variable<a href="c07-modeling.html#
   overflow-x: hidden;
 }
 
-#tymamautoy .gt_column_spanner_outer {
+#zulyvxvvtg .gt_column_spanner_outer {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -1415,15 +1415,15 @@ <h4>Example 1: Linear regression with single variable<a href="c07-modeling.html#
   padding-right: 4px;
 }
 
-#tymamautoy .gt_column_spanner_outer:first-child {
+#zulyvxvvtg .gt_column_spanner_outer:first-child {
   padding-left: 0;
 }
 
-#tymamautoy .gt_column_spanner_outer:last-child {
+#zulyvxvvtg .gt_column_spanner_outer:last-child {
   padding-right: 0;
 }
 
-#tymamautoy .gt_column_spanner {
+#zulyvxvvtg .gt_column_spanner {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
@@ -1435,11 +1435,11 @@ <h4>Example 1: Linear regression with single variable<a href="c07-modeling.html#
   width: 100%;
 }
 
-#tymamautoy .gt_spanner_row {
+#zulyvxvvtg .gt_spanner_row {
   border-bottom-style: hidden;
 }
 
-#tymamautoy .gt_group_heading {
+#zulyvxvvtg .gt_group_heading {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -1465,7 +1465,7 @@ <h4>Example 1: Linear regression with single variable<a href="c07-modeling.html#
   text-align: left;
 }
 
-#tymamautoy .gt_empty_group_heading {
+#zulyvxvvtg .gt_empty_group_heading {
   padding: 0.5px;
   color: #333333;
   background-color: #FFFFFF;
@@ -1480,15 +1480,15 @@ <h4>Example 1: Linear regression with single variable<a href="c07-modeling.html#
   vertical-align: middle;
 }
 
-#tymamautoy .gt_from_md > :first-child {
+#zulyvxvvtg .gt_from_md > :first-child {
   margin-top: 0;
 }
 
-#tymamautoy .gt_from_md > :last-child {
+#zulyvxvvtg .gt_from_md > :last-child {
   margin-bottom: 0;
 }
 
-#tymamautoy .gt_row {
+#zulyvxvvtg .gt_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -1507,7 +1507,7 @@ <h4>Example 1: Linear regression with single variable<a href="c07-modeling.html#
   overflow-x: hidden;
 }
 
-#tymamautoy .gt_stub {
+#zulyvxvvtg .gt_stub {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -1520,7 +1520,7 @@ <h4>Example 1: Linear regression with single variable<a href="c07-modeling.html#
   padding-right: 5px;
 }
 
-#tymamautoy .gt_stub_row_group {
+#zulyvxvvtg .gt_stub_row_group {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -1534,15 +1534,15 @@ <h4>Example 1: Linear regression with single variable<a href="c07-modeling.html#
   vertical-align: top;
 }
 
-#tymamautoy .gt_row_group_first td {
+#zulyvxvvtg .gt_row_group_first td {
   border-top-width: 2px;
 }
 
-#tymamautoy .gt_row_group_first th {
+#zulyvxvvtg .gt_row_group_first th {
   border-top-width: 2px;
 }
 
-#tymamautoy .gt_summary_row {
+#zulyvxvvtg .gt_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -1552,16 +1552,16 @@ <h4>Example 1: Linear regression with single variable<a href="c07-modeling.html#
   padding-right: 5px;
 }
 
-#tymamautoy .gt_first_summary_row {
+#zulyvxvvtg .gt_first_summary_row {
   border-top-style: solid;
   border-top-color: #D3D3D3;
 }
 
-#tymamautoy .gt_first_summary_row.thick {
+#zulyvxvvtg .gt_first_summary_row.thick {
   border-top-width: 2px;
 }
 
-#tymamautoy .gt_last_summary_row {
+#zulyvxvvtg .gt_last_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -1571,7 +1571,7 @@ <h4>Example 1: Linear regression with single variable<a href="c07-modeling.html#
   border-bottom-color: #D3D3D3;
 }
 
-#tymamautoy .gt_grand_summary_row {
+#zulyvxvvtg .gt_grand_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -1581,7 +1581,7 @@ <h4>Example 1: Linear regression with single variable<a href="c07-modeling.html#
   padding-right: 5px;
 }
 
-#tymamautoy .gt_first_grand_summary_row {
+#zulyvxvvtg .gt_first_grand_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -1591,7 +1591,7 @@ <h4>Example 1: Linear regression with single variable<a href="c07-modeling.html#
   border-top-color: #D3D3D3;
 }
 
-#tymamautoy .gt_last_grand_summary_row_top {
+#zulyvxvvtg .gt_last_grand_summary_row_top {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -1601,11 +1601,11 @@ <h4>Example 1: Linear regression with single variable<a href="c07-modeling.html#
   border-bottom-color: #D3D3D3;
 }
 
-#tymamautoy .gt_striped {
+#zulyvxvvtg .gt_striped {
   background-color: rgba(128, 128, 128, 0.05);
 }
 
-#tymamautoy .gt_table_body {
+#zulyvxvvtg .gt_table_body {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -1614,7 +1614,7 @@ <h4>Example 1: Linear regression with single variable<a href="c07-modeling.html#
   border-bottom-color: #D3D3D3;
 }
 
-#tymamautoy .gt_footnotes {
+#zulyvxvvtg .gt_footnotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -1628,7 +1628,7 @@ <h4>Example 1: Linear regression with single variable<a href="c07-modeling.html#
   border-right-color: #D3D3D3;
 }
 
-#tymamautoy .gt_footnote {
+#zulyvxvvtg .gt_footnote {
   margin: 0px;
   font-size: 90%;
   padding-top: 4px;
@@ -1637,7 +1637,7 @@ <h4>Example 1: Linear regression with single variable<a href="c07-modeling.html#
   padding-right: 5px;
 }
 
-#tymamautoy .gt_sourcenotes {
+#zulyvxvvtg .gt_sourcenotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -1651,7 +1651,7 @@ <h4>Example 1: Linear regression with single variable<a href="c07-modeling.html#
   border-right-color: #D3D3D3;
 }
 
-#tymamautoy .gt_sourcenote {
+#zulyvxvvtg .gt_sourcenote {
   font-size: 90%;
   padding-top: 4px;
   padding-bottom: 4px;
@@ -1659,63 +1659,63 @@ <h4>Example 1: Linear regression with single variable<a href="c07-modeling.html#
   padding-right: 5px;
 }
 
-#tymamautoy .gt_left {
+#zulyvxvvtg .gt_left {
   text-align: left;
 }
 
-#tymamautoy .gt_center {
+#zulyvxvvtg .gt_center {
   text-align: center;
 }
 
-#tymamautoy .gt_right {
+#zulyvxvvtg .gt_right {
   text-align: right;
   font-variant-numeric: tabular-nums;
 }
 
-#tymamautoy .gt_font_normal {
+#zulyvxvvtg .gt_font_normal {
   font-weight: normal;
 }
 
-#tymamautoy .gt_font_bold {
+#zulyvxvvtg .gt_font_bold {
   font-weight: bold;
 }
 
-#tymamautoy .gt_font_italic {
+#zulyvxvvtg .gt_font_italic {
   font-style: italic;
 }
 
-#tymamautoy .gt_super {
+#zulyvxvvtg .gt_super {
   font-size: 65%;
 }
 
-#tymamautoy .gt_footnote_marks {
+#zulyvxvvtg .gt_footnote_marks {
   font-size: 75%;
   vertical-align: 0.4em;
   position: initial;
 }
 
-#tymamautoy .gt_asterisk {
+#zulyvxvvtg .gt_asterisk {
   font-size: 100%;
   vertical-align: 0;
 }
 
-#tymamautoy .gt_indent_1 {
+#zulyvxvvtg .gt_indent_1 {
   text-indent: 5px;
 }
 
-#tymamautoy .gt_indent_2 {
+#zulyvxvvtg .gt_indent_2 {
   text-indent: 10px;
 }
 
-#tymamautoy .gt_indent_3 {
+#zulyvxvvtg .gt_indent_3 {
   text-indent: 15px;
 }
 
-#tymamautoy .gt_indent_4 {
+#zulyvxvvtg .gt_indent_4 {
   text-indent: 20px;
 }
 
-#tymamautoy .gt_indent_5 {
+#zulyvxvvtg .gt_indent_5 {
   text-indent: 25px;
 }
 </style>
@@ -1747,40 +1747,41 @@ <h4>Example 1: Linear regression with single variable<a href="c07-modeling.html#
   
 </table>
 </div>
-<p>In the output above, we can see the estimated coefficients (<code>estimate</code>), estimated standard errors of the coefficients (<code>std.error</code>), the t-statistic (<code>statistic</code>), and the p-value for each coefficient. In these results, we can say that, on average, for every additional square foot of house size, the electricity bill increases by 29.9 cents and that square footage is significantly associated with electricity expenditure (p-value&lt;0.0001).</p>
-<p>This is a very simple model, and there are likely many more factors related to electricity expenditure, including the type of cooling, number of appliances, location, and more. However, starting with one variable models can help researchers understand what potential relationships there are between variables before fitting more complex models. Often researchers start with known relationships before building models to determine what impact additional variables have on the model.</p>
+<p>In the output above, we can see the estimated coefficients (<code>estimate</code>), estimated standard errors of the coefficients (<code>std.error</code>), the t-statistic (<code>statistic</code>), and the p-value for each coefficient. In these results, we can say that, on average, for every additional square foot of house size, the electricity bill increases by 30 cents, and that square footage is significantly associated with electricity expenditure (p-value&lt;0.0001.)</p>
+<p>This is a straightforward model, and there are likely many more factors related to electricity expenditure, including the type of cooling, number of appliances, location, and more. However, starting with one-variable models can help researchers understand what potential relationships there are between variables before fitting more complex models. Often, researchers start with known relationships before building models to determine what impact additional variables have on the model.</p>
 </div>
 <div id="example-2-linear-regression-with-multiple-variables-and-interactions" class="section level4 unnumbered hasAnchor">
 <h4>Example 2: Linear regression with multiple variables and interactions<a href="c07-modeling.html#example-2-linear-regression-with-multiple-variables-and-interactions" class="anchor-section" aria-label="Anchor link to header"></a></h4>
-<p>In the following example, a model is fit to predict electricity expenditure, including Census region (factor/categorical), urbanicity (factor/categorical), square footage (double/numeric), and whether air-conditioning is used (logical/categorical) with all two-way interactions also included. In this example, we are choosing to fit this model without an intercept (using <code>-1</code> in the formula). This will result in an intercept estimate for each region instead of a single intercept for all data.</p>
+<p>In the following example, a model is fit to predict electricity expenditure, including Census region (factor/categorical), urbanicity (factor/categorical), square footage (double/numeric), and whether air-conditioning (A/C) is used (logical/categorical) with all two-way interactions also included. In this example, we are choosing to fit this model without an intercept (using <code>-1</code> in the formula.) This results in an intercept estimate for each region instead of a single intercept for all data.</p>
 <div class="sourceCode" id="cb227"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb227-1"><a href="c07-modeling.html#cb227-1" tabindex="-1"></a>m_electric_multi <span class="ot">&lt;-</span> recs_des <span class="sc">%&gt;%</span></span>
 <span id="cb227-2"><a href="c07-modeling.html#cb227-2" tabindex="-1"></a>  <span class="fu">svyglm</span>(</span>
 <span id="cb227-3"><a href="c07-modeling.html#cb227-3" tabindex="-1"></a>    <span class="at">design =</span> .,</span>
-<span id="cb227-4"><a href="c07-modeling.html#cb227-4" tabindex="-1"></a>    <span class="at">formula =</span> DOLLAREL <span class="sc">~</span> (Region <span class="sc">+</span> Urbanicity <span class="sc">+</span> TOTSQFT_EN <span class="sc">+</span> ACUsed)<span class="sc">^</span><span class="dv">2</span> <span class="sc">-</span> <span class="dv">1</span>, </span>
-<span id="cb227-5"><a href="c07-modeling.html#cb227-5" tabindex="-1"></a>    <span class="at">na.action =</span> na.omit</span>
-<span id="cb227-6"><a href="c07-modeling.html#cb227-6" tabindex="-1"></a>  )</span></code></pre></div>
+<span id="cb227-4"><a href="c07-modeling.html#cb227-4" tabindex="-1"></a>    <span class="at">formula =</span> </span>
+<span id="cb227-5"><a href="c07-modeling.html#cb227-5" tabindex="-1"></a>      DOLLAREL <span class="sc">~</span> (Region <span class="sc">+</span> Urbanicity <span class="sc">+</span> TOTSQFT_EN <span class="sc">+</span> ACUsed)<span class="sc">^</span><span class="dv">2</span> <span class="sc">-</span> <span class="dv">1</span>, </span>
+<span id="cb227-6"><a href="c07-modeling.html#cb227-6" tabindex="-1"></a>    <span class="at">na.action =</span> na.omit</span>
+<span id="cb227-7"><a href="c07-modeling.html#cb227-7" tabindex="-1"></a>  )</span></code></pre></div>
 <div class="sourceCode" id="cb228"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb228-1"><a href="c07-modeling.html#cb228-1" tabindex="-1"></a><span class="fu">tidy</span>(m_electric_multi) <span class="sc">%&gt;%</span></span>
 <span id="cb228-2"><a href="c07-modeling.html#cb228-2" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">p.value=</span><span class="fu">pretty_p_value</span>(p.value)) <span class="sc">%&gt;%</span></span>
 <span id="cb228-3"><a href="c07-modeling.html#cb228-3" tabindex="-1"></a>  <span class="fu">gt</span>() <span class="sc">%&gt;%</span></span>
 <span id="cb228-4"><a href="c07-modeling.html#cb228-4" tabindex="-1"></a>  <span class="fu">fmt_number</span>()</span></code></pre></div>
 
-<div id="jhtioslzwx" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
-<style>#jhtioslzwx table {
+<div id="wqomfjloar" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
+<style>#wqomfjloar table {
   font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
   -webkit-font-smoothing: antialiased;
   -moz-osx-font-smoothing: grayscale;
 }
 
-#jhtioslzwx thead, #jhtioslzwx tbody, #jhtioslzwx tfoot, #jhtioslzwx tr, #jhtioslzwx td, #jhtioslzwx th {
+#wqomfjloar thead, #wqomfjloar tbody, #wqomfjloar tfoot, #wqomfjloar tr, #wqomfjloar td, #wqomfjloar th {
   border-style: none;
 }
 
-#jhtioslzwx p {
+#wqomfjloar p {
   margin: 0;
   padding: 0;
 }
 
-#jhtioslzwx .gt_table {
+#wqomfjloar .gt_table {
   display: table;
   border-collapse: collapse;
   line-height: normal;
@@ -1806,12 +1807,12 @@ <h4>Example 2: Linear regression with multiple variables and interactions<a href
   border-left-color: #D3D3D3;
 }
 
-#jhtioslzwx .gt_caption {
+#wqomfjloar .gt_caption {
   padding-top: 4px;
   padding-bottom: 4px;
 }
 
-#jhtioslzwx .gt_title {
+#wqomfjloar .gt_title {
   color: #333333;
   font-size: 125%;
   font-weight: initial;
@@ -1823,7 +1824,7 @@ <h4>Example 2: Linear regression with multiple variables and interactions<a href
   border-bottom-width: 0;
 }
 
-#jhtioslzwx .gt_subtitle {
+#wqomfjloar .gt_subtitle {
   color: #333333;
   font-size: 85%;
   font-weight: initial;
@@ -1835,7 +1836,7 @@ <h4>Example 2: Linear regression with multiple variables and interactions<a href
   border-top-width: 0;
 }
 
-#jhtioslzwx .gt_heading {
+#wqomfjloar .gt_heading {
   background-color: #FFFFFF;
   text-align: center;
   border-bottom-color: #FFFFFF;
@@ -1847,13 +1848,13 @@ <h4>Example 2: Linear regression with multiple variables and interactions<a href
   border-right-color: #D3D3D3;
 }
 
-#jhtioslzwx .gt_bottom_border {
+#wqomfjloar .gt_bottom_border {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
 }
 
-#jhtioslzwx .gt_col_headings {
+#wqomfjloar .gt_col_headings {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -1868,7 +1869,7 @@ <h4>Example 2: Linear regression with multiple variables and interactions<a href
   border-right-color: #D3D3D3;
 }
 
-#jhtioslzwx .gt_col_heading {
+#wqomfjloar .gt_col_heading {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -1888,7 +1889,7 @@ <h4>Example 2: Linear regression with multiple variables and interactions<a href
   overflow-x: hidden;
 }
 
-#jhtioslzwx .gt_column_spanner_outer {
+#wqomfjloar .gt_column_spanner_outer {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -1900,15 +1901,15 @@ <h4>Example 2: Linear regression with multiple variables and interactions<a href
   padding-right: 4px;
 }
 
-#jhtioslzwx .gt_column_spanner_outer:first-child {
+#wqomfjloar .gt_column_spanner_outer:first-child {
   padding-left: 0;
 }
 
-#jhtioslzwx .gt_column_spanner_outer:last-child {
+#wqomfjloar .gt_column_spanner_outer:last-child {
   padding-right: 0;
 }
 
-#jhtioslzwx .gt_column_spanner {
+#wqomfjloar .gt_column_spanner {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
@@ -1920,11 +1921,11 @@ <h4>Example 2: Linear regression with multiple variables and interactions<a href
   width: 100%;
 }
 
-#jhtioslzwx .gt_spanner_row {
+#wqomfjloar .gt_spanner_row {
   border-bottom-style: hidden;
 }
 
-#jhtioslzwx .gt_group_heading {
+#wqomfjloar .gt_group_heading {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -1950,7 +1951,7 @@ <h4>Example 2: Linear regression with multiple variables and interactions<a href
   text-align: left;
 }
 
-#jhtioslzwx .gt_empty_group_heading {
+#wqomfjloar .gt_empty_group_heading {
   padding: 0.5px;
   color: #333333;
   background-color: #FFFFFF;
@@ -1965,15 +1966,15 @@ <h4>Example 2: Linear regression with multiple variables and interactions<a href
   vertical-align: middle;
 }
 
-#jhtioslzwx .gt_from_md > :first-child {
+#wqomfjloar .gt_from_md > :first-child {
   margin-top: 0;
 }
 
-#jhtioslzwx .gt_from_md > :last-child {
+#wqomfjloar .gt_from_md > :last-child {
   margin-bottom: 0;
 }
 
-#jhtioslzwx .gt_row {
+#wqomfjloar .gt_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -1992,7 +1993,7 @@ <h4>Example 2: Linear regression with multiple variables and interactions<a href
   overflow-x: hidden;
 }
 
-#jhtioslzwx .gt_stub {
+#wqomfjloar .gt_stub {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -2005,7 +2006,7 @@ <h4>Example 2: Linear regression with multiple variables and interactions<a href
   padding-right: 5px;
 }
 
-#jhtioslzwx .gt_stub_row_group {
+#wqomfjloar .gt_stub_row_group {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -2019,15 +2020,15 @@ <h4>Example 2: Linear regression with multiple variables and interactions<a href
   vertical-align: top;
 }
 
-#jhtioslzwx .gt_row_group_first td {
+#wqomfjloar .gt_row_group_first td {
   border-top-width: 2px;
 }
 
-#jhtioslzwx .gt_row_group_first th {
+#wqomfjloar .gt_row_group_first th {
   border-top-width: 2px;
 }
 
-#jhtioslzwx .gt_summary_row {
+#wqomfjloar .gt_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -2037,16 +2038,16 @@ <h4>Example 2: Linear regression with multiple variables and interactions<a href
   padding-right: 5px;
 }
 
-#jhtioslzwx .gt_first_summary_row {
+#wqomfjloar .gt_first_summary_row {
   border-top-style: solid;
   border-top-color: #D3D3D3;
 }
 
-#jhtioslzwx .gt_first_summary_row.thick {
+#wqomfjloar .gt_first_summary_row.thick {
   border-top-width: 2px;
 }
 
-#jhtioslzwx .gt_last_summary_row {
+#wqomfjloar .gt_last_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -2056,7 +2057,7 @@ <h4>Example 2: Linear regression with multiple variables and interactions<a href
   border-bottom-color: #D3D3D3;
 }
 
-#jhtioslzwx .gt_grand_summary_row {
+#wqomfjloar .gt_grand_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -2066,7 +2067,7 @@ <h4>Example 2: Linear regression with multiple variables and interactions<a href
   padding-right: 5px;
 }
 
-#jhtioslzwx .gt_first_grand_summary_row {
+#wqomfjloar .gt_first_grand_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -2076,7 +2077,7 @@ <h4>Example 2: Linear regression with multiple variables and interactions<a href
   border-top-color: #D3D3D3;
 }
 
-#jhtioslzwx .gt_last_grand_summary_row_top {
+#wqomfjloar .gt_last_grand_summary_row_top {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -2086,11 +2087,11 @@ <h4>Example 2: Linear regression with multiple variables and interactions<a href
   border-bottom-color: #D3D3D3;
 }
 
-#jhtioslzwx .gt_striped {
+#wqomfjloar .gt_striped {
   background-color: rgba(128, 128, 128, 0.05);
 }
 
-#jhtioslzwx .gt_table_body {
+#wqomfjloar .gt_table_body {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -2099,7 +2100,7 @@ <h4>Example 2: Linear regression with multiple variables and interactions<a href
   border-bottom-color: #D3D3D3;
 }
 
-#jhtioslzwx .gt_footnotes {
+#wqomfjloar .gt_footnotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -2113,7 +2114,7 @@ <h4>Example 2: Linear regression with multiple variables and interactions<a href
   border-right-color: #D3D3D3;
 }
 
-#jhtioslzwx .gt_footnote {
+#wqomfjloar .gt_footnote {
   margin: 0px;
   font-size: 90%;
   padding-top: 4px;
@@ -2122,7 +2123,7 @@ <h4>Example 2: Linear regression with multiple variables and interactions<a href
   padding-right: 5px;
 }
 
-#jhtioslzwx .gt_sourcenotes {
+#wqomfjloar .gt_sourcenotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -2136,7 +2137,7 @@ <h4>Example 2: Linear regression with multiple variables and interactions<a href
   border-right-color: #D3D3D3;
 }
 
-#jhtioslzwx .gt_sourcenote {
+#wqomfjloar .gt_sourcenote {
   font-size: 90%;
   padding-top: 4px;
   padding-bottom: 4px;
@@ -2144,63 +2145,63 @@ <h4>Example 2: Linear regression with multiple variables and interactions<a href
   padding-right: 5px;
 }
 
-#jhtioslzwx .gt_left {
+#wqomfjloar .gt_left {
   text-align: left;
 }
 
-#jhtioslzwx .gt_center {
+#wqomfjloar .gt_center {
   text-align: center;
 }
 
-#jhtioslzwx .gt_right {
+#wqomfjloar .gt_right {
   text-align: right;
   font-variant-numeric: tabular-nums;
 }
 
-#jhtioslzwx .gt_font_normal {
+#wqomfjloar .gt_font_normal {
   font-weight: normal;
 }
 
-#jhtioslzwx .gt_font_bold {
+#wqomfjloar .gt_font_bold {
   font-weight: bold;
 }
 
-#jhtioslzwx .gt_font_italic {
+#wqomfjloar .gt_font_italic {
   font-style: italic;
 }
 
-#jhtioslzwx .gt_super {
+#wqomfjloar .gt_super {
   font-size: 65%;
 }
 
-#jhtioslzwx .gt_footnote_marks {
+#wqomfjloar .gt_footnote_marks {
   font-size: 75%;
   vertical-align: 0.4em;
   position: initial;
 }
 
-#jhtioslzwx .gt_asterisk {
+#wqomfjloar .gt_asterisk {
   font-size: 100%;
   vertical-align: 0;
 }
 
-#jhtioslzwx .gt_indent_1 {
+#wqomfjloar .gt_indent_1 {
   text-indent: 5px;
 }
 
-#jhtioslzwx .gt_indent_2 {
+#wqomfjloar .gt_indent_2 {
   text-indent: 10px;
 }
 
-#jhtioslzwx .gt_indent_3 {
+#wqomfjloar .gt_indent_3 {
   text-indent: 15px;
 }
 
-#jhtioslzwx .gt_indent_4 {
+#wqomfjloar .gt_indent_4 {
   text-indent: 20px;
 }
 
-#jhtioslzwx .gt_indent_5 {
+#wqomfjloar .gt_indent_5 {
   text-indent: 25px;
 }
 </style>
@@ -2354,8 +2355,8 @@ <h4>Example 2: Linear regression with multiple variables and interactions<a href
 ##  in svyglm(design = ., formula = DOLLAREL ~ (Region + Urbanicity + 
 ##     TOTSQFT_EN + ACUsed)^2 - 1, na.action = na.omit)
 ## F =  6.851  on  6  and  35  df: p= 7.2e-05</code></pre>
-<p>This output indicates there is a significant interaction between urbanicity and region (p-value=&lt;0.0001).</p>
-<p>To examine the predictions, residuals, and more from the model, the function <code>augment()</code> from {broom} can be used. The <code>augment()</code> function will return a tibble with the independent and dependent variables and other fit statistics. The <code>augment()</code> function has not been specifically written for objects of class <code>svyglm</code>, and as such, a warning will be displayed indicating this at this time. As it was not written exactly for this class of objects, a little tweaking needs to be done after using <code>augment()</code>. To obtain the standard error of the predicted values (<code>.se.fit</code>) we need to use the <code>attr()</code> function on the predicted values (<code>.fitted</code>) created by <code>augment()</code>. Additionally, the predicted values created are outputted as a <code>svrep</code> type of data. If we want to plot the predicted values, we need to use <code>as.numeric()</code> to get the predicted values into a numeric format to work with. However, it is important to note that this adjustment must be completed <strong>after</strong> the standard error adjustment.</p>
+<p>This output indicates there is a significant interaction between urbanicity and region (p-value=&lt;0.0001.)</p>
+<p>To examine the predictions, residuals, and more from the model, the function <code>augment()</code> from {broom} can be used. The <code>augment()</code> function returns a tibble with the independent and dependent variables and other fit statistics. The <code>augment()</code> function has not been specifically written for objects of class <code>svyglm</code>, and as such, a warning is displayed indicating this at this time. As it was not written exactly for this class of objects, a little tweaking needs to be done after using <code>augment()</code>. To obtain the standard error of the predicted values (<code>.se.fit</code>), we need to use the <code>attr()</code> function on the predicted values (<code>.fitted</code>) created by <code>augment()</code>. Additionally, the predicted values created are outputted with a type of <code>svrep</code>. If we want to plot the predicted values, we need to use <code>as.numeric()</code> to get the predicted values into a numeric format to work with. However, it is important to note that this adjustment must be completed <strong>after</strong> the standard error adjustment.</p>
 <div class="sourceCode" id="cb231"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb231-1"><a href="c07-modeling.html#cb231-1" tabindex="-1"></a>fitstats <span class="ot">&lt;-</span></span>
 <span id="cb231-2"><a href="c07-modeling.html#cb231-2" tabindex="-1"></a>  <span class="fu">augment</span>(m_electric_multi) <span class="sc">%&gt;%</span></span>
 <span id="cb231-3"><a href="c07-modeling.html#cb231-3" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">.se.fit =</span> <span class="fu">sqrt</span>(<span class="fu">attr</span>(.fitted, <span class="st">&quot;var&quot;</span>)), </span>
@@ -2381,7 +2382,7 @@ <h4>Example 2: Linear regression with multiple variables and interactions<a href
 <p>These results can then be used in a variety of ways, including examining residual plots as illustrated in the code below and Figure <a href="c07-modeling.html#fig:model-aug-examp-plot">7.2</a>. In the residual plot, we look for any patterns in the data. If we do see patterns, this may indicate a violation of the heteroscedasticity assumption and the standard errors of the coefficients may be incorrect. In Figure <a href="c07-modeling.html#fig:model-aug-examp-plot">7.2</a>, we do not see a strong pattern indicating that our assumption of heteroscedasticity may hold.</p>
 <div class="sourceCode" id="cb233"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb233-1"><a href="c07-modeling.html#cb233-1" tabindex="-1"></a>fitstats <span class="sc">%&gt;%</span></span>
 <span id="cb233-2"><a href="c07-modeling.html#cb233-2" tabindex="-1"></a>  <span class="fu">ggplot</span>(<span class="fu">aes</span>(<span class="at">x =</span> .fitted, .resid)) <span class="sc">+</span></span>
-<span id="cb233-3"><a href="c07-modeling.html#cb233-3" tabindex="-1"></a>  <span class="fu">geom_point</span>() <span class="sc">+</span></span>
+<span id="cb233-3"><a href="c07-modeling.html#cb233-3" tabindex="-1"></a>  <span class="fu">geom_point</span>(<span class="at">alpha=</span>.<span class="dv">1</span>) <span class="sc">+</span> </span>
 <span id="cb233-4"><a href="c07-modeling.html#cb233-4" tabindex="-1"></a>  <span class="fu">geom_hline</span>(<span class="at">yintercept =</span> <span class="dv">0</span>, <span class="at">color =</span> <span class="st">&quot;red&quot;</span>) <span class="sc">+</span></span>
 <span id="cb233-5"><a href="c07-modeling.html#cb233-5" tabindex="-1"></a>  <span class="fu">theme_minimal</span>() <span class="sc">+</span></span>
 <span id="cb233-6"><a href="c07-modeling.html#cb233-6" tabindex="-1"></a>  <span class="fu">xlab</span>(<span class="st">&quot;Fitted value of electricity cost&quot;</span>) <span class="sc">+</span></span>
@@ -2389,12 +2390,12 @@ <h4>Example 2: Linear regression with multiple variables and interactions<a href
 <span id="cb233-8"><a href="c07-modeling.html#cb233-8" tabindex="-1"></a>  <span class="fu">scale_y_continuous</span>(<span class="at">labels =</span> scales<span class="sc">::</span><span class="fu">dollar_format</span>()) <span class="sc">+</span></span>
 <span id="cb233-9"><a href="c07-modeling.html#cb233-9" tabindex="-1"></a>  <span class="fu">scale_x_continuous</span>(<span class="at">labels =</span> scales<span class="sc">::</span><span class="fu">dollar_format</span>()) </span></code></pre></div>
 <div class="figure"><span style="display:block;" id="fig:model-aug-examp-plot"></span>
-<img src="bookdown_files/figure-html/model-aug-examp-plot-1.png" alt="Residual scatter plot with a x-axis of 'Fitted value of electricity cost' ranging between approximately $0 and $4,000 and a y-axis with the 'Residual of model' ranging from approximatley -$3,000 to $5,000. The points create a slight megaphone shape with largest residuals in the middle of the x-range. A red line is drawn horizontally at y=0." width="672" />
+<img src="bookdown_files/figure-html/model-aug-examp-plot-1.png" alt="Residual scatter plot with a x-axis of 'Fitted value of electricity cost' ranging between approximately $0 and $4,000 and a y-axis with the 'Residual of model' ranging from approximately -$3,000 to $5,000. The points create a slight megaphone shape with largest residuals in the middle of the x-range. A red line is drawn horizontally at y=0." width="672" />
 <p class="caption">
 FIGURE 7.2: Residual plot of electric cost model with covariates Region, Urbanicity, TOTSQFT_EN, and ACUsed
 </p>
 </div>
-<p>Additionally, <code>augment()</code> can be used to predict outcomes for data not used in modeling. Perhaps, we would like to predict the energy expenditure for a home in an urban area in the south that uses air-conditioning and is 2,500 square feet. To do this, we first make a tibble including that additional data and then use the <code>newdata</code> argument in the <code>augment()</code> function. As before, to obtain the standard error of the predicted values we need to use the <code>attr()</code> function.</p>
+<p>Additionally, <code>augment()</code> can be used to predict outcomes for data not used in modeling. Perhaps we would like to predict the energy expenditure for a home in an urban area in the south that uses air-conditioning and is 2,500 square feet. To do this, we first make a tibble including that additional data and then use the <code>newdata</code> argument in the <code>augment()</code> function. As before, to obtain the standard error of the predicted values, we need to use the <code>attr()</code> function.</p>
 <div class="sourceCode" id="cb234"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb234-1"><a href="c07-modeling.html#cb234-1" tabindex="-1"></a>add_data <span class="ot">&lt;-</span> recs_2020 <span class="sc">%&gt;%</span> </span>
 <span id="cb234-2"><a href="c07-modeling.html#cb234-2" tabindex="-1"></a>  <span class="fu">select</span>(DOEID, Region, Urbanicity,</span>
 <span id="cb234-3"><a href="c07-modeling.html#cb234-3" tabindex="-1"></a>         TOTSQFT_EN, ACUsed,</span>
@@ -2426,8 +2427,8 @@ <h4>Example 2: Linear regression with multiple variables and interactions<a href
 </div>
 <div id="logistic-regression" class="section level2 hasAnchor" number="7.4">
 <h2><span class="header-section-number">7.4</span> Logistic regression<a href="c07-modeling.html#logistic-regression" class="anchor-section" aria-label="Anchor link to header"></a></h2>
-<p>Logistic regression is used to model binary outcomes such as whether or not someone voted. There are several instances where an outcome may not be originally binary but is collapsed into being binary. For example, given that gender is often asked in surveys with multiple response options and not a binary scale, many researchers now code gender in logistic modeling as cis-male compared to not cis-male. We could also convert a 4-point likert scale that has levels of “Strongly Agree”, “Agree”, “Disagree”, and “Strongly Disagree” to group the agreement levels into one group and disagreement levels into a second group.</p>
-<p>Logistic regression is a specific case of the generalized linear model (GLM). A GLM uses a link function to link the response variable to the linear model. If we tried to use a normal linear regression with a binary outcome, many assumptions are not held - namely the response is not continuous. Logistic regression allows us to link a linear model between the covariates and a propensity of an outcome. In logistic regression, the link model is the logit function. Specifically, the model is specified as follows:</p>
+<p>Logistic regression is used to model binary outcomes, such as whether or not someone voted. There are several instances where an outcome may not be originally binary but is collapsed into being binary. For example, given that gender is often asked in surveys with multiple response options and not a binary scale, many researchers now code gender in logistic modeling as cis-male compared to not cis-male. We could also convert a 4-point Likert scale that has levels of “Strongly Agree”, “Agree”, “Disagree”, and “Strongly Disagree” to group the agreement levels into one group and disagreement levels into a second group.</p>
+<p>Logistic regression is a specific case of the generalized linear model (GLM.) A GLM uses a link function to link the response variable to the linear model. If we tried to use a normal linear regression with a binary outcome, many assumptions would not hold, namely, the response would not be continuous. Logistic regression allows us to link a linear model between the covariates and the propensity of an outcome. In logistic regression, the link model is the logit function. Specifically, the model is specified as follows:</p>
 <p><span class="math display">\[ y_i \sim \text{Bernoulli}(\pi_i)\]</span></p>
 <span class="math display">\[\begin{equation}
 \log \left(\frac{\pi_i}{1-\pi_i} \right)=\beta_0 +\sum_{i=1}^n \beta_i x_i
@@ -2438,8 +2439,8 @@ <h2><span class="header-section-number">7.4</span> Logistic regression<a href="c
 <p>Assumptions in logistic regression using survey data include:</p>
 <ul>
 <li>The outcome variable has two levels</li>
-<li>There is a linear relationship between the independent variables and the log odds (<span class="math inline">\(\log \left(\frac{\pi_i}{1-\pi_i} \right)\)</span>)</li>
-<li>The residuals are homoscedastic, that is, the error term is the same across all values of independent variables</li>
+<li>There is a linear relationship between the independent variables and the log odds (the equation for the logit function)</li>
+<li>The residuals are homoscedastic; that is, the error term is the same across all values of independent variables</li>
 </ul>
 <div id="syntax-8" class="section level3 hasAnchor" number="7.4.1">
 <h3><span class="header-section-number">7.4.1</span> Syntax<a href="c07-modeling.html#syntax-8" class="anchor-section" aria-label="Anchor link to header"></a></h3>
@@ -2460,18 +2461,18 @@ <h3><span class="header-section-number">7.4.1</span> Syntax<a href="c07-modeling
 <li><code>df.resid</code>: degrees of freedom for Wald tests (optional) - defaults to using <code>degf(design)-p</code> where <span class="math inline">\(p\)</span> is the rank of the design matrix</li>
 <li><code>family</code>: the error distribution/link function to be used in the model</li>
 </ul>
-<p>Note <code>svyglm()</code> is the same function used in both ANOVA and normal linear regression. However, we’ve added the link function quasibinomial. While we can use the binomial link function, it is recommended to use the quasibinomial as our weights may not be integers, and the quasibinomial also allows for overdispersion <span class="citation">(<a href="#ref-lumley2010complex">Lumley 2010</a>; <a href="#ref-mccullagh1989binary">McCullagh and Nelder 1989</a>; <a href="#ref-R-base">R Core Team 2023</a>)</span>. The quasibinomial family has a default logit link which is what is specified in the equations above. When specifying the outcome variable, it will likely be specified in one of three ways with survey data:</p>
+<p>Note <code>svyglm()</code> is the same function used in both ANOVA and normal linear regression. However, we’ve added the link function quasibinomial. While we can use the binomial link function, it is recommended to use the quasibinomial as our weights may not be integers, and the quasibinomial also allows for overdispersion <span class="citation">(<a href="#ref-lumley2010complex">Lumley 2010</a>; <a href="#ref-mccullagh1989binary">McCullagh and Nelder 1989</a>; <a href="#ref-R-base">R Core Team 2023</a>)</span>. The quasibinomial family has a default logit link, which is specified in the equations above. When specifying the outcome variable, it is likely specified in one of three ways with survey data:</p>
 <ul>
-<li>A two level factor variable where the first level of the factor indicates a “failure” and the second level indicates a “success”</li>
+<li>A two-level factor variable where the first level of the factor indicates a “failure” and the second level indicates a “success”</li>
 <li>A numeric variable which is 1 or 0 where 1 indicates a success</li>
 <li>A logical variable where TRUE indicates a success</li>
 </ul>
 </div>
 <div id="examples-8" class="section level3 hasAnchor" number="7.4.2">
 <h3><span class="header-section-number">7.4.2</span> Examples<a href="c07-modeling.html#examples-8" class="anchor-section" aria-label="Anchor link to header"></a></h3>
-<div id="example-1-logistic-regression-with-single-variable" class="section level4 unnumbered hasAnchor">
-<h4>Example 1: Logistic regression with single variable<a href="c07-modeling.html#example-1-logistic-regression-with-single-variable" class="anchor-section" aria-label="Anchor link to header"></a></h4>
-<p>In the following example, the ANES data is used, and we are modeling whether someone usually has trust in the government<a href="#fn25" class="footnote-ref" id="fnref25"><sup>25</sup></a> by who someone voted for president in 2020. As a reminder, the leading candidates were Biden and Trump though people could vote for someone else not in the Democratic or Republican parties. Those votes are all grouped into an “Other” category. We first create a binary outcome for trusting in the government by collapsing “Always” and “Most of the time” into a single factor level, and the other response options (“About half the time”, “Some of the time”, and “Never”) into a second factor level. Next, a scatter plot of the raw data is not useful as it is all 0 and 1 outcomes, so instead, we plot a summary of the data.</p>
+<div id="example-1-logistic-regression-with-a-single-variable" class="section level4 unnumbered hasAnchor">
+<h4>Example 1: Logistic regression with a single variable<a href="c07-modeling.html#example-1-logistic-regression-with-a-single-variable" class="anchor-section" aria-label="Anchor link to header"></a></h4>
+<p>In the following example, the ANES data are used, and we are modeling whether someone usually has trust in the government<a href="#fn25" class="footnote-ref" id="fnref25"><sup>25</sup></a> by who someone voted for president in 2020. As a reminder, the leading candidates were Biden and Trump, though people could vote for someone else not in the Democratic or Republican parties. Those votes are all grouped into an “Other” category. We first create a binary outcome for trusting in the government by collapsing “Always” and “Most of the time” into a single-factor level, and the other response options (“About half the time”, “Some of the time”, and “Never”) into a second factor level. Next, a scatter plot of the raw data is not useful as it is all 0 and 1 outcomes, so instead, we plot a summary of the data.</p>
 <div class="sourceCode" id="cb237"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb237-1"><a href="c07-modeling.html#cb237-1" tabindex="-1"></a>anes_des_der <span class="ot">&lt;-</span> anes_des <span class="sc">%&gt;%</span></span>
 <span id="cb237-2"><a href="c07-modeling.html#cb237-2" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">TrustGovernmentUsually =</span> <span class="fu">case_when</span>(</span>
 <span id="cb237-3"><a href="c07-modeling.html#cb237-3" tabindex="-1"></a>    <span class="fu">is.na</span>(TrustGovernment) <span class="sc">~</span> <span class="cn">NA</span>,</span>
@@ -2503,7 +2504,7 @@ <h4>Example 1: Logistic regression with single variable<a href="c07-modeling.htm
 FIGURE 7.3: Relationship between candidate selection and trust in government, ANES 2020
 </p>
 </div>
-<p>By looking at Figure <a href="c07-modeling.html#fig:model-logisticexamp-plot">7.3</a> it appears that people who voted for Trump are more likely to say that they usually have trust in the government compared to those who voted for Biden and Other candidates. To determine if this insight is accurate, we next we fit the model.</p>
+<p>Looking at Figure <a href="c07-modeling.html#fig:model-logisticexamp-plot">7.3</a>, it appears that people who voted for Trump are more likely to say that they usually have trust in the government compared to those who voted for Biden and Other candidates. To determine if this insight is accurate, we next fit the model.</p>
 <div class="sourceCode" id="cb238"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb238-1"><a href="c07-modeling.html#cb238-1" tabindex="-1"></a>logistic_trust_vote <span class="ot">&lt;-</span> anes_des_der <span class="sc">%&gt;%</span></span>
 <span id="cb238-2"><a href="c07-modeling.html#cb238-2" tabindex="-1"></a>  <span class="fu">svyglm</span>(<span class="at">design =</span> .,</span>
 <span id="cb238-3"><a href="c07-modeling.html#cb238-3" tabindex="-1"></a>         <span class="at">formula =</span> TrustGovernmentUsually <span class="sc">~</span> VotedPres2020_selection,</span>
@@ -2513,23 +2514,23 @@ <h4>Example 1: Logistic regression with single variable<a href="c07-modeling.htm
 <span id="cb239-3"><a href="c07-modeling.html#cb239-3" tabindex="-1"></a>  <span class="fu">gt</span>() <span class="sc">%&gt;%</span></span>
 <span id="cb239-4"><a href="c07-modeling.html#cb239-4" tabindex="-1"></a>  <span class="fu">fmt_number</span>()</span></code></pre></div>
 
-<div id="njegcpogfu" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
-<style>#njegcpogfu table {
+<div id="rwppmogzow" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
+<style>#rwppmogzow table {
   font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
   -webkit-font-smoothing: antialiased;
   -moz-osx-font-smoothing: grayscale;
 }
 
-#njegcpogfu thead, #njegcpogfu tbody, #njegcpogfu tfoot, #njegcpogfu tr, #njegcpogfu td, #njegcpogfu th {
+#rwppmogzow thead, #rwppmogzow tbody, #rwppmogzow tfoot, #rwppmogzow tr, #rwppmogzow td, #rwppmogzow th {
   border-style: none;
 }
 
-#njegcpogfu p {
+#rwppmogzow p {
   margin: 0;
   padding: 0;
 }
 
-#njegcpogfu .gt_table {
+#rwppmogzow .gt_table {
   display: table;
   border-collapse: collapse;
   line-height: normal;
@@ -2555,12 +2556,12 @@ <h4>Example 1: Logistic regression with single variable<a href="c07-modeling.htm
   border-left-color: #D3D3D3;
 }
 
-#njegcpogfu .gt_caption {
+#rwppmogzow .gt_caption {
   padding-top: 4px;
   padding-bottom: 4px;
 }
 
-#njegcpogfu .gt_title {
+#rwppmogzow .gt_title {
   color: #333333;
   font-size: 125%;
   font-weight: initial;
@@ -2572,7 +2573,7 @@ <h4>Example 1: Logistic regression with single variable<a href="c07-modeling.htm
   border-bottom-width: 0;
 }
 
-#njegcpogfu .gt_subtitle {
+#rwppmogzow .gt_subtitle {
   color: #333333;
   font-size: 85%;
   font-weight: initial;
@@ -2584,7 +2585,7 @@ <h4>Example 1: Logistic regression with single variable<a href="c07-modeling.htm
   border-top-width: 0;
 }
 
-#njegcpogfu .gt_heading {
+#rwppmogzow .gt_heading {
   background-color: #FFFFFF;
   text-align: center;
   border-bottom-color: #FFFFFF;
@@ -2596,13 +2597,13 @@ <h4>Example 1: Logistic regression with single variable<a href="c07-modeling.htm
   border-right-color: #D3D3D3;
 }
 
-#njegcpogfu .gt_bottom_border {
+#rwppmogzow .gt_bottom_border {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
 }
 
-#njegcpogfu .gt_col_headings {
+#rwppmogzow .gt_col_headings {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -2617,7 +2618,7 @@ <h4>Example 1: Logistic regression with single variable<a href="c07-modeling.htm
   border-right-color: #D3D3D3;
 }
 
-#njegcpogfu .gt_col_heading {
+#rwppmogzow .gt_col_heading {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -2637,7 +2638,7 @@ <h4>Example 1: Logistic regression with single variable<a href="c07-modeling.htm
   overflow-x: hidden;
 }
 
-#njegcpogfu .gt_column_spanner_outer {
+#rwppmogzow .gt_column_spanner_outer {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -2649,15 +2650,15 @@ <h4>Example 1: Logistic regression with single variable<a href="c07-modeling.htm
   padding-right: 4px;
 }
 
-#njegcpogfu .gt_column_spanner_outer:first-child {
+#rwppmogzow .gt_column_spanner_outer:first-child {
   padding-left: 0;
 }
 
-#njegcpogfu .gt_column_spanner_outer:last-child {
+#rwppmogzow .gt_column_spanner_outer:last-child {
   padding-right: 0;
 }
 
-#njegcpogfu .gt_column_spanner {
+#rwppmogzow .gt_column_spanner {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
@@ -2669,11 +2670,11 @@ <h4>Example 1: Logistic regression with single variable<a href="c07-modeling.htm
   width: 100%;
 }
 
-#njegcpogfu .gt_spanner_row {
+#rwppmogzow .gt_spanner_row {
   border-bottom-style: hidden;
 }
 
-#njegcpogfu .gt_group_heading {
+#rwppmogzow .gt_group_heading {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -2699,7 +2700,7 @@ <h4>Example 1: Logistic regression with single variable<a href="c07-modeling.htm
   text-align: left;
 }
 
-#njegcpogfu .gt_empty_group_heading {
+#rwppmogzow .gt_empty_group_heading {
   padding: 0.5px;
   color: #333333;
   background-color: #FFFFFF;
@@ -2714,15 +2715,15 @@ <h4>Example 1: Logistic regression with single variable<a href="c07-modeling.htm
   vertical-align: middle;
 }
 
-#njegcpogfu .gt_from_md > :first-child {
+#rwppmogzow .gt_from_md > :first-child {
   margin-top: 0;
 }
 
-#njegcpogfu .gt_from_md > :last-child {
+#rwppmogzow .gt_from_md > :last-child {
   margin-bottom: 0;
 }
 
-#njegcpogfu .gt_row {
+#rwppmogzow .gt_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -2741,7 +2742,7 @@ <h4>Example 1: Logistic regression with single variable<a href="c07-modeling.htm
   overflow-x: hidden;
 }
 
-#njegcpogfu .gt_stub {
+#rwppmogzow .gt_stub {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -2754,7 +2755,7 @@ <h4>Example 1: Logistic regression with single variable<a href="c07-modeling.htm
   padding-right: 5px;
 }
 
-#njegcpogfu .gt_stub_row_group {
+#rwppmogzow .gt_stub_row_group {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -2768,15 +2769,15 @@ <h4>Example 1: Logistic regression with single variable<a href="c07-modeling.htm
   vertical-align: top;
 }
 
-#njegcpogfu .gt_row_group_first td {
+#rwppmogzow .gt_row_group_first td {
   border-top-width: 2px;
 }
 
-#njegcpogfu .gt_row_group_first th {
+#rwppmogzow .gt_row_group_first th {
   border-top-width: 2px;
 }
 
-#njegcpogfu .gt_summary_row {
+#rwppmogzow .gt_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -2786,16 +2787,16 @@ <h4>Example 1: Logistic regression with single variable<a href="c07-modeling.htm
   padding-right: 5px;
 }
 
-#njegcpogfu .gt_first_summary_row {
+#rwppmogzow .gt_first_summary_row {
   border-top-style: solid;
   border-top-color: #D3D3D3;
 }
 
-#njegcpogfu .gt_first_summary_row.thick {
+#rwppmogzow .gt_first_summary_row.thick {
   border-top-width: 2px;
 }
 
-#njegcpogfu .gt_last_summary_row {
+#rwppmogzow .gt_last_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -2805,7 +2806,7 @@ <h4>Example 1: Logistic regression with single variable<a href="c07-modeling.htm
   border-bottom-color: #D3D3D3;
 }
 
-#njegcpogfu .gt_grand_summary_row {
+#rwppmogzow .gt_grand_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -2815,7 +2816,7 @@ <h4>Example 1: Logistic regression with single variable<a href="c07-modeling.htm
   padding-right: 5px;
 }
 
-#njegcpogfu .gt_first_grand_summary_row {
+#rwppmogzow .gt_first_grand_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -2825,7 +2826,7 @@ <h4>Example 1: Logistic regression with single variable<a href="c07-modeling.htm
   border-top-color: #D3D3D3;
 }
 
-#njegcpogfu .gt_last_grand_summary_row_top {
+#rwppmogzow .gt_last_grand_summary_row_top {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -2835,11 +2836,11 @@ <h4>Example 1: Logistic regression with single variable<a href="c07-modeling.htm
   border-bottom-color: #D3D3D3;
 }
 
-#njegcpogfu .gt_striped {
+#rwppmogzow .gt_striped {
   background-color: rgba(128, 128, 128, 0.05);
 }
 
-#njegcpogfu .gt_table_body {
+#rwppmogzow .gt_table_body {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -2848,7 +2849,7 @@ <h4>Example 1: Logistic regression with single variable<a href="c07-modeling.htm
   border-bottom-color: #D3D3D3;
 }
 
-#njegcpogfu .gt_footnotes {
+#rwppmogzow .gt_footnotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -2862,7 +2863,7 @@ <h4>Example 1: Logistic regression with single variable<a href="c07-modeling.htm
   border-right-color: #D3D3D3;
 }
 
-#njegcpogfu .gt_footnote {
+#rwppmogzow .gt_footnote {
   margin: 0px;
   font-size: 90%;
   padding-top: 4px;
@@ -2871,7 +2872,7 @@ <h4>Example 1: Logistic regression with single variable<a href="c07-modeling.htm
   padding-right: 5px;
 }
 
-#njegcpogfu .gt_sourcenotes {
+#rwppmogzow .gt_sourcenotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -2885,7 +2886,7 @@ <h4>Example 1: Logistic regression with single variable<a href="c07-modeling.htm
   border-right-color: #D3D3D3;
 }
 
-#njegcpogfu .gt_sourcenote {
+#rwppmogzow .gt_sourcenote {
   font-size: 90%;
   padding-top: 4px;
   padding-bottom: 4px;
@@ -2893,63 +2894,63 @@ <h4>Example 1: Logistic regression with single variable<a href="c07-modeling.htm
   padding-right: 5px;
 }
 
-#njegcpogfu .gt_left {
+#rwppmogzow .gt_left {
   text-align: left;
 }
 
-#njegcpogfu .gt_center {
+#rwppmogzow .gt_center {
   text-align: center;
 }
 
-#njegcpogfu .gt_right {
+#rwppmogzow .gt_right {
   text-align: right;
   font-variant-numeric: tabular-nums;
 }
 
-#njegcpogfu .gt_font_normal {
+#rwppmogzow .gt_font_normal {
   font-weight: normal;
 }
 
-#njegcpogfu .gt_font_bold {
+#rwppmogzow .gt_font_bold {
   font-weight: bold;
 }
 
-#njegcpogfu .gt_font_italic {
+#rwppmogzow .gt_font_italic {
   font-style: italic;
 }
 
-#njegcpogfu .gt_super {
+#rwppmogzow .gt_super {
   font-size: 65%;
 }
 
-#njegcpogfu .gt_footnote_marks {
+#rwppmogzow .gt_footnote_marks {
   font-size: 75%;
   vertical-align: 0.4em;
   position: initial;
 }
 
-#njegcpogfu .gt_asterisk {
+#rwppmogzow .gt_asterisk {
   font-size: 100%;
   vertical-align: 0;
 }
 
-#njegcpogfu .gt_indent_1 {
+#rwppmogzow .gt_indent_1 {
   text-indent: 5px;
 }
 
-#njegcpogfu .gt_indent_2 {
+#rwppmogzow .gt_indent_2 {
   text-indent: 10px;
 }
 
-#njegcpogfu .gt_indent_3 {
+#rwppmogzow .gt_indent_3 {
   text-indent: 15px;
 }
 
-#njegcpogfu .gt_indent_4 {
+#rwppmogzow .gt_indent_4 {
   text-indent: 20px;
 }
 
-#njegcpogfu .gt_indent_5 {
+#rwppmogzow .gt_indent_5 {
   text-indent: 25px;
 }
 </style>
@@ -2986,30 +2987,30 @@ <h4>Example 1: Logistic regression with single variable<a href="c07-modeling.htm
   
 </table>
 </div>
-<p>In the output above, we can see the estimated coefficients (<code>estimate</code>), estimated standard errors of the coefficients (<code>std.error</code>), the t-statistic (<code>statistic</code>), and the p-value for each coefficient. This output indicates that respondents who voted for Trump are 0.435 times more likely to usually have trust in the government compared to those who voted for Biden (the reference level).</p>
-<p>Sometimes it is easier to talk about the odds instead of the likelihood. To do this, we need to exponentiate the coefficients. We can use the same <code>tidy()</code> function, but include the argument <code>exponentiate = TRUE</code> to see the odds.</p>
+<p>In the output above, we can see the estimated coefficients (<code>estimate</code>), estimated standard errors of the coefficients (<code>std.error</code>), the t-statistic (<code>statistic</code>), and the p-value for each coefficient. This output indicates that respondents who voted for Trump are more likely to usually have trust in the government compared to those who voted for Biden (the reference level.) The coefficient of 0.435 represents the increase in the log odds of usually trusting the government.</p>
+<p>In most cases, it is easier to talk about the odds instead of the log odds. To do this, we need to exponentiate the coefficients. We can use the same <code>tidy()</code> function but include the argument <code>exponentiate = TRUE</code> to see the odds.</p>
 <div class="sourceCode" id="cb240"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb240-1"><a href="c07-modeling.html#cb240-1" tabindex="-1"></a><span class="fu">tidy</span>(logistic_trust_vote, <span class="at">exponentiate =</span> <span class="cn">TRUE</span>) <span class="sc">%&gt;%</span> </span>
 <span id="cb240-2"><a href="c07-modeling.html#cb240-2" tabindex="-1"></a>  <span class="fu">select</span>(term, estimate) <span class="sc">%&gt;%</span></span>
 <span id="cb240-3"><a href="c07-modeling.html#cb240-3" tabindex="-1"></a>  <span class="fu">gt</span>() <span class="sc">%&gt;%</span></span>
 <span id="cb240-4"><a href="c07-modeling.html#cb240-4" tabindex="-1"></a>  <span class="fu">fmt_number</span>()</span></code></pre></div>
 
-<div id="wtkmsuxfxd" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
-<style>#wtkmsuxfxd table {
+<div id="sqbfoxcwgf" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
+<style>#sqbfoxcwgf table {
   font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
   -webkit-font-smoothing: antialiased;
   -moz-osx-font-smoothing: grayscale;
 }
 
-#wtkmsuxfxd thead, #wtkmsuxfxd tbody, #wtkmsuxfxd tfoot, #wtkmsuxfxd tr, #wtkmsuxfxd td, #wtkmsuxfxd th {
+#sqbfoxcwgf thead, #sqbfoxcwgf tbody, #sqbfoxcwgf tfoot, #sqbfoxcwgf tr, #sqbfoxcwgf td, #sqbfoxcwgf th {
   border-style: none;
 }
 
-#wtkmsuxfxd p {
+#sqbfoxcwgf p {
   margin: 0;
   padding: 0;
 }
 
-#wtkmsuxfxd .gt_table {
+#sqbfoxcwgf .gt_table {
   display: table;
   border-collapse: collapse;
   line-height: normal;
@@ -3035,12 +3036,12 @@ <h4>Example 1: Logistic regression with single variable<a href="c07-modeling.htm
   border-left-color: #D3D3D3;
 }
 
-#wtkmsuxfxd .gt_caption {
+#sqbfoxcwgf .gt_caption {
   padding-top: 4px;
   padding-bottom: 4px;
 }
 
-#wtkmsuxfxd .gt_title {
+#sqbfoxcwgf .gt_title {
   color: #333333;
   font-size: 125%;
   font-weight: initial;
@@ -3052,7 +3053,7 @@ <h4>Example 1: Logistic regression with single variable<a href="c07-modeling.htm
   border-bottom-width: 0;
 }
 
-#wtkmsuxfxd .gt_subtitle {
+#sqbfoxcwgf .gt_subtitle {
   color: #333333;
   font-size: 85%;
   font-weight: initial;
@@ -3064,7 +3065,7 @@ <h4>Example 1: Logistic regression with single variable<a href="c07-modeling.htm
   border-top-width: 0;
 }
 
-#wtkmsuxfxd .gt_heading {
+#sqbfoxcwgf .gt_heading {
   background-color: #FFFFFF;
   text-align: center;
   border-bottom-color: #FFFFFF;
@@ -3076,13 +3077,13 @@ <h4>Example 1: Logistic regression with single variable<a href="c07-modeling.htm
   border-right-color: #D3D3D3;
 }
 
-#wtkmsuxfxd .gt_bottom_border {
+#sqbfoxcwgf .gt_bottom_border {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
 }
 
-#wtkmsuxfxd .gt_col_headings {
+#sqbfoxcwgf .gt_col_headings {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -3097,7 +3098,7 @@ <h4>Example 1: Logistic regression with single variable<a href="c07-modeling.htm
   border-right-color: #D3D3D3;
 }
 
-#wtkmsuxfxd .gt_col_heading {
+#sqbfoxcwgf .gt_col_heading {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -3117,7 +3118,7 @@ <h4>Example 1: Logistic regression with single variable<a href="c07-modeling.htm
   overflow-x: hidden;
 }
 
-#wtkmsuxfxd .gt_column_spanner_outer {
+#sqbfoxcwgf .gt_column_spanner_outer {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -3129,15 +3130,15 @@ <h4>Example 1: Logistic regression with single variable<a href="c07-modeling.htm
   padding-right: 4px;
 }
 
-#wtkmsuxfxd .gt_column_spanner_outer:first-child {
+#sqbfoxcwgf .gt_column_spanner_outer:first-child {
   padding-left: 0;
 }
 
-#wtkmsuxfxd .gt_column_spanner_outer:last-child {
+#sqbfoxcwgf .gt_column_spanner_outer:last-child {
   padding-right: 0;
 }
 
-#wtkmsuxfxd .gt_column_spanner {
+#sqbfoxcwgf .gt_column_spanner {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
@@ -3149,11 +3150,11 @@ <h4>Example 1: Logistic regression with single variable<a href="c07-modeling.htm
   width: 100%;
 }
 
-#wtkmsuxfxd .gt_spanner_row {
+#sqbfoxcwgf .gt_spanner_row {
   border-bottom-style: hidden;
 }
 
-#wtkmsuxfxd .gt_group_heading {
+#sqbfoxcwgf .gt_group_heading {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -3179,7 +3180,7 @@ <h4>Example 1: Logistic regression with single variable<a href="c07-modeling.htm
   text-align: left;
 }
 
-#wtkmsuxfxd .gt_empty_group_heading {
+#sqbfoxcwgf .gt_empty_group_heading {
   padding: 0.5px;
   color: #333333;
   background-color: #FFFFFF;
@@ -3194,15 +3195,15 @@ <h4>Example 1: Logistic regression with single variable<a href="c07-modeling.htm
   vertical-align: middle;
 }
 
-#wtkmsuxfxd .gt_from_md > :first-child {
+#sqbfoxcwgf .gt_from_md > :first-child {
   margin-top: 0;
 }
 
-#wtkmsuxfxd .gt_from_md > :last-child {
+#sqbfoxcwgf .gt_from_md > :last-child {
   margin-bottom: 0;
 }
 
-#wtkmsuxfxd .gt_row {
+#sqbfoxcwgf .gt_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -3221,7 +3222,7 @@ <h4>Example 1: Logistic regression with single variable<a href="c07-modeling.htm
   overflow-x: hidden;
 }
 
-#wtkmsuxfxd .gt_stub {
+#sqbfoxcwgf .gt_stub {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -3234,7 +3235,7 @@ <h4>Example 1: Logistic regression with single variable<a href="c07-modeling.htm
   padding-right: 5px;
 }
 
-#wtkmsuxfxd .gt_stub_row_group {
+#sqbfoxcwgf .gt_stub_row_group {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -3248,15 +3249,15 @@ <h4>Example 1: Logistic regression with single variable<a href="c07-modeling.htm
   vertical-align: top;
 }
 
-#wtkmsuxfxd .gt_row_group_first td {
+#sqbfoxcwgf .gt_row_group_first td {
   border-top-width: 2px;
 }
 
-#wtkmsuxfxd .gt_row_group_first th {
+#sqbfoxcwgf .gt_row_group_first th {
   border-top-width: 2px;
 }
 
-#wtkmsuxfxd .gt_summary_row {
+#sqbfoxcwgf .gt_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -3266,16 +3267,16 @@ <h4>Example 1: Logistic regression with single variable<a href="c07-modeling.htm
   padding-right: 5px;
 }
 
-#wtkmsuxfxd .gt_first_summary_row {
+#sqbfoxcwgf .gt_first_summary_row {
   border-top-style: solid;
   border-top-color: #D3D3D3;
 }
 
-#wtkmsuxfxd .gt_first_summary_row.thick {
+#sqbfoxcwgf .gt_first_summary_row.thick {
   border-top-width: 2px;
 }
 
-#wtkmsuxfxd .gt_last_summary_row {
+#sqbfoxcwgf .gt_last_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -3285,7 +3286,7 @@ <h4>Example 1: Logistic regression with single variable<a href="c07-modeling.htm
   border-bottom-color: #D3D3D3;
 }
 
-#wtkmsuxfxd .gt_grand_summary_row {
+#sqbfoxcwgf .gt_grand_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -3295,7 +3296,7 @@ <h4>Example 1: Logistic regression with single variable<a href="c07-modeling.htm
   padding-right: 5px;
 }
 
-#wtkmsuxfxd .gt_first_grand_summary_row {
+#sqbfoxcwgf .gt_first_grand_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -3305,7 +3306,7 @@ <h4>Example 1: Logistic regression with single variable<a href="c07-modeling.htm
   border-top-color: #D3D3D3;
 }
 
-#wtkmsuxfxd .gt_last_grand_summary_row_top {
+#sqbfoxcwgf .gt_last_grand_summary_row_top {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -3315,11 +3316,11 @@ <h4>Example 1: Logistic regression with single variable<a href="c07-modeling.htm
   border-bottom-color: #D3D3D3;
 }
 
-#wtkmsuxfxd .gt_striped {
+#sqbfoxcwgf .gt_striped {
   background-color: rgba(128, 128, 128, 0.05);
 }
 
-#wtkmsuxfxd .gt_table_body {
+#sqbfoxcwgf .gt_table_body {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -3328,7 +3329,7 @@ <h4>Example 1: Logistic regression with single variable<a href="c07-modeling.htm
   border-bottom-color: #D3D3D3;
 }
 
-#wtkmsuxfxd .gt_footnotes {
+#sqbfoxcwgf .gt_footnotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -3342,7 +3343,7 @@ <h4>Example 1: Logistic regression with single variable<a href="c07-modeling.htm
   border-right-color: #D3D3D3;
 }
 
-#wtkmsuxfxd .gt_footnote {
+#sqbfoxcwgf .gt_footnote {
   margin: 0px;
   font-size: 90%;
   padding-top: 4px;
@@ -3351,7 +3352,7 @@ <h4>Example 1: Logistic regression with single variable<a href="c07-modeling.htm
   padding-right: 5px;
 }
 
-#wtkmsuxfxd .gt_sourcenotes {
+#sqbfoxcwgf .gt_sourcenotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -3365,7 +3366,7 @@ <h4>Example 1: Logistic regression with single variable<a href="c07-modeling.htm
   border-right-color: #D3D3D3;
 }
 
-#wtkmsuxfxd .gt_sourcenote {
+#sqbfoxcwgf .gt_sourcenote {
   font-size: 90%;
   padding-top: 4px;
   padding-bottom: 4px;
@@ -3373,63 +3374,63 @@ <h4>Example 1: Logistic regression with single variable<a href="c07-modeling.htm
   padding-right: 5px;
 }
 
-#wtkmsuxfxd .gt_left {
+#sqbfoxcwgf .gt_left {
   text-align: left;
 }
 
-#wtkmsuxfxd .gt_center {
+#sqbfoxcwgf .gt_center {
   text-align: center;
 }
 
-#wtkmsuxfxd .gt_right {
+#sqbfoxcwgf .gt_right {
   text-align: right;
   font-variant-numeric: tabular-nums;
 }
 
-#wtkmsuxfxd .gt_font_normal {
+#sqbfoxcwgf .gt_font_normal {
   font-weight: normal;
 }
 
-#wtkmsuxfxd .gt_font_bold {
+#sqbfoxcwgf .gt_font_bold {
   font-weight: bold;
 }
 
-#wtkmsuxfxd .gt_font_italic {
+#sqbfoxcwgf .gt_font_italic {
   font-style: italic;
 }
 
-#wtkmsuxfxd .gt_super {
+#sqbfoxcwgf .gt_super {
   font-size: 65%;
 }
 
-#wtkmsuxfxd .gt_footnote_marks {
+#sqbfoxcwgf .gt_footnote_marks {
   font-size: 75%;
   vertical-align: 0.4em;
   position: initial;
 }
 
-#wtkmsuxfxd .gt_asterisk {
+#sqbfoxcwgf .gt_asterisk {
   font-size: 100%;
   vertical-align: 0;
 }
 
-#wtkmsuxfxd .gt_indent_1 {
+#sqbfoxcwgf .gt_indent_1 {
   text-indent: 5px;
 }
 
-#wtkmsuxfxd .gt_indent_2 {
+#sqbfoxcwgf .gt_indent_2 {
   text-indent: 10px;
 }
 
-#wtkmsuxfxd .gt_indent_3 {
+#sqbfoxcwgf .gt_indent_3 {
   text-indent: 15px;
 }
 
-#wtkmsuxfxd .gt_indent_4 {
+#sqbfoxcwgf .gt_indent_4 {
   text-indent: 20px;
 }
 
-#wtkmsuxfxd .gt_indent_5 {
+#sqbfoxcwgf .gt_indent_5 {
   text-indent: 25px;
 }
 </style>
@@ -3454,8 +3455,8 @@ <h4>Example 1: Logistic regression with single variable<a href="c07-modeling.htm
   
 </table>
 </div>
-<p>We can interpret this as saying that the odds of usually trusting the government for someone who voted for Trump is 154% as likely to trust the government compared to a person who voted for Biden (the reference level). In comparison, a person who voted for neither Biden nor Trump is 52% as likely to trust the government as someone who voted for Biden.</p>
-<p>As with linear regression, the <code>augment()</code> can be used to predict values. By default, the prediction is the link function (logit function in this instance) and not the probability. To predict the probability, add an argument of <code>type.predict="response"</code> as demonstrated below:</p>
+<p>We can interpret this as saying that the odds of usually trusting the government for someone who voted for Trump is 154% as likely to trust the government compared to a person who voted for Biden (the reference level.) In comparison, a person who voted for neither Biden nor Trump is 52% as likely to trust the government as someone who voted for Biden.</p>
+<p>As with linear regression, the <code>augment()</code> can be used to predict values. By default, the prediction is the link function, not the probability model. To predict the probability, add an argument of <code>type.predict="response"</code> as demonstrated below:</p>
 <div class="sourceCode" id="cb241"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb241-1"><a href="c07-modeling.html#cb241-1" tabindex="-1"></a>logistic_trust_vote <span class="sc">%&gt;%</span></span>
 <span id="cb241-2"><a href="c07-modeling.html#cb241-2" tabindex="-1"></a>  <span class="fu">augment</span>(<span class="at">type.predict =</span> <span class="st">&quot;response&quot;</span>) <span class="sc">%&gt;%</span></span>
 <span id="cb241-3"><a href="c07-modeling.html#cb241-3" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">.se.fit =</span> <span class="fu">sqrt</span>(<span class="fu">attr</span>(.fitted, <span class="st">&quot;var&quot;</span>)), </span>
@@ -3480,41 +3481,41 @@ <h4>Example 1: Logistic regression with single variable<a href="c07-modeling.htm
 ## # ℹ 6,202 more rows</code></pre>
 </div>
 <div id="example-2-interaction-effects" class="section level4 unnumbered hasAnchor">
-<h4>Example 2: Interaction effects<a href="c07-modeling.html#example-2-interaction-effects" class="anchor-section" aria-label="Anchor link to header"></a></h4>
-<p>Let’s look at another example with interaction effects. If we’re interested in understanding the demographics of people who voted for Biden among all voters in 2020, we could include <code>EarlyVote2020</code> and <code>Gender</code> in our model.</p>
-<p>First we need to subset the data to 2020 voters and then create an indicator for voted for Biden.</p>
-<div class="sourceCode" id="cb243"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb243-1"><a href="c07-modeling.html#cb243-1" tabindex="-1"></a>anes_des_ind <span class="ot">&lt;-</span> anes_des <span class="sc">%&gt;%</span> </span>
+<h4>Example 2: Interaction Effects<a href="c07-modeling.html#example-2-interaction-effects" class="anchor-section" aria-label="Anchor link to header"></a></h4>
+<p>Let’s look at another example with interaction effects. If we’re interested in understanding the demographics of people who voted for Biden among all voters in 2020, we could include the indicator of if respondents voted early (<code>EarlyVote2020</code>) and their income group (<code>Income7</code>) in our model.</p>
+<p>First, we need to subset the data to 2020 voters and then create an indicator for voted for Biden.</p>
+<div class="sourceCode" id="cb243"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb243-1"><a href="c07-modeling.html#cb243-1" tabindex="-1"></a>anes_des_ind <span class="ot">&lt;-</span> anes_des <span class="sc">%&gt;%</span></span>
 <span id="cb243-2"><a href="c07-modeling.html#cb243-2" tabindex="-1"></a>  <span class="fu">filter</span>(<span class="sc">!</span><span class="fu">is.na</span>(VotedPres2020_selection)) <span class="sc">%&gt;%</span></span>
-<span id="cb243-3"><a href="c07-modeling.html#cb243-3" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">VoteBiden =</span> <span class="fu">case_when</span>(VotedPres2020_selection <span class="sc">==</span> <span class="st">&quot;Biden&quot;</span><span class="sc">~</span><span class="dv">1</span>,</span>
+<span id="cb243-3"><a href="c07-modeling.html#cb243-3" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">VoteBiden =</span> <span class="fu">case_when</span>(VotedPres2020_selection <span class="sc">==</span> <span class="st">&quot;Biden&quot;</span> <span class="sc">~</span> <span class="dv">1</span>,</span>
 <span id="cb243-4"><a href="c07-modeling.html#cb243-4" tabindex="-1"></a>                               <span class="cn">TRUE</span> <span class="sc">~</span> <span class="dv">0</span>))</span></code></pre></div>
-<p>Let’s first look at the main effects of gender and early voting behavior.</p>
+<p>Let’s first look at the main effects of income grouping and early voting behavior.</p>
 <div class="sourceCode" id="cb244"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb244-1"><a href="c07-modeling.html#cb244-1" tabindex="-1"></a>log_biden_main <span class="ot">&lt;-</span> anes_des_ind <span class="sc">%&gt;%</span></span>
 <span id="cb244-2"><a href="c07-modeling.html#cb244-2" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">EarlyVote2020 =</span> <span class="fu">fct_relevel</span>(EarlyVote2020, <span class="st">&quot;No&quot;</span>, <span class="at">after =</span> <span class="dv">0</span>)) <span class="sc">%&gt;%</span> </span>
 <span id="cb244-3"><a href="c07-modeling.html#cb244-3" tabindex="-1"></a>  <span class="fu">svyglm</span>(<span class="at">design =</span> .,</span>
-<span id="cb244-4"><a href="c07-modeling.html#cb244-4" tabindex="-1"></a>         <span class="at">formula =</span> VoteBiden <span class="sc">~</span> EarlyVote2020 <span class="sc">+</span> Gender,</span>
+<span id="cb244-4"><a href="c07-modeling.html#cb244-4" tabindex="-1"></a>         <span class="at">formula =</span> VoteBiden <span class="sc">~</span> EarlyVote2020 <span class="sc">+</span> Income7,</span>
 <span id="cb244-5"><a href="c07-modeling.html#cb244-5" tabindex="-1"></a>         <span class="at">family =</span> quasibinomial) </span></code></pre></div>
 <div class="sourceCode" id="cb245"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb245-1"><a href="c07-modeling.html#cb245-1" tabindex="-1"></a><span class="fu">tidy</span>(log_biden_main) <span class="sc">%&gt;%</span></span>
 <span id="cb245-2"><a href="c07-modeling.html#cb245-2" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">p.value=</span><span class="fu">pretty_p_value</span>(p.value)) <span class="sc">%&gt;%</span></span>
 <span id="cb245-3"><a href="c07-modeling.html#cb245-3" tabindex="-1"></a>  <span class="fu">gt</span>() <span class="sc">%&gt;%</span></span>
 <span id="cb245-4"><a href="c07-modeling.html#cb245-4" tabindex="-1"></a>  <span class="fu">fmt_number</span>()</span></code></pre></div>
 
-<div id="jbboitmhxl" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
-<style>#jbboitmhxl table {
+<div id="txvpomzvxj" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
+<style>#txvpomzvxj table {
   font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
   -webkit-font-smoothing: antialiased;
   -moz-osx-font-smoothing: grayscale;
 }
 
-#jbboitmhxl thead, #jbboitmhxl tbody, #jbboitmhxl tfoot, #jbboitmhxl tr, #jbboitmhxl td, #jbboitmhxl th {
+#txvpomzvxj thead, #txvpomzvxj tbody, #txvpomzvxj tfoot, #txvpomzvxj tr, #txvpomzvxj td, #txvpomzvxj th {
   border-style: none;
 }
 
-#jbboitmhxl p {
+#txvpomzvxj p {
   margin: 0;
   padding: 0;
 }
 
-#jbboitmhxl .gt_table {
+#txvpomzvxj .gt_table {
   display: table;
   border-collapse: collapse;
   line-height: normal;
@@ -3540,12 +3541,12 @@ <h4>Example 2: Interaction effects<a href="c07-modeling.html#example-2-interacti
   border-left-color: #D3D3D3;
 }
 
-#jbboitmhxl .gt_caption {
+#txvpomzvxj .gt_caption {
   padding-top: 4px;
   padding-bottom: 4px;
 }
 
-#jbboitmhxl .gt_title {
+#txvpomzvxj .gt_title {
   color: #333333;
   font-size: 125%;
   font-weight: initial;
@@ -3557,7 +3558,7 @@ <h4>Example 2: Interaction effects<a href="c07-modeling.html#example-2-interacti
   border-bottom-width: 0;
 }
 
-#jbboitmhxl .gt_subtitle {
+#txvpomzvxj .gt_subtitle {
   color: #333333;
   font-size: 85%;
   font-weight: initial;
@@ -3569,7 +3570,7 @@ <h4>Example 2: Interaction effects<a href="c07-modeling.html#example-2-interacti
   border-top-width: 0;
 }
 
-#jbboitmhxl .gt_heading {
+#txvpomzvxj .gt_heading {
   background-color: #FFFFFF;
   text-align: center;
   border-bottom-color: #FFFFFF;
@@ -3581,13 +3582,13 @@ <h4>Example 2: Interaction effects<a href="c07-modeling.html#example-2-interacti
   border-right-color: #D3D3D3;
 }
 
-#jbboitmhxl .gt_bottom_border {
+#txvpomzvxj .gt_bottom_border {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
 }
 
-#jbboitmhxl .gt_col_headings {
+#txvpomzvxj .gt_col_headings {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -3602,7 +3603,7 @@ <h4>Example 2: Interaction effects<a href="c07-modeling.html#example-2-interacti
   border-right-color: #D3D3D3;
 }
 
-#jbboitmhxl .gt_col_heading {
+#txvpomzvxj .gt_col_heading {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -3622,7 +3623,7 @@ <h4>Example 2: Interaction effects<a href="c07-modeling.html#example-2-interacti
   overflow-x: hidden;
 }
 
-#jbboitmhxl .gt_column_spanner_outer {
+#txvpomzvxj .gt_column_spanner_outer {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -3634,15 +3635,15 @@ <h4>Example 2: Interaction effects<a href="c07-modeling.html#example-2-interacti
   padding-right: 4px;
 }
 
-#jbboitmhxl .gt_column_spanner_outer:first-child {
+#txvpomzvxj .gt_column_spanner_outer:first-child {
   padding-left: 0;
 }
 
-#jbboitmhxl .gt_column_spanner_outer:last-child {
+#txvpomzvxj .gt_column_spanner_outer:last-child {
   padding-right: 0;
 }
 
-#jbboitmhxl .gt_column_spanner {
+#txvpomzvxj .gt_column_spanner {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
@@ -3654,11 +3655,11 @@ <h4>Example 2: Interaction effects<a href="c07-modeling.html#example-2-interacti
   width: 100%;
 }
 
-#jbboitmhxl .gt_spanner_row {
+#txvpomzvxj .gt_spanner_row {
   border-bottom-style: hidden;
 }
 
-#jbboitmhxl .gt_group_heading {
+#txvpomzvxj .gt_group_heading {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -3684,7 +3685,7 @@ <h4>Example 2: Interaction effects<a href="c07-modeling.html#example-2-interacti
   text-align: left;
 }
 
-#jbboitmhxl .gt_empty_group_heading {
+#txvpomzvxj .gt_empty_group_heading {
   padding: 0.5px;
   color: #333333;
   background-color: #FFFFFF;
@@ -3699,15 +3700,15 @@ <h4>Example 2: Interaction effects<a href="c07-modeling.html#example-2-interacti
   vertical-align: middle;
 }
 
-#jbboitmhxl .gt_from_md > :first-child {
+#txvpomzvxj .gt_from_md > :first-child {
   margin-top: 0;
 }
 
-#jbboitmhxl .gt_from_md > :last-child {
+#txvpomzvxj .gt_from_md > :last-child {
   margin-bottom: 0;
 }
 
-#jbboitmhxl .gt_row {
+#txvpomzvxj .gt_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -3726,7 +3727,7 @@ <h4>Example 2: Interaction effects<a href="c07-modeling.html#example-2-interacti
   overflow-x: hidden;
 }
 
-#jbboitmhxl .gt_stub {
+#txvpomzvxj .gt_stub {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -3739,7 +3740,7 @@ <h4>Example 2: Interaction effects<a href="c07-modeling.html#example-2-interacti
   padding-right: 5px;
 }
 
-#jbboitmhxl .gt_stub_row_group {
+#txvpomzvxj .gt_stub_row_group {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -3753,15 +3754,15 @@ <h4>Example 2: Interaction effects<a href="c07-modeling.html#example-2-interacti
   vertical-align: top;
 }
 
-#jbboitmhxl .gt_row_group_first td {
+#txvpomzvxj .gt_row_group_first td {
   border-top-width: 2px;
 }
 
-#jbboitmhxl .gt_row_group_first th {
+#txvpomzvxj .gt_row_group_first th {
   border-top-width: 2px;
 }
 
-#jbboitmhxl .gt_summary_row {
+#txvpomzvxj .gt_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -3771,16 +3772,16 @@ <h4>Example 2: Interaction effects<a href="c07-modeling.html#example-2-interacti
   padding-right: 5px;
 }
 
-#jbboitmhxl .gt_first_summary_row {
+#txvpomzvxj .gt_first_summary_row {
   border-top-style: solid;
   border-top-color: #D3D3D3;
 }
 
-#jbboitmhxl .gt_first_summary_row.thick {
+#txvpomzvxj .gt_first_summary_row.thick {
   border-top-width: 2px;
 }
 
-#jbboitmhxl .gt_last_summary_row {
+#txvpomzvxj .gt_last_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -3790,7 +3791,7 @@ <h4>Example 2: Interaction effects<a href="c07-modeling.html#example-2-interacti
   border-bottom-color: #D3D3D3;
 }
 
-#jbboitmhxl .gt_grand_summary_row {
+#txvpomzvxj .gt_grand_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -3800,7 +3801,7 @@ <h4>Example 2: Interaction effects<a href="c07-modeling.html#example-2-interacti
   padding-right: 5px;
 }
 
-#jbboitmhxl .gt_first_grand_summary_row {
+#txvpomzvxj .gt_first_grand_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -3810,7 +3811,7 @@ <h4>Example 2: Interaction effects<a href="c07-modeling.html#example-2-interacti
   border-top-color: #D3D3D3;
 }
 
-#jbboitmhxl .gt_last_grand_summary_row_top {
+#txvpomzvxj .gt_last_grand_summary_row_top {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -3820,11 +3821,11 @@ <h4>Example 2: Interaction effects<a href="c07-modeling.html#example-2-interacti
   border-bottom-color: #D3D3D3;
 }
 
-#jbboitmhxl .gt_striped {
+#txvpomzvxj .gt_striped {
   background-color: rgba(128, 128, 128, 0.05);
 }
 
-#jbboitmhxl .gt_table_body {
+#txvpomzvxj .gt_table_body {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -3833,7 +3834,7 @@ <h4>Example 2: Interaction effects<a href="c07-modeling.html#example-2-interacti
   border-bottom-color: #D3D3D3;
 }
 
-#jbboitmhxl .gt_footnotes {
+#txvpomzvxj .gt_footnotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -3847,7 +3848,7 @@ <h4>Example 2: Interaction effects<a href="c07-modeling.html#example-2-interacti
   border-right-color: #D3D3D3;
 }
 
-#jbboitmhxl .gt_footnote {
+#txvpomzvxj .gt_footnote {
   margin: 0px;
   font-size: 90%;
   padding-top: 4px;
@@ -3856,7 +3857,7 @@ <h4>Example 2: Interaction effects<a href="c07-modeling.html#example-2-interacti
   padding-right: 5px;
 }
 
-#jbboitmhxl .gt_sourcenotes {
+#txvpomzvxj .gt_sourcenotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -3870,7 +3871,7 @@ <h4>Example 2: Interaction effects<a href="c07-modeling.html#example-2-interacti
   border-right-color: #D3D3D3;
 }
 
-#jbboitmhxl .gt_sourcenote {
+#txvpomzvxj .gt_sourcenote {
   font-size: 90%;
   padding-top: 4px;
   padding-bottom: 4px;
@@ -3878,68 +3879,68 @@ <h4>Example 2: Interaction effects<a href="c07-modeling.html#example-2-interacti
   padding-right: 5px;
 }
 
-#jbboitmhxl .gt_left {
+#txvpomzvxj .gt_left {
   text-align: left;
 }
 
-#jbboitmhxl .gt_center {
+#txvpomzvxj .gt_center {
   text-align: center;
 }
 
-#jbboitmhxl .gt_right {
+#txvpomzvxj .gt_right {
   text-align: right;
   font-variant-numeric: tabular-nums;
 }
 
-#jbboitmhxl .gt_font_normal {
+#txvpomzvxj .gt_font_normal {
   font-weight: normal;
 }
 
-#jbboitmhxl .gt_font_bold {
+#txvpomzvxj .gt_font_bold {
   font-weight: bold;
 }
 
-#jbboitmhxl .gt_font_italic {
+#txvpomzvxj .gt_font_italic {
   font-style: italic;
 }
 
-#jbboitmhxl .gt_super {
+#txvpomzvxj .gt_super {
   font-size: 65%;
 }
 
-#jbboitmhxl .gt_footnote_marks {
+#txvpomzvxj .gt_footnote_marks {
   font-size: 75%;
   vertical-align: 0.4em;
   position: initial;
 }
 
-#jbboitmhxl .gt_asterisk {
+#txvpomzvxj .gt_asterisk {
   font-size: 100%;
   vertical-align: 0;
 }
 
-#jbboitmhxl .gt_indent_1 {
+#txvpomzvxj .gt_indent_1 {
   text-indent: 5px;
 }
 
-#jbboitmhxl .gt_indent_2 {
+#txvpomzvxj .gt_indent_2 {
   text-indent: 10px;
 }
 
-#jbboitmhxl .gt_indent_3 {
+#txvpomzvxj .gt_indent_3 {
   text-indent: 15px;
 }
 
-#jbboitmhxl .gt_indent_4 {
+#txvpomzvxj .gt_indent_4 {
   text-indent: 20px;
 }
 
-#jbboitmhxl .gt_indent_5 {
+#txvpomzvxj .gt_indent_5 {
   text-indent: 25px;
 }
 </style>
 <table class="gt_table" data-quarto-disable-processing="false" data-quarto-bootstrap="false">
-  <caption><span id="tab:model-logisticexamp-biden-main-tab">TABLE 7.8: </span>Logistic regression output for predicting voting for Biden given early voting behavior and gender - main effects only, RECS 2020</caption>
+  <caption><span id="tab:model-logisticexamp-biden-main-tab">TABLE 7.8: </span>Logistic regression output for predicting voting for Biden given early voting behavior and income - main effects only, ANES 2020</caption>
   <thead>
     
     <tr class="gt_col_headings">
@@ -3952,54 +3953,79 @@ <h4>Example 2: Interaction effects<a href="c07-modeling.html#example-2-interacti
   </thead>
   <tbody class="gt_table_body">
     <tr><td headers="term" class="gt_row gt_left">(Intercept)</td>
-<td headers="estimate" class="gt_row gt_right">−0.31</td>
-<td headers="std.error" class="gt_row gt_right">0.27</td>
-<td headers="statistic" class="gt_row gt_right">−1.15</td>
-<td headers="p.value" class="gt_row gt_right">0.2553</td></tr>
+<td headers="estimate" class="gt_row gt_right">1.28</td>
+<td headers="std.error" class="gt_row gt_right">0.43</td>
+<td headers="statistic" class="gt_row gt_right">2.99</td>
+<td headers="p.value" class="gt_row gt_right">0.0047</td></tr>
     <tr><td headers="term" class="gt_row gt_left">EarlyVote2020Yes</td>
-<td headers="estimate" class="gt_row gt_right">0.53</td>
-<td headers="std.error" class="gt_row gt_right">0.35</td>
-<td headers="statistic" class="gt_row gt_right">1.53</td>
-<td headers="p.value" class="gt_row gt_right">0.1338</td></tr>
-    <tr><td headers="term" class="gt_row gt_left">GenderFemale</td>
-<td headers="estimate" class="gt_row gt_right">0.96</td>
-<td headers="std.error" class="gt_row gt_right">0.26</td>
-<td headers="statistic" class="gt_row gt_right">3.73</td>
-<td headers="p.value" class="gt_row gt_right">0.0005</td></tr>
+<td headers="estimate" class="gt_row gt_right">0.44</td>
+<td headers="std.error" class="gt_row gt_right">0.34</td>
+<td headers="statistic" class="gt_row gt_right">1.29</td>
+<td headers="p.value" class="gt_row gt_right">0.2039</td></tr>
+    <tr><td headers="term" class="gt_row gt_left">Income7$20k to &lt; 40k</td>
+<td headers="estimate" class="gt_row gt_right">−1.06</td>
+<td headers="std.error" class="gt_row gt_right">0.49</td>
+<td headers="statistic" class="gt_row gt_right">−2.18</td>
+<td headers="p.value" class="gt_row gt_right">0.0352</td></tr>
+    <tr><td headers="term" class="gt_row gt_left">Income7$40k to &lt; 60k</td>
+<td headers="estimate" class="gt_row gt_right">−0.78</td>
+<td headers="std.error" class="gt_row gt_right">0.42</td>
+<td headers="statistic" class="gt_row gt_right">−1.86</td>
+<td headers="p.value" class="gt_row gt_right">0.0705</td></tr>
+    <tr><td headers="term" class="gt_row gt_left">Income7$60k to &lt; 80k</td>
+<td headers="estimate" class="gt_row gt_right">−1.24</td>
+<td headers="std.error" class="gt_row gt_right">0.70</td>
+<td headers="statistic" class="gt_row gt_right">−1.77</td>
+<td headers="p.value" class="gt_row gt_right">0.0842</td></tr>
+    <tr><td headers="term" class="gt_row gt_left">Income7$80k to &lt; 100k</td>
+<td headers="estimate" class="gt_row gt_right">−0.66</td>
+<td headers="std.error" class="gt_row gt_right">0.64</td>
+<td headers="statistic" class="gt_row gt_right">−1.02</td>
+<td headers="p.value" class="gt_row gt_right">0.3137</td></tr>
+    <tr><td headers="term" class="gt_row gt_left">Income7$100k to &lt; 125k</td>
+<td headers="estimate" class="gt_row gt_right">−1.02</td>
+<td headers="std.error" class="gt_row gt_right">0.54</td>
+<td headers="statistic" class="gt_row gt_right">−1.89</td>
+<td headers="p.value" class="gt_row gt_right">0.0662</td></tr>
+    <tr><td headers="term" class="gt_row gt_left">Income7$125k or more</td>
+<td headers="estimate" class="gt_row gt_right">−1.25</td>
+<td headers="std.error" class="gt_row gt_right">0.44</td>
+<td headers="statistic" class="gt_row gt_right">−2.87</td>
+<td headers="p.value" class="gt_row gt_right">0.0065</td></tr>
   </tbody>
   
   
 </table>
 </div>
-<p>This main effect model indicates that respondents with who early voted in 2020 are 0.528 (p-value=0.1338) times more likely to vote for Biden compared to respondents who did not early vote in the 2020 election (the reference level). We see that gender is also significant with females more likely to vote for Biden compared to males (p-value=0.0005).</p>
-<p>It is possible that there is an interaction between gender and early voting behavior. To determine this we can create a model that includes the interaction effects:</p>
+<p>This main effect model (see Table <a href="c07-modeling.html#tab:model-logisticexamp-biden-main-tab">7.8</a>) indicates that people with incomes of $125,000 or more have a significant negative coefficient -1.25 (p-value=0.0065). This indicates that people with incomes of $125,000 or more were less likely to vote for Biden in the 2020 election compared to people with incomes of $20,000 or less (reference level).</p>
+<p>Although early voting behavior was not significant, there may be an interaction between income and early voting behavior. To determine this, we can create a model that includes the interaction effects:</p>
 <div class="sourceCode" id="cb246"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb246-1"><a href="c07-modeling.html#cb246-1" tabindex="-1"></a>log_biden_int <span class="ot">&lt;-</span> anes_des_ind <span class="sc">%&gt;%</span></span>
 <span id="cb246-2"><a href="c07-modeling.html#cb246-2" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">EarlyVote2020 =</span> <span class="fu">fct_relevel</span>(EarlyVote2020, <span class="st">&quot;No&quot;</span>, <span class="at">after =</span> <span class="dv">0</span>)) <span class="sc">%&gt;%</span> </span>
 <span id="cb246-3"><a href="c07-modeling.html#cb246-3" tabindex="-1"></a>  <span class="fu">svyglm</span>(<span class="at">design =</span> .,</span>
-<span id="cb246-4"><a href="c07-modeling.html#cb246-4" tabindex="-1"></a>         <span class="at">formula =</span> VoteBiden <span class="sc">~</span> (EarlyVote2020 <span class="sc">+</span> Gender)<span class="sc">^</span><span class="dv">2</span>,</span>
+<span id="cb246-4"><a href="c07-modeling.html#cb246-4" tabindex="-1"></a>         <span class="at">formula =</span> VoteBiden <span class="sc">~</span> (EarlyVote2020 <span class="sc">+</span> Income7)<span class="sc">^</span><span class="dv">2</span>,</span>
 <span id="cb246-5"><a href="c07-modeling.html#cb246-5" tabindex="-1"></a>         <span class="at">family =</span> quasibinomial) </span></code></pre></div>
 <div class="sourceCode" id="cb247"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb247-1"><a href="c07-modeling.html#cb247-1" tabindex="-1"></a><span class="fu">tidy</span>(log_biden_int) <span class="sc">%&gt;%</span></span>
 <span id="cb247-2"><a href="c07-modeling.html#cb247-2" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">p.value=</span><span class="fu">pretty_p_value</span>(p.value)) <span class="sc">%&gt;%</span></span>
 <span id="cb247-3"><a href="c07-modeling.html#cb247-3" tabindex="-1"></a>  <span class="fu">gt</span>() <span class="sc">%&gt;%</span></span>
 <span id="cb247-4"><a href="c07-modeling.html#cb247-4" tabindex="-1"></a>  <span class="fu">fmt_number</span>()</span></code></pre></div>
 
-<div id="wucwuafyzy" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
-<style>#wucwuafyzy table {
+<div id="owpwgkfrzt" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
+<style>#owpwgkfrzt table {
   font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
   -webkit-font-smoothing: antialiased;
   -moz-osx-font-smoothing: grayscale;
 }
 
-#wucwuafyzy thead, #wucwuafyzy tbody, #wucwuafyzy tfoot, #wucwuafyzy tr, #wucwuafyzy td, #wucwuafyzy th {
+#owpwgkfrzt thead, #owpwgkfrzt tbody, #owpwgkfrzt tfoot, #owpwgkfrzt tr, #owpwgkfrzt td, #owpwgkfrzt th {
   border-style: none;
 }
 
-#wucwuafyzy p {
+#owpwgkfrzt p {
   margin: 0;
   padding: 0;
 }
 
-#wucwuafyzy .gt_table {
+#owpwgkfrzt .gt_table {
   display: table;
   border-collapse: collapse;
   line-height: normal;
@@ -4025,12 +4051,12 @@ <h4>Example 2: Interaction effects<a href="c07-modeling.html#example-2-interacti
   border-left-color: #D3D3D3;
 }
 
-#wucwuafyzy .gt_caption {
+#owpwgkfrzt .gt_caption {
   padding-top: 4px;
   padding-bottom: 4px;
 }
 
-#wucwuafyzy .gt_title {
+#owpwgkfrzt .gt_title {
   color: #333333;
   font-size: 125%;
   font-weight: initial;
@@ -4042,7 +4068,7 @@ <h4>Example 2: Interaction effects<a href="c07-modeling.html#example-2-interacti
   border-bottom-width: 0;
 }
 
-#wucwuafyzy .gt_subtitle {
+#owpwgkfrzt .gt_subtitle {
   color: #333333;
   font-size: 85%;
   font-weight: initial;
@@ -4054,7 +4080,7 @@ <h4>Example 2: Interaction effects<a href="c07-modeling.html#example-2-interacti
   border-top-width: 0;
 }
 
-#wucwuafyzy .gt_heading {
+#owpwgkfrzt .gt_heading {
   background-color: #FFFFFF;
   text-align: center;
   border-bottom-color: #FFFFFF;
@@ -4066,13 +4092,13 @@ <h4>Example 2: Interaction effects<a href="c07-modeling.html#example-2-interacti
   border-right-color: #D3D3D3;
 }
 
-#wucwuafyzy .gt_bottom_border {
+#owpwgkfrzt .gt_bottom_border {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
 }
 
-#wucwuafyzy .gt_col_headings {
+#owpwgkfrzt .gt_col_headings {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -4087,7 +4113,7 @@ <h4>Example 2: Interaction effects<a href="c07-modeling.html#example-2-interacti
   border-right-color: #D3D3D3;
 }
 
-#wucwuafyzy .gt_col_heading {
+#owpwgkfrzt .gt_col_heading {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -4107,7 +4133,7 @@ <h4>Example 2: Interaction effects<a href="c07-modeling.html#example-2-interacti
   overflow-x: hidden;
 }
 
-#wucwuafyzy .gt_column_spanner_outer {
+#owpwgkfrzt .gt_column_spanner_outer {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -4119,15 +4145,15 @@ <h4>Example 2: Interaction effects<a href="c07-modeling.html#example-2-interacti
   padding-right: 4px;
 }
 
-#wucwuafyzy .gt_column_spanner_outer:first-child {
+#owpwgkfrzt .gt_column_spanner_outer:first-child {
   padding-left: 0;
 }
 
-#wucwuafyzy .gt_column_spanner_outer:last-child {
+#owpwgkfrzt .gt_column_spanner_outer:last-child {
   padding-right: 0;
 }
 
-#wucwuafyzy .gt_column_spanner {
+#owpwgkfrzt .gt_column_spanner {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
@@ -4139,11 +4165,11 @@ <h4>Example 2: Interaction effects<a href="c07-modeling.html#example-2-interacti
   width: 100%;
 }
 
-#wucwuafyzy .gt_spanner_row {
+#owpwgkfrzt .gt_spanner_row {
   border-bottom-style: hidden;
 }
 
-#wucwuafyzy .gt_group_heading {
+#owpwgkfrzt .gt_group_heading {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -4169,7 +4195,7 @@ <h4>Example 2: Interaction effects<a href="c07-modeling.html#example-2-interacti
   text-align: left;
 }
 
-#wucwuafyzy .gt_empty_group_heading {
+#owpwgkfrzt .gt_empty_group_heading {
   padding: 0.5px;
   color: #333333;
   background-color: #FFFFFF;
@@ -4184,15 +4210,15 @@ <h4>Example 2: Interaction effects<a href="c07-modeling.html#example-2-interacti
   vertical-align: middle;
 }
 
-#wucwuafyzy .gt_from_md > :first-child {
+#owpwgkfrzt .gt_from_md > :first-child {
   margin-top: 0;
 }
 
-#wucwuafyzy .gt_from_md > :last-child {
+#owpwgkfrzt .gt_from_md > :last-child {
   margin-bottom: 0;
 }
 
-#wucwuafyzy .gt_row {
+#owpwgkfrzt .gt_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -4211,7 +4237,7 @@ <h4>Example 2: Interaction effects<a href="c07-modeling.html#example-2-interacti
   overflow-x: hidden;
 }
 
-#wucwuafyzy .gt_stub {
+#owpwgkfrzt .gt_stub {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -4224,7 +4250,7 @@ <h4>Example 2: Interaction effects<a href="c07-modeling.html#example-2-interacti
   padding-right: 5px;
 }
 
-#wucwuafyzy .gt_stub_row_group {
+#owpwgkfrzt .gt_stub_row_group {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -4238,15 +4264,15 @@ <h4>Example 2: Interaction effects<a href="c07-modeling.html#example-2-interacti
   vertical-align: top;
 }
 
-#wucwuafyzy .gt_row_group_first td {
+#owpwgkfrzt .gt_row_group_first td {
   border-top-width: 2px;
 }
 
-#wucwuafyzy .gt_row_group_first th {
+#owpwgkfrzt .gt_row_group_first th {
   border-top-width: 2px;
 }
 
-#wucwuafyzy .gt_summary_row {
+#owpwgkfrzt .gt_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -4256,16 +4282,16 @@ <h4>Example 2: Interaction effects<a href="c07-modeling.html#example-2-interacti
   padding-right: 5px;
 }
 
-#wucwuafyzy .gt_first_summary_row {
+#owpwgkfrzt .gt_first_summary_row {
   border-top-style: solid;
   border-top-color: #D3D3D3;
 }
 
-#wucwuafyzy .gt_first_summary_row.thick {
+#owpwgkfrzt .gt_first_summary_row.thick {
   border-top-width: 2px;
 }
 
-#wucwuafyzy .gt_last_summary_row {
+#owpwgkfrzt .gt_last_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -4275,7 +4301,7 @@ <h4>Example 2: Interaction effects<a href="c07-modeling.html#example-2-interacti
   border-bottom-color: #D3D3D3;
 }
 
-#wucwuafyzy .gt_grand_summary_row {
+#owpwgkfrzt .gt_grand_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -4285,7 +4311,7 @@ <h4>Example 2: Interaction effects<a href="c07-modeling.html#example-2-interacti
   padding-right: 5px;
 }
 
-#wucwuafyzy .gt_first_grand_summary_row {
+#owpwgkfrzt .gt_first_grand_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -4295,7 +4321,7 @@ <h4>Example 2: Interaction effects<a href="c07-modeling.html#example-2-interacti
   border-top-color: #D3D3D3;
 }
 
-#wucwuafyzy .gt_last_grand_summary_row_top {
+#owpwgkfrzt .gt_last_grand_summary_row_top {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -4305,11 +4331,11 @@ <h4>Example 2: Interaction effects<a href="c07-modeling.html#example-2-interacti
   border-bottom-color: #D3D3D3;
 }
 
-#wucwuafyzy .gt_striped {
+#owpwgkfrzt .gt_striped {
   background-color: rgba(128, 128, 128, 0.05);
 }
 
-#wucwuafyzy .gt_table_body {
+#owpwgkfrzt .gt_table_body {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -4318,7 +4344,7 @@ <h4>Example 2: Interaction effects<a href="c07-modeling.html#example-2-interacti
   border-bottom-color: #D3D3D3;
 }
 
-#wucwuafyzy .gt_footnotes {
+#owpwgkfrzt .gt_footnotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -4332,7 +4358,7 @@ <h4>Example 2: Interaction effects<a href="c07-modeling.html#example-2-interacti
   border-right-color: #D3D3D3;
 }
 
-#wucwuafyzy .gt_footnote {
+#owpwgkfrzt .gt_footnote {
   margin: 0px;
   font-size: 90%;
   padding-top: 4px;
@@ -4341,7 +4367,7 @@ <h4>Example 2: Interaction effects<a href="c07-modeling.html#example-2-interacti
   padding-right: 5px;
 }
 
-#wucwuafyzy .gt_sourcenotes {
+#owpwgkfrzt .gt_sourcenotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -4355,7 +4381,7 @@ <h4>Example 2: Interaction effects<a href="c07-modeling.html#example-2-interacti
   border-right-color: #D3D3D3;
 }
 
-#wucwuafyzy .gt_sourcenote {
+#owpwgkfrzt .gt_sourcenote {
   font-size: 90%;
   padding-top: 4px;
   padding-bottom: 4px;
@@ -4363,68 +4389,68 @@ <h4>Example 2: Interaction effects<a href="c07-modeling.html#example-2-interacti
   padding-right: 5px;
 }
 
-#wucwuafyzy .gt_left {
+#owpwgkfrzt .gt_left {
   text-align: left;
 }
 
-#wucwuafyzy .gt_center {
+#owpwgkfrzt .gt_center {
   text-align: center;
 }
 
-#wucwuafyzy .gt_right {
+#owpwgkfrzt .gt_right {
   text-align: right;
   font-variant-numeric: tabular-nums;
 }
 
-#wucwuafyzy .gt_font_normal {
+#owpwgkfrzt .gt_font_normal {
   font-weight: normal;
 }
 
-#wucwuafyzy .gt_font_bold {
+#owpwgkfrzt .gt_font_bold {
   font-weight: bold;
 }
 
-#wucwuafyzy .gt_font_italic {
+#owpwgkfrzt .gt_font_italic {
   font-style: italic;
 }
 
-#wucwuafyzy .gt_super {
+#owpwgkfrzt .gt_super {
   font-size: 65%;
 }
 
-#wucwuafyzy .gt_footnote_marks {
+#owpwgkfrzt .gt_footnote_marks {
   font-size: 75%;
   vertical-align: 0.4em;
   position: initial;
 }
 
-#wucwuafyzy .gt_asterisk {
+#owpwgkfrzt .gt_asterisk {
   font-size: 100%;
   vertical-align: 0;
 }
 
-#wucwuafyzy .gt_indent_1 {
+#owpwgkfrzt .gt_indent_1 {
   text-indent: 5px;
 }
 
-#wucwuafyzy .gt_indent_2 {
+#owpwgkfrzt .gt_indent_2 {
   text-indent: 10px;
 }
 
-#wucwuafyzy .gt_indent_3 {
+#owpwgkfrzt .gt_indent_3 {
   text-indent: 15px;
 }
 
-#wucwuafyzy .gt_indent_4 {
+#owpwgkfrzt .gt_indent_4 {
   text-indent: 20px;
 }
 
-#wucwuafyzy .gt_indent_5 {
+#owpwgkfrzt .gt_indent_5 {
   text-indent: 25px;
 }
 </style>
 <table class="gt_table" data-quarto-disable-processing="false" data-quarto-bootstrap="false">
-  <caption><span id="tab:model-logisticexamp-biden-int-tab">TABLE 7.9: </span>Logistic regression output for predicting voting for Biden given early voting behavior and gender - with interaction, RECS 2020</caption>
+  <caption><span id="tab:model-logisticexamp-biden-int-tab">TABLE 7.9: </span>Logistic regression output for predicting voting for Biden given early voting behavior and income - with interaction, ANES 2020</caption>
   <thead>
     
     <tr class="gt_col_headings">
@@ -4437,60 +4463,115 @@ <h4>Example 2: Interaction effects<a href="c07-modeling.html#example-2-interacti
   </thead>
   <tbody class="gt_table_body">
     <tr><td headers="term" class="gt_row gt_left">(Intercept)</td>
-<td headers="estimate" class="gt_row gt_right">−0.20</td>
-<td headers="std.error" class="gt_row gt_right">0.36</td>
-<td headers="statistic" class="gt_row gt_right">−0.55</td>
-<td headers="p.value" class="gt_row gt_right">0.5844</td></tr>
+<td headers="estimate" class="gt_row gt_right">2.32</td>
+<td headers="std.error" class="gt_row gt_right">0.67</td>
+<td headers="statistic" class="gt_row gt_right">3.45</td>
+<td headers="p.value" class="gt_row gt_right">0.0015</td></tr>
     <tr><td headers="term" class="gt_row gt_left">EarlyVote2020Yes</td>
-<td headers="estimate" class="gt_row gt_right">0.38</td>
-<td headers="std.error" class="gt_row gt_right">0.47</td>
-<td headers="statistic" class="gt_row gt_right">0.80</td>
-<td headers="p.value" class="gt_row gt_right">0.4277</td></tr>
-    <tr><td headers="term" class="gt_row gt_left">GenderFemale</td>
-<td headers="estimate" class="gt_row gt_right">0.76</td>
-<td headers="std.error" class="gt_row gt_right">0.54</td>
-<td headers="statistic" class="gt_row gt_right">1.42</td>
-<td headers="p.value" class="gt_row gt_right">0.1625</td></tr>
-    <tr><td headers="term" class="gt_row gt_left">EarlyVote2020Yes:GenderFemale</td>
-<td headers="estimate" class="gt_row gt_right">0.27</td>
-<td headers="std.error" class="gt_row gt_right">0.60</td>
-<td headers="statistic" class="gt_row gt_right">0.45</td>
-<td headers="p.value" class="gt_row gt_right">0.6583</td></tr>
+<td headers="estimate" class="gt_row gt_right">−0.81</td>
+<td headers="std.error" class="gt_row gt_right">0.78</td>
+<td headers="statistic" class="gt_row gt_right">−1.03</td>
+<td headers="p.value" class="gt_row gt_right">0.3081</td></tr>
+    <tr><td headers="term" class="gt_row gt_left">Income7$20k to &lt; 40k</td>
+<td headers="estimate" class="gt_row gt_right">−2.33</td>
+<td headers="std.error" class="gt_row gt_right">0.87</td>
+<td headers="statistic" class="gt_row gt_right">−2.68</td>
+<td headers="p.value" class="gt_row gt_right">0.0113</td></tr>
+    <tr><td headers="term" class="gt_row gt_left">Income7$40k to &lt; 60k</td>
+<td headers="estimate" class="gt_row gt_right">−1.67</td>
+<td headers="std.error" class="gt_row gt_right">0.89</td>
+<td headers="statistic" class="gt_row gt_right">−1.87</td>
+<td headers="p.value" class="gt_row gt_right">0.0700</td></tr>
+    <tr><td headers="term" class="gt_row gt_left">Income7$60k to &lt; 80k</td>
+<td headers="estimate" class="gt_row gt_right">−2.05</td>
+<td headers="std.error" class="gt_row gt_right">1.05</td>
+<td headers="statistic" class="gt_row gt_right">−1.96</td>
+<td headers="p.value" class="gt_row gt_right">0.0580</td></tr>
+    <tr><td headers="term" class="gt_row gt_left">Income7$80k to &lt; 100k</td>
+<td headers="estimate" class="gt_row gt_right">−3.42</td>
+<td headers="std.error" class="gt_row gt_right">1.12</td>
+<td headers="statistic" class="gt_row gt_right">−3.06</td>
+<td headers="p.value" class="gt_row gt_right">0.0043</td></tr>
+    <tr><td headers="term" class="gt_row gt_left">Income7$100k to &lt; 125k</td>
+<td headers="estimate" class="gt_row gt_right">−2.33</td>
+<td headers="std.error" class="gt_row gt_right">1.07</td>
+<td headers="statistic" class="gt_row gt_right">−2.17</td>
+<td headers="p.value" class="gt_row gt_right">0.0368</td></tr>
+    <tr><td headers="term" class="gt_row gt_left">Income7$125k or more</td>
+<td headers="estimate" class="gt_row gt_right">−2.09</td>
+<td headers="std.error" class="gt_row gt_right">0.92</td>
+<td headers="statistic" class="gt_row gt_right">−2.28</td>
+<td headers="p.value" class="gt_row gt_right">0.0289</td></tr>
+    <tr><td headers="term" class="gt_row gt_left">EarlyVote2020Yes:Income7$20k to &lt; 40k</td>
+<td headers="estimate" class="gt_row gt_right">1.60</td>
+<td headers="std.error" class="gt_row gt_right">0.95</td>
+<td headers="statistic" class="gt_row gt_right">1.69</td>
+<td headers="p.value" class="gt_row gt_right">0.1006</td></tr>
+    <tr><td headers="term" class="gt_row gt_left">EarlyVote2020Yes:Income7$40k to &lt; 60k</td>
+<td headers="estimate" class="gt_row gt_right">0.99</td>
+<td headers="std.error" class="gt_row gt_right">1.00</td>
+<td headers="statistic" class="gt_row gt_right">0.99</td>
+<td headers="p.value" class="gt_row gt_right">0.3289</td></tr>
+    <tr><td headers="term" class="gt_row gt_left">EarlyVote2020Yes:Income7$60k to &lt; 80k</td>
+<td headers="estimate" class="gt_row gt_right">0.90</td>
+<td headers="std.error" class="gt_row gt_right">1.14</td>
+<td headers="statistic" class="gt_row gt_right">0.79</td>
+<td headers="p.value" class="gt_row gt_right">0.4373</td></tr>
+    <tr><td headers="term" class="gt_row gt_left">EarlyVote2020Yes:Income7$80k to &lt; 100k</td>
+<td headers="estimate" class="gt_row gt_right">3.22</td>
+<td headers="std.error" class="gt_row gt_right">1.16</td>
+<td headers="statistic" class="gt_row gt_right">2.78</td>
+<td headers="p.value" class="gt_row gt_right">0.0087</td></tr>
+    <tr><td headers="term" class="gt_row gt_left">EarlyVote2020Yes:Income7$100k to &lt; 125k</td>
+<td headers="estimate" class="gt_row gt_right">1.64</td>
+<td headers="std.error" class="gt_row gt_right">1.11</td>
+<td headers="statistic" class="gt_row gt_right">1.48</td>
+<td headers="p.value" class="gt_row gt_right">0.1492</td></tr>
+    <tr><td headers="term" class="gt_row gt_left">EarlyVote2020Yes:Income7$125k or more</td>
+<td headers="estimate" class="gt_row gt_right">1.00</td>
+<td headers="std.error" class="gt_row gt_right">1.14</td>
+<td headers="statistic" class="gt_row gt_right">0.88</td>
+<td headers="p.value" class="gt_row gt_right">0.3867</td></tr>
   </tbody>
   
   
 </table>
 </div>
-<p>The results from the interaction model show that the interaction between early voting behavior and gender is significant. To better understand what this interaction means, we will want to plot the predicted probabilities with an interaction plot. Let’s first obtain the predicted probabilities for each possible combination of variables using the <code>augment()</code> function.</p>
+<p>The results from the interaction model (see Table <a href="c07-modeling.html#tab:model-logisticexamp-biden-int-tab">7.9</a>) show that one interaction between early voting behavior and income is significant. To better understand what this interaction means, we can plot the predicted probabilities with an interaction plot. Let’s first obtain the predicted probabilities for each possible combination of variables using the <code>augment()</code> function.</p>
 <div class="sourceCode" id="cb248"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb248-1"><a href="c07-modeling.html#cb248-1" tabindex="-1"></a>log_biden_pred <span class="ot">&lt;-</span> log_biden_int <span class="sc">%&gt;%</span></span>
 <span id="cb248-2"><a href="c07-modeling.html#cb248-2" tabindex="-1"></a>  <span class="fu">augment</span>(<span class="at">type.predict =</span> <span class="st">&quot;response&quot;</span>) <span class="sc">%&gt;%</span></span>
-<span id="cb248-3"><a href="c07-modeling.html#cb248-3" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">.se.fit =</span> <span class="fu">sqrt</span>(<span class="fu">attr</span>(.fitted, <span class="st">&quot;var&quot;</span>)), </span>
+<span id="cb248-3"><a href="c07-modeling.html#cb248-3" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">.se.fit =</span> <span class="fu">sqrt</span>(<span class="fu">attr</span>(.fitted, <span class="st">&quot;var&quot;</span>)),</span>
 <span id="cb248-4"><a href="c07-modeling.html#cb248-4" tabindex="-1"></a>         <span class="at">.fitted =</span> <span class="fu">as.numeric</span>(.fitted)) <span class="sc">%&gt;%</span></span>
-<span id="cb248-5"><a href="c07-modeling.html#cb248-5" tabindex="-1"></a>  <span class="fu">select</span>(VoteBiden, EarlyVote2020, Gender, .fitted, .se.fit) </span></code></pre></div>
-<p>To create an interaction plot, the y-axis will be the predicted probabilities, and one of our x-variables will be on the x-axis and the other will be represented by multiple lines. Figure <a href="c07-modeling.html#fig:model-logisticexamp-biden-plot">7.4</a> shows the interaction plot with gender on the x-axis and early voting behavior represented by the lines.</p>
-<div class="sourceCode" id="cb249"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb249-1"><a href="c07-modeling.html#cb249-1" tabindex="-1"></a>log_biden_pred <span class="sc">%&gt;%</span> </span>
-<span id="cb249-2"><a href="c07-modeling.html#cb249-2" tabindex="-1"></a>  <span class="fu">filter</span>(VoteBiden<span class="sc">==</span><span class="dv">1</span>) <span class="sc">%&gt;%</span> </span>
-<span id="cb249-3"><a href="c07-modeling.html#cb249-3" tabindex="-1"></a>  <span class="fu">distinct</span>() <span class="sc">%&gt;%</span> </span>
-<span id="cb249-4"><a href="c07-modeling.html#cb249-4" tabindex="-1"></a>  <span class="fu">arrange</span>(Gender, EarlyVote2020) <span class="sc">%&gt;%</span> </span>
-<span id="cb249-5"><a href="c07-modeling.html#cb249-5" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">EarlyVote2020 =</span> <span class="fu">fct_reorder2</span>(EarlyVote2020, Gender, .fitted)) <span class="sc">%&gt;%</span></span>
-<span id="cb249-6"><a href="c07-modeling.html#cb249-6" tabindex="-1"></a>  <span class="fu">ggplot</span>(<span class="fu">aes</span>(<span class="at">x =</span> Gender, <span class="at">y =</span> .fitted, <span class="at">group =</span> EarlyVote2020,</span>
-<span id="cb249-7"><a href="c07-modeling.html#cb249-7" tabindex="-1"></a>             <span class="at">color =</span> EarlyVote2020, <span class="at">linetype =</span> EarlyVote2020)) <span class="sc">+</span></span>
-<span id="cb249-8"><a href="c07-modeling.html#cb249-8" tabindex="-1"></a>  <span class="fu">geom_line</span>(<span class="at">linewidth =</span> <span class="fl">1.1</span>) <span class="sc">+</span></span>
-<span id="cb249-9"><a href="c07-modeling.html#cb249-9" tabindex="-1"></a>  <span class="fu">scale_color_manual</span>(<span class="at">values =</span> book_colors[<span class="fu">c</span>(<span class="dv">2</span>,<span class="dv">4</span>)]) <span class="sc">+</span></span>
-<span id="cb249-10"><a href="c07-modeling.html#cb249-10" tabindex="-1"></a>  <span class="fu">ylab</span>(<span class="st">&quot;Predicted Probability of Voting for Biden&quot;</span>) <span class="sc">+</span></span>
-<span id="cb249-11"><a href="c07-modeling.html#cb249-11" tabindex="-1"></a>  <span class="fu">labs</span>(<span class="at">color=</span><span class="st">&quot;Voted Early&quot;</span>,</span>
-<span id="cb249-12"><a href="c07-modeling.html#cb249-12" tabindex="-1"></a>       <span class="at">linetype=</span><span class="st">&quot;Voted Early&quot;</span>) <span class="sc">+</span></span>
-<span id="cb249-13"><a href="c07-modeling.html#cb249-13" tabindex="-1"></a>  <span class="fu">coord_cartesian</span>(<span class="at">ylim=</span><span class="fu">c</span>(<span class="dv">0</span>,<span class="dv">1</span>)) <span class="sc">+</span></span>
-<span id="cb249-14"><a href="c07-modeling.html#cb249-14" tabindex="-1"></a>  <span class="fu">guides</span>(<span class="at">fill =</span> <span class="st">&quot;none&quot;</span>) <span class="sc">+</span></span>
-<span id="cb249-15"><a href="c07-modeling.html#cb249-15" tabindex="-1"></a>  <span class="fu">theme_minimal</span>()</span></code></pre></div>
+<span id="cb248-5"><a href="c07-modeling.html#cb248-5" tabindex="-1"></a>  <span class="fu">select</span>(VoteBiden, EarlyVote2020, Income7, .fitted, .se.fit) </span></code></pre></div>
+<p>The y-axis is the predicted probabilities, one of our x-variables is on the x-axis, and the other is represented by multiple lines. Figure <a href="c07-modeling.html#fig:model-logisticexamp-biden-plot">7.4</a> shows the interaction plot with early voting behavior on the x-axis and income represented by the lines.</p>
+<div class="sourceCode" id="cb249"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb249-1"><a href="c07-modeling.html#cb249-1" tabindex="-1"></a>log_biden_pred <span class="sc">%&gt;%</span></span>
+<span id="cb249-2"><a href="c07-modeling.html#cb249-2" tabindex="-1"></a>  <span class="fu">filter</span>(VoteBiden <span class="sc">==</span> <span class="dv">1</span>) <span class="sc">%&gt;%</span></span>
+<span id="cb249-3"><a href="c07-modeling.html#cb249-3" tabindex="-1"></a>  <span class="fu">distinct</span>() <span class="sc">%&gt;%</span></span>
+<span id="cb249-4"><a href="c07-modeling.html#cb249-4" tabindex="-1"></a>  <span class="fu">arrange</span>(EarlyVote2020, Income7) <span class="sc">%&gt;%</span></span>
+<span id="cb249-5"><a href="c07-modeling.html#cb249-5" tabindex="-1"></a>  <span class="fu">ggplot</span>(<span class="fu">aes</span>(</span>
+<span id="cb249-6"><a href="c07-modeling.html#cb249-6" tabindex="-1"></a>    <span class="at">x =</span> EarlyVote2020,</span>
+<span id="cb249-7"><a href="c07-modeling.html#cb249-7" tabindex="-1"></a>    <span class="at">y =</span> .fitted,</span>
+<span id="cb249-8"><a href="c07-modeling.html#cb249-8" tabindex="-1"></a>    <span class="at">group =</span> Income7,</span>
+<span id="cb249-9"><a href="c07-modeling.html#cb249-9" tabindex="-1"></a>    <span class="at">color =</span> Income7,</span>
+<span id="cb249-10"><a href="c07-modeling.html#cb249-10" tabindex="-1"></a>    <span class="at">linetype =</span> Income7</span>
+<span id="cb249-11"><a href="c07-modeling.html#cb249-11" tabindex="-1"></a>  )) <span class="sc">+</span></span>
+<span id="cb249-12"><a href="c07-modeling.html#cb249-12" tabindex="-1"></a>  <span class="fu">geom_line</span>(<span class="at">linewidth =</span> <span class="fl">1.1</span>) <span class="sc">+</span></span>
+<span id="cb249-13"><a href="c07-modeling.html#cb249-13" tabindex="-1"></a>  <span class="fu">scale_color_manual</span>(<span class="at">values =</span> <span class="fu">colorRampPalette</span>(book_colors)(<span class="dv">7</span>)) <span class="sc">+</span></span>
+<span id="cb249-14"><a href="c07-modeling.html#cb249-14" tabindex="-1"></a>  <span class="fu">ylab</span>(<span class="st">&quot;Predicted Probability of Voting for Biden&quot;</span>) <span class="sc">+</span></span>
+<span id="cb249-15"><a href="c07-modeling.html#cb249-15" tabindex="-1"></a>  <span class="fu">labs</span>(<span class="at">x =</span> <span class="st">&quot;Voted Early&quot;</span>,</span>
+<span id="cb249-16"><a href="c07-modeling.html#cb249-16" tabindex="-1"></a>       <span class="at">color =</span> <span class="st">&quot;Income&quot;</span>,</span>
+<span id="cb249-17"><a href="c07-modeling.html#cb249-17" tabindex="-1"></a>       <span class="at">linetype =</span> <span class="st">&quot;Income&quot;</span>) <span class="sc">+</span></span>
+<span id="cb249-18"><a href="c07-modeling.html#cb249-18" tabindex="-1"></a>  <span class="fu">coord_cartesian</span>(<span class="at">ylim =</span> <span class="fu">c</span>(<span class="dv">0</span>, <span class="dv">1</span>)) <span class="sc">+</span></span>
+<span id="cb249-19"><a href="c07-modeling.html#cb249-19" tabindex="-1"></a>  <span class="fu">guides</span>(<span class="at">fill =</span> <span class="st">&quot;none&quot;</span>) <span class="sc">+</span></span>
+<span id="cb249-20"><a href="c07-modeling.html#cb249-20" tabindex="-1"></a>  <span class="fu">theme_minimal</span>()</span></code></pre></div>
 <div class="figure"><span style="display:block;" id="fig:model-logisticexamp-biden-plot"></span>
-<img src="bookdown_files/figure-html/model-logisticexamp-biden-plot-1.png" alt="Line plot with x-axis as Male and Female (left to right) and y-axis as 'Predicted Probability of Voting for Biden'. There are two lines for early voting indicators with lines being from top to bottom: Did Not Early Vote and Did Early Vote. The line representing did not early vote is roughly parallel with similar predicted probabilities between males and females. For those who did early vote, females have higher predicted probability of voting for Biden than males." width="672" />
+<img src="bookdown_files/figure-html/model-logisticexamp-biden-plot-1.png" alt="Line plot with x-axis as indicator for voted early, with did not early vote on the left and did early vote on the right, and y-axis as 'Predicted Probability of Voting for Biden'. There are seven lines for income groups with lines being from top to bottom: Under $20k, $80k to less than $100k, $40k to less than $60k, $100k to less than $125k, $20k to less than 40k, $125k or more, and $60k to less than $80k. The lines for $40k to less than $60k, $60k to less than $80k, and $125k or more are all relatively flat with the probabilities for did not early vote and did early vote being equivalent. The lines for $20k to less than $40k and $100k to less than $125k have a slight positive slope. The line for less than $20k has a slight negative slope and has overall the highest probability for both levels of early voting. The line for $80k to less than $100k has a large positive slope. This line shows the lowest probability for those who did not early vote, and the second highest probability for those who did early vote." width="672" />
 <p class="caption">
-FIGURE 7.4: Interaction Plot of Gender and Early Voting Predicting the Probability of Voting for Biden
+FIGURE 7.4: Interaction Plot of Early Voting and Income Predicting the Probability of Voting for Biden
 </p>
 </div>
-<p>From this plot we can see that respondents who indicated a male gender had roughly the same probability of voting for Biden regardless of if they voted early or not. However, females who voted early were more likely to vote for Biden if they voted early than if they did not vote early.</p>
-<p>Interactions in models can be difficult to understand from the coefficients alone. Using these interaction plots can help others understand the nuances of the results, and often can become even more helpful with more than two levels in a given factor (e.g., education or race/ethnicity).</p>
+<p>From Figure <a href="c07-modeling.html#fig:model-logisticexamp-biden-plot">7.4</a>, we can see that people who have incomes in most groups (e.g., $40k to &lt;60k) have roughly the same probability of voting for Biden regardless of whether they voted early or not. However, those with income in the $100k to &lt; 125k group were more likely to vote for Biden if they voted early than if they did not vote early.</p>
+<p>Interactions in models can be difficult to understand from the coefficients alone. Using these interaction plots can help others understand the nuances of the results.</p>
 </div>
 </div>
 </div>
@@ -4498,7 +4579,7 @@ <h4>Example 2: Interaction effects<a href="c07-modeling.html#example-2-interacti
 <h2><span class="header-section-number">7.5</span> Exercises<a href="c07-modeling.html#exercises-1" class="anchor-section" aria-label="Anchor link to header"></a></h2>
 <ol style="list-style-type: decimal">
 <li><p>The type of housing unit may have an impact on energy expenses. Is there any relationship between housing unit type (<code>HousingUnitType</code>) and total energy expenditure (<code>TOTALDOL</code>)? First, find the average energy expenditure by housing unit type as a descriptive analysis and then do the test. The reference level in the comparison should be the housing unit type that is most common.</p></li>
-<li><p>Does temperature play a role in electricity expenditure? Cooling degree days are a measure of how hot a place is. CDD65 for a given day indicates the number of degrees Fahrenheit warmer than 65°F (18.3°C) it is in a location. On a day that averages 65°F and below, CDD65=0. While a day that averages 85°F (29.4°C) would have CDD65=20 because it is 20 degrees Fahrenheit warmer <span class="citation">(<a href="#ref-eia-cdd">U.S. Energy Information Administration 2023d</a>)</span>. For each day in the year, this is summed to give an indicator of how hot the place is throughout the year. Similarly, HDD65 indicates the days colder than 65°F. Can energy expenditure be predicted using these temperature indicators along with square footage? Is there a significant relationship? Include main effects and two-way interactions.</p></li>
+<li><p>Does temperature play a role in electricity expenditure? Cooling degree days are a measure of how hot a place is. CDD65 for a given day indicates the number of degrees Fahrenheit warmer than 65°F (18.3°C) it is in a location. On a day that averages 65°F and below, CDD65=0, while a day that averages 85°F (29.4°C) would have CDD65=20 because it is 20 degrees Fahrenheit warmer <span class="citation">(<a href="#ref-eia-cdd">U.S. Energy Information Administration 2023d</a>)</span>. Each day in the year is summed up to indicate how hot the place is throughout the year. Similarly, HDD65 indicates the days colder than 65°F. Can energy expenditure be predicted using these temperature indicators along with square footage? Is there a significant relationship? Include main effects and two-way interactions.</p></li>
 <li><p>Continuing with our results from question 2, create a plot between the actual and predicted expenditures and a residual plot for the predicted expenditures.</p></li>
 <li><p>Early voting expanded in 2020 <span class="citation">(<a href="#ref-npr-voting-trend">Sprunt 2020</a>)</span>. Build a logistic model predicting early voting in 2020 (<code>EarlyVote2020</code>) using age (<code>Age</code>), education (<code>Education</code>), and party identification (<code>PartyID</code>). Include two-way interactions.</p></li>
 <li><p>Continuing from Exercise 4, predict the probability of early voting for two people. Both are 28 years old and have a graduate degree, but one person is a strong Democrat, and the other is a strong Republican.</p></li>
@@ -4543,7 +4624,7 @@ <h3>References<a href="references.html#references" class="anchor-section" aria-l
 <hr />
 <ol start="23">
 <li id="fn23"><p>Use <code>help(formula)</code> or <code>?formula</code> in R<a href="c07-modeling.html#fnref23" class="footnote-back">↩︎</a></p></li>
-<li id="fn24"><p>There is some debate about whether weights should be used in regression <span class="citation">(<a href="#ref-bollen2016weightsreg">Bollen et al. 2016</a>; <a href="#ref-gelman2007weights">Gelman 2007</a>)</span>. However, for the purposes of providing complete information on how to analyze complex survey data, this chapter will include weights.<a href="c07-modeling.html#fnref24" class="footnote-back">↩︎</a></p></li>
+<li id="fn24"><p>There is some debate about whether weights should be used in regression <span class="citation">(<a href="#ref-bollen2016weightsreg">Bollen et al. 2016</a>; <a href="#ref-gelman2007weights">Gelman 2007</a>)</span>. However, for the purposes of providing complete information on how to analyze complex survey data, this chapter includes weights.<a href="c07-modeling.html#fnref24" class="footnote-back">↩︎</a></p></li>
 <li id="fn25"><p>Question: How often can you trust the federal government in Washington to do what is right?<a href="c07-modeling.html#fnref25" class="footnote-back">↩︎</a></p></li>
 </ol>
 </div>
diff --git a/c08-communicating-results.html b/c08-communicating-results.html
index f368aa69..8ae67eaa 100644
--- a/c08-communicating-results.html
+++ b/c08-communicating-results.html
@@ -23,7 +23,7 @@
 <meta name="author" content="Stephanie Zimmer, Rebecca J. Powell, and Isabella Velásquez" />
 
 
-<meta name="date" content="2024-04-16" />
+<meta name="date" content="2024-04-24" />
 
   <meta name="viewport" content="width=device-width, initial-scale=1" />
   <meta name="apple-mobile-web-app-capable" content="yes" />
@@ -229,7 +229,7 @@
 <li class="chapter" data-level="4.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#introduction-3"><i class="fa fa-check"></i><b>4.1</b> Introduction</a></li>
 <li class="chapter" data-level="4.2" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup"><i class="fa fa-check"></i><b>4.2</b> Setup</a>
 <ul>
-<li class="chapter" data-level="4.2.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#packages"><i class="fa fa-check"></i><b>4.2.1</b> Packages</a></li>
+<li class="chapter" data-level="4.2.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup-load-pkgs"><i class="fa fa-check"></i><b>4.2.1</b> Packages</a></li>
 <li class="chapter" data-level="4.2.2" data-path="c04-getting-started.html"><a href="c04-getting-started.html#data"><i class="fa fa-check"></i><b>4.2.2</b> Data</a></li>
 <li class="chapter" data-level="4.2.3" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup-des-obj"><i class="fa fa-check"></i><b>4.2.3</b> Design objects</a></li>
 </ul></li>
@@ -358,7 +358,7 @@
 <li class="chapter" data-level="9.9.1" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#random-number-seeds"><i class="fa fa-check"></i><b>9.9.1</b> Random number seeds</a></li>
 <li class="chapter" data-level="9.9.2" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#descriptive-names-and-labels"><i class="fa fa-check"></i><b>9.9.2</b> Descriptive names and labels</a></li>
 </ul></li>
-<li class="chapter" data-level="9.10" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#summary"><i class="fa fa-check"></i><b>9.10</b> Summary</a></li>
+<li class="chapter" data-level="9.10" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#additional-resources-1"><i class="fa fa-check"></i><b>9.10</b> Additional resources</a></li>
 </ul></li>
 <li class="part"><span><b>IV Real life data</b></span></li>
 <li class="chapter" data-level="10" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html"><i class="fa fa-check"></i><b>10</b> Sample designs and replicate weights</a>
@@ -380,7 +380,7 @@
 <ul>
 <li class="chapter" data-level="10.4.1" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#balanced-repeated-replication-brr-method"><i class="fa fa-check"></i><b>10.4.1</b> Balanced Repeated Replication (BRR) method</a></li>
 <li class="chapter" data-level="10.4.2" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#fays-brr-method"><i class="fa fa-check"></i><b>10.4.2</b> Fay’s BRR method</a></li>
-<li class="chapter" data-level="10.4.3" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#jackknife-method"><i class="fa fa-check"></i><b>10.4.3</b> Jackknife method</a></li>
+<li class="chapter" data-level="10.4.3" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#samp-jackknife"><i class="fa fa-check"></i><b>10.4.3</b> Jackknife method</a></li>
 <li class="chapter" data-level="10.4.4" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#bootstrap-method"><i class="fa fa-check"></i><b>10.4.4</b> Bootstrap method</a></li>
 </ul></li>
 <li class="chapter" data-level="10.5" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#exercises-2"><i class="fa fa-check"></i><b>10.5</b> Exercises</a></li>
@@ -398,14 +398,14 @@
 <li class="chapter" data-level="11.4" data-path="c11-missing-data.html"><a href="c11-missing-data.html#analysis-with-missing-data"><i class="fa fa-check"></i><b>11.4</b> Analysis with missing data</a>
 <ul>
 <li class="chapter" data-level="11.4.1" data-path="c11-missing-data.html"><a href="c11-missing-data.html#recoding-missing-data"><i class="fa fa-check"></i><b>11.4.1</b> Recoding missing data</a></li>
-<li class="chapter" data-level="11.4.2" data-path="c11-missing-data.html"><a href="c11-missing-data.html#accounting-for-skip-patterns"><i class="fa fa-check"></i><b>11.4.2</b> Accounting for skip patterns</a></li>
+<li class="chapter" data-level="11.4.2" data-path="c11-missing-data.html"><a href="c11-missing-data.html#missing-skip-patt"><i class="fa fa-check"></i><b>11.4.2</b> Accounting for skip patterns</a></li>
 </ul></li>
 </ul></li>
 <li class="chapter" data-level="12" data-path="c12-recommendations.html"><a href="c12-recommendations.html"><i class="fa fa-check"></i><b>12</b> Successful survey analysis recommendations</a>
 <ul>
 <li class="chapter" data-level="" data-path="c12-recommendations.html"><a href="c12-recommendations.html#prereq12"><i class="fa fa-check"></i>Prerequisites</a></li>
 <li class="chapter" data-level="12.1" data-path="c12-recommendations.html"><a href="c12-recommendations.html#introduction-10"><i class="fa fa-check"></i><b>12.1</b> Introduction</a></li>
-<li class="chapter" data-level="12.2" data-path="c12-recommendations.html"><a href="c12-recommendations.html#recs-survey-process"><i class="fa fa-check"></i><b>12.2</b> Follow survey analysis process</a></li>
+<li class="chapter" data-level="12.2" data-path="c12-recommendations.html"><a href="c12-recommendations.html#recs-survey-process"><i class="fa fa-check"></i><b>12.2</b> Follow the survey analysis process</a></li>
 <li class="chapter" data-level="12.3" data-path="c12-recommendations.html"><a href="c12-recommendations.html#begin-with-descriptive-analysis"><i class="fa fa-check"></i><b>12.3</b> Begin with descriptive analysis</a>
 <ul>
 <li class="chapter" data-level="12.3.1" data-path="c12-recommendations.html"><a href="c12-recommendations.html#table-review"><i class="fa fa-check"></i><b>12.3.1</b> Table review</a></li>
@@ -530,7 +530,7 @@ <h3>Prerequisites<a href="c08-communicating-results.html#prereq8" class="anchor-
 <span id="cb250-4"><a href="c08-communicating-results.html#cb250-4" tabindex="-1"></a><span class="fu">library</span>(srvyrexploR)</span>
 <span id="cb250-5"><a href="c08-communicating-results.html#cb250-5" tabindex="-1"></a><span class="fu">library</span>(gt)</span>
 <span id="cb250-6"><a href="c08-communicating-results.html#cb250-6" tabindex="-1"></a><span class="fu">library</span>(gtsummary)</span></code></pre></div>
-<p>We will be using data from ANES as described in Chapter <a href="c04-getting-started.html#c04-getting-started">4</a>. As a reminder, here is the code to create the design objects for each to use throughout this chapter. For ANES, we need to adjust the weight so it sums to the population instead of the sample (see the ANES documentation and Chapter <a href="c04-getting-started.html#c04-getting-started">4</a> for more information).</p>
+<p>We are using data from ANES as described in Chapter <a href="c04-getting-started.html#c04-getting-started">4</a>. As a reminder, here is the code to create the design objects for each to use throughout this chapter. For ANES, we need to adjust the weight so it sums to the population instead of the sample (see the ANES documentation and Chapter <a href="c04-getting-started.html#c04-getting-started">4</a> for more information.)</p>
 <div class="sourceCode" id="cb251"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb251-1"><a href="c08-communicating-results.html#cb251-1" tabindex="-1"></a>targetpop <span class="ot">&lt;-</span> <span class="dv">231592693</span></span>
 <span id="cb251-2"><a href="c08-communicating-results.html#cb251-2" tabindex="-1"></a></span>
 <span id="cb251-3"><a href="c08-communicating-results.html#cb251-3" tabindex="-1"></a>anes_adjwgt <span class="ot">&lt;-</span> anes_2020 <span class="sc">%&gt;%</span></span>
@@ -549,52 +549,52 @@ <h2><span class="header-section-number">8.1</span> Introduction<a href="c08-comm
 <p>After finishing the analysis and modeling, we proceed to the important task of communicating the survey results. Our audience may range from seasoned researchers familiar with our survey data to newcomers encountering the information for the first time. We should aim to explain the methodology and analysis while presenting findings in an accessible way, and it is our responsibility to report information with care.</p>
 <p>Before beginning any dissemination of results, consider questions such as:</p>
 <ul>
-<li>How will we present results? Examples include a website, print, or other media. Based on the media type, we might limit or enhance the use of graphical representation.</li>
-<li>What is the audience’s familiarity with the study and/or data? Audiences can range from the general public to data experts. If we anticipate limited knowledge about the study, we should provide detailed descriptions (we discuss recommendations later in the chapter).</li>
-<li>What are we trying to communicate? It could be summary statistics, trends, patterns, or other insights. Tables might suit summary statistics, while plots are better at conveying trends and patterns.</li>
+<li>How are we presenting results? Examples include a website, print, or other media. Based on the media type, we might limit or enhance the use of graphical representation.</li>
+<li>What is the audience’s familiarity with the study and/or data? Audiences can range from the general public to data experts. If we anticipate limited knowledge about the study, we should provide detailed descriptions (we discuss recommendations later in the chapter.)</li>
+<li>What are we trying to communicate? It could be summary statistics, trends, patterns, or other insights. Tables may suit summary statistics, while plots are better at conveying trends and patterns.</li>
 <li>Is the audience accustomed to interpreting plots? If not, include explanatory text to guide them on how to interpret the plots effectively.</li>
 <li>What is the audience’s statistical knowledge? If the audience does not have a strong statistics background, provide text on standard errors, confidence intervals, and other estimate types to enhance understanding.</li>
 </ul>
 </div>
 <div id="describing-results-through-text" class="section level2 hasAnchor" number="8.2">
 <h2><span class="header-section-number">8.2</span> Describing results through text<a href="c08-communicating-results.html#describing-results-through-text" class="anchor-section" aria-label="Anchor link to header"></a></h2>
-<p>As analysts, our emphasis is often on the data, and communicating results can sometimes be overlooked. First, we need to identify the appropriate information to share with our audience. Chapters <a href="c02-overview-surveys.html#c02-overview-surveys">2</a> and <a href="c03-survey-data-documentation.html#c03-survey-data-documentation">3</a> provide insights into factors we need to consider during analysis, and they remain relevant when presenting results to others.</p>
+<p>As analysts, we often emphasize the data, and communicating results can sometimes be overlooked. To be effective communicators, we need to identify the appropriate information to share with our audience. Chapters <a href="c02-overview-surveys.html#c02-overview-surveys">2</a> and <a href="c03-survey-data-documentation.html#c03-survey-data-documentation">3</a> provide insights into factors we need to consider during analysis, and they remain relevant when presenting results to others.</p>
 <div id="methodology-1" class="section level3 hasAnchor" number="8.2.1">
 <h3><span class="header-section-number">8.2.1</span> Methodology<a href="c08-communicating-results.html#methodology-1" class="anchor-section" aria-label="Anchor link to header"></a></h3>
-<p>If we are using existing data, methodologically-sound surveys will provide documentation about how the survey was fielded, the questionnaires, and other necessary information for analyses. For example, the survey’s methodology reports should include the population of interest, sampling procedures, response rates, questionnaire documentation, weighting, and a general overview of disclosure statements. Many American organizations follow the American Association for Public Opinion Research’s (AAPOR) <a href="https://aapor.org/standards-and-ethics/transparency-initiative">Transparency Initiative</a>. The AAPOR Transparency Initiative requires organizations to include specific details in their methodology, making it clear how we can and should analyze the results. Being transparent about these methods is vital for the scientific rigor of the field.</p>
+<p>If we are using existing data, methodologically-sound surveys provide documentation about how the survey was fielded, the questionnaires, and other necessary information for analyses. For example, the survey’s methodology reports should include the population of interest, sampling procedures, response rates, questionnaire documentation, weighting, and a general overview of disclosure statements. Many American organizations follow the American Association for Public Opinion Research’s (AAPOR) <a href="https://aapor.org/standards-and-ethics/transparency-initiative">Transparency Initiative.</a> The AAPOR Transparency Initiative requires organizations to include specific details in their methodology, making it clear how we can and should analyze and interpret the results. Being transparent about these methods is vital for the scientific rigor of the field.</p>
 <p>The details provided in Chapter <a href="c02-overview-surveys.html#c02-overview-surveys">2</a> about the survey process should be shared with the audience when presenting the results. When using publicly-available data, like the examples in this book, we can often link to the methodology report in our final output. We should also provide high-level information for the audience to quickly grasp the context around the findings. For example, we can mention when and where the study was conducted, the population’s age range, or other contextual details. This information helps the audience understand how generalizable the results are.</p>
-<p>Providing this material is especially important when there’s no methodology report available for the analyzed data. For example, if a researcher conducted a new survey for a specific purpose, we should document and present all the pertinent information during the analysis and reporting process. Adhering to the AAPOR Transparency Initiative guidelines is a reliable method to guarantee that all essential information is communicated to the audience.</p>
+<p>Providing this material is especially important when no methodology report is available for the analyzed data. For example, if we conducted a new survey for a specific purpose, we should document and present all the pertinent information during the analysis and reporting process. Adhering to the AAPOR Transparency Initiative guidelines is a reliable method to guarantee that all essential information is communicated to the audience.</p>
 </div>
 <div id="analysis" class="section level3 hasAnchor" number="8.2.2">
 <h3><span class="header-section-number">8.2.2</span> Analysis<a href="c08-communicating-results.html#analysis" class="anchor-section" aria-label="Anchor link to header"></a></h3>
-<p>Along with the survey methodology and weight calculations, we should also share our approach to preparing, cleaning, and analyzing the data. For example, in Chapter <a href="c06-statistical-testing.html#c06-statistical-testing">6</a>, we compared education distributions from the ANES survey to the American Community Survey (ACS). To make the comparison, we had to collapse education categories provided in the ANES data to match the ACS. The process for this particular example may seem straightforward (like combining Bachelor’s and Graduate Degrees into a single category), but there are multiple ways to deal with the data. Our choice is just one of many. We should document both the original ANES question and response options and the steps we took to match it with ACS data. This transparency helps clarify our analysis to our audience.</p>
-<p>Missing data is another instance where we want to be unambigious and upfront with our audience. In this book, numerous examples and exercises remove missing data, as this is often the easiest way to handle them. However, there are circumstances where missing data holds substantive importance, and excluding them could introduce bias (see Chapter <a href="c11-missing-data.html#c11-missing-data">11</a>). Being transparent about our handling of missing data is important to maintaining the integrity of our analysis and ensuring a comprehensive understanding of the results.</p>
+<p>Along with the survey methodology and weight calculations, we should also share our approach to preparing, cleaning, and analyzing the data. For example, in Chapter <a href="c06-statistical-testing.html#c06-statistical-testing">6</a>, we compared education distributions from the ANES survey to the American Community Survey (ACS.) To make the comparison, we had to collapse the education categories provided in the ANES data to match the ACS. The process for this particular example may seem straightforward (like combining Bachelor’s and Graduate Degrees into a single category), but there are multiple ways to deal with the data. Our choice is just one of many. We should document both the original ANES question and response options and the steps we took to match them with ACS data. This transparency helps clarify our analysis to our audience.</p>
+<p>Missing data is another instance where we want to be unambiguous and upfront with our audience. In this book, numerous examples and exercises remove missing data, as this is often the easiest way to handle them. However, there are circumstances where missing data holds substantive importance, and excluding them could introduce bias (see Chapter <a href="c11-missing-data.html#c11-missing-data">11</a>.) Being transparent about our handling of missing data is important to maintaining the integrity of our analysis and ensuring a comprehensive understanding of the results.</p>
 </div>
 <div id="results" class="section level3 hasAnchor" number="8.2.3">
 <h3><span class="header-section-number">8.2.3</span> Results<a href="c08-communicating-results.html#results" class="anchor-section" aria-label="Anchor link to header"></a></h3>
 <p>While tables and graphs are commonly used to communicate results, there are instances where text can be more effective in sharing information. Narrative details, such as context around point estimates or model coefficients, can go a long way in improving our communication. We have several strategies to effectively convey the significance of the data to the audience through text.</p>
-<p>First, we can highlight important data points in a sentence using plain language. For example, if we were looking at election polling data conducted before an election, we could say something like:</p>
+<p>First, we can highlight important data elements in a sentence using plain language. For example, if we were looking at election polling data conducted before an election, we could say:</p>
 <blockquote>
 <p>As of [DATE], an estimated XX% of registered U.S. voters say they will vote for [CANDIDATE NAME] for president in the [YEAR] general election.</p>
 </blockquote>
 <p>This sentence provides key pieces of information in a straightforward way:</p>
 <ol style="list-style-type: decimal">
-<li><strong>[DATE]</strong>: Given that polling data is time-specific, providing the date of reference lets the audience know when this data was valid.</li>
-<li><strong>Registered U.S. voters</strong>: This tells the audience who we surveyed, letting them know the target population.</li>
+<li><strong>[DATE]</strong>: Given that polling data are time-specific, providing the date of reference lets the audience know when these data were valid.</li>
+<li><strong>Registered U.S. voters</strong>: This tells the audience who we surveyed, letting them know the population of interest.</li>
 <li><strong>XX%</strong>: This part provides the estimated percentage of people voting for a specific candidate for a specific office.</li>
 <li><strong>[YEAR] general election</strong>: As with the bullet above, adding this gives more context about the election type and year. The estimate would take on a different meaning if we changed it to a <em>primary</em> election instead of a <em>general</em> election.</li>
 </ol>
 <p>We also included the word “estimated.” When presenting aggregate survey results, we have errors around each estimate. We want to convey this uncertainty rather than talk in absolutes. Words like “estimated,” “on average,” or “around” can help communicate this uncertainty to the audience. Instead of saying ‘XX%,’ we can also say ‘XX% (+/- Y%)’ to show the margin of error. Confidence intervals can also be incorporated into the text to assist readers.</p>
-<p>Second, providing context and discussing the <em>meaning</em> behind a point estimate can help the audience glean some insight into why the data is important. For example, when comparing two values, it can be helpful to highlight if there are statistically significant differences and explain the impact and relevance of this information. This is where we, as analysts, should to do our best to be mindful of biases and present the facts logically.</p>
-<p>Keep in mind how we discuss these findings can greatly influence how the audience interprets them. If we include speculation, using phrases like “the authors speculate” or “these findings may indicate” relays the uncertainty around the notion while still lending a plausible solution. Additionally, we can present alternative viewpoints or competing discussion points to explain the uncertainty in the results.</p>
+<p>Second, providing context and discussing the <em>meaning</em> behind a point estimate can help the audience glean some insight into why the data are important. For example, when comparing two values, it can be helpful to highlight if there are statistically significant differences and explain the impact and relevance of this information. This is where we should do our best to be mindful of biases and present the facts logically.</p>
+<p>Keep in mind how we discuss these findings can greatly influence how the audience interprets them. If we include speculation, phrases like “the authors speculate” or “these findings may indicate” relays the uncertainty around the notion while still lending a plausible solution. Additionally, we can present alternative viewpoints or competing discussion points to explain the uncertainty in the results.</p>
 </div>
 </div>
 <div id="visualizing-data" class="section level2 hasAnchor" number="8.3">
 <h2><span class="header-section-number">8.3</span> Visualizing data<a href="c08-communicating-results.html#visualizing-data" class="anchor-section" aria-label="Anchor link to header"></a></h2>
-<p>Although discussing key findings in the text is important, presenting large amounts of data is often more digestible for the audience in tables or visualizations. Effectively combining text, tables, and graphs can be powerful in communicating results. This section provides examples of using the {gt}, {gtsummary}, and {ggplot2} packages to enhance the dissemination of results <span class="citation">(<a href="#ref-R-gt">Iannone et al. 2023</a>; <a href="#ref-gtsummary">Sjoberg et al. 2021</a>; <a href="#ref-ggplot22016">Wickham 2016</a>)</span>.</p>
+<p>Although discussing key findings in the text is important, presenting large amounts of data in tables or visualizations is often more digestible for the audience. Effectively combining text, tables, and graphs can be powerful in communicating results. This section provides examples of using the {gt}, {gtsummary}, and {ggplot2} packages to enhance the dissemination of results <span class="citation">(<a href="#ref-R-gt">Iannone et al. 2023</a>; <a href="#ref-gtsummarysjo">Sjoberg et al. 2021</a>; <a href="#ref-ggplot2wickham">Wickham 2016</a>)</span>.</p>
 <div id="tables" class="section level3 hasAnchor" number="8.3.1">
 <h3><span class="header-section-number">8.3.1</span> Tables<a href="c08-communicating-results.html#tables" class="anchor-section" aria-label="Anchor link to header"></a></h3>
-<p>Tables are a great way to provide a large amount of data when individual data points need to be examined. However, it is important to present tables in a reader-friendly format. Numbers should align, rows and columns should be easy to follow, and the table size should not compromise readability. Using key visualization techniques, we can create tables that are informative and nice to look at. Many packages create easy-to-read tables (e.g., {kable} + {kableExtra}, {gt}, {gtsummary}, {DT}, {formattable}, {flextable}, {reactable}). While we will focus on {gt} here, we encourage learning about others as they may have additional helpful features. We appreciate the flexibility, ability to use pipes (e.g., <code>%&gt;%</code>), and numerous extensions of the {gt} package. Please note, at this time, {gtsummary} needs additional features to be widely used for survey analysis, particularly due to its lack of ability to work with replicate designs. We provide one example using {gtsummary} and hope it evolves into a more comprehensive tool over time.</p>
+<p>Tables are a great way to provide a large amount of data when individual data points need to be examined. However, it is important to present tables in a reader-friendly format. Numbers should align, rows and columns should be easy to follow, and the table size should not compromise readability. Using key visualization techniques, we can create tables that are informative and nice to look at. Many packages create easy-to-read tables (e.g., {kable} + {kableExtra}, {gt}, {gtsummary}, {DT}, {formattable}, {flextable}, {reactable}.) We appreciate the flexibility, ability to use pipes (e.g., <code>%&gt;%</code>), and numerous extensions of the {gt} package. While we focus on {gt} here, we encourage learning about others as they may have additional helpful features. Please note, at this time, {gtsummary} needs additional features to be widely used for survey analysis, particularly due to its lack of ability to work with replicate designs. We provide one example using {gtsummary} and hope it evolves into a more comprehensive tool over time.</p>
 <div id="results-gt" class="section level4 hasAnchor" number="8.3.1.1">
 <h4><span class="header-section-number">8.3.1.1</span> Transitioning {srvyr} output to a {gt} table<a href="c08-communicating-results.html#results-gt" class="anchor-section" aria-label="Anchor link to header"></a></h4>
 <p>Let’s start by using some of the data we calculated earlier in this book. In Chapter <a href="c06-statistical-testing.html#c06-statistical-testing">6</a>, we looked at data on trust in government with the proportions calculated below:</p>
@@ -613,8 +613,8 @@ <h4><span class="header-section-number">8.3.1.1</span> Transitioning {srvyr} out
 ## 4 Some of the time         0.434         0.00855
 ## 5 Never                    0.110         0.00566</code></pre>
 <p>The default output generated by R may work for initial viewing inside our IDE or when creating basic output in an R Markdown or Quarto document. However, when presenting these results in other publications, such as the print version of this book or with other formal dissemination modes, modifying the display can improve our reader’s experience.</p>
-<p>Looking at the output from <code>trust_gov</code>, a couple of improvements are obvious: (1) switching to percentages instead of proportions and (2) using the variable names as column headers. The {gt} package is a good tool for implementing better labeling and creating publishable tables. Let’s walk through some code as we make a few changes to improve the table’s usefulness.</p>
-<p>First, we initiate the table with the <code>gt()</code> function. Next, we use the argument <code>rowname_col()</code> to designate the <code>TrustGovernment</code> column as the labels for each row (called the table “stub”). We apply the <code>cols_label()</code> function to create informative column labels instead of variable names, and then the <code>tab_spanner()</code> function to add a label across multiple columns. In this case, we label all columns except the stub with “Trust in Government, 2020”. We then format the proportions into percentages with the <code>fmt_percent()</code> function and reduce the number of decimals shown with <code>decimals = 1</code>. Finally, the <code>tab_caption()</code> function adds a table title for HTML version of the book. We can use the caption for cross-referencing in R Markdown, Quarto, and bookdown, as well as adding it to the list of tables in the book.</p>
+<p>Looking at the output from <code>trust_gov</code>, a couple of improvements stand out: (1) switching to percentages instead of proportions and (2) removing the variable names as column headers. The {gt} package is a good tool for implementing better labeling and creating publishable tables. Let’s walk through some code as we make a few changes to improve the table’s usefulness.</p>
+<p>First, we initiate the formatted table with the <code>gt()</code> function on the <code>trust_gov</code> tibble previously created. Next, we use the argument <code>rowname_col()</code> to designate the <code>TrustGovernment</code> column as the label for each row (called the table “stub”.) We apply the <code>cols_label()</code> function to create informative column labels instead of variable names and then the <code>tab_spanner()</code> function to add a label across multiple columns. In this case, we label all columns except the stub with “Trust in Government, 2020”. We then format the proportions into percentages with the <code>fmt_percent()</code> function and reduce the number of decimals shown to one with <code>decimals = 1</code>. Finally, the <code>tab_caption()</code> function adds a table title for the HTML version of the book. We can use the caption for cross-referencing in R Markdown, Quarto, and bookdown, as well as adding it to the list of tables in the book. These changes are all seen in Table <a href="c08-communicating-results.html#tab:results-table-gt1-tab">8.1</a>.</p>
 <div class="sourceCode" id="cb254"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb254-1"><a href="c08-communicating-results.html#cb254-1" tabindex="-1"></a>trust_gov_gt <span class="ot">&lt;-</span> trust_gov <span class="sc">%&gt;%</span></span>
 <span id="cb254-2"><a href="c08-communicating-results.html#cb254-2" tabindex="-1"></a>  <span class="fu">gt</span>(<span class="at">rowname_col =</span> <span class="st">&quot;TrustGovernment&quot;</span>) <span class="sc">%&gt;%</span></span>
 <span id="cb254-3"><a href="c08-communicating-results.html#cb254-3" tabindex="-1"></a>  <span class="fu">cols_label</span>(<span class="at">trust_gov_p =</span> <span class="st">&quot;%&quot;</span>,</span>
@@ -625,23 +625,23 @@ <h4><span class="header-section-number">8.3.1.1</span> Transitioning {srvyr} out
 <div class="sourceCode" id="cb255"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb255-1"><a href="c08-communicating-results.html#cb255-1" tabindex="-1"></a>trust_gov_gt <span class="sc">%&gt;%</span> </span>
 <span id="cb255-2"><a href="c08-communicating-results.html#cb255-2" tabindex="-1"></a>  <span class="fu">tab_caption</span>(<span class="st">&quot;Example of gt table with trust in government estimate&quot;</span>)</span></code></pre></div>
 
-<div id="eooentrgiy" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
-<style>#eooentrgiy table {
+<div id="bdukvgbtnp" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
+<style>#bdukvgbtnp table {
   font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
   -webkit-font-smoothing: antialiased;
   -moz-osx-font-smoothing: grayscale;
 }
 
-#eooentrgiy thead, #eooentrgiy tbody, #eooentrgiy tfoot, #eooentrgiy tr, #eooentrgiy td, #eooentrgiy th {
+#bdukvgbtnp thead, #bdukvgbtnp tbody, #bdukvgbtnp tfoot, #bdukvgbtnp tr, #bdukvgbtnp td, #bdukvgbtnp th {
   border-style: none;
 }
 
-#eooentrgiy p {
+#bdukvgbtnp p {
   margin: 0;
   padding: 0;
 }
 
-#eooentrgiy .gt_table {
+#bdukvgbtnp .gt_table {
   display: table;
   border-collapse: collapse;
   line-height: normal;
@@ -667,12 +667,12 @@ <h4><span class="header-section-number">8.3.1.1</span> Transitioning {srvyr} out
   border-left-color: #D3D3D3;
 }
 
-#eooentrgiy .gt_caption {
+#bdukvgbtnp .gt_caption {
   padding-top: 4px;
   padding-bottom: 4px;
 }
 
-#eooentrgiy .gt_title {
+#bdukvgbtnp .gt_title {
   color: #333333;
   font-size: 125%;
   font-weight: initial;
@@ -684,7 +684,7 @@ <h4><span class="header-section-number">8.3.1.1</span> Transitioning {srvyr} out
   border-bottom-width: 0;
 }
 
-#eooentrgiy .gt_subtitle {
+#bdukvgbtnp .gt_subtitle {
   color: #333333;
   font-size: 85%;
   font-weight: initial;
@@ -696,7 +696,7 @@ <h4><span class="header-section-number">8.3.1.1</span> Transitioning {srvyr} out
   border-top-width: 0;
 }
 
-#eooentrgiy .gt_heading {
+#bdukvgbtnp .gt_heading {
   background-color: #FFFFFF;
   text-align: center;
   border-bottom-color: #FFFFFF;
@@ -708,13 +708,13 @@ <h4><span class="header-section-number">8.3.1.1</span> Transitioning {srvyr} out
   border-right-color: #D3D3D3;
 }
 
-#eooentrgiy .gt_bottom_border {
+#bdukvgbtnp .gt_bottom_border {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
 }
 
-#eooentrgiy .gt_col_headings {
+#bdukvgbtnp .gt_col_headings {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -729,7 +729,7 @@ <h4><span class="header-section-number">8.3.1.1</span> Transitioning {srvyr} out
   border-right-color: #D3D3D3;
 }
 
-#eooentrgiy .gt_col_heading {
+#bdukvgbtnp .gt_col_heading {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -749,7 +749,7 @@ <h4><span class="header-section-number">8.3.1.1</span> Transitioning {srvyr} out
   overflow-x: hidden;
 }
 
-#eooentrgiy .gt_column_spanner_outer {
+#bdukvgbtnp .gt_column_spanner_outer {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -761,15 +761,15 @@ <h4><span class="header-section-number">8.3.1.1</span> Transitioning {srvyr} out
   padding-right: 4px;
 }
 
-#eooentrgiy .gt_column_spanner_outer:first-child {
+#bdukvgbtnp .gt_column_spanner_outer:first-child {
   padding-left: 0;
 }
 
-#eooentrgiy .gt_column_spanner_outer:last-child {
+#bdukvgbtnp .gt_column_spanner_outer:last-child {
   padding-right: 0;
 }
 
-#eooentrgiy .gt_column_spanner {
+#bdukvgbtnp .gt_column_spanner {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
@@ -781,11 +781,11 @@ <h4><span class="header-section-number">8.3.1.1</span> Transitioning {srvyr} out
   width: 100%;
 }
 
-#eooentrgiy .gt_spanner_row {
+#bdukvgbtnp .gt_spanner_row {
   border-bottom-style: hidden;
 }
 
-#eooentrgiy .gt_group_heading {
+#bdukvgbtnp .gt_group_heading {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -811,7 +811,7 @@ <h4><span class="header-section-number">8.3.1.1</span> Transitioning {srvyr} out
   text-align: left;
 }
 
-#eooentrgiy .gt_empty_group_heading {
+#bdukvgbtnp .gt_empty_group_heading {
   padding: 0.5px;
   color: #333333;
   background-color: #FFFFFF;
@@ -826,15 +826,15 @@ <h4><span class="header-section-number">8.3.1.1</span> Transitioning {srvyr} out
   vertical-align: middle;
 }
 
-#eooentrgiy .gt_from_md > :first-child {
+#bdukvgbtnp .gt_from_md > :first-child {
   margin-top: 0;
 }
 
-#eooentrgiy .gt_from_md > :last-child {
+#bdukvgbtnp .gt_from_md > :last-child {
   margin-bottom: 0;
 }
 
-#eooentrgiy .gt_row {
+#bdukvgbtnp .gt_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -853,7 +853,7 @@ <h4><span class="header-section-number">8.3.1.1</span> Transitioning {srvyr} out
   overflow-x: hidden;
 }
 
-#eooentrgiy .gt_stub {
+#bdukvgbtnp .gt_stub {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -866,7 +866,7 @@ <h4><span class="header-section-number">8.3.1.1</span> Transitioning {srvyr} out
   padding-right: 5px;
 }
 
-#eooentrgiy .gt_stub_row_group {
+#bdukvgbtnp .gt_stub_row_group {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -880,15 +880,15 @@ <h4><span class="header-section-number">8.3.1.1</span> Transitioning {srvyr} out
   vertical-align: top;
 }
 
-#eooentrgiy .gt_row_group_first td {
+#bdukvgbtnp .gt_row_group_first td {
   border-top-width: 2px;
 }
 
-#eooentrgiy .gt_row_group_first th {
+#bdukvgbtnp .gt_row_group_first th {
   border-top-width: 2px;
 }
 
-#eooentrgiy .gt_summary_row {
+#bdukvgbtnp .gt_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -898,16 +898,16 @@ <h4><span class="header-section-number">8.3.1.1</span> Transitioning {srvyr} out
   padding-right: 5px;
 }
 
-#eooentrgiy .gt_first_summary_row {
+#bdukvgbtnp .gt_first_summary_row {
   border-top-style: solid;
   border-top-color: #D3D3D3;
 }
 
-#eooentrgiy .gt_first_summary_row.thick {
+#bdukvgbtnp .gt_first_summary_row.thick {
   border-top-width: 2px;
 }
 
-#eooentrgiy .gt_last_summary_row {
+#bdukvgbtnp .gt_last_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -917,7 +917,7 @@ <h4><span class="header-section-number">8.3.1.1</span> Transitioning {srvyr} out
   border-bottom-color: #D3D3D3;
 }
 
-#eooentrgiy .gt_grand_summary_row {
+#bdukvgbtnp .gt_grand_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -927,7 +927,7 @@ <h4><span class="header-section-number">8.3.1.1</span> Transitioning {srvyr} out
   padding-right: 5px;
 }
 
-#eooentrgiy .gt_first_grand_summary_row {
+#bdukvgbtnp .gt_first_grand_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -937,7 +937,7 @@ <h4><span class="header-section-number">8.3.1.1</span> Transitioning {srvyr} out
   border-top-color: #D3D3D3;
 }
 
-#eooentrgiy .gt_last_grand_summary_row_top {
+#bdukvgbtnp .gt_last_grand_summary_row_top {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -947,11 +947,11 @@ <h4><span class="header-section-number">8.3.1.1</span> Transitioning {srvyr} out
   border-bottom-color: #D3D3D3;
 }
 
-#eooentrgiy .gt_striped {
+#bdukvgbtnp .gt_striped {
   background-color: rgba(128, 128, 128, 0.05);
 }
 
-#eooentrgiy .gt_table_body {
+#bdukvgbtnp .gt_table_body {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -960,7 +960,7 @@ <h4><span class="header-section-number">8.3.1.1</span> Transitioning {srvyr} out
   border-bottom-color: #D3D3D3;
 }
 
-#eooentrgiy .gt_footnotes {
+#bdukvgbtnp .gt_footnotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -974,7 +974,7 @@ <h4><span class="header-section-number">8.3.1.1</span> Transitioning {srvyr} out
   border-right-color: #D3D3D3;
 }
 
-#eooentrgiy .gt_footnote {
+#bdukvgbtnp .gt_footnote {
   margin: 0px;
   font-size: 90%;
   padding-top: 4px;
@@ -983,7 +983,7 @@ <h4><span class="header-section-number">8.3.1.1</span> Transitioning {srvyr} out
   padding-right: 5px;
 }
 
-#eooentrgiy .gt_sourcenotes {
+#bdukvgbtnp .gt_sourcenotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -997,7 +997,7 @@ <h4><span class="header-section-number">8.3.1.1</span> Transitioning {srvyr} out
   border-right-color: #D3D3D3;
 }
 
-#eooentrgiy .gt_sourcenote {
+#bdukvgbtnp .gt_sourcenote {
   font-size: 90%;
   padding-top: 4px;
   padding-bottom: 4px;
@@ -1005,63 +1005,63 @@ <h4><span class="header-section-number">8.3.1.1</span> Transitioning {srvyr} out
   padding-right: 5px;
 }
 
-#eooentrgiy .gt_left {
+#bdukvgbtnp .gt_left {
   text-align: left;
 }
 
-#eooentrgiy .gt_center {
+#bdukvgbtnp .gt_center {
   text-align: center;
 }
 
-#eooentrgiy .gt_right {
+#bdukvgbtnp .gt_right {
   text-align: right;
   font-variant-numeric: tabular-nums;
 }
 
-#eooentrgiy .gt_font_normal {
+#bdukvgbtnp .gt_font_normal {
   font-weight: normal;
 }
 
-#eooentrgiy .gt_font_bold {
+#bdukvgbtnp .gt_font_bold {
   font-weight: bold;
 }
 
-#eooentrgiy .gt_font_italic {
+#bdukvgbtnp .gt_font_italic {
   font-style: italic;
 }
 
-#eooentrgiy .gt_super {
+#bdukvgbtnp .gt_super {
   font-size: 65%;
 }
 
-#eooentrgiy .gt_footnote_marks {
+#bdukvgbtnp .gt_footnote_marks {
   font-size: 75%;
   vertical-align: 0.4em;
   position: initial;
 }
 
-#eooentrgiy .gt_asterisk {
+#bdukvgbtnp .gt_asterisk {
   font-size: 100%;
   vertical-align: 0;
 }
 
-#eooentrgiy .gt_indent_1 {
+#bdukvgbtnp .gt_indent_1 {
   text-indent: 5px;
 }
 
-#eooentrgiy .gt_indent_2 {
+#bdukvgbtnp .gt_indent_2 {
   text-indent: 10px;
 }
 
-#eooentrgiy .gt_indent_3 {
+#bdukvgbtnp .gt_indent_3 {
   text-indent: 15px;
 }
 
-#eooentrgiy .gt_indent_4 {
+#bdukvgbtnp .gt_indent_4 {
   text-indent: 20px;
 }
 
-#eooentrgiy .gt_indent_5 {
+#bdukvgbtnp .gt_indent_5 {
   text-indent: 25px;
 }
 </style>
@@ -1101,7 +1101,7 @@ <h4><span class="header-section-number">8.3.1.1</span> Transitioning {srvyr} out
   
 </table>
 </div>
-<p>We can add a few more enhancements, such as a title, a data source note, and a footnote with the question information, using the functions <code>tab_header()</code>, <code>tab_source_note()</code>, and <code>tab_footnote()</code>. If having the percentage sign in both the header and the cells seems redundant, we can opt for <code>fmt_number()</code> instead of <code>fmt_percent()</code> and scale the number by 100 with <code>scale_by = 100</code>.</p>
+<p>We can add a few more enhancements, such as a title (which is different from a caption<a href="#fn26" class="footnote-ref" id="fnref26"><sup>26</sup></a>), a data source note, and a footnote with the question information, using the functions <code>tab_header()</code>, <code>tab_source_note()</code>, and <code>tab_footnote()</code>. If having the percentage sign in both the header and the cells seems redundant, we can opt for <code>fmt_number()</code> instead of <code>fmt_percent()</code> and scale the number by 100 with <code>scale_by = 100</code>. The resulting table is displayed in Table <a href="c08-communicating-results.html#tab:results-table-gt2-tab">8.2</a>.</p>
 <div class="sourceCode" id="cb256"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb256-1"><a href="c08-communicating-results.html#cb256-1" tabindex="-1"></a>trust_gov_gt2 <span class="ot">&lt;-</span> trust_gov_gt <span class="sc">%&gt;%</span></span>
 <span id="cb256-2"><a href="c08-communicating-results.html#cb256-2" tabindex="-1"></a>  <span class="fu">tab_header</span>(<span class="st">&quot;American voter&#39;s trust</span></span>
 <span id="cb256-3"><a href="c08-communicating-results.html#cb256-3" tabindex="-1"></a><span class="st">             in the federal government, 2020&quot;</span>) <span class="sc">%&gt;%</span></span>
@@ -1114,23 +1114,23 @@ <h4><span class="header-section-number">8.3.1.1</span> Transitioning {srvyr} out
 <span id="cb256-10"><a href="c08-communicating-results.html#cb256-10" tabindex="-1"></a>             <span class="at">decimals =</span> <span class="dv">1</span>)</span></code></pre></div>
 <div class="sourceCode" id="cb257"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb257-1"><a href="c08-communicating-results.html#cb257-1" tabindex="-1"></a>trust_gov_gt2</span></code></pre></div>
 
-<div id="gqxxiknzou" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
-<style>#gqxxiknzou table {
+<div id="enromajpnn" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
+<style>#enromajpnn table {
   font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
   -webkit-font-smoothing: antialiased;
   -moz-osx-font-smoothing: grayscale;
 }
 
-#gqxxiknzou thead, #gqxxiknzou tbody, #gqxxiknzou tfoot, #gqxxiknzou tr, #gqxxiknzou td, #gqxxiknzou th {
+#enromajpnn thead, #enromajpnn tbody, #enromajpnn tfoot, #enromajpnn tr, #enromajpnn td, #enromajpnn th {
   border-style: none;
 }
 
-#gqxxiknzou p {
+#enromajpnn p {
   margin: 0;
   padding: 0;
 }
 
-#gqxxiknzou .gt_table {
+#enromajpnn .gt_table {
   display: table;
   border-collapse: collapse;
   line-height: normal;
@@ -1156,12 +1156,12 @@ <h4><span class="header-section-number">8.3.1.1</span> Transitioning {srvyr} out
   border-left-color: #D3D3D3;
 }
 
-#gqxxiknzou .gt_caption {
+#enromajpnn .gt_caption {
   padding-top: 4px;
   padding-bottom: 4px;
 }
 
-#gqxxiknzou .gt_title {
+#enromajpnn .gt_title {
   color: #333333;
   font-size: 125%;
   font-weight: initial;
@@ -1173,7 +1173,7 @@ <h4><span class="header-section-number">8.3.1.1</span> Transitioning {srvyr} out
   border-bottom-width: 0;
 }
 
-#gqxxiknzou .gt_subtitle {
+#enromajpnn .gt_subtitle {
   color: #333333;
   font-size: 85%;
   font-weight: initial;
@@ -1185,7 +1185,7 @@ <h4><span class="header-section-number">8.3.1.1</span> Transitioning {srvyr} out
   border-top-width: 0;
 }
 
-#gqxxiknzou .gt_heading {
+#enromajpnn .gt_heading {
   background-color: #FFFFFF;
   text-align: center;
   border-bottom-color: #FFFFFF;
@@ -1197,13 +1197,13 @@ <h4><span class="header-section-number">8.3.1.1</span> Transitioning {srvyr} out
   border-right-color: #D3D3D3;
 }
 
-#gqxxiknzou .gt_bottom_border {
+#enromajpnn .gt_bottom_border {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
 }
 
-#gqxxiknzou .gt_col_headings {
+#enromajpnn .gt_col_headings {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -1218,7 +1218,7 @@ <h4><span class="header-section-number">8.3.1.1</span> Transitioning {srvyr} out
   border-right-color: #D3D3D3;
 }
 
-#gqxxiknzou .gt_col_heading {
+#enromajpnn .gt_col_heading {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -1238,7 +1238,7 @@ <h4><span class="header-section-number">8.3.1.1</span> Transitioning {srvyr} out
   overflow-x: hidden;
 }
 
-#gqxxiknzou .gt_column_spanner_outer {
+#enromajpnn .gt_column_spanner_outer {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -1250,15 +1250,15 @@ <h4><span class="header-section-number">8.3.1.1</span> Transitioning {srvyr} out
   padding-right: 4px;
 }
 
-#gqxxiknzou .gt_column_spanner_outer:first-child {
+#enromajpnn .gt_column_spanner_outer:first-child {
   padding-left: 0;
 }
 
-#gqxxiknzou .gt_column_spanner_outer:last-child {
+#enromajpnn .gt_column_spanner_outer:last-child {
   padding-right: 0;
 }
 
-#gqxxiknzou .gt_column_spanner {
+#enromajpnn .gt_column_spanner {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
@@ -1270,11 +1270,11 @@ <h4><span class="header-section-number">8.3.1.1</span> Transitioning {srvyr} out
   width: 100%;
 }
 
-#gqxxiknzou .gt_spanner_row {
+#enromajpnn .gt_spanner_row {
   border-bottom-style: hidden;
 }
 
-#gqxxiknzou .gt_group_heading {
+#enromajpnn .gt_group_heading {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -1300,7 +1300,7 @@ <h4><span class="header-section-number">8.3.1.1</span> Transitioning {srvyr} out
   text-align: left;
 }
 
-#gqxxiknzou .gt_empty_group_heading {
+#enromajpnn .gt_empty_group_heading {
   padding: 0.5px;
   color: #333333;
   background-color: #FFFFFF;
@@ -1315,15 +1315,15 @@ <h4><span class="header-section-number">8.3.1.1</span> Transitioning {srvyr} out
   vertical-align: middle;
 }
 
-#gqxxiknzou .gt_from_md > :first-child {
+#enromajpnn .gt_from_md > :first-child {
   margin-top: 0;
 }
 
-#gqxxiknzou .gt_from_md > :last-child {
+#enromajpnn .gt_from_md > :last-child {
   margin-bottom: 0;
 }
 
-#gqxxiknzou .gt_row {
+#enromajpnn .gt_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -1342,7 +1342,7 @@ <h4><span class="header-section-number">8.3.1.1</span> Transitioning {srvyr} out
   overflow-x: hidden;
 }
 
-#gqxxiknzou .gt_stub {
+#enromajpnn .gt_stub {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -1355,7 +1355,7 @@ <h4><span class="header-section-number">8.3.1.1</span> Transitioning {srvyr} out
   padding-right: 5px;
 }
 
-#gqxxiknzou .gt_stub_row_group {
+#enromajpnn .gt_stub_row_group {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -1369,15 +1369,15 @@ <h4><span class="header-section-number">8.3.1.1</span> Transitioning {srvyr} out
   vertical-align: top;
 }
 
-#gqxxiknzou .gt_row_group_first td {
+#enromajpnn .gt_row_group_first td {
   border-top-width: 2px;
 }
 
-#gqxxiknzou .gt_row_group_first th {
+#enromajpnn .gt_row_group_first th {
   border-top-width: 2px;
 }
 
-#gqxxiknzou .gt_summary_row {
+#enromajpnn .gt_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -1387,16 +1387,16 @@ <h4><span class="header-section-number">8.3.1.1</span> Transitioning {srvyr} out
   padding-right: 5px;
 }
 
-#gqxxiknzou .gt_first_summary_row {
+#enromajpnn .gt_first_summary_row {
   border-top-style: solid;
   border-top-color: #D3D3D3;
 }
 
-#gqxxiknzou .gt_first_summary_row.thick {
+#enromajpnn .gt_first_summary_row.thick {
   border-top-width: 2px;
 }
 
-#gqxxiknzou .gt_last_summary_row {
+#enromajpnn .gt_last_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -1406,7 +1406,7 @@ <h4><span class="header-section-number">8.3.1.1</span> Transitioning {srvyr} out
   border-bottom-color: #D3D3D3;
 }
 
-#gqxxiknzou .gt_grand_summary_row {
+#enromajpnn .gt_grand_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -1416,7 +1416,7 @@ <h4><span class="header-section-number">8.3.1.1</span> Transitioning {srvyr} out
   padding-right: 5px;
 }
 
-#gqxxiknzou .gt_first_grand_summary_row {
+#enromajpnn .gt_first_grand_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -1426,7 +1426,7 @@ <h4><span class="header-section-number">8.3.1.1</span> Transitioning {srvyr} out
   border-top-color: #D3D3D3;
 }
 
-#gqxxiknzou .gt_last_grand_summary_row_top {
+#enromajpnn .gt_last_grand_summary_row_top {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -1436,11 +1436,11 @@ <h4><span class="header-section-number">8.3.1.1</span> Transitioning {srvyr} out
   border-bottom-color: #D3D3D3;
 }
 
-#gqxxiknzou .gt_striped {
+#enromajpnn .gt_striped {
   background-color: rgba(128, 128, 128, 0.05);
 }
 
-#gqxxiknzou .gt_table_body {
+#enromajpnn .gt_table_body {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -1449,7 +1449,7 @@ <h4><span class="header-section-number">8.3.1.1</span> Transitioning {srvyr} out
   border-bottom-color: #D3D3D3;
 }
 
-#gqxxiknzou .gt_footnotes {
+#enromajpnn .gt_footnotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -1463,7 +1463,7 @@ <h4><span class="header-section-number">8.3.1.1</span> Transitioning {srvyr} out
   border-right-color: #D3D3D3;
 }
 
-#gqxxiknzou .gt_footnote {
+#enromajpnn .gt_footnote {
   margin: 0px;
   font-size: 90%;
   padding-top: 4px;
@@ -1472,7 +1472,7 @@ <h4><span class="header-section-number">8.3.1.1</span> Transitioning {srvyr} out
   padding-right: 5px;
 }
 
-#gqxxiknzou .gt_sourcenotes {
+#enromajpnn .gt_sourcenotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -1486,7 +1486,7 @@ <h4><span class="header-section-number">8.3.1.1</span> Transitioning {srvyr} out
   border-right-color: #D3D3D3;
 }
 
-#gqxxiknzou .gt_sourcenote {
+#enromajpnn .gt_sourcenote {
   font-size: 90%;
   padding-top: 4px;
   padding-bottom: 4px;
@@ -1494,63 +1494,63 @@ <h4><span class="header-section-number">8.3.1.1</span> Transitioning {srvyr} out
   padding-right: 5px;
 }
 
-#gqxxiknzou .gt_left {
+#enromajpnn .gt_left {
   text-align: left;
 }
 
-#gqxxiknzou .gt_center {
+#enromajpnn .gt_center {
   text-align: center;
 }
 
-#gqxxiknzou .gt_right {
+#enromajpnn .gt_right {
   text-align: right;
   font-variant-numeric: tabular-nums;
 }
 
-#gqxxiknzou .gt_font_normal {
+#enromajpnn .gt_font_normal {
   font-weight: normal;
 }
 
-#gqxxiknzou .gt_font_bold {
+#enromajpnn .gt_font_bold {
   font-weight: bold;
 }
 
-#gqxxiknzou .gt_font_italic {
+#enromajpnn .gt_font_italic {
   font-style: italic;
 }
 
-#gqxxiknzou .gt_super {
+#enromajpnn .gt_super {
   font-size: 65%;
 }
 
-#gqxxiknzou .gt_footnote_marks {
+#enromajpnn .gt_footnote_marks {
   font-size: 75%;
   vertical-align: 0.4em;
   position: initial;
 }
 
-#gqxxiknzou .gt_asterisk {
+#enromajpnn .gt_asterisk {
   font-size: 100%;
   vertical-align: 0;
 }
 
-#gqxxiknzou .gt_indent_1 {
+#enromajpnn .gt_indent_1 {
   text-indent: 5px;
 }
 
-#gqxxiknzou .gt_indent_2 {
+#enromajpnn .gt_indent_2 {
   text-indent: 10px;
 }
 
-#gqxxiknzou .gt_indent_3 {
+#enromajpnn .gt_indent_3 {
   text-indent: 15px;
 }
 
-#gqxxiknzou .gt_indent_4 {
+#enromajpnn .gt_indent_4 {
   text-indent: 20px;
 }
 
-#gqxxiknzou .gt_indent_5 {
+#enromajpnn .gt_indent_5 {
   text-indent: 25px;
 }
 </style>
@@ -1609,13 +1609,13 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
 <p>The {gtsummary} package simultaneously summarizes data and creates publication-ready tables. Initially designed for clinical trial data, it has been extended to include survey analysis in certain capacities. At this time, it is only compatible with survey objects using Taylor’s Series Linearization and not replicate methods. While it offers a restricted set of summary statistics, the following are available for categorical variables:</p>
 <ul>
 <li><code>{n}</code> frequency</li>
-<li><code>{N}</code> denominator, or cohort size</li>
-<li><code>{p}</code> percentage</li>
+<li><code>{N}</code> denominator, or respondent population</li>
+<li><code>{p}</code> proportion (stylized as a percentage by default)</li>
 <li><code>{p.std.error}</code> standard error of the sample proportion</li>
 <li><code>{deff}</code> design effect of the sample proportion</li>
 <li><code>{n_unweighted}</code> unweighted frequency</li>
 <li><code>{N_unweighted}</code> unweighted denominator</li>
-<li><code>{p_unweighted}</code> unweighted formatted percentage</li>
+<li><code>{p_unweighted}</code> unweighted formatted proportion (stylized as a percentage by default)</li>
 </ul>
 <p>The following summary statistics are available for continuous variables:</p>
 <ul>
@@ -1627,32 +1627,32 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
 <li><code>{var}</code> variance</li>
 <li><code>{min}</code> minimum</li>
 <li><code>{max}</code> maximum</li>
-<li><code>{p##}</code> any integer percentile, where <code>##</code> is an integer from 0 to 100</li>
+<li><code>{p#}</code> any integer percentile, where <code>#</code> is an integer from 0 to 100</li>
 <li><code>{sum}</code> sum</li>
 </ul>
-<p>In the following example, we will build a table using {gtsummary}, similar to the table in the {gt} example. The main function we use is <code>tbl_svysummary()</code>. In this function, we include the variables we want to analyze in the <code>include</code> argument and define the statistics we want to display in the <code>statistic</code> argument. To specify the statistics, we apply the syntax from the {glue} package, where we enclose the variables we want to insert within curly brackets. We must specify the desired statistics using the names listed above. For example, to specify that we want the proportion followed by the standard error of the proportion in parentheses, we use <code>{p} ({p.std.error})</code>.</p>
+<p>In the following example, we build a table using {gtsummary}, similar to the table in the {gt} example. The main function we use is <code>tbl_svysummary()</code>. In this function, we include the variables we want to analyze in the <code>include</code> argument and define the statistics we want to display in the <code>statistic</code> argument. To specify the statistics, we apply the syntax from the {glue} package, where we enclose the variables we want to insert within curly brackets. We must specify the desired statistics using the names listed above. For example, to specify that we want the proportion followed by the standard error of the proportion in parentheses, we use <code>{p} ({p.std.error})</code>. Table <a href="c08-communicating-results.html#tab:results-gts-ex-1-tab">8.3</a> displays the resulting table.</p>
 <div class="sourceCode" id="cb258"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb258-1"><a href="c08-communicating-results.html#cb258-1" tabindex="-1"></a>anes_des_gtsum <span class="ot">&lt;-</span> anes_des <span class="sc">%&gt;%</span></span>
 <span id="cb258-2"><a href="c08-communicating-results.html#cb258-2" tabindex="-1"></a>  <span class="fu">tbl_svysummary</span>(<span class="at">include =</span> TrustGovernment,</span>
 <span id="cb258-3"><a href="c08-communicating-results.html#cb258-3" tabindex="-1"></a>                 <span class="at">statistic =</span> <span class="fu">list</span>(<span class="fu">all_categorical</span>() <span class="sc">~</span> <span class="st">&quot;{p} ({p.std.error})&quot;</span>)) </span></code></pre></div>
 <div class="sourceCode" id="cb259"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb259-1"><a href="c08-communicating-results.html#cb259-1" tabindex="-1"></a>anes_des_gtsum</span></code></pre></div>
 
-<div id="eqhjruykcc" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
-<style>#eqhjruykcc table {
+<div id="shenfykkfl" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
+<style>#shenfykkfl table {
   font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
   -webkit-font-smoothing: antialiased;
   -moz-osx-font-smoothing: grayscale;
 }
 
-#eqhjruykcc thead, #eqhjruykcc tbody, #eqhjruykcc tfoot, #eqhjruykcc tr, #eqhjruykcc td, #eqhjruykcc th {
+#shenfykkfl thead, #shenfykkfl tbody, #shenfykkfl tfoot, #shenfykkfl tr, #shenfykkfl td, #shenfykkfl th {
   border-style: none;
 }
 
-#eqhjruykcc p {
+#shenfykkfl p {
   margin: 0;
   padding: 0;
 }
 
-#eqhjruykcc .gt_table {
+#shenfykkfl .gt_table {
   display: table;
   border-collapse: collapse;
   line-height: normal;
@@ -1678,12 +1678,12 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-left-color: #D3D3D3;
 }
 
-#eqhjruykcc .gt_caption {
+#shenfykkfl .gt_caption {
   padding-top: 4px;
   padding-bottom: 4px;
 }
 
-#eqhjruykcc .gt_title {
+#shenfykkfl .gt_title {
   color: #333333;
   font-size: 125%;
   font-weight: initial;
@@ -1695,7 +1695,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-bottom-width: 0;
 }
 
-#eqhjruykcc .gt_subtitle {
+#shenfykkfl .gt_subtitle {
   color: #333333;
   font-size: 85%;
   font-weight: initial;
@@ -1707,7 +1707,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-top-width: 0;
 }
 
-#eqhjruykcc .gt_heading {
+#shenfykkfl .gt_heading {
   background-color: #FFFFFF;
   text-align: center;
   border-bottom-color: #FFFFFF;
@@ -1719,13 +1719,13 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-right-color: #D3D3D3;
 }
 
-#eqhjruykcc .gt_bottom_border {
+#shenfykkfl .gt_bottom_border {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
 }
 
-#eqhjruykcc .gt_col_headings {
+#shenfykkfl .gt_col_headings {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -1740,7 +1740,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-right-color: #D3D3D3;
 }
 
-#eqhjruykcc .gt_col_heading {
+#shenfykkfl .gt_col_heading {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -1760,7 +1760,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   overflow-x: hidden;
 }
 
-#eqhjruykcc .gt_column_spanner_outer {
+#shenfykkfl .gt_column_spanner_outer {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -1772,15 +1772,15 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   padding-right: 4px;
 }
 
-#eqhjruykcc .gt_column_spanner_outer:first-child {
+#shenfykkfl .gt_column_spanner_outer:first-child {
   padding-left: 0;
 }
 
-#eqhjruykcc .gt_column_spanner_outer:last-child {
+#shenfykkfl .gt_column_spanner_outer:last-child {
   padding-right: 0;
 }
 
-#eqhjruykcc .gt_column_spanner {
+#shenfykkfl .gt_column_spanner {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
@@ -1792,11 +1792,11 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   width: 100%;
 }
 
-#eqhjruykcc .gt_spanner_row {
+#shenfykkfl .gt_spanner_row {
   border-bottom-style: hidden;
 }
 
-#eqhjruykcc .gt_group_heading {
+#shenfykkfl .gt_group_heading {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -1822,7 +1822,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   text-align: left;
 }
 
-#eqhjruykcc .gt_empty_group_heading {
+#shenfykkfl .gt_empty_group_heading {
   padding: 0.5px;
   color: #333333;
   background-color: #FFFFFF;
@@ -1837,15 +1837,15 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   vertical-align: middle;
 }
 
-#eqhjruykcc .gt_from_md > :first-child {
+#shenfykkfl .gt_from_md > :first-child {
   margin-top: 0;
 }
 
-#eqhjruykcc .gt_from_md > :last-child {
+#shenfykkfl .gt_from_md > :last-child {
   margin-bottom: 0;
 }
 
-#eqhjruykcc .gt_row {
+#shenfykkfl .gt_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -1864,7 +1864,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   overflow-x: hidden;
 }
 
-#eqhjruykcc .gt_stub {
+#shenfykkfl .gt_stub {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -1877,7 +1877,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   padding-right: 5px;
 }
 
-#eqhjruykcc .gt_stub_row_group {
+#shenfykkfl .gt_stub_row_group {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -1891,15 +1891,15 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   vertical-align: top;
 }
 
-#eqhjruykcc .gt_row_group_first td {
+#shenfykkfl .gt_row_group_first td {
   border-top-width: 2px;
 }
 
-#eqhjruykcc .gt_row_group_first th {
+#shenfykkfl .gt_row_group_first th {
   border-top-width: 2px;
 }
 
-#eqhjruykcc .gt_summary_row {
+#shenfykkfl .gt_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -1909,16 +1909,16 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   padding-right: 5px;
 }
 
-#eqhjruykcc .gt_first_summary_row {
+#shenfykkfl .gt_first_summary_row {
   border-top-style: solid;
   border-top-color: #D3D3D3;
 }
 
-#eqhjruykcc .gt_first_summary_row.thick {
+#shenfykkfl .gt_first_summary_row.thick {
   border-top-width: 2px;
 }
 
-#eqhjruykcc .gt_last_summary_row {
+#shenfykkfl .gt_last_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -1928,7 +1928,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-bottom-color: #D3D3D3;
 }
 
-#eqhjruykcc .gt_grand_summary_row {
+#shenfykkfl .gt_grand_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -1938,7 +1938,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   padding-right: 5px;
 }
 
-#eqhjruykcc .gt_first_grand_summary_row {
+#shenfykkfl .gt_first_grand_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -1948,7 +1948,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-top-color: #D3D3D3;
 }
 
-#eqhjruykcc .gt_last_grand_summary_row_top {
+#shenfykkfl .gt_last_grand_summary_row_top {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -1958,11 +1958,11 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-bottom-color: #D3D3D3;
 }
 
-#eqhjruykcc .gt_striped {
+#shenfykkfl .gt_striped {
   background-color: rgba(128, 128, 128, 0.05);
 }
 
-#eqhjruykcc .gt_table_body {
+#shenfykkfl .gt_table_body {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -1971,7 +1971,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-bottom-color: #D3D3D3;
 }
 
-#eqhjruykcc .gt_footnotes {
+#shenfykkfl .gt_footnotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -1985,7 +1985,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-right-color: #D3D3D3;
 }
 
-#eqhjruykcc .gt_footnote {
+#shenfykkfl .gt_footnote {
   margin: 0px;
   font-size: 90%;
   padding-top: 4px;
@@ -1994,7 +1994,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   padding-right: 5px;
 }
 
-#eqhjruykcc .gt_sourcenotes {
+#shenfykkfl .gt_sourcenotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -2008,7 +2008,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-right-color: #D3D3D3;
 }
 
-#eqhjruykcc .gt_sourcenote {
+#shenfykkfl .gt_sourcenote {
   font-size: 90%;
   padding-top: 4px;
   padding-bottom: 4px;
@@ -2016,68 +2016,68 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   padding-right: 5px;
 }
 
-#eqhjruykcc .gt_left {
+#shenfykkfl .gt_left {
   text-align: left;
 }
 
-#eqhjruykcc .gt_center {
+#shenfykkfl .gt_center {
   text-align: center;
 }
 
-#eqhjruykcc .gt_right {
+#shenfykkfl .gt_right {
   text-align: right;
   font-variant-numeric: tabular-nums;
 }
 
-#eqhjruykcc .gt_font_normal {
+#shenfykkfl .gt_font_normal {
   font-weight: normal;
 }
 
-#eqhjruykcc .gt_font_bold {
+#shenfykkfl .gt_font_bold {
   font-weight: bold;
 }
 
-#eqhjruykcc .gt_font_italic {
+#shenfykkfl .gt_font_italic {
   font-style: italic;
 }
 
-#eqhjruykcc .gt_super {
+#shenfykkfl .gt_super {
   font-size: 65%;
 }
 
-#eqhjruykcc .gt_footnote_marks {
+#shenfykkfl .gt_footnote_marks {
   font-size: 75%;
   vertical-align: 0.4em;
   position: initial;
 }
 
-#eqhjruykcc .gt_asterisk {
+#shenfykkfl .gt_asterisk {
   font-size: 100%;
   vertical-align: 0;
 }
 
-#eqhjruykcc .gt_indent_1 {
+#shenfykkfl .gt_indent_1 {
   text-indent: 5px;
 }
 
-#eqhjruykcc .gt_indent_2 {
+#shenfykkfl .gt_indent_2 {
   text-indent: 10px;
 }
 
-#eqhjruykcc .gt_indent_3 {
+#shenfykkfl .gt_indent_3 {
   text-indent: 15px;
 }
 
-#eqhjruykcc .gt_indent_4 {
+#shenfykkfl .gt_indent_4 {
   text-indent: 20px;
 }
 
-#eqhjruykcc .gt_indent_5 {
+#shenfykkfl .gt_indent_5 {
   text-indent: 25px;
 }
 </style>
 <table class="gt_table" data-quarto-disable-processing="false" data-quarto-bootstrap="false">
-  <caption><span id="tab:results-gts-ex-1-tab">TABLE 8.3: </span>Example of gtsummary table with trust in government estimates</caption>
+  <caption><span id="tab:results-gts-ex-1-tab">TABLE 8.3: </span>Example of {gtsummary} table with trust in government estimates</caption>
   <thead>
     
     <tr class="gt_col_headings">
@@ -2109,7 +2109,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   </tfoot>
 </table>
 </div>
-<p>The default table includes the weighted number of missing (or Unknown) records. The standard error is reported as a proportion, while the proportion is styled as a percentage. In the next step, we remove the Unknown category by setting the missing argument to “no” and format the standard error as a percentage using the <code>digits</code> argument. To improve the table for publication, we provide a more polished label for the “TrustGovernment” variable using the <code>label</code> argument.</p>
+<p>The default table (shown in Table <a href="c08-communicating-results.html#tab:results-gts-ex-1-tab">8.3</a> includes the weighted number of missing (or Unknown) records. The standard error is reported as a proportion, while the proportion is styled as a percentage. In the next step, we remove the Unknown category by setting the missing argument to “no” and format the standard error as a percentage using the <code>digits</code> argument. To improve the table for publication, we provide a more polished label for the “TrustGovernment” variable using the <code>label</code> argument. Te resulting table is displayed in Table <a href="c08-communicating-results.html#tab:results-gts-ex-2-tab">8.4</a>.</p>
 <div class="sourceCode" id="cb260"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb260-1"><a href="c08-communicating-results.html#cb260-1" tabindex="-1"></a>anes_des_gtsum2 <span class="ot">&lt;-</span> anes_des <span class="sc">%&gt;%</span></span>
 <span id="cb260-2"><a href="c08-communicating-results.html#cb260-2" tabindex="-1"></a>  <span class="fu">tbl_svysummary</span>(</span>
 <span id="cb260-3"><a href="c08-communicating-results.html#cb260-3" tabindex="-1"></a>    <span class="at">include =</span> TrustGovernment,</span>
@@ -2120,23 +2120,23 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
 <span id="cb260-8"><a href="c08-communicating-results.html#cb260-8" tabindex="-1"></a>  )</span></code></pre></div>
 <div class="sourceCode" id="cb261"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb261-1"><a href="c08-communicating-results.html#cb261-1" tabindex="-1"></a>anes_des_gtsum2</span></code></pre></div>
 
-<div id="xrxgqklkbn" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
-<style>#xrxgqklkbn table {
+<div id="vpbydkxlvl" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
+<style>#vpbydkxlvl table {
   font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
   -webkit-font-smoothing: antialiased;
   -moz-osx-font-smoothing: grayscale;
 }
 
-#xrxgqklkbn thead, #xrxgqklkbn tbody, #xrxgqklkbn tfoot, #xrxgqklkbn tr, #xrxgqklkbn td, #xrxgqklkbn th {
+#vpbydkxlvl thead, #vpbydkxlvl tbody, #vpbydkxlvl tfoot, #vpbydkxlvl tr, #vpbydkxlvl td, #vpbydkxlvl th {
   border-style: none;
 }
 
-#xrxgqklkbn p {
+#vpbydkxlvl p {
   margin: 0;
   padding: 0;
 }
 
-#xrxgqklkbn .gt_table {
+#vpbydkxlvl .gt_table {
   display: table;
   border-collapse: collapse;
   line-height: normal;
@@ -2162,12 +2162,12 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-left-color: #D3D3D3;
 }
 
-#xrxgqklkbn .gt_caption {
+#vpbydkxlvl .gt_caption {
   padding-top: 4px;
   padding-bottom: 4px;
 }
 
-#xrxgqklkbn .gt_title {
+#vpbydkxlvl .gt_title {
   color: #333333;
   font-size: 125%;
   font-weight: initial;
@@ -2179,7 +2179,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-bottom-width: 0;
 }
 
-#xrxgqklkbn .gt_subtitle {
+#vpbydkxlvl .gt_subtitle {
   color: #333333;
   font-size: 85%;
   font-weight: initial;
@@ -2191,7 +2191,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-top-width: 0;
 }
 
-#xrxgqklkbn .gt_heading {
+#vpbydkxlvl .gt_heading {
   background-color: #FFFFFF;
   text-align: center;
   border-bottom-color: #FFFFFF;
@@ -2203,13 +2203,13 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-right-color: #D3D3D3;
 }
 
-#xrxgqklkbn .gt_bottom_border {
+#vpbydkxlvl .gt_bottom_border {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
 }
 
-#xrxgqklkbn .gt_col_headings {
+#vpbydkxlvl .gt_col_headings {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -2224,7 +2224,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-right-color: #D3D3D3;
 }
 
-#xrxgqklkbn .gt_col_heading {
+#vpbydkxlvl .gt_col_heading {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -2244,7 +2244,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   overflow-x: hidden;
 }
 
-#xrxgqklkbn .gt_column_spanner_outer {
+#vpbydkxlvl .gt_column_spanner_outer {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -2256,15 +2256,15 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   padding-right: 4px;
 }
 
-#xrxgqklkbn .gt_column_spanner_outer:first-child {
+#vpbydkxlvl .gt_column_spanner_outer:first-child {
   padding-left: 0;
 }
 
-#xrxgqklkbn .gt_column_spanner_outer:last-child {
+#vpbydkxlvl .gt_column_spanner_outer:last-child {
   padding-right: 0;
 }
 
-#xrxgqklkbn .gt_column_spanner {
+#vpbydkxlvl .gt_column_spanner {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
@@ -2276,11 +2276,11 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   width: 100%;
 }
 
-#xrxgqklkbn .gt_spanner_row {
+#vpbydkxlvl .gt_spanner_row {
   border-bottom-style: hidden;
 }
 
-#xrxgqklkbn .gt_group_heading {
+#vpbydkxlvl .gt_group_heading {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -2306,7 +2306,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   text-align: left;
 }
 
-#xrxgqklkbn .gt_empty_group_heading {
+#vpbydkxlvl .gt_empty_group_heading {
   padding: 0.5px;
   color: #333333;
   background-color: #FFFFFF;
@@ -2321,15 +2321,15 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   vertical-align: middle;
 }
 
-#xrxgqklkbn .gt_from_md > :first-child {
+#vpbydkxlvl .gt_from_md > :first-child {
   margin-top: 0;
 }
 
-#xrxgqklkbn .gt_from_md > :last-child {
+#vpbydkxlvl .gt_from_md > :last-child {
   margin-bottom: 0;
 }
 
-#xrxgqklkbn .gt_row {
+#vpbydkxlvl .gt_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -2348,7 +2348,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   overflow-x: hidden;
 }
 
-#xrxgqklkbn .gt_stub {
+#vpbydkxlvl .gt_stub {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -2361,7 +2361,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   padding-right: 5px;
 }
 
-#xrxgqklkbn .gt_stub_row_group {
+#vpbydkxlvl .gt_stub_row_group {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -2375,15 +2375,15 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   vertical-align: top;
 }
 
-#xrxgqklkbn .gt_row_group_first td {
+#vpbydkxlvl .gt_row_group_first td {
   border-top-width: 2px;
 }
 
-#xrxgqklkbn .gt_row_group_first th {
+#vpbydkxlvl .gt_row_group_first th {
   border-top-width: 2px;
 }
 
-#xrxgqklkbn .gt_summary_row {
+#vpbydkxlvl .gt_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -2393,16 +2393,16 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   padding-right: 5px;
 }
 
-#xrxgqklkbn .gt_first_summary_row {
+#vpbydkxlvl .gt_first_summary_row {
   border-top-style: solid;
   border-top-color: #D3D3D3;
 }
 
-#xrxgqklkbn .gt_first_summary_row.thick {
+#vpbydkxlvl .gt_first_summary_row.thick {
   border-top-width: 2px;
 }
 
-#xrxgqklkbn .gt_last_summary_row {
+#vpbydkxlvl .gt_last_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -2412,7 +2412,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-bottom-color: #D3D3D3;
 }
 
-#xrxgqklkbn .gt_grand_summary_row {
+#vpbydkxlvl .gt_grand_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -2422,7 +2422,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   padding-right: 5px;
 }
 
-#xrxgqklkbn .gt_first_grand_summary_row {
+#vpbydkxlvl .gt_first_grand_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -2432,7 +2432,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-top-color: #D3D3D3;
 }
 
-#xrxgqklkbn .gt_last_grand_summary_row_top {
+#vpbydkxlvl .gt_last_grand_summary_row_top {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -2442,11 +2442,11 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-bottom-color: #D3D3D3;
 }
 
-#xrxgqklkbn .gt_striped {
+#vpbydkxlvl .gt_striped {
   background-color: rgba(128, 128, 128, 0.05);
 }
 
-#xrxgqklkbn .gt_table_body {
+#vpbydkxlvl .gt_table_body {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -2455,7 +2455,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-bottom-color: #D3D3D3;
 }
 
-#xrxgqklkbn .gt_footnotes {
+#vpbydkxlvl .gt_footnotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -2469,7 +2469,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-right-color: #D3D3D3;
 }
 
-#xrxgqklkbn .gt_footnote {
+#vpbydkxlvl .gt_footnote {
   margin: 0px;
   font-size: 90%;
   padding-top: 4px;
@@ -2478,7 +2478,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   padding-right: 5px;
 }
 
-#xrxgqklkbn .gt_sourcenotes {
+#vpbydkxlvl .gt_sourcenotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -2492,7 +2492,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-right-color: #D3D3D3;
 }
 
-#xrxgqklkbn .gt_sourcenote {
+#vpbydkxlvl .gt_sourcenote {
   font-size: 90%;
   padding-top: 4px;
   padding-bottom: 4px;
@@ -2500,68 +2500,68 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   padding-right: 5px;
 }
 
-#xrxgqklkbn .gt_left {
+#vpbydkxlvl .gt_left {
   text-align: left;
 }
 
-#xrxgqklkbn .gt_center {
+#vpbydkxlvl .gt_center {
   text-align: center;
 }
 
-#xrxgqklkbn .gt_right {
+#vpbydkxlvl .gt_right {
   text-align: right;
   font-variant-numeric: tabular-nums;
 }
 
-#xrxgqklkbn .gt_font_normal {
+#vpbydkxlvl .gt_font_normal {
   font-weight: normal;
 }
 
-#xrxgqklkbn .gt_font_bold {
+#vpbydkxlvl .gt_font_bold {
   font-weight: bold;
 }
 
-#xrxgqklkbn .gt_font_italic {
+#vpbydkxlvl .gt_font_italic {
   font-style: italic;
 }
 
-#xrxgqklkbn .gt_super {
+#vpbydkxlvl .gt_super {
   font-size: 65%;
 }
 
-#xrxgqklkbn .gt_footnote_marks {
+#vpbydkxlvl .gt_footnote_marks {
   font-size: 75%;
   vertical-align: 0.4em;
   position: initial;
 }
 
-#xrxgqklkbn .gt_asterisk {
+#vpbydkxlvl .gt_asterisk {
   font-size: 100%;
   vertical-align: 0;
 }
 
-#xrxgqklkbn .gt_indent_1 {
+#vpbydkxlvl .gt_indent_1 {
   text-indent: 5px;
 }
 
-#xrxgqklkbn .gt_indent_2 {
+#vpbydkxlvl .gt_indent_2 {
   text-indent: 10px;
 }
 
-#xrxgqklkbn .gt_indent_3 {
+#vpbydkxlvl .gt_indent_3 {
   text-indent: 15px;
 }
 
-#xrxgqklkbn .gt_indent_4 {
+#vpbydkxlvl .gt_indent_4 {
   text-indent: 20px;
 }
 
-#xrxgqklkbn .gt_indent_5 {
+#vpbydkxlvl .gt_indent_5 {
   text-indent: 25px;
 }
 </style>
 <table class="gt_table" data-quarto-disable-processing="false" data-quarto-bootstrap="false">
-  <caption><span id="tab:results-gts-ex-2-tab">TABLE 8.4: </span>Example of gtsummary table with trust in government estimates with labeling and digits options</caption>
+  <caption><span id="tab:results-gts-ex-2-tab">TABLE 8.4: </span>Example of {gtsummary} table with trust in government estimates with labeling and digits options</caption>
   <thead>
     
     <tr class="gt_col_headings">
@@ -2591,7 +2591,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   </tfoot>
 </table>
 </div>
-<p>To exclude the term “Characteristic” and the estimated population size, we can modify the header using the<code>modify_header()</code> function to update the <code>label</code>. Further adjustments can be made based on personal preferences, organizational guidelines, or other style guides. If we prefer having the standard error in the header, similar to the {gt} table, instead of in the footnote (the {gtsummary} default), we can make these changes by specifying <code>stat_0</code> in the <code>modify_header()</code> function. Additionally, using <code>modify_footnote()</code> with <code>update = everything() ~ NA</code> removes the standard error from the footnote. After transforming the object into a gt table using <code>as_gt()</code>, we can add footnotes and a title using the same methods explained in Section <a href="c08-communicating-results.html#results-gt">8.3.1.1</a>.</p>
+<p>Table <a href="c08-communicating-results.html#tab:results-gts-ex-2-tab">8.4</a> is closer to our ideal output, but we still want to make a few changes. To exclude the term “Characteristic” and the estimated population size (N), we can modify the header using the <code>modify_header()</code> function to update the <code>label</code>. Further adjustments can be made based on personal preferences, organizational guidelines, or other style guides. If we prefer having the standard error in the header, similar to the {gt} table, instead of in the footnote (the {gtsummary} default), we can make these changes by specifying <code>stat_0</code> in the <code>modify_header()</code> function. Additionally, using <code>modify_footnote()</code> with <code>update = everything() ~ NA</code> removes the standard error from the footnote. After transforming the object into a gt table using <code>as_gt()</code>, we can add footnotes and a title using the same methods explained in Section <a href="c08-communicating-results.html#results-gt">8.3.1.1</a>. This updated table is displayed in Table <a href="c08-communicating-results.html#tab:results-gts-ex-3-tab">8.5</a>.</p>
 <div class="sourceCode" id="cb262"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb262-1"><a href="c08-communicating-results.html#cb262-1" tabindex="-1"></a>anes_des_gtsum3 <span class="ot">&lt;-</span> anes_des <span class="sc">%&gt;%</span></span>
 <span id="cb262-2"><a href="c08-communicating-results.html#cb262-2" tabindex="-1"></a>  <span class="fu">tbl_svysummary</span>(</span>
 <span id="cb262-3"><a href="c08-communicating-results.html#cb262-3" tabindex="-1"></a>    <span class="at">include =</span> TrustGovernment,</span>
@@ -2614,23 +2614,23 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
 <span id="cb262-20"><a href="c08-communicating-results.html#cb262-20" tabindex="-1"></a>  )</span></code></pre></div>
 <div class="sourceCode" id="cb263"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb263-1"><a href="c08-communicating-results.html#cb263-1" tabindex="-1"></a>anes_des_gtsum3</span></code></pre></div>
 
-<div id="gcmixtdayn" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
-<style>#gcmixtdayn table {
+<div id="odgelfzebt" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
+<style>#odgelfzebt table {
   font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
   -webkit-font-smoothing: antialiased;
   -moz-osx-font-smoothing: grayscale;
 }
 
-#gcmixtdayn thead, #gcmixtdayn tbody, #gcmixtdayn tfoot, #gcmixtdayn tr, #gcmixtdayn td, #gcmixtdayn th {
+#odgelfzebt thead, #odgelfzebt tbody, #odgelfzebt tfoot, #odgelfzebt tr, #odgelfzebt td, #odgelfzebt th {
   border-style: none;
 }
 
-#gcmixtdayn p {
+#odgelfzebt p {
   margin: 0;
   padding: 0;
 }
 
-#gcmixtdayn .gt_table {
+#odgelfzebt .gt_table {
   display: table;
   border-collapse: collapse;
   line-height: normal;
@@ -2656,12 +2656,12 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-left-color: #D3D3D3;
 }
 
-#gcmixtdayn .gt_caption {
+#odgelfzebt .gt_caption {
   padding-top: 4px;
   padding-bottom: 4px;
 }
 
-#gcmixtdayn .gt_title {
+#odgelfzebt .gt_title {
   color: #333333;
   font-size: 125%;
   font-weight: initial;
@@ -2673,7 +2673,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-bottom-width: 0;
 }
 
-#gcmixtdayn .gt_subtitle {
+#odgelfzebt .gt_subtitle {
   color: #333333;
   font-size: 85%;
   font-weight: initial;
@@ -2685,7 +2685,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-top-width: 0;
 }
 
-#gcmixtdayn .gt_heading {
+#odgelfzebt .gt_heading {
   background-color: #FFFFFF;
   text-align: center;
   border-bottom-color: #FFFFFF;
@@ -2697,13 +2697,13 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-right-color: #D3D3D3;
 }
 
-#gcmixtdayn .gt_bottom_border {
+#odgelfzebt .gt_bottom_border {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
 }
 
-#gcmixtdayn .gt_col_headings {
+#odgelfzebt .gt_col_headings {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -2718,7 +2718,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-right-color: #D3D3D3;
 }
 
-#gcmixtdayn .gt_col_heading {
+#odgelfzebt .gt_col_heading {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -2738,7 +2738,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   overflow-x: hidden;
 }
 
-#gcmixtdayn .gt_column_spanner_outer {
+#odgelfzebt .gt_column_spanner_outer {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -2750,15 +2750,15 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   padding-right: 4px;
 }
 
-#gcmixtdayn .gt_column_spanner_outer:first-child {
+#odgelfzebt .gt_column_spanner_outer:first-child {
   padding-left: 0;
 }
 
-#gcmixtdayn .gt_column_spanner_outer:last-child {
+#odgelfzebt .gt_column_spanner_outer:last-child {
   padding-right: 0;
 }
 
-#gcmixtdayn .gt_column_spanner {
+#odgelfzebt .gt_column_spanner {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
@@ -2770,11 +2770,11 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   width: 100%;
 }
 
-#gcmixtdayn .gt_spanner_row {
+#odgelfzebt .gt_spanner_row {
   border-bottom-style: hidden;
 }
 
-#gcmixtdayn .gt_group_heading {
+#odgelfzebt .gt_group_heading {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -2800,7 +2800,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   text-align: left;
 }
 
-#gcmixtdayn .gt_empty_group_heading {
+#odgelfzebt .gt_empty_group_heading {
   padding: 0.5px;
   color: #333333;
   background-color: #FFFFFF;
@@ -2815,15 +2815,15 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   vertical-align: middle;
 }
 
-#gcmixtdayn .gt_from_md > :first-child {
+#odgelfzebt .gt_from_md > :first-child {
   margin-top: 0;
 }
 
-#gcmixtdayn .gt_from_md > :last-child {
+#odgelfzebt .gt_from_md > :last-child {
   margin-bottom: 0;
 }
 
-#gcmixtdayn .gt_row {
+#odgelfzebt .gt_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -2842,7 +2842,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   overflow-x: hidden;
 }
 
-#gcmixtdayn .gt_stub {
+#odgelfzebt .gt_stub {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -2855,7 +2855,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   padding-right: 5px;
 }
 
-#gcmixtdayn .gt_stub_row_group {
+#odgelfzebt .gt_stub_row_group {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -2869,15 +2869,15 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   vertical-align: top;
 }
 
-#gcmixtdayn .gt_row_group_first td {
+#odgelfzebt .gt_row_group_first td {
   border-top-width: 2px;
 }
 
-#gcmixtdayn .gt_row_group_first th {
+#odgelfzebt .gt_row_group_first th {
   border-top-width: 2px;
 }
 
-#gcmixtdayn .gt_summary_row {
+#odgelfzebt .gt_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -2887,16 +2887,16 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   padding-right: 5px;
 }
 
-#gcmixtdayn .gt_first_summary_row {
+#odgelfzebt .gt_first_summary_row {
   border-top-style: solid;
   border-top-color: #D3D3D3;
 }
 
-#gcmixtdayn .gt_first_summary_row.thick {
+#odgelfzebt .gt_first_summary_row.thick {
   border-top-width: 2px;
 }
 
-#gcmixtdayn .gt_last_summary_row {
+#odgelfzebt .gt_last_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -2906,7 +2906,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-bottom-color: #D3D3D3;
 }
 
-#gcmixtdayn .gt_grand_summary_row {
+#odgelfzebt .gt_grand_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -2916,7 +2916,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   padding-right: 5px;
 }
 
-#gcmixtdayn .gt_first_grand_summary_row {
+#odgelfzebt .gt_first_grand_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -2926,7 +2926,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-top-color: #D3D3D3;
 }
 
-#gcmixtdayn .gt_last_grand_summary_row_top {
+#odgelfzebt .gt_last_grand_summary_row_top {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -2936,11 +2936,11 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-bottom-color: #D3D3D3;
 }
 
-#gcmixtdayn .gt_striped {
+#odgelfzebt .gt_striped {
   background-color: rgba(128, 128, 128, 0.05);
 }
 
-#gcmixtdayn .gt_table_body {
+#odgelfzebt .gt_table_body {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -2949,7 +2949,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-bottom-color: #D3D3D3;
 }
 
-#gcmixtdayn .gt_footnotes {
+#odgelfzebt .gt_footnotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -2963,7 +2963,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-right-color: #D3D3D3;
 }
 
-#gcmixtdayn .gt_footnote {
+#odgelfzebt .gt_footnote {
   margin: 0px;
   font-size: 90%;
   padding-top: 4px;
@@ -2972,7 +2972,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   padding-right: 5px;
 }
 
-#gcmixtdayn .gt_sourcenotes {
+#odgelfzebt .gt_sourcenotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -2986,7 +2986,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-right-color: #D3D3D3;
 }
 
-#gcmixtdayn .gt_sourcenote {
+#odgelfzebt .gt_sourcenote {
   font-size: 90%;
   padding-top: 4px;
   padding-bottom: 4px;
@@ -2994,68 +2994,68 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   padding-right: 5px;
 }
 
-#gcmixtdayn .gt_left {
+#odgelfzebt .gt_left {
   text-align: left;
 }
 
-#gcmixtdayn .gt_center {
+#odgelfzebt .gt_center {
   text-align: center;
 }
 
-#gcmixtdayn .gt_right {
+#odgelfzebt .gt_right {
   text-align: right;
   font-variant-numeric: tabular-nums;
 }
 
-#gcmixtdayn .gt_font_normal {
+#odgelfzebt .gt_font_normal {
   font-weight: normal;
 }
 
-#gcmixtdayn .gt_font_bold {
+#odgelfzebt .gt_font_bold {
   font-weight: bold;
 }
 
-#gcmixtdayn .gt_font_italic {
+#odgelfzebt .gt_font_italic {
   font-style: italic;
 }
 
-#gcmixtdayn .gt_super {
+#odgelfzebt .gt_super {
   font-size: 65%;
 }
 
-#gcmixtdayn .gt_footnote_marks {
+#odgelfzebt .gt_footnote_marks {
   font-size: 75%;
   vertical-align: 0.4em;
   position: initial;
 }
 
-#gcmixtdayn .gt_asterisk {
+#odgelfzebt .gt_asterisk {
   font-size: 100%;
   vertical-align: 0;
 }
 
-#gcmixtdayn .gt_indent_1 {
+#odgelfzebt .gt_indent_1 {
   text-indent: 5px;
 }
 
-#gcmixtdayn .gt_indent_2 {
+#odgelfzebt .gt_indent_2 {
   text-indent: 10px;
 }
 
-#gcmixtdayn .gt_indent_3 {
+#odgelfzebt .gt_indent_3 {
   text-indent: 15px;
 }
 
-#gcmixtdayn .gt_indent_4 {
+#odgelfzebt .gt_indent_4 {
   text-indent: 20px;
 }
 
-#gcmixtdayn .gt_indent_5 {
+#odgelfzebt .gt_indent_5 {
   text-indent: 25px;
 }
 </style>
 <table class="gt_table" data-quarto-disable-processing="false" data-quarto-bootstrap="false">
-  <caption><span id="tab:results-gts-ex-3-tab">TABLE 8.5: </span>Example of gtsummary table with trust in government estimates with more labeling options and context</caption>
+  <caption><span id="tab:results-gts-ex-3-tab">TABLE 8.5: </span>Example of {gtsummary} table with trust in government estimates with more labeling options and context</caption>
   <thead>
     <tr class="gt_heading">
       <td colspan="2" class="gt_heading gt_title gt_font_normal gt_bottom_border" style>American voter's trust
@@ -3095,7 +3095,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   </tfoot>
 </table>
 </div>
-<p>We can also include continuous variables in the table. Below, we add a summary of the age variable by updating the <code>include</code>, <code>statistic</code>, and <code>digits</code> arguments.</p>
+<p>We can also include summaries of more than one variable in the table. These variables can be either categorical or continuous. In the following code and Table <a href="c08-communicating-results.html#tab:results-gts-ex-4-tab">8.6</a>, we add the mean age by updating the <code>include</code>, <code>statistic</code>, and <code>digits</code> arguments.</p>
 <div class="sourceCode" id="cb264"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb264-1"><a href="c08-communicating-results.html#cb264-1" tabindex="-1"></a>anes_des_gtsum4 <span class="ot">&lt;-</span> anes_des <span class="sc">%&gt;%</span></span>
 <span id="cb264-2"><a href="c08-communicating-results.html#cb264-2" tabindex="-1"></a>  <span class="fu">tbl_svysummary</span>(</span>
 <span id="cb264-3"><a href="c08-communicating-results.html#cb264-3" tabindex="-1"></a>    <span class="at">include =</span> <span class="fu">c</span>(TrustGovernment, Age),</span>
@@ -3112,33 +3112,34 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
 <span id="cb264-14"><a href="c08-communicating-results.html#cb264-14" tabindex="-1"></a>  <span class="fu">modify_header</span>(<span class="at">label =</span> <span class="st">&quot; &quot;</span>,</span>
 <span id="cb264-15"><a href="c08-communicating-results.html#cb264-15" tabindex="-1"></a>                <span class="at">stat_0 =</span> <span class="st">&quot;% (s.e.)&quot;</span>) <span class="sc">%&gt;%</span></span>
 <span id="cb264-16"><a href="c08-communicating-results.html#cb264-16" tabindex="-1"></a>  <span class="fu">as_gt</span>() <span class="sc">%&gt;%</span></span>
-<span id="cb264-17"><a href="c08-communicating-results.html#cb264-17" tabindex="-1"></a>  <span class="fu">tab_header</span>(<span class="st">&quot;American voter&#39;s trust in the federal government, 2020&quot;</span>) <span class="sc">%&gt;%</span></span>
-<span id="cb264-18"><a href="c08-communicating-results.html#cb264-18" tabindex="-1"></a>  <span class="fu">tab_source_note</span>(<span class="st">&quot;American National Election Studies, 2020&quot;</span>) <span class="sc">%&gt;%</span></span>
-<span id="cb264-19"><a href="c08-communicating-results.html#cb264-19" tabindex="-1"></a>  <span class="fu">tab_footnote</span>(</span>
-<span id="cb264-20"><a href="c08-communicating-results.html#cb264-20" tabindex="-1"></a>    <span class="st">&quot;Question text: How often can you trust the federal government</span></span>
-<span id="cb264-21"><a href="c08-communicating-results.html#cb264-21" tabindex="-1"></a><span class="st">    in Washington to do what is right?&quot;</span></span>
-<span id="cb264-22"><a href="c08-communicating-results.html#cb264-22" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
-<span id="cb264-23"><a href="c08-communicating-results.html#cb264-23" tabindex="-1"></a>  <span class="fu">tab_caption</span>(<span class="st">&quot;Example of gtsummary table with trust in government</span></span>
-<span id="cb264-24"><a href="c08-communicating-results.html#cb264-24" tabindex="-1"></a><span class="st">              estimates and average age&quot;</span>)</span></code></pre></div>
+<span id="cb264-17"><a href="c08-communicating-results.html#cb264-17" tabindex="-1"></a>  <span class="fu">tab_header</span>(</span>
+<span id="cb264-18"><a href="c08-communicating-results.html#cb264-18" tabindex="-1"></a>    <span class="st">&quot;American voter&#39;s trust in the federal government, 2020&quot;</span>) <span class="sc">%&gt;%</span></span>
+<span id="cb264-19"><a href="c08-communicating-results.html#cb264-19" tabindex="-1"></a>  <span class="fu">tab_source_note</span>(<span class="st">&quot;American National Election Studies, 2020&quot;</span>) <span class="sc">%&gt;%</span></span>
+<span id="cb264-20"><a href="c08-communicating-results.html#cb264-20" tabindex="-1"></a>  <span class="fu">tab_footnote</span>(</span>
+<span id="cb264-21"><a href="c08-communicating-results.html#cb264-21" tabindex="-1"></a>    <span class="st">&quot;Question text: How often can you trust the federal government</span></span>
+<span id="cb264-22"><a href="c08-communicating-results.html#cb264-22" tabindex="-1"></a><span class="st">    in Washington to do what is right?&quot;</span></span>
+<span id="cb264-23"><a href="c08-communicating-results.html#cb264-23" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
+<span id="cb264-24"><a href="c08-communicating-results.html#cb264-24" tabindex="-1"></a>  <span class="fu">tab_caption</span>(<span class="st">&quot;Example of gtsummary table with trust in government</span></span>
+<span id="cb264-25"><a href="c08-communicating-results.html#cb264-25" tabindex="-1"></a><span class="st">              estimates and average age&quot;</span>)</span></code></pre></div>
 <div class="sourceCode" id="cb265"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb265-1"><a href="c08-communicating-results.html#cb265-1" tabindex="-1"></a>anes_des_gtsum4</span></code></pre></div>
 
-<div id="qvinrblcgt" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
-<style>#qvinrblcgt table {
+<div id="zsfsizshxx" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
+<style>#zsfsizshxx table {
   font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
   -webkit-font-smoothing: antialiased;
   -moz-osx-font-smoothing: grayscale;
 }
 
-#qvinrblcgt thead, #qvinrblcgt tbody, #qvinrblcgt tfoot, #qvinrblcgt tr, #qvinrblcgt td, #qvinrblcgt th {
+#zsfsizshxx thead, #zsfsizshxx tbody, #zsfsizshxx tfoot, #zsfsizshxx tr, #zsfsizshxx td, #zsfsizshxx th {
   border-style: none;
 }
 
-#qvinrblcgt p {
+#zsfsizshxx p {
   margin: 0;
   padding: 0;
 }
 
-#qvinrblcgt .gt_table {
+#zsfsizshxx .gt_table {
   display: table;
   border-collapse: collapse;
   line-height: normal;
@@ -3164,12 +3165,12 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-left-color: #D3D3D3;
 }
 
-#qvinrblcgt .gt_caption {
+#zsfsizshxx .gt_caption {
   padding-top: 4px;
   padding-bottom: 4px;
 }
 
-#qvinrblcgt .gt_title {
+#zsfsizshxx .gt_title {
   color: #333333;
   font-size: 125%;
   font-weight: initial;
@@ -3181,7 +3182,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-bottom-width: 0;
 }
 
-#qvinrblcgt .gt_subtitle {
+#zsfsizshxx .gt_subtitle {
   color: #333333;
   font-size: 85%;
   font-weight: initial;
@@ -3193,7 +3194,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-top-width: 0;
 }
 
-#qvinrblcgt .gt_heading {
+#zsfsizshxx .gt_heading {
   background-color: #FFFFFF;
   text-align: center;
   border-bottom-color: #FFFFFF;
@@ -3205,13 +3206,13 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-right-color: #D3D3D3;
 }
 
-#qvinrblcgt .gt_bottom_border {
+#zsfsizshxx .gt_bottom_border {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
 }
 
-#qvinrblcgt .gt_col_headings {
+#zsfsizshxx .gt_col_headings {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -3226,7 +3227,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-right-color: #D3D3D3;
 }
 
-#qvinrblcgt .gt_col_heading {
+#zsfsizshxx .gt_col_heading {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -3246,7 +3247,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   overflow-x: hidden;
 }
 
-#qvinrblcgt .gt_column_spanner_outer {
+#zsfsizshxx .gt_column_spanner_outer {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -3258,15 +3259,15 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   padding-right: 4px;
 }
 
-#qvinrblcgt .gt_column_spanner_outer:first-child {
+#zsfsizshxx .gt_column_spanner_outer:first-child {
   padding-left: 0;
 }
 
-#qvinrblcgt .gt_column_spanner_outer:last-child {
+#zsfsizshxx .gt_column_spanner_outer:last-child {
   padding-right: 0;
 }
 
-#qvinrblcgt .gt_column_spanner {
+#zsfsizshxx .gt_column_spanner {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
@@ -3278,11 +3279,11 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   width: 100%;
 }
 
-#qvinrblcgt .gt_spanner_row {
+#zsfsizshxx .gt_spanner_row {
   border-bottom-style: hidden;
 }
 
-#qvinrblcgt .gt_group_heading {
+#zsfsizshxx .gt_group_heading {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -3308,7 +3309,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   text-align: left;
 }
 
-#qvinrblcgt .gt_empty_group_heading {
+#zsfsizshxx .gt_empty_group_heading {
   padding: 0.5px;
   color: #333333;
   background-color: #FFFFFF;
@@ -3323,15 +3324,15 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   vertical-align: middle;
 }
 
-#qvinrblcgt .gt_from_md > :first-child {
+#zsfsizshxx .gt_from_md > :first-child {
   margin-top: 0;
 }
 
-#qvinrblcgt .gt_from_md > :last-child {
+#zsfsizshxx .gt_from_md > :last-child {
   margin-bottom: 0;
 }
 
-#qvinrblcgt .gt_row {
+#zsfsizshxx .gt_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -3350,7 +3351,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   overflow-x: hidden;
 }
 
-#qvinrblcgt .gt_stub {
+#zsfsizshxx .gt_stub {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -3363,7 +3364,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   padding-right: 5px;
 }
 
-#qvinrblcgt .gt_stub_row_group {
+#zsfsizshxx .gt_stub_row_group {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -3377,15 +3378,15 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   vertical-align: top;
 }
 
-#qvinrblcgt .gt_row_group_first td {
+#zsfsizshxx .gt_row_group_first td {
   border-top-width: 2px;
 }
 
-#qvinrblcgt .gt_row_group_first th {
+#zsfsizshxx .gt_row_group_first th {
   border-top-width: 2px;
 }
 
-#qvinrblcgt .gt_summary_row {
+#zsfsizshxx .gt_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -3395,16 +3396,16 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   padding-right: 5px;
 }
 
-#qvinrblcgt .gt_first_summary_row {
+#zsfsizshxx .gt_first_summary_row {
   border-top-style: solid;
   border-top-color: #D3D3D3;
 }
 
-#qvinrblcgt .gt_first_summary_row.thick {
+#zsfsizshxx .gt_first_summary_row.thick {
   border-top-width: 2px;
 }
 
-#qvinrblcgt .gt_last_summary_row {
+#zsfsizshxx .gt_last_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -3414,7 +3415,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-bottom-color: #D3D3D3;
 }
 
-#qvinrblcgt .gt_grand_summary_row {
+#zsfsizshxx .gt_grand_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -3424,7 +3425,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   padding-right: 5px;
 }
 
-#qvinrblcgt .gt_first_grand_summary_row {
+#zsfsizshxx .gt_first_grand_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -3434,7 +3435,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-top-color: #D3D3D3;
 }
 
-#qvinrblcgt .gt_last_grand_summary_row_top {
+#zsfsizshxx .gt_last_grand_summary_row_top {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -3444,11 +3445,11 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-bottom-color: #D3D3D3;
 }
 
-#qvinrblcgt .gt_striped {
+#zsfsizshxx .gt_striped {
   background-color: rgba(128, 128, 128, 0.05);
 }
 
-#qvinrblcgt .gt_table_body {
+#zsfsizshxx .gt_table_body {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -3457,7 +3458,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-bottom-color: #D3D3D3;
 }
 
-#qvinrblcgt .gt_footnotes {
+#zsfsizshxx .gt_footnotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -3471,7 +3472,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-right-color: #D3D3D3;
 }
 
-#qvinrblcgt .gt_footnote {
+#zsfsizshxx .gt_footnote {
   margin: 0px;
   font-size: 90%;
   padding-top: 4px;
@@ -3480,7 +3481,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   padding-right: 5px;
 }
 
-#qvinrblcgt .gt_sourcenotes {
+#zsfsizshxx .gt_sourcenotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -3494,7 +3495,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-right-color: #D3D3D3;
 }
 
-#qvinrblcgt .gt_sourcenote {
+#zsfsizshxx .gt_sourcenote {
   font-size: 90%;
   padding-top: 4px;
   padding-bottom: 4px;
@@ -3502,68 +3503,68 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   padding-right: 5px;
 }
 
-#qvinrblcgt .gt_left {
+#zsfsizshxx .gt_left {
   text-align: left;
 }
 
-#qvinrblcgt .gt_center {
+#zsfsizshxx .gt_center {
   text-align: center;
 }
 
-#qvinrblcgt .gt_right {
+#zsfsizshxx .gt_right {
   text-align: right;
   font-variant-numeric: tabular-nums;
 }
 
-#qvinrblcgt .gt_font_normal {
+#zsfsizshxx .gt_font_normal {
   font-weight: normal;
 }
 
-#qvinrblcgt .gt_font_bold {
+#zsfsizshxx .gt_font_bold {
   font-weight: bold;
 }
 
-#qvinrblcgt .gt_font_italic {
+#zsfsizshxx .gt_font_italic {
   font-style: italic;
 }
 
-#qvinrblcgt .gt_super {
+#zsfsizshxx .gt_super {
   font-size: 65%;
 }
 
-#qvinrblcgt .gt_footnote_marks {
+#zsfsizshxx .gt_footnote_marks {
   font-size: 75%;
   vertical-align: 0.4em;
   position: initial;
 }
 
-#qvinrblcgt .gt_asterisk {
+#zsfsizshxx .gt_asterisk {
   font-size: 100%;
   vertical-align: 0;
 }
 
-#qvinrblcgt .gt_indent_1 {
+#zsfsizshxx .gt_indent_1 {
   text-indent: 5px;
 }
 
-#qvinrblcgt .gt_indent_2 {
+#zsfsizshxx .gt_indent_2 {
   text-indent: 10px;
 }
 
-#qvinrblcgt .gt_indent_3 {
+#zsfsizshxx .gt_indent_3 {
   text-indent: 15px;
 }
 
-#qvinrblcgt .gt_indent_4 {
+#zsfsizshxx .gt_indent_4 {
   text-indent: 20px;
 }
 
-#qvinrblcgt .gt_indent_5 {
+#zsfsizshxx .gt_indent_5 {
   text-indent: 25px;
 }
 </style>
 <table class="gt_table" data-quarto-disable-processing="false" data-quarto-bootstrap="false">
-  <caption><span id="tab:results-gts-ex-4-tab">TABLE 8.6: </span>Example of gtsummary table with trust in government estimates and average age</caption>
+  <caption><span id="tab:results-gts-ex-4-tab">TABLE 8.6: </span>Example of {gtsummary} table with trust in government estimates and average age</caption>
   <thead>
     <tr class="gt_heading">
       <td colspan="2" class="gt_heading gt_title gt_font_normal gt_bottom_border" style>American voter's trust in the federal government, 2020</td>
@@ -3603,7 +3604,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   </tfoot>
 </table>
 </div>
-<p>With {gtsummary}, we can also calculate statistics by different groups. Let’s modify the previous example to analyze data on whether a respondent voted for president in 2020. We update the <code>by</code> argument and refine the header.</p>
+<p>With {gtsummary}, we can also calculate statistics by different groups. Let’s modify the previous example (displayed in Table <a href="c08-communicating-results.html#tab:results-gts-ex-4-tab">8.6</a> to analyze data on whether a respondent voted for president in 2020. We update the <code>by</code> argument and refine the header. The resulting table is displayed in Table <a href="c08-communicating-results.html#tab:results-gts-ex-5-tab">8.7</a>.</p>
 <div class="sourceCode" id="cb266"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb266-1"><a href="c08-communicating-results.html#cb266-1" tabindex="-1"></a>anes_des_gtsum5 <span class="ot">&lt;-</span> anes_des <span class="sc">%&gt;%</span></span>
 <span id="cb266-2"><a href="c08-communicating-results.html#cb266-2" tabindex="-1"></a>  <span class="fu">drop_na</span>(VotedPres2020) <span class="sc">%&gt;%</span></span>
 <span id="cb266-3"><a href="c08-communicating-results.html#cb266-3" tabindex="-1"></a>  <span class="fu">tbl_svysummary</span>(</span>
@@ -3632,23 +3633,23 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
 <span id="cb266-26"><a href="c08-communicating-results.html#cb266-26" tabindex="-1"></a>  )</span></code></pre></div>
 <div class="sourceCode" id="cb267"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb267-1"><a href="c08-communicating-results.html#cb267-1" tabindex="-1"></a>anes_des_gtsum5</span></code></pre></div>
 
-<div id="hjdkonkthi" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
-<style>#hjdkonkthi table {
+<div id="karazkvpvb" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
+<style>#karazkvpvb table {
   font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
   -webkit-font-smoothing: antialiased;
   -moz-osx-font-smoothing: grayscale;
 }
 
-#hjdkonkthi thead, #hjdkonkthi tbody, #hjdkonkthi tfoot, #hjdkonkthi tr, #hjdkonkthi td, #hjdkonkthi th {
+#karazkvpvb thead, #karazkvpvb tbody, #karazkvpvb tfoot, #karazkvpvb tr, #karazkvpvb td, #karazkvpvb th {
   border-style: none;
 }
 
-#hjdkonkthi p {
+#karazkvpvb p {
   margin: 0;
   padding: 0;
 }
 
-#hjdkonkthi .gt_table {
+#karazkvpvb .gt_table {
   display: table;
   border-collapse: collapse;
   line-height: normal;
@@ -3674,12 +3675,12 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-left-color: #D3D3D3;
 }
 
-#hjdkonkthi .gt_caption {
+#karazkvpvb .gt_caption {
   padding-top: 4px;
   padding-bottom: 4px;
 }
 
-#hjdkonkthi .gt_title {
+#karazkvpvb .gt_title {
   color: #333333;
   font-size: 125%;
   font-weight: initial;
@@ -3691,7 +3692,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-bottom-width: 0;
 }
 
-#hjdkonkthi .gt_subtitle {
+#karazkvpvb .gt_subtitle {
   color: #333333;
   font-size: 85%;
   font-weight: initial;
@@ -3703,7 +3704,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-top-width: 0;
 }
 
-#hjdkonkthi .gt_heading {
+#karazkvpvb .gt_heading {
   background-color: #FFFFFF;
   text-align: center;
   border-bottom-color: #FFFFFF;
@@ -3715,13 +3716,13 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-right-color: #D3D3D3;
 }
 
-#hjdkonkthi .gt_bottom_border {
+#karazkvpvb .gt_bottom_border {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
 }
 
-#hjdkonkthi .gt_col_headings {
+#karazkvpvb .gt_col_headings {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -3736,7 +3737,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-right-color: #D3D3D3;
 }
 
-#hjdkonkthi .gt_col_heading {
+#karazkvpvb .gt_col_heading {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -3756,7 +3757,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   overflow-x: hidden;
 }
 
-#hjdkonkthi .gt_column_spanner_outer {
+#karazkvpvb .gt_column_spanner_outer {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -3768,15 +3769,15 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   padding-right: 4px;
 }
 
-#hjdkonkthi .gt_column_spanner_outer:first-child {
+#karazkvpvb .gt_column_spanner_outer:first-child {
   padding-left: 0;
 }
 
-#hjdkonkthi .gt_column_spanner_outer:last-child {
+#karazkvpvb .gt_column_spanner_outer:last-child {
   padding-right: 0;
 }
 
-#hjdkonkthi .gt_column_spanner {
+#karazkvpvb .gt_column_spanner {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
@@ -3788,11 +3789,11 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   width: 100%;
 }
 
-#hjdkonkthi .gt_spanner_row {
+#karazkvpvb .gt_spanner_row {
   border-bottom-style: hidden;
 }
 
-#hjdkonkthi .gt_group_heading {
+#karazkvpvb .gt_group_heading {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -3818,7 +3819,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   text-align: left;
 }
 
-#hjdkonkthi .gt_empty_group_heading {
+#karazkvpvb .gt_empty_group_heading {
   padding: 0.5px;
   color: #333333;
   background-color: #FFFFFF;
@@ -3833,15 +3834,15 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   vertical-align: middle;
 }
 
-#hjdkonkthi .gt_from_md > :first-child {
+#karazkvpvb .gt_from_md > :first-child {
   margin-top: 0;
 }
 
-#hjdkonkthi .gt_from_md > :last-child {
+#karazkvpvb .gt_from_md > :last-child {
   margin-bottom: 0;
 }
 
-#hjdkonkthi .gt_row {
+#karazkvpvb .gt_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -3860,7 +3861,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   overflow-x: hidden;
 }
 
-#hjdkonkthi .gt_stub {
+#karazkvpvb .gt_stub {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -3873,7 +3874,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   padding-right: 5px;
 }
 
-#hjdkonkthi .gt_stub_row_group {
+#karazkvpvb .gt_stub_row_group {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -3887,15 +3888,15 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   vertical-align: top;
 }
 
-#hjdkonkthi .gt_row_group_first td {
+#karazkvpvb .gt_row_group_first td {
   border-top-width: 2px;
 }
 
-#hjdkonkthi .gt_row_group_first th {
+#karazkvpvb .gt_row_group_first th {
   border-top-width: 2px;
 }
 
-#hjdkonkthi .gt_summary_row {
+#karazkvpvb .gt_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -3905,16 +3906,16 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   padding-right: 5px;
 }
 
-#hjdkonkthi .gt_first_summary_row {
+#karazkvpvb .gt_first_summary_row {
   border-top-style: solid;
   border-top-color: #D3D3D3;
 }
 
-#hjdkonkthi .gt_first_summary_row.thick {
+#karazkvpvb .gt_first_summary_row.thick {
   border-top-width: 2px;
 }
 
-#hjdkonkthi .gt_last_summary_row {
+#karazkvpvb .gt_last_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -3924,7 +3925,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-bottom-color: #D3D3D3;
 }
 
-#hjdkonkthi .gt_grand_summary_row {
+#karazkvpvb .gt_grand_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -3934,7 +3935,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   padding-right: 5px;
 }
 
-#hjdkonkthi .gt_first_grand_summary_row {
+#karazkvpvb .gt_first_grand_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -3944,7 +3945,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-top-color: #D3D3D3;
 }
 
-#hjdkonkthi .gt_last_grand_summary_row_top {
+#karazkvpvb .gt_last_grand_summary_row_top {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -3954,11 +3955,11 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-bottom-color: #D3D3D3;
 }
 
-#hjdkonkthi .gt_striped {
+#karazkvpvb .gt_striped {
   background-color: rgba(128, 128, 128, 0.05);
 }
 
-#hjdkonkthi .gt_table_body {
+#karazkvpvb .gt_table_body {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -3967,7 +3968,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-bottom-color: #D3D3D3;
 }
 
-#hjdkonkthi .gt_footnotes {
+#karazkvpvb .gt_footnotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -3981,7 +3982,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-right-color: #D3D3D3;
 }
 
-#hjdkonkthi .gt_footnote {
+#karazkvpvb .gt_footnote {
   margin: 0px;
   font-size: 90%;
   padding-top: 4px;
@@ -3990,7 +3991,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   padding-right: 5px;
 }
 
-#hjdkonkthi .gt_sourcenotes {
+#karazkvpvb .gt_sourcenotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -4004,7 +4005,7 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   border-right-color: #D3D3D3;
 }
 
-#hjdkonkthi .gt_sourcenote {
+#karazkvpvb .gt_sourcenote {
   font-size: 90%;
   padding-top: 4px;
   padding-bottom: 4px;
@@ -4012,68 +4013,68 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
   padding-right: 5px;
 }
 
-#hjdkonkthi .gt_left {
+#karazkvpvb .gt_left {
   text-align: left;
 }
 
-#hjdkonkthi .gt_center {
+#karazkvpvb .gt_center {
   text-align: center;
 }
 
-#hjdkonkthi .gt_right {
+#karazkvpvb .gt_right {
   text-align: right;
   font-variant-numeric: tabular-nums;
 }
 
-#hjdkonkthi .gt_font_normal {
+#karazkvpvb .gt_font_normal {
   font-weight: normal;
 }
 
-#hjdkonkthi .gt_font_bold {
+#karazkvpvb .gt_font_bold {
   font-weight: bold;
 }
 
-#hjdkonkthi .gt_font_italic {
+#karazkvpvb .gt_font_italic {
   font-style: italic;
 }
 
-#hjdkonkthi .gt_super {
+#karazkvpvb .gt_super {
   font-size: 65%;
 }
 
-#hjdkonkthi .gt_footnote_marks {
+#karazkvpvb .gt_footnote_marks {
   font-size: 75%;
   vertical-align: 0.4em;
   position: initial;
 }
 
-#hjdkonkthi .gt_asterisk {
+#karazkvpvb .gt_asterisk {
   font-size: 100%;
   vertical-align: 0;
 }
 
-#hjdkonkthi .gt_indent_1 {
+#karazkvpvb .gt_indent_1 {
   text-indent: 5px;
 }
 
-#hjdkonkthi .gt_indent_2 {
+#karazkvpvb .gt_indent_2 {
   text-indent: 10px;
 }
 
-#hjdkonkthi .gt_indent_3 {
+#karazkvpvb .gt_indent_3 {
   text-indent: 15px;
 }
 
-#hjdkonkthi .gt_indent_4 {
+#karazkvpvb .gt_indent_4 {
   text-indent: 20px;
 }
 
-#hjdkonkthi .gt_indent_5 {
+#karazkvpvb .gt_indent_5 {
   text-indent: 25px;
 }
 </style>
 <table class="gt_table" data-quarto-disable-processing="false" data-quarto-bootstrap="false">
-  <caption><span id="tab:results-gts-ex-5-tab">TABLE 8.7: </span>Example of gtsummary table with trust in government estimates by voting status</caption>
+  <caption><span id="tab:results-gts-ex-5-tab">TABLE 8.7: </span>Example of {gtsummary} table with trust in government estimates by voting status</caption>
   <thead>
     <tr class="gt_heading">
       <td colspan="3" class="gt_heading gt_title gt_font_normal gt_bottom_border" style>American voter's trust
@@ -4130,8 +4131,8 @@ <h4>Expanding tables using {gtsummary}<a href="c08-communicating-results.html#ex
 <div id="charts-and-plots" class="section level3 hasAnchor" number="8.3.2">
 <h3><span class="header-section-number">8.3.2</span> Charts and plots<a href="c08-communicating-results.html#charts-and-plots" class="anchor-section" aria-label="Anchor link to header"></a></h3>
 <p>Survey analysis can yield an abundance of printed summary statistics and models. Even with the most careful analysis, interpreting the results can be overwhelming. This is where charts and plots play a key role in our work. By transforming complex data into a visual representation, we can recognize patterns, relationships, and trends with greater ease.</p>
-<p>R has numerous packages for creating compelling and insightful charts. In this section, we will focus on {ggplot2}, a member of the {tidyverse} collection of packages. Known for its power and flexibility, {ggplot2} is an invaluable tool for creating a wide range of data visualizations <span class="citation">(<a href="#ref-ggplot22016">Wickham 2016</a>)</span>.</p>
-<p>The {ggplot2} package follows the “grammar of graphics,” a framework that incrementally adds layers of chart components. This approach allows us to customize visual elements such as scales, colors, labels, and annotations to enhance the clarity of our results. After creating the survey design object, we can modify it to include additional outcomes and calculate estimates for our desired data points. Below, we create a binary variable <code>TrustGovernmentUsually</code>, which is <code>TRUE</code> when <code>TrustGovernment</code> is “Always” or “Most of the time” and <code>FALSE</code> otherwise. Then, we calculate the percentage of people who usually trust the government based on their vote in the 2020 presidential election (<code>VotedPres2020_selection</code>). We remove the cases where people did not vote or did not indicate their choice.</p>
+<p>R has numerous packages for creating compelling and insightful charts. In this section, we focus on {ggplot2}, a member of the {tidyverse} collection of packages. Known for its power and flexibility, {ggplot2} is an invaluable tool for creating a wide range of data visualizations <span class="citation">(<a href="#ref-ggplot2wickham">Wickham 2016</a>)</span>.</p>
+<p>The {ggplot2} package follows the “grammar of graphics,” a framework that incrementally adds layers of chart components. This approach allows us to customize visual elements such as scales, colors, labels, and annotations to enhance the clarity of our results. After creating the survey design object, we can modify it to include additional outcomes and calculate estimates for our desired data points. Below, we create a binary variable <code>TrustGovernmentUsually</code>, which is <code>TRUE</code> when <code>TrustGovernment</code> is “Always” or “Most of the time” and <code>FALSE</code> otherwise. Then, we calculate the percentage of people who usually trust the government based on their vote in the 2020 presidential election (<code>VotedPres2020_selection</code>.) We remove the cases where people did not vote or did not indicate their choice.</p>
 <div class="sourceCode" id="cb268"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb268-1"><a href="c08-communicating-results.html#cb268-1" tabindex="-1"></a>anes_des_der <span class="ot">&lt;-</span> anes_des <span class="sc">%&gt;%</span></span>
 <span id="cb268-2"><a href="c08-communicating-results.html#cb268-2" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">TrustGovernmentUsually =</span> <span class="fu">case_when</span>(</span>
 <span id="cb268-3"><a href="c08-communicating-results.html#cb268-3" tabindex="-1"></a>    <span class="fu">is.na</span>(TrustGovernment) <span class="sc">~</span> <span class="cn">NA</span>,</span>
@@ -4156,7 +4157,7 @@ <h3><span class="header-section-number">8.3.2</span> Charts and plots<a href="c0
 ## 1 Biden                      0.123         0.109          0.140
 ## 2 Trump                      0.178         0.161          0.198
 ## 3 Other                      0.0681        0.0290         0.152</code></pre>
-<p>Now, we can begin creating our chart with {ggplot2}. First, we set up our plot with <code>ggplot()</code>. Next, we define the data points to be displayed using aesthetics, or <code>aes</code>. Aesthetics represent the visual properties of the objects in the plot. In the example below, we map the <code>x</code> variable to <code>VotedPres2020_selection</code> from the dataset and the <code>y</code> variable to <code>pct_trust</code>. Finally, we specify the type of plot with <code>geom_*()</code>, in this case, <code>geom_bar()</code>. The resulting plot is displayed in Figure <a href="c08-communicating-results.html#fig:results-plot1">8.1</a>.</p>
+<p>Now, we can begin creating our chart with {ggplot2}. First, we set up our plot with <code>ggplot()</code>. Next, we define the data points to be displayed using aesthetics, or <code>aes</code>. Aesthetics represent the visual properties of the objects in the plot. In the following example, we create a bar chart of the percentage of people who usually trust the government by who they voted for in the 2020 election. To do this, we want to have who they voted for on the x-axis (<code>VotedPres2020_selection</code>) and the percent they usually trust the government on the y-axis (<code>pct_trust</code>.) We specify these variables in <code>ggplot()</code> and then indicate we want a bar chart with <code>geom_bar()</code>. The resulting plot is displayed in Figure <a href="c08-communicating-results.html#fig:results-plot1">8.1</a>.</p>
 <div class="sourceCode" id="cb270"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb270-1"><a href="c08-communicating-results.html#cb270-1" tabindex="-1"></a>p <span class="ot">&lt;-</span> anes_des_der <span class="sc">%&gt;%</span></span>
 <span id="cb270-2"><a href="c08-communicating-results.html#cb270-2" tabindex="-1"></a>  <span class="fu">ggplot</span>(<span class="fu">aes</span>(<span class="at">x =</span> VotedPres2020_selection,</span>
 <span id="cb270-3"><a href="c08-communicating-results.html#cb270-3" tabindex="-1"></a>             <span class="at">y =</span> pct_trust)) <span class="sc">+</span></span>
@@ -4169,7 +4170,7 @@ <h3><span class="header-section-number">8.3.2</span> Charts and plots<a href="c0
 FIGURE 8.1: Bar chart of trust in government by chosen 2020 presidential candidate
 </p>
 </div>
-<p>This is a great starting point: we observe that a higher percentage of people stating they usually trust the government among those who voted for Trump compared to those who voted for Biden or other candidates. Now, what if we want to introduce color to better differentiate the three groups? We can add <code>fill</code> under <code>aesthetics</code>, indicating that we want to use distinct values of <code>VotedPres2020_selection</code> to color the bars. In this instance, Biden and Trump will be displayed in different colors.</p>
+<p>This is a great starting point: it appears that a higher percentage of people state they usually trust the government among those who voted for Trump compared to those who voted for Biden or other candidates. Now, what if we want to introduce color to better differentiate the three groups? We can add <code>fill</code> under <code>aesthetics</code>, indicating that we want to use distinct colors for each value of <code>VotedPres2020_selection</code>. In this instance, Biden and Trump are displayed in different colors (shades in the print version of this book) in Figure <a href="c08-communicating-results.html#fig:results-plot2">8.2</a>.</p>
 <div class="sourceCode" id="cb271"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb271-1"><a href="c08-communicating-results.html#cb271-1" tabindex="-1"></a>pcolor <span class="ot">&lt;-</span> anes_des_der <span class="sc">%&gt;%</span></span>
 <span id="cb271-2"><a href="c08-communicating-results.html#cb271-2" tabindex="-1"></a>  <span class="fu">ggplot</span>(<span class="fu">aes</span>(<span class="at">x =</span> VotedPres2020_selection,</span>
 <span id="cb271-3"><a href="c08-communicating-results.html#cb271-3" tabindex="-1"></a>             <span class="at">y =</span> pct_trust,</span>
@@ -4183,7 +4184,7 @@ <h3><span class="header-section-number">8.3.2</span> Charts and plots<a href="c0
 FIGURE 8.2: Bar chart of trust in government by chosen 2020 presidential candidate with colors
 </p>
 </div>
-<p>Let’s say we wanted to follow proper statistical analysis practice and incorporate variability in our plot. We can add another geom, <code>geom_errorbar()</code>, to display the confidence intervals on top of our existing <code>geom_bar()</code> layer. We can add the layer using a plus sign <code>+</code>.</p>
+<p>Let’s say we wanted to follow proper statistical analysis practice and incorporate variability in our plot. We can add another geom, <code>geom_errorbar()</code>, to display the confidence intervals on top of our existing <code>geom_bar()</code> layer. We can add the layer using a plus sign <code>+</code>. The resulting graph is displayed in Figure <a href="c08-communicating-results.html#fig:results-plot3">8.3</a>.</p>
 <div class="sourceCode" id="cb272"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb272-1"><a href="c08-communicating-results.html#cb272-1" tabindex="-1"></a>pcol_error <span class="ot">&lt;-</span> anes_des_der <span class="sc">%&gt;%</span></span>
 <span id="cb272-2"><a href="c08-communicating-results.html#cb272-2" tabindex="-1"></a>  <span class="fu">ggplot</span>(<span class="fu">aes</span>(<span class="at">x =</span> VotedPres2020_selection,</span>
 <span id="cb272-3"><a href="c08-communicating-results.html#cb272-3" tabindex="-1"></a>             <span class="at">y =</span> pct_trust,</span>
@@ -4200,7 +4201,7 @@ <h3><span class="header-section-number">8.3.2</span> Charts and plots<a href="c0
 FIGURE 8.3: Bar chart of trust in government by chosen 2020 presidential candidate with colors and error bars
 </p>
 </div>
-<p>We can continue adding to our plot until we achieve our desired look. For example, we can eliminate the color legend as it doesn’t contribute meaningful information with <code>guides(fill = "none")</code>. We can specify specific colors for <code>fill</code> using <code>scale_fill_manual()</code>. Inside the function, we provide a vector of values corresponding to the colors in our plot. These values are hexadecimal (hex) color codes, denoted by a leading pound sign <code>#</code> followed by six letters or numbers. The hex code <code>#0b3954</code> used below is a dark blue. There are many tools online that help pick hex codes, such as <a href="https://htmlcolorcodes.com/">htmlcolorcodes.com/</a>.</p>
+<p>We can continue adding to our plot until we achieve our desired look. For example, we can eliminate the color legend as it doesn’t contribute meaningful information with <code>guides(fill = "none")</code>. We can also specify specific colors for <code>fill</code> using <code>scale_fill_manual()</code>. Inside this function, we provide a vector of values corresponding to the colors in our plot. These values are hexadecimal (hex) color codes, denoted by a leading pound sign <code>#</code> followed by six letters or numbers. The hex code <code>#0b3954</code> used below is dark blue. There are many tools online that help pick hex codes, such as <a href="https://htmlcolorcodes.com/">htmlcolorcodes.com</a>. Additionally, Figure <a href="c08-communicating-results.html#fig:results-plot4">8.4</a> incorporates better labels for the x and y axes (<code>xlab()</code>, <code>ylab()</code>), a title (<code>labs(title=)</code>), and a footnote with the data source (<code>labs(caption=)</code>.)</p>
 <div class="sourceCode" id="cb273"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb273-1"><a href="c08-communicating-results.html#cb273-1" tabindex="-1"></a>pfull <span class="ot">&lt;-</span></span>
 <span id="cb273-2"><a href="c08-communicating-results.html#cb273-2" tabindex="-1"></a>  anes_des_der <span class="sc">%&gt;%</span></span>
 <span id="cb273-3"><a href="c08-communicating-results.html#cb273-3" tabindex="-1"></a>  <span class="fu">ggplot</span>(<span class="fu">aes</span>(<span class="at">x =</span> VotedPres2020_selection,</span>
@@ -4226,7 +4227,7 @@ <h3><span class="header-section-number">8.3.2</span> Charts and plots<a href="c0
 FIGURE 8.4: Bar chart of trust in government by chosen 2020 presidential candidate with colors, labels, error bars, and title
 </p>
 </div>
-<p>What we’ve explored in this section are just the foundational aspects of {ggplot2}, and the capabilities of this package extend far beyond what we’ve covered. Advanced features such as annotation, faceting, and theming allow for more sophisticated and customized visualizations. The book <span class="citation">Wickham (<a href="#ref-ggplot22016">2016</a>)</span> is a comprehensive guide to learning more about this powerful tool.</p>
+<p>What we’ve explored in this section are just the foundational aspects of {ggplot2}, and the capabilities of this package extend far beyond what we’ve covered. Advanced features such as annotation, faceting, and theming allow for more sophisticated and customized visualizations. The ggplot2 book by <span class="citation">Wickham (<a href="#ref-ggplot2wickham">2016</a>)</span> is a comprehensive guide to learning more about this powerful tool.</p>
 
 </div>
 </div>
@@ -4236,12 +4237,18 @@ <h3>References<a href="references.html#references" class="anchor-section" aria-l
 <div id="ref-R-gt" class="csl-entry">
 Iannone, Richard, Joe Cheng, Barret Schloerke, Ellis Hughes, Alexandra Lauer, and JooYoung Seo. 2023. <em><span class="nocase">gt</span>: Easily Create Presentation-Ready Display Tables</em>.
 </div>
-<div id="ref-gtsummary" class="csl-entry">
-Sjoberg, Daniel D., Karissa Whiting, Michael Curry, Jessica A. Lavery, and Joseph Larmarange. 2021. <span>“Reproducible Summary Tables with the Gtsummary Package.”</span> <em><span>The R Journal</span></em> 13: 570–80. <a href="https://doi.org/10.32614/RJ-2021-053">https://doi.org/10.32614/RJ-2021-053</a>.
+<div id="ref-gtsummarysjo" class="csl-entry">
+Sjoberg, Daniel D., Karissa Whiting, Michael Curry, Jessica A. Lavery, and Joseph Larmarange. 2021. <span>“Reproducible Summary Tables with the <span class="nocase">gtsummary</span> Package.”</span> <em><span>The R Journal</span></em> 13: 570–80. <a href="https://doi.org/10.32614/RJ-2021-053">https://doi.org/10.32614/RJ-2021-053</a>.
 </div>
-<div id="ref-ggplot22016" class="csl-entry">
-Wickham, Hadley. 2016. <em>Ggplot2: Elegant Graphics for Data Analysis</em>. Springer-Verlag New York. <a href="https://ggplot2.tidyverse.org">https://ggplot2.tidyverse.org</a>.
+<div id="ref-ggplot2wickham" class="csl-entry">
+Wickham, Hadley. 2016. <em><span class="nocase">ggplot2</span>: Elegant Graphics for Data Analysis</em>. Springer-Verlag New York. <a href="https://ggplot2.tidyverse.org">https://ggplot2.tidyverse.org</a>.
 </div>
+</div>
+<div class="footnotes">
+<hr />
+<ol start="26">
+<li id="fn26"><p>The function <code>tab_caption()</code> is intended for usage in R Markdown, Quarto, or bookdown to add the ability of cross-referencing where as the function <code>tab_header()</code> is used to add a title or subtitle to a table in any context including Shiny or GitHub flavor Markdown without cross-referencing and is placed within the table object itself whereas a caption is placed based with the table based on the output type.<a href="c08-communicating-results.html#fnref26" class="footnote-back">↩︎</a></p></li>
+</ol>
 </div>
             </section>
 
diff --git a/c09-reprex-data.html b/c09-reprex-data.html
index 67031c82..a85ac83d 100644
--- a/c09-reprex-data.html
+++ b/c09-reprex-data.html
@@ -23,7 +23,7 @@
 <meta name="author" content="Stephanie Zimmer, Rebecca J. Powell, and Isabella Velásquez" />
 
 
-<meta name="date" content="2024-04-16" />
+<meta name="date" content="2024-04-24" />
 
   <meta name="viewport" content="width=device-width, initial-scale=1" />
   <meta name="apple-mobile-web-app-capable" content="yes" />
@@ -229,7 +229,7 @@
 <li class="chapter" data-level="4.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#introduction-3"><i class="fa fa-check"></i><b>4.1</b> Introduction</a></li>
 <li class="chapter" data-level="4.2" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup"><i class="fa fa-check"></i><b>4.2</b> Setup</a>
 <ul>
-<li class="chapter" data-level="4.2.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#packages"><i class="fa fa-check"></i><b>4.2.1</b> Packages</a></li>
+<li class="chapter" data-level="4.2.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup-load-pkgs"><i class="fa fa-check"></i><b>4.2.1</b> Packages</a></li>
 <li class="chapter" data-level="4.2.2" data-path="c04-getting-started.html"><a href="c04-getting-started.html#data"><i class="fa fa-check"></i><b>4.2.2</b> Data</a></li>
 <li class="chapter" data-level="4.2.3" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup-des-obj"><i class="fa fa-check"></i><b>4.2.3</b> Design objects</a></li>
 </ul></li>
@@ -358,7 +358,7 @@
 <li class="chapter" data-level="9.9.1" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#random-number-seeds"><i class="fa fa-check"></i><b>9.9.1</b> Random number seeds</a></li>
 <li class="chapter" data-level="9.9.2" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#descriptive-names-and-labels"><i class="fa fa-check"></i><b>9.9.2</b> Descriptive names and labels</a></li>
 </ul></li>
-<li class="chapter" data-level="9.10" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#summary"><i class="fa fa-check"></i><b>9.10</b> Summary</a></li>
+<li class="chapter" data-level="9.10" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#additional-resources-1"><i class="fa fa-check"></i><b>9.10</b> Additional resources</a></li>
 </ul></li>
 <li class="part"><span><b>IV Real life data</b></span></li>
 <li class="chapter" data-level="10" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html"><i class="fa fa-check"></i><b>10</b> Sample designs and replicate weights</a>
@@ -380,7 +380,7 @@
 <ul>
 <li class="chapter" data-level="10.4.1" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#balanced-repeated-replication-brr-method"><i class="fa fa-check"></i><b>10.4.1</b> Balanced Repeated Replication (BRR) method</a></li>
 <li class="chapter" data-level="10.4.2" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#fays-brr-method"><i class="fa fa-check"></i><b>10.4.2</b> Fay’s BRR method</a></li>
-<li class="chapter" data-level="10.4.3" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#jackknife-method"><i class="fa fa-check"></i><b>10.4.3</b> Jackknife method</a></li>
+<li class="chapter" data-level="10.4.3" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#samp-jackknife"><i class="fa fa-check"></i><b>10.4.3</b> Jackknife method</a></li>
 <li class="chapter" data-level="10.4.4" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#bootstrap-method"><i class="fa fa-check"></i><b>10.4.4</b> Bootstrap method</a></li>
 </ul></li>
 <li class="chapter" data-level="10.5" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#exercises-2"><i class="fa fa-check"></i><b>10.5</b> Exercises</a></li>
@@ -398,14 +398,14 @@
 <li class="chapter" data-level="11.4" data-path="c11-missing-data.html"><a href="c11-missing-data.html#analysis-with-missing-data"><i class="fa fa-check"></i><b>11.4</b> Analysis with missing data</a>
 <ul>
 <li class="chapter" data-level="11.4.1" data-path="c11-missing-data.html"><a href="c11-missing-data.html#recoding-missing-data"><i class="fa fa-check"></i><b>11.4.1</b> Recoding missing data</a></li>
-<li class="chapter" data-level="11.4.2" data-path="c11-missing-data.html"><a href="c11-missing-data.html#accounting-for-skip-patterns"><i class="fa fa-check"></i><b>11.4.2</b> Accounting for skip patterns</a></li>
+<li class="chapter" data-level="11.4.2" data-path="c11-missing-data.html"><a href="c11-missing-data.html#missing-skip-patt"><i class="fa fa-check"></i><b>11.4.2</b> Accounting for skip patterns</a></li>
 </ul></li>
 </ul></li>
 <li class="chapter" data-level="12" data-path="c12-recommendations.html"><a href="c12-recommendations.html"><i class="fa fa-check"></i><b>12</b> Successful survey analysis recommendations</a>
 <ul>
 <li class="chapter" data-level="" data-path="c12-recommendations.html"><a href="c12-recommendations.html#prereq12"><i class="fa fa-check"></i>Prerequisites</a></li>
 <li class="chapter" data-level="12.1" data-path="c12-recommendations.html"><a href="c12-recommendations.html#introduction-10"><i class="fa fa-check"></i><b>12.1</b> Introduction</a></li>
-<li class="chapter" data-level="12.2" data-path="c12-recommendations.html"><a href="c12-recommendations.html#recs-survey-process"><i class="fa fa-check"></i><b>12.2</b> Follow survey analysis process</a></li>
+<li class="chapter" data-level="12.2" data-path="c12-recommendations.html"><a href="c12-recommendations.html#recs-survey-process"><i class="fa fa-check"></i><b>12.2</b> Follow the survey analysis process</a></li>
 <li class="chapter" data-level="12.3" data-path="c12-recommendations.html"><a href="c12-recommendations.html#begin-with-descriptive-analysis"><i class="fa fa-check"></i><b>12.3</b> Begin with descriptive analysis</a>
 <ul>
 <li class="chapter" data-level="12.3.1" data-path="c12-recommendations.html"><a href="c12-recommendations.html#table-review"><i class="fa fa-check"></i><b>12.3.1</b> Table review</a></li>
@@ -521,22 +521,22 @@ <h1>
 <h1><span class="header-section-number">Chapter 9</span> Reproducible research<a href="c09-reprex-data.html#c09-reprex-data" class="anchor-section" aria-label="Anchor link to header"></a></h1>
 <div id="introduction-7" class="section level2 hasAnchor" number="9.1">
 <h2><span class="header-section-number">9.1</span> Introduction<a href="c09-reprex-data.html#introduction-7" class="anchor-section" aria-label="Anchor link to header"></a></h2>
-<p>Reproducing a data analysis’s results is a crucial aspect of any research. First, reproducibility serves as a form of quality assurance. If we pass an analysis project to another person, they should be able to run the entire project from start to finish and obtain the same results. They can critically assess the methodology and code while detecting potential errors. Another goal of reproducibility is enabling the verification of our analysis. When someone else is able to check our results, it ensures the integrity of the analyses by determining that the conclusions are not dependent on a particular person running the code or workflow on a particular day or in a particular environment.</p>
+<p>Reproducing results is a crucial aspect of any research. First, reproducibility serves as a form of quality assurance. If we pass an analysis project to another person, they should be able to run the entire project from start to finish and obtain the same results. They can critically assess the methodology and code while detecting potential errors. Another goal of reproducibility is enabling the verification of our analysis. When someone else is able to check our results, it ensures the integrity of the analyses by determining that the conclusions are not dependent on a particular person running the code or workflow on a particular day or in a particular environment.</p>
 <p>Not only is reproducibility a key component in ethical and accurate research, but it is also a requirement for many scientific journals. For example, the Journal of Survey Statistics and Methodology (JSSAM) and Public Opinion Quarterly (POQ) require authors to make code, data, and methodology transparent and accessible to other researchers who wish to verify or build on existing work.</p>
 <p>Reproducible research requires that the key components of analysis are available, discoverable, documented, and shared with others. The four main components that we should consider are:</p>
 <ul>
 <li><strong>Code</strong>: source code used for data cleaning, analysis, modeling, and reporting</li>
-<li><strong>Data</strong>: raw data used in the workflow, or if data is sensitive or proprietary, as much data as possible that would allow others to run our workflow (e.g., access to a restricted use file (RUF))</li>
+<li><strong>Data</strong>: raw data used in the workflow, or if data are sensitive or proprietary, as much data as possible that would allow others to run our workflow or provide details on how to access the data (e.g., access to a restricted use file (RUF))</li>
 <li><strong>Environment</strong>: environment of the project, including the R version, packages, operating system, and other dependencies used in the analysis</li>
-<li><strong>Methodology</strong>: analysis methodology, including rationale behind decisions, interpretations, and assumptions</li>
+<li><strong>Methodology</strong>: survey and analysis methodology, including rationale behind sample, questionnaire and analysis decisions, interpretations, and assumptions</li>
 </ul>
-<p>In Chapter <a href="c08-communicating-results.html#c08-communicating-results">8</a>, we briefly mention how each of these is important to include in the methodology report and when communicating the findings of a study. However, to be transparent and effective researchers, we need to ensure we not only discuss these through text but also provide files and additional information when requested. Often, when starting a project, analysts will dive into the data and make decisions as they go without full documentation, which can be challenging if we need to go back and make changes or understand even what we did a few months ago. It benefits other analysts and potentially our future selves to better document everything from the start. The good news is that many tools, practices, and project management techniques make survey analysis projects easy to reproduce. For best results, analysts should decide which techniques and tools will be used before starting a project (or very early on).</p>
+<p>In Chapter <a href="c08-communicating-results.html#c08-communicating-results">8</a>, we briefly mention how each of these is important to include in the methodology report and when communicating the findings of a study. However, to be transparent and effective researchers, we need to ensure we not only discuss these through text but also provide files and additional information when requested. Often, when starting a project, we may be eager to dive into the data and make decisions as we go without full documentation. This can be challenging if we need to go back and make changes or understand even what we did a few months ago. It benefits other analysts and potentially our future selves to document everything from the start. The good news is that many tools, practices, and project management techniques make survey analysis projects easy to reproduce. For best results, we should decide which techniques and tools to use before starting a project (or very early on.)</p>
 <p>This chapter covers some of our suggestions for tools and techniques we can use in projects. This list is not comprehensive but aims to provide a starting point for those looking to create a reproducible workflow.</p>
 </div>
 <div id="project-based-workflows" class="section level2 hasAnchor" number="9.2">
 <h2><span class="header-section-number">9.2</span> Project-based workflows<a href="c09-reprex-data.html#project-based-workflows" class="anchor-section" aria-label="Anchor link to header"></a></h2>
 <p>We recommend a project-based workflow for analysis projects as described by <span class="citation">Wickham, Çetinkaya-Rundel, and Grolemund (<a href="#ref-wickham2023r4ds">2023</a>)</span>. A project-based workflow maintains a “source of truth” for our analyses. It helps with file system discipline by putting everything related to a project in a designated folder. Since all associated files are in a single location, they are easy to find and organize. When we reopen the project, we can recreate the environment in which we originally ran the code to reproduce our results.</p>
-<p>The RStudio IDE has built-in support for projects. When we create a project in RStudio, it creates a <code>.Rproj</code> file that store settings specific to that project. Once we have created a project, we can create folders that help us organize our workflow. For example, a project directory could look like this:</p>
+<p>The RStudio IDE has built-in support for projects. When we create a project in RStudio, it creates an <code>.Rproj</code> file that stores settings specific to that project. Once we have created a project, we can create folders that help us organize our workflow. For example, a project directory could look like this:</p>
 <pre><code>| anes_analysis/
   | anes_analysis.Rproj
   | README.md
@@ -555,27 +555,27 @@ <h2><span class="header-section-number">9.2</span> Project-based workflows<a hre
     | anes_report.Rmd
     | anes_report.html
     | anes_report.pdf</code></pre>
-<p>In a project-based workflow, all paths are relative and, by default, relative to the project’s folder. By using relative paths, others can open and run our files even if their directory configuration differs from ours. The {here} package enables easy file referencing, and we can start with using the <code>here::here()</code> function to build the path for loading or saving data <span class="citation">(<a href="#ref-R-here">Müller 2020</a>)</span>. Below, we ask R to read the CSV file <code>anes_2020.csv</code> in the project directory’s <code>data</code> folder:</p>
+<p>In a project-based workflow, all paths are relative and, by default, relative to the folder the <code>.Rproj</code> file is located in. By using relative paths, others can open and run our files even if their directory configuration differs from ours (e.g., Mac and Windows users have different directory path structures.) The {here} package enables easy file referencing, and we can start by using the <code>here::here()</code> function to build the path for loading or saving data <span class="citation">(<a href="#ref-R-here">Müller 2020</a>)</span>. Below, we ask R to read the CSV file <code>anes_2020.csv</code> in the project directory’s <code>data</code> folder:</p>
 <div class="sourceCode" id="cb275"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb275-1"><a href="c09-reprex-data.html#cb275-1" tabindex="-1"></a>anes <span class="ot">&lt;-</span> </span>
 <span id="cb275-2"><a href="c09-reprex-data.html#cb275-2" tabindex="-1"></a>  <span class="fu">read_csv</span>(here<span class="sc">::</span><span class="fu">here</span>(<span class="st">&quot;data&quot;</span>, <span class="st">&quot;anes2020_clean.csv&quot;</span>))</span></code></pre></div>
 <p>The combination of projects and the {here} package keep all associated files in an organized manner. This workflow makes it more likely that our analyses can be reproduced by us or our colleagues.</p>
 </div>
 <div id="functions-and-packages" class="section level2 hasAnchor" number="9.3">
 <h2><span class="header-section-number">9.3</span> Functions and packages<a href="c09-reprex-data.html#functions-and-packages" class="anchor-section" aria-label="Anchor link to header"></a></h2>
-<p>We may find ourselves repeating ourselves in our script, and the chances of errors increases whenever we copy and paste our code. By creating a function, we can create a consistent set of commands that reduce the likelihood of mistakes. Functions also organize our code, improve the code readability, and allow others to execute the same commands. Throughout this book, we have created functions, such as in Chapter <a href="c13-ncvs-vignette.html#c13-ncvs-vignette">13</a>, to run sequences of rename, filter, group_by, and summarize statements across different variables. The function helps us avoid overlooking necessary steps.</p>
+<p>We may find ourselves repeating ourselves in our script, and the chance of errors increases whenever we copy and paste our code. By creating a function, we can create a consistent set of commands that reduce the likelihood of mistakes. Functions also organize our code, improve the code readability, and allow others to execute the same commands. For example, in Chapter <a href="c13-ncvs-vignette.html#c13-ncvs-vignette">13</a>, we create a function to run sequences of <code>rename()</code>, <code>filter()</code>, <code>group_by()</code>, and summarize statements across different variables. Creating functions helps us avoid overlooking necessary steps.</p>
 <p>A package is made up of a collection of functions. If we find ourselves sharing functions with others to replicate the same series of commands in a separate project, creating a package can be a useful tool for sharing the code along with data and documentation.</p>
 </div>
 <div id="version-control-with-git" class="section level2 hasAnchor" number="9.4">
 <h2><span class="header-section-number">9.4</span> Version control with Git<a href="c09-reprex-data.html#version-control-with-git" class="anchor-section" aria-label="Anchor link to header"></a></h2>
 <p>Often, a survey analysis project produces a lot of code. Keeping track of the latest version can become challenging as files evolve throughout a project. If a team of analysts is working on the same script, someone may use an outdated version, resulting in incorrect results or redundant work.</p>
-<p>Version control systems like Git can help alleviate these pains. Git is a system that helps track changes in computer files. Analysts can use Git to follow code evaluation and manage asynchronous work. With Git, it is easy to see any changes made in a script, revert changes, and resolve differences between code versions (called conflicts).</p>
-<p>Services such as GitHub or GitLab provide hosting and sharing of files as well as version control with Git. For example, we can visit the GitHub repository for this book (<a href="https://github.com/tidy-survey-r/tidy-survey-book">https://github.com/tidy-survey-r/tidy-survey-book</a>) and see the files that build the book, when they were committed to the repository, and the history of modifications over time.</p>
+<p>Version control systems like Git can help alleviate these pains. Git is a system that tracks changes in files. We can use Git to follow code evaluation and manage asynchronous work. With Git, it is easy to see any changes made in a script, revert changes, and resolve differences between code versions (called conflicts.)</p>
+<p>Services such as GitHub or GitLab provide hosting and sharing of files as well as version control with Git. For example, we can visit the <a href="https://github.com/tidy-survey-r/tidy-survey-book">GitHub repository for this book</a> and see the files that build the book, when they were committed to the repository, and the history of modifications over time.</p>
 <p>In addition to code scripts, platforms like GitHub can store data and documentation. They provide a way to maintain a history of data modifications through versioning and timestamps. By saving the data and documentation alongside the code, it becomes easier for others to refer to and access everything they need in one place.</p>
-<p>Using version control in analysis projects makes collaboration and maintenance more manageable. For connecting Git with R, we recommend <span class="citation">Bryan (<a href="#ref-git-w-R">2023</a>)</span>.</p>
+<p>Using version control in analysis projects makes collaboration and maintenance more manageable. To connect Git with R, we recommend referencing the book <a href="https://happygitwithr.com/">Happy Git and GitHub for the useR</a> <span class="citation">(<a href="#ref-git-w-R">Bryan 2023</a>)</span>.</p>
 </div>
 <div id="package-management-with-renv" class="section level2 hasAnchor" number="9.5">
 <h2><span class="header-section-number">9.5</span> Package management with {renv}<a href="c09-reprex-data.html#package-management-with-renv" class="anchor-section" aria-label="Anchor link to header"></a></h2>
-<p>Ensuring reproducibility involves not only using version control of code, but also managing the versions of packages. If two people run the same code but use different versions of a package, the results might differ because of changes in those packages. For example, this book currently uses a version of the {srvyr} package from GitHub and not from CRAN. This is because the version of {srvyr} on CRAN has some bugs (errors) that result in incorrect calculations. The version on GitHub has corrected these errors, so we have asked readers to install the GitHub version to obtain the same results.</p>
+<p>Ensuring reproducibility involves not only using version control of code but also managing the versions of packages. If two people run the same code but use different package versions, the results might differ because of changes to those packages. For example, this book currently uses a version of the {srvyr} package from GitHub and not from CRAN. This is because the version of {srvyr} on CRAN has some bugs (errors) that result in incorrect calculations. The version on GitHub has corrected these errors, so we have asked readers to install the GitHub version to obtain the same results.</p>
 <p>One way to handle different package versions is with the {renv} package. This package allows researchers to set the versions for each package used and manage package dependencies. Specifically, {renv} creates isolated, project-specific environments that record the packages and their versions used in the code. When initiated by a new user, {renv} checks whether the installed packages are consistent with the recorded version for the project. If not, it installs the appropriate versions so that others can replicate the project’s environment to rerun the code and obtain consistent results <span class="citation">(<a href="#ref-R-renv">Ushey and Wickham 2023</a>)</span>.</p>
 </div>
 <div id="r-environments-with-docker" class="section level2 hasAnchor" number="9.6">
@@ -584,8 +584,8 @@ <h2><span class="header-section-number">9.6</span> R environments with Docker<a
 </div>
 <div id="workflow-management-with-targets" class="section level2 hasAnchor" number="9.7">
 <h2><span class="header-section-number">9.7</span> Workflow management with {targets}<a href="c09-reprex-data.html#workflow-management-with-targets" class="anchor-section" aria-label="Anchor link to header"></a></h2>
-<p>With complex studies involving multiple code files and dependencies, it is important to ensures each step is executed in the intended sequence. We can do this manually, e.g., numbering files to indicate the order or providing detailed documentation on the order. Alternatively, we can automate the process so the code flows sequentially. Making sure that the code runs in the correct order helps ensure that the research is reproducible. Anyone should be able to pick up the set of scripts and get the same results by following the workflow.</p>
-<p>The {targets} package is growing as a popular workflow manager that documents, automates, and executes complex data workflows with multiple steps and dependencies. With this package, we first define the order of execution for our code, and then it will consistently execute the code in that order each time it is run. One beneficial feature of {targets} is that if you change code later in the workflow, only the affected code and its downstream targets (i.e., the subsequent code files) are re-executed when we change a script. The {targets} package also provides interactive progress monitoring and reporting, allowing us to track the status and progress of our analysis pipeline <span class="citation">(<a href="#ref-targets2021">Landau 2021</a>)</span>.</p>
+<p>With complex studies involving multiple code files and dependencies, it is important to ensure each step is executed in the intended sequence. We can do this manually, e.g., by numbering files to indicate the order or providing detailed documentation on the order. Alternatively, we can automate the process so the code flows sequentially. Making sure that the code runs in the correct order helps ensure that the research is reproducible. Anyone should be able to pick up the set of scripts and get the same results by following the workflow.</p>
+<p>The {targets} package is growing as a popular workflow manager that documents, automates, and executes complex data workflows with multiple steps and dependencies. With this package, we first define the order of execution for our code, and then it consistently executes the code in that order each time it is run. One beneficial feature of {targets} is that if code changes later in the workflow, only the affected code and its downstream targets (i.e., the subsequent code files) are re-executed when we change a script. The {targets} package also provides interactive progress monitoring and reporting, allowing us to track the status and progress of our analysis pipeline <span class="citation">(<a href="#ref-targetslandau">Landau 2021</a>)</span>.</p>
 </div>
 <div id="documentation-with-quarto-and-r-markdown" class="section level2 hasAnchor" number="9.8">
 <h2><span class="header-section-number">9.8</span> Documentation with Quarto and R Markdown<a href="c09-reprex-data.html#documentation-with-quarto-and-r-markdown" class="anchor-section" aria-label="Anchor link to header"></a></h2>
@@ -594,7 +594,7 @@ <h2><span class="header-section-number">9.8</span> Documentation with Quarto and
 <div id="parameterization" class="section level3 hasAnchor" number="9.8.1">
 <h3><span class="header-section-number">9.8.1</span> Parameterization<a href="c09-reprex-data.html#parameterization" class="anchor-section" aria-label="Anchor link to header"></a></h3>
 <p>Another useful feature of Quarto and R Markdown is the ability to reduce repetitive code by parameterizing the files. Parameters can control various aspects of the analysis, such as dates, geography, or other analysis variables. We can define and modify these parameters to explore different scenarios or inputs. For example, suppose we start by creating a document that provides survey analysis results for North Carolina but then later decide we want to look at another state. In that case, we can define a <code>state</code> parameter and rerun the same analysis for a state like Washington without having to edit the code throughout the document.</p>
-<p>Parameters can be defined in the header or code chunks of our Quarto or R Markdown documents and easily be modified and documented. We reduce errors that may occur by manually editing code throughout the script, and offer a flexible way for others to replicate the analysis and explore variations.</p>
+<p>Parameters can be defined in the header or code chunks of our Quarto or R Markdown documents and easily modified and documented. By manually editing code throughout the script, we reduce errors that may occur and offer a flexible way for others to replicate the analysis and explore variations.</p>
 </div>
 </div>
 <div id="other-tips-for-reproducibility" class="section level2 hasAnchor" number="9.9">
@@ -602,26 +602,26 @@ <h2><span class="header-section-number">9.9</span> Other tips for reproducibilit
 <div id="random-number-seeds" class="section level3 hasAnchor" number="9.9.1">
 <h3><span class="header-section-number">9.9.1</span> Random number seeds<a href="c09-reprex-data.html#random-number-seeds" class="anchor-section" aria-label="Anchor link to header"></a></h3>
 <p>Some tasks in survey analysis require randomness, such as imputation, model training, or creating random samples. By default, the random numbers generated by R change each time we rerun the code, making it difficult to reproduce the same results. By “setting the seed,” we can control the randomness and ensure that the random numbers remain consistent whenever we rerun the code. Others can use the same seed value to reproduce our random numbers and achieve the same results.</p>
-<p>In R, we can use the <code>set.seed()</code> function to control the randomness in our code. Set a seed value by providing an integer to the function:</p>
+<p>In R, we can use the <code>set.seed()</code> function to control the randomness in our code. We set a seed value by providing an integer in the function argument. The following code chunk sets a seed using <code>999</code>, then runs a random number function (<code>runif()</code>) to get five random numbers from a uniform distribution.</p>
 <div class="sourceCode" id="cb276"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb276-1"><a href="c09-reprex-data.html#cb276-1" tabindex="-1"></a><span class="fu">set.seed</span>(<span class="dv">999</span>)</span>
-<span id="cb276-2"><a href="c09-reprex-data.html#cb276-2" tabindex="-1"></a></span>
-<span id="cb276-3"><a href="c09-reprex-data.html#cb276-3" tabindex="-1"></a><span class="fu">runif</span>(<span class="dv">5</span>)</span></code></pre></div>
-<p>The <code>runif()</code> function generates five random numbers from a uniform distribution. Since the seed is set to <code>999</code>, running <code>runif()</code> multiple times will always produce the same sequence:</p>
-<pre><code>[1] 0.38907138 0.58306072 0.09466569 0.85263123 0.78674676</code></pre>
-<p>The choice of the seed number is up to the analyst. For example, this could be the date (<code>20240102</code>) or time of day (<code>1056</code>) when the analysis was first conducted, a phone number (<code>8675309</code>), or the first few numbers that come to mind (<code>369</code>). As long as the seed is set for a given analysis, the actual number is up to the analyst to decide. It is important to note that <code>set.seed()</code> should be used <em>before</em> random number generation. It would be unethical to run an analysis over and over to choose a seed that produces the result you want. Run it once per program, and the seed will be applied to the entire script. We recommend setting the seed at the beginning of a script, where libraries are loaded.</p>
+<span id="cb276-2"><a href="c09-reprex-data.html#cb276-2" tabindex="-1"></a><span class="fu">runif</span>(<span class="dv">5</span>)</span></code></pre></div>
+<pre><code>## [1] 0.38907 0.58306 0.09467 0.85263 0.78675</code></pre>
+<p>Since the seed is set to <code>999</code>, running <code>runif(5)</code> multiple times always produces the same output:</p>
+<pre><code>## [1] 0.38907 0.58306 0.09467 0.85263 0.78675</code></pre>
+<p>The choice of the seed number is up to the analyst. For example, this could be the date (<code>20240102</code>) or time of day (<code>1056</code>) when the analysis was first conducted, a phone number (<code>8675309</code>), or the first few numbers that come to mind (<code>369</code>.) As long as the seed is set for a given analysis, the actual number is up to the analyst to decide. It is important to note that <code>set.seed()</code> should be used <strong>before</strong> random number generation. Run it once per program, and the seed is applied to the entire script. We recommend setting the seed at the beginning of a script, where libraries are loaded.</p>
 </div>
 <div id="descriptive-names-and-labels" class="section level3 hasAnchor" number="9.9.2">
 <h3><span class="header-section-number">9.9.2</span> Descriptive names and labels<a href="c09-reprex-data.html#descriptive-names-and-labels" class="anchor-section" aria-label="Anchor link to header"></a></h3>
-<p>Using descriptive variable names or labeling data can also assist with reproducible research. For example, in the ANES data, the variable names in the raw data all start with <code>V20</code> and are a string of numbers. To make things easier to reproduce, we opted to change the variable names to be more descriptive of what they contained (e.g., <code>Age</code>). This can also be done with the data values themselves. One way to accomplish this is by creating factors for categorical data, which can ensure that we know that a value of <code>1</code> really means <code>Female</code>, for example. There are other ways of handling this, such as attaching labels to the data instead of recoding variables to be descriptive (see Chapter <a href="c11-missing-data.html#c11-missing-data">11</a>). As with random number seeds, the exact method is up to the analyst, but providing this information can help ensure our research is reproducible.</p>
+<p>Using descriptive variable names or labeling data can also assist with reproducible research. For example, in the ANES data, the variable names in the raw data all start with <code>V20</code> and are a string of numbers. To make things easier to reproduce in this book, we opted to change the variable names to be more descriptive of what they contained (e.g., <code>Age</code>.) This can also be done with the data values themselves. One way to accomplish this is by creating factors for categorical data, which can ensure that we know that a value of <code>1</code> really means <code>Female</code>, for example. There are other ways of handling this, such as attaching labels to the data instead of recoding variables to be descriptive (see Chapter <a href="c11-missing-data.html#c11-missing-data">11</a>.) As with random number seeds, the exact method is up to the analyst, but providing this information can help ensure our research is reproducible.</p>
 </div>
 </div>
-<div id="summary" class="section level2 hasAnchor" number="9.10">
-<h2><span class="header-section-number">9.10</span> Summary<a href="c09-reprex-data.html#summary" class="anchor-section" aria-label="Anchor link to header"></a></h2>
-<p>We can promote accuracy and verification of results by making our analysis reproducible. There are various tools and guides available to help you achieve reproducibility in your work, a few of which were described in this chapter. Here are additional resources to explore:</p>
+<div id="additional-resources-1" class="section level2 hasAnchor" number="9.10">
+<h2><span class="header-section-number">9.10</span> Additional resources<a href="c09-reprex-data.html#additional-resources-1" class="anchor-section" aria-label="Anchor link to header"></a></h2>
+<p>We can promote accuracy and verification of results by making our analysis reproducible. There are various tools and guides available to help achieve reproducibility in analysis work, a few of which were described in this chapter. Here are additional resources to explore:</p>
 <ul>
-<li>R for Data Science chapter on project-based workflows: <a href="https://r4ds.hadley.nz/workflow-scripts.html#projects">https://r4ds.hadley.nz/workflow-scripts.html#projects</a></li>
-<li>Building reproducible analytical pipelines with R by Bruno Rodrigues: <a href="https://raps-with-r.dev/">https://raps-with-r.dev/</a></li>
-<li>Posit Solutions Site page on reproducible environments: <a href="https://solutions.posit.co/envs-pkgs/environments/">https://solutions.posit.co/envs-pkgs/environments/</a></li>
+<li><a href="https://r4ds.hadley.nz/workflow-scripts.html#projects">R for Data Science chapter on project-based workflows</a></li>
+<li><a href="https://raps-with-r.dev/">Building reproducible analytical pipelines with R</a></li>
+<li><a href="https://solutions.posit.co/envs-pkgs/environments/">Posit Solutions Site page on reproducible environments</a></li>
 </ul>
 
 </div>
@@ -634,8 +634,8 @@ <h3>References<a href="references.html#references" class="anchor-section" aria-l
 <div id="ref-git-w-R" class="csl-entry">
 Bryan, Jenny. 2023. <em>Happy Git and GitHub for the useR</em>. <a href="https://happygitwithr.com/" class="uri">https://happygitwithr.com/</a>.
 </div>
-<div id="ref-targets2021" class="csl-entry">
-Landau, William Michael. 2021. <span>“The Targets r Package: A Dynamic Make-Like Function-Oriented Pipeline Toolkit for Reproducibility and High-Performance Computing.”</span> <em>Journal of Open Source Software</em> 6 (57): 2959. <a href="https://doi.org/10.21105/joss.02959">https://doi.org/10.21105/joss.02959</a>.
+<div id="ref-targetslandau" class="csl-entry">
+Landau, William Michael. 2021. <span>“The <span class="nocase">targets</span> <span>R</span> Package: A Dynamic <span>Make</span>-Like Function-Oriented Pipeline Toolkit for Reproducibility and High-Performance Computing.”</span> <em>Journal of Open Source Software</em> 6 (57): 2959. <a href="https://doi.org/10.21105/joss.02959">https://doi.org/10.21105/joss.02959</a>.
 </div>
 <div id="ref-R-here" class="csl-entry">
 Müller, Kirill. 2020. <em><span class="nocase">here</span>: A Simpler Way to Find Your Files</em>.
diff --git a/c10-sample-designs-replicate-weights.html b/c10-sample-designs-replicate-weights.html
index 0045d0a4..cd0c2f90 100644
--- a/c10-sample-designs-replicate-weights.html
+++ b/c10-sample-designs-replicate-weights.html
@@ -23,7 +23,7 @@
 <meta name="author" content="Stephanie Zimmer, Rebecca J. Powell, and Isabella Velásquez" />
 
 
-<meta name="date" content="2024-04-16" />
+<meta name="date" content="2024-04-24" />
 
   <meta name="viewport" content="width=device-width, initial-scale=1" />
   <meta name="apple-mobile-web-app-capable" content="yes" />
@@ -229,7 +229,7 @@
 <li class="chapter" data-level="4.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#introduction-3"><i class="fa fa-check"></i><b>4.1</b> Introduction</a></li>
 <li class="chapter" data-level="4.2" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup"><i class="fa fa-check"></i><b>4.2</b> Setup</a>
 <ul>
-<li class="chapter" data-level="4.2.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#packages"><i class="fa fa-check"></i><b>4.2.1</b> Packages</a></li>
+<li class="chapter" data-level="4.2.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup-load-pkgs"><i class="fa fa-check"></i><b>4.2.1</b> Packages</a></li>
 <li class="chapter" data-level="4.2.2" data-path="c04-getting-started.html"><a href="c04-getting-started.html#data"><i class="fa fa-check"></i><b>4.2.2</b> Data</a></li>
 <li class="chapter" data-level="4.2.3" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup-des-obj"><i class="fa fa-check"></i><b>4.2.3</b> Design objects</a></li>
 </ul></li>
@@ -358,7 +358,7 @@
 <li class="chapter" data-level="9.9.1" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#random-number-seeds"><i class="fa fa-check"></i><b>9.9.1</b> Random number seeds</a></li>
 <li class="chapter" data-level="9.9.2" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#descriptive-names-and-labels"><i class="fa fa-check"></i><b>9.9.2</b> Descriptive names and labels</a></li>
 </ul></li>
-<li class="chapter" data-level="9.10" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#summary"><i class="fa fa-check"></i><b>9.10</b> Summary</a></li>
+<li class="chapter" data-level="9.10" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#additional-resources-1"><i class="fa fa-check"></i><b>9.10</b> Additional resources</a></li>
 </ul></li>
 <li class="part"><span><b>IV Real life data</b></span></li>
 <li class="chapter" data-level="10" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html"><i class="fa fa-check"></i><b>10</b> Sample designs and replicate weights</a>
@@ -380,7 +380,7 @@
 <ul>
 <li class="chapter" data-level="10.4.1" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#balanced-repeated-replication-brr-method"><i class="fa fa-check"></i><b>10.4.1</b> Balanced Repeated Replication (BRR) method</a></li>
 <li class="chapter" data-level="10.4.2" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#fays-brr-method"><i class="fa fa-check"></i><b>10.4.2</b> Fay’s BRR method</a></li>
-<li class="chapter" data-level="10.4.3" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#jackknife-method"><i class="fa fa-check"></i><b>10.4.3</b> Jackknife method</a></li>
+<li class="chapter" data-level="10.4.3" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#samp-jackknife"><i class="fa fa-check"></i><b>10.4.3</b> Jackknife method</a></li>
 <li class="chapter" data-level="10.4.4" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#bootstrap-method"><i class="fa fa-check"></i><b>10.4.4</b> Bootstrap method</a></li>
 </ul></li>
 <li class="chapter" data-level="10.5" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#exercises-2"><i class="fa fa-check"></i><b>10.5</b> Exercises</a></li>
@@ -398,14 +398,14 @@
 <li class="chapter" data-level="11.4" data-path="c11-missing-data.html"><a href="c11-missing-data.html#analysis-with-missing-data"><i class="fa fa-check"></i><b>11.4</b> Analysis with missing data</a>
 <ul>
 <li class="chapter" data-level="11.4.1" data-path="c11-missing-data.html"><a href="c11-missing-data.html#recoding-missing-data"><i class="fa fa-check"></i><b>11.4.1</b> Recoding missing data</a></li>
-<li class="chapter" data-level="11.4.2" data-path="c11-missing-data.html"><a href="c11-missing-data.html#accounting-for-skip-patterns"><i class="fa fa-check"></i><b>11.4.2</b> Accounting for skip patterns</a></li>
+<li class="chapter" data-level="11.4.2" data-path="c11-missing-data.html"><a href="c11-missing-data.html#missing-skip-patt"><i class="fa fa-check"></i><b>11.4.2</b> Accounting for skip patterns</a></li>
 </ul></li>
 </ul></li>
 <li class="chapter" data-level="12" data-path="c12-recommendations.html"><a href="c12-recommendations.html"><i class="fa fa-check"></i><b>12</b> Successful survey analysis recommendations</a>
 <ul>
 <li class="chapter" data-level="" data-path="c12-recommendations.html"><a href="c12-recommendations.html#prereq12"><i class="fa fa-check"></i>Prerequisites</a></li>
 <li class="chapter" data-level="12.1" data-path="c12-recommendations.html"><a href="c12-recommendations.html#introduction-10"><i class="fa fa-check"></i><b>12.1</b> Introduction</a></li>
-<li class="chapter" data-level="12.2" data-path="c12-recommendations.html"><a href="c12-recommendations.html#recs-survey-process"><i class="fa fa-check"></i><b>12.2</b> Follow survey analysis process</a></li>
+<li class="chapter" data-level="12.2" data-path="c12-recommendations.html"><a href="c12-recommendations.html#recs-survey-process"><i class="fa fa-check"></i><b>12.2</b> Follow the survey analysis process</a></li>
 <li class="chapter" data-level="12.3" data-path="c12-recommendations.html"><a href="c12-recommendations.html#begin-with-descriptive-analysis"><i class="fa fa-check"></i><b>12.3</b> Begin with descriptive analysis</a>
 <ul>
 <li class="chapter" data-level="12.3.1" data-path="c12-recommendations.html"><a href="c12-recommendations.html#table-review"><i class="fa fa-check"></i><b>12.3.1</b> Table review</a></li>
@@ -524,19 +524,19 @@ <h3>Prerequisites<a href="c10-sample-designs-replicate-weights.html#prereq3" cla
 </div>
 <div class="prereqbox">
 <p>For this chapter, load the following packages:</p>
-<div class="sourceCode" id="cb278"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb278-1"><a href="c10-sample-designs-replicate-weights.html#cb278-1" tabindex="-1"></a><span class="fu">library</span>(tidyverse)</span>
-<span id="cb278-2"><a href="c10-sample-designs-replicate-weights.html#cb278-2" tabindex="-1"></a><span class="fu">library</span>(survey)</span>
-<span id="cb278-3"><a href="c10-sample-designs-replicate-weights.html#cb278-3" tabindex="-1"></a><span class="fu">library</span>(srvyr)</span>
-<span id="cb278-4"><a href="c10-sample-designs-replicate-weights.html#cb278-4" tabindex="-1"></a><span class="fu">library</span>(srvyrexploR)</span></code></pre></div>
-<p>To help explain the different types of sample designs, this chapter will use the <code>api</code> and <code>scd</code> data that are included in the {survey} package <span class="citation">(<a href="#ref-lumley2010complex">Lumley 2010</a>)</span>:</p>
-<div class="sourceCode" id="cb279"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb279-1"><a href="c10-sample-designs-replicate-weights.html#cb279-1" tabindex="-1"></a><span class="fu">data</span>(api)</span>
-<span id="cb279-2"><a href="c10-sample-designs-replicate-weights.html#cb279-2" tabindex="-1"></a><span class="fu">data</span>(scd)</span></code></pre></div>
-<p>This chapter also uses data from the Residential Energy Consumption Survey (RECS) - both 2015 and 2020, which are included in the {srvyrexploR} package as <code>recs_2015</code> and <code>recs_2020</code>, respectively <span class="citation">(<a href="#ref-R-srvyrexploR">Stephanie, Rebecca, and Isabella 2024</a>)</span>.</p>
+<div class="sourceCode" id="cb279"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb279-1"><a href="c10-sample-designs-replicate-weights.html#cb279-1" tabindex="-1"></a><span class="fu">library</span>(tidyverse)</span>
+<span id="cb279-2"><a href="c10-sample-designs-replicate-weights.html#cb279-2" tabindex="-1"></a><span class="fu">library</span>(survey)</span>
+<span id="cb279-3"><a href="c10-sample-designs-replicate-weights.html#cb279-3" tabindex="-1"></a><span class="fu">library</span>(srvyr)</span>
+<span id="cb279-4"><a href="c10-sample-designs-replicate-weights.html#cb279-4" tabindex="-1"></a><span class="fu">library</span>(srvyrexploR)</span></code></pre></div>
+<p>To help explain the different types of sample designs, this chapter uses the <code>api</code> and <code>scd</code> data that are included in the {survey} package <span class="citation">(<a href="#ref-lumley2010complex">Lumley 2010</a>)</span>:</p>
+<div class="sourceCode" id="cb280"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb280-1"><a href="c10-sample-designs-replicate-weights.html#cb280-1" tabindex="-1"></a><span class="fu">data</span>(api)</span>
+<span id="cb280-2"><a href="c10-sample-designs-replicate-weights.html#cb280-2" tabindex="-1"></a><span class="fu">data</span>(scd)</span></code></pre></div>
+<p>This chapter uses data from the Residential Energy Consumption Survey (RECS) - both 2015 and 2020, so we load the RECS data from the {srvyrexploR} package using their object names <code>recs_2015</code> and <code>recs_2020</code>, respectively <span class="citation">(<a href="#ref-R-srvyrexploR">Zimmer, Powell, and Velásquez 2024</a>)</span>.</p>
 </div>
 <div id="introduction-8" class="section level2 hasAnchor" number="10.1">
 <h2><span class="header-section-number">10.1</span> Introduction<a href="c10-sample-designs-replicate-weights.html#introduction-8" class="anchor-section" aria-label="Anchor link to header"></a></h2>
 <p>The primary reason for using packages like {survey} and {srvyr} is to account for the sampling design or replicate weights into estimates <span class="citation">(<a href="#ref-R-srvyr">Freedman Ellis and Schneider 2023</a>; <a href="#ref-lumley2010complex">Lumley 2010</a>)</span>. By incorporating the sampling design or replicate weights, precision estimates (e.g., standard errors and confidence intervals) are appropriately calculated.</p>
-<p>In this chapter, we will introduce common sampling designs and common types of replicate weights, the mathematical methods for calculating estimates and standard errors for a given sampling design, and the R syntax to specify the sampling design or replicate weights. While we will show the math behind the estimates, the functions in these packages will do the calculation. To deeply understand the math and the derivation, refer to <span class="citation">Penn State (<a href="#ref-pennstate506">2019</a>)</span>, <span class="citation">Särndal, Swensson, and Wretman (<a href="#ref-sarndal2003model">2003</a>)</span>, <span class="citation">Wolter (<a href="#ref-wolter2007introduction">2007</a>)</span>, or <span class="citation">Fuller (<a href="#ref-fuller2011sampling">2011</a>)</span> (these are listed in order of increasing statistical rigorousness).</p>
+<p>In this chapter, we introduce common sampling designs and common types of replicate weights, the mathematical methods for calculating estimates and standard errors for a given sampling design, and the R syntax to specify the sampling design or replicate weights. While we show the math behind the estimates, the functions in these packages handle the calculation. To deeply understand the math and the derivation, refer to <span class="citation">Penn State (<a href="#ref-pennstate506">2019</a>)</span>, <span class="citation">Särndal, Swensson, and Wretman (<a href="#ref-sarndal2003model">2003</a>)</span>, <span class="citation">Wolter (<a href="#ref-wolter2007introduction">2007</a>)</span>, or <span class="citation">Fuller (<a href="#ref-fuller2011sampling">2011</a>)</span> (these are listed in order of increasing statistical rigorousness.)</p>
 <p>The general process for estimation in the {srvyr} package is to:</p>
 <ol style="list-style-type: decimal">
 <li><p>Create a <code>tbl_svy</code> object (a survey object) using: <code>as_survey_design()</code> or <code>as_survey_rep()</code></p></li>
@@ -548,16 +548,16 @@ <h2><span class="header-section-number">10.1</span> Introduction<a href="c10-sam
 </div>
 <div id="common-sampling-designs" class="section level2 hasAnchor" number="10.2">
 <h2><span class="header-section-number">10.2</span> Common sampling designs<a href="c10-sample-designs-replicate-weights.html#common-sampling-designs" class="anchor-section" aria-label="Anchor link to header"></a></h2>
-<p>A sampling design is the method used to draw a sample. Both logistical and statistical elements are considered when developing a sampling design. When specifying a sampling design in R, the levels of sampling are specified along with the weights. The weight for each record is constructed so that the particular record represents that many units in the population. For example, in a survey of 6th-grade students in the United States, the weight associated with each responding student reflects how many 6th grade students across the country that record represents. Generally, the weights represent the inverse of the probability of selection such that the sum of the weights corresponds to the total population size, although some studies may have the sum of the weights equal to the number of respondent records.</p>
+<p>A sampling design is the method used to draw a sample. Both logistical and statistical elements are considered when developing a sampling design. When specifying a sampling design in R, we specify the levels of sampling along with the weights. The weight for each record is constructed so that the particular record represents that many units in the population. For example, in a survey of 6th-grade students in the United States, the weight associated with each responding student reflects how many 6th grade students across the country that record represents. Generally, the weights represent the inverse of the probability of selection, such that the sum of the weights corresponds to the total population size, although some studies may have the sum of the weights equal to the number of respondent records.</p>
 <p>Some common terminology across the designs are:</p>
 <ul>
 <li><strong>sample size</strong>, generally denoted as <span class="math inline">\(n\)</span>, is the number of units selected to be sampled</li>
-<li><strong>population size</strong>, generally denoted as <span class="math inline">\(N\)</span>, is the number of units in the target population</li>
+<li><strong>population size</strong>, generally denoted as <span class="math inline">\(N\)</span>, is the number of units in the population of interest</li>
 <li><strong>sampling frame</strong>, the list of units from which the sample is drawn (see Chapter <a href="c02-overview-surveys.html#c02-overview-surveys">2</a> for more information)</li>
 </ul>
 <div id="simple-random-sample-without-replacement" class="section level3 hasAnchor" number="10.2.1">
 <h3><span class="header-section-number">10.2.1</span> Simple random sample without replacement<a href="c10-sample-designs-replicate-weights.html#simple-random-sample-without-replacement" class="anchor-section" aria-label="Anchor link to header"></a></h3>
-<p>The simple random sample (SRS) without replacement is a sampling design where a fixed sample size is selected from a sampling frame, and every possible subsample has an equal probability of selection. Without replacement refers to the fact that once a sampling unit has been selected, it is removed from the sample frame and cannot be selected again.</p>
+<p>The simple random sample (SRS) without replacement is a sampling design in which a fixed sample size is selected from a sampling frame, and every possible subsample has an equal probability of selection. Without replacement refers to the fact that once a sampling unit has been selected, it is removed from the sample frame and cannot be selected again.</p>
 <ul>
 <li><strong>Requirements</strong>: The sampling frame must include the entire population.</li>
 <li><strong>Advantages</strong>: SRS requires no information about the units apart from contact information.</li>
@@ -572,7 +572,7 @@ <h4>The math<a href="c10-sample-designs-replicate-weights.html#the-math" class="
 <p>The estimate of the standard error of the mean is:</p>
 <p><span class="math display">\[se(\bar{y})=\sqrt{\frac{s^2}{n}\left( 1-\frac{n}{N} \right)}\]</span> where</p>
 <p><span class="math display">\[s^2=\frac{1}{n-1}\sum_{i=1}^n\left(y_i-\bar{y}\right)^2.\]</span></p>
-<p>and <span class="math inline">\(N\)</span> is the population size. This standard error estimate might look very similar to equations in other applications except for the part on the right side of the equation: <span class="math inline">\(1-\frac{n}{N}\)</span>. This is called the finite population correction (FPC) factor. If the size of the frame, <span class="math inline">\(N\)</span>, is very large in comparison to the sample, the FPC is negligible, so it is often ignored. A common guideline is if the sample is less than 10% of the population, the FPC is negligible.</p>
+<p>and <span class="math inline">\(N\)</span> is the population size. This standard error estimate might look very similar to equations in other statistical applications except for the part on the right side of the equation: <span class="math inline">\(1-\frac{n}{N}\)</span>. This is called the finite population correction (FPC) factor. If the size of the frame, <span class="math inline">\(N\)</span>, is very large in comparison to the sample, the FPC is negligible, so it is often ignored. A common guideline is if the sample is less than 10% of the population, the FPC is negligible.</p>
 <p>To estimate proportions, we define <span class="math inline">\(x_i\)</span> as the indicator if the outcome is observed. That is, <span class="math inline">\(x_i=1\)</span> if the outcome is observed, and <span class="math inline">\(x_i=0\)</span> if the outcome is not observed for respondent <span class="math inline">\(i\)</span>. Then the estimated proportion from an SRS design is:</p>
 <p><span class="math display">\[\hat{p}=\frac{1}{n}\sum_{i=1}^n x_i \]</span>
 and the estimated standard error of the proportion is:</p>
@@ -580,28 +580,28 @@ <h4>The math<a href="c10-sample-designs-replicate-weights.html#the-math" class="
 </div>
 <div id="the-syntax" class="section level4 unnumbered hasAnchor">
 <h4>The syntax<a href="c10-sample-designs-replicate-weights.html#the-syntax" class="anchor-section" aria-label="Anchor link to header"></a></h4>
-<p>If a sample was drawn through SRS and had no nonresponse or other weighting adjustments, in R, specify this design as:</p>
-<div class="sourceCode" id="cb280"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb280-1"><a href="c10-sample-designs-replicate-weights.html#cb280-1" tabindex="-1"></a>srs1_des <span class="ot">&lt;-</span> dat <span class="sc">%&gt;%</span></span>
-<span id="cb280-2"><a href="c10-sample-designs-replicate-weights.html#cb280-2" tabindex="-1"></a> <span class="fu">as_survey_design</span>(<span class="at">fpc =</span> fpcvar)</span></code></pre></div>
-<p>where <code>dat</code> is a tibble or data.frame with the survey data, and <code>fpcvar</code> is a variable in the data indicating the sampling frame’s size (this variable will have the same value for all cases in an SRS design). If the frame is very large, sometimes the frame size is not provided. In that case, the FPC is not needed, and specify the design as:</p>
-<div class="sourceCode" id="cb281"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb281-1"><a href="c10-sample-designs-replicate-weights.html#cb281-1" tabindex="-1"></a>srs2_des <span class="ot">&lt;-</span> dat <span class="sc">%&gt;%</span></span>
-<span id="cb281-2"><a href="c10-sample-designs-replicate-weights.html#cb281-2" tabindex="-1"></a> <span class="fu">as_survey_design</span>()</span></code></pre></div>
-<p>If some post-survey adjustments were implemented and the weights are not all equal, specify the design as:</p>
-<div class="sourceCode" id="cb282"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb282-1"><a href="c10-sample-designs-replicate-weights.html#cb282-1" tabindex="-1"></a>srs3_des <span class="ot">&lt;-</span> dat <span class="sc">%&gt;%</span></span>
-<span id="cb282-2"><a href="c10-sample-designs-replicate-weights.html#cb282-2" tabindex="-1"></a> <span class="fu">as_survey_design</span>(<span class="at">weights =</span> wtvar, </span>
-<span id="cb282-3"><a href="c10-sample-designs-replicate-weights.html#cb282-3" tabindex="-1"></a>                  <span class="at">fpc =</span> fpcvar)</span></code></pre></div>
+<p>If a sample was drawn through SRS and had no nonresponse or other weighting adjustments, in R, we specify this design as:</p>
+<div class="sourceCode" id="cb281"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb281-1"><a href="c10-sample-designs-replicate-weights.html#cb281-1" tabindex="-1"></a>srs1_des <span class="ot">&lt;-</span> dat <span class="sc">%&gt;%</span></span>
+<span id="cb281-2"><a href="c10-sample-designs-replicate-weights.html#cb281-2" tabindex="-1"></a> <span class="fu">as_survey_design</span>(<span class="at">fpc =</span> fpcvar)</span></code></pre></div>
+<p>where <code>dat</code> is a tibble or data.frame with the survey data, and <code>fpcvar</code> is a variable in the data indicating the sampling frame’s size (this variable has the same value for all cases in an SRS design.) If the frame is very large, sometimes the frame size is not provided. In that case, the FPC is not needed, and we specify the design as:</p>
+<div class="sourceCode" id="cb282"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb282-1"><a href="c10-sample-designs-replicate-weights.html#cb282-1" tabindex="-1"></a>srs2_des <span class="ot">&lt;-</span> dat <span class="sc">%&gt;%</span></span>
+<span id="cb282-2"><a href="c10-sample-designs-replicate-weights.html#cb282-2" tabindex="-1"></a> <span class="fu">as_survey_design</span>()</span></code></pre></div>
+<p>If some post-survey adjustments were implemented and the weights are not all equal, we specify the design as:</p>
+<div class="sourceCode" id="cb283"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb283-1"><a href="c10-sample-designs-replicate-weights.html#cb283-1" tabindex="-1"></a>srs3_des <span class="ot">&lt;-</span> dat <span class="sc">%&gt;%</span></span>
+<span id="cb283-2"><a href="c10-sample-designs-replicate-weights.html#cb283-2" tabindex="-1"></a> <span class="fu">as_survey_design</span>(<span class="at">weights =</span> wtvar, </span>
+<span id="cb283-3"><a href="c10-sample-designs-replicate-weights.html#cb283-3" tabindex="-1"></a>                  <span class="at">fpc =</span> fpcvar)</span></code></pre></div>
 <p>where <code>wtvar</code> is a variable in the data indicating the weight for each case. Again, the FPC can be omitted if it is unnecessary because the frame is large compared to the sample size.</p>
 </div>
 <div id="example-2" class="section level4 unnumbered hasAnchor">
 <h4>Example<a href="c10-sample-designs-replicate-weights.html#example-2" class="anchor-section" aria-label="Anchor link to header"></a></h4>
-<p>The {survey} package in R provides some example datasets that we will use throughout this chapter. The documentation provides detailed information about the variables. One of the example datasets we will use is from the Academic Performance Index (API). The API was a program administered by the California Department of Education, and the {survey} package includes a population file (sample frame) of all schools with at least 100 students and several different samples pulled from that data using different sampling methods. For this first example, we will use the <code>apisrs</code> dataset, which contains an SRS of 200 schools. For printing purposes, we create a new dataset called <code>apisrs_slim</code>, which sorts the data by the school district and school ID and subsets the data to only a few columns. The SRS sample data is illustrated below:</p>
-<div class="sourceCode" id="cb283"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb283-1"><a href="c10-sample-designs-replicate-weights.html#cb283-1" tabindex="-1"></a>apisrs_slim <span class="ot">&lt;-</span></span>
-<span id="cb283-2"><a href="c10-sample-designs-replicate-weights.html#cb283-2" tabindex="-1"></a> apisrs <span class="sc">%&gt;%</span></span>
-<span id="cb283-3"><a href="c10-sample-designs-replicate-weights.html#cb283-3" tabindex="-1"></a> <span class="fu">as_tibble</span>() <span class="sc">%&gt;%</span></span>
-<span id="cb283-4"><a href="c10-sample-designs-replicate-weights.html#cb283-4" tabindex="-1"></a> <span class="fu">arrange</span>(dnum, snum) <span class="sc">%&gt;%</span></span>
-<span id="cb283-5"><a href="c10-sample-designs-replicate-weights.html#cb283-5" tabindex="-1"></a> <span class="fu">select</span>(cds, dnum, snum, dname, sname, fpc, pw)</span>
-<span id="cb283-6"><a href="c10-sample-designs-replicate-weights.html#cb283-6" tabindex="-1"></a></span>
-<span id="cb283-7"><a href="c10-sample-designs-replicate-weights.html#cb283-7" tabindex="-1"></a>apisrs_slim</span></code></pre></div>
+<p>The {survey} package in R provides some example datasets that we use throughout this chapter. The documentation provides detailed information about the variables. One of the example datasets we use is from the Academic Performance Index (API.) The API was a program administered by the California Department of Education, and the {survey} package includes a population file (sample frame) of all schools with at least 100 students and several different samples pulled from that data using different sampling methods. For this first example, we use the <code>apisrs</code> dataset, which contains an SRS of 200 schools. For printing purposes, we create a new dataset called <code>apisrs_slim</code>, which sorts the data by the school district and school ID and subsets the data to only a few columns. The SRS sample data are illustrated below:</p>
+<div class="sourceCode" id="cb284"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb284-1"><a href="c10-sample-designs-replicate-weights.html#cb284-1" tabindex="-1"></a>apisrs_slim <span class="ot">&lt;-</span></span>
+<span id="cb284-2"><a href="c10-sample-designs-replicate-weights.html#cb284-2" tabindex="-1"></a> apisrs <span class="sc">%&gt;%</span></span>
+<span id="cb284-3"><a href="c10-sample-designs-replicate-weights.html#cb284-3" tabindex="-1"></a> <span class="fu">as_tibble</span>() <span class="sc">%&gt;%</span></span>
+<span id="cb284-4"><a href="c10-sample-designs-replicate-weights.html#cb284-4" tabindex="-1"></a> <span class="fu">arrange</span>(dnum, snum) <span class="sc">%&gt;%</span></span>
+<span id="cb284-5"><a href="c10-sample-designs-replicate-weights.html#cb284-5" tabindex="-1"></a> <span class="fu">select</span>(cds, dnum, snum, dname, sname, fpc, pw)</span>
+<span id="cb284-6"><a href="c10-sample-designs-replicate-weights.html#cb284-6" tabindex="-1"></a></span>
+<span id="cb284-7"><a href="c10-sample-designs-replicate-weights.html#cb284-7" tabindex="-1"></a>apisrs_slim</span></code></pre></div>
 <pre><code>## # A tibble: 200 × 7
 ##    cds             dnum  snum dname                   sname    fpc    pw
 ##    &lt;chr&gt;          &lt;int&gt; &lt;dbl&gt; &lt;chr&gt;                   &lt;chr&gt;  &lt;dbl&gt; &lt;dbl&gt;
@@ -656,12 +656,12 @@ <h4>Example<a href="c10-sample-designs-replicate-weights.html#example-2" class="
 </tr>
 </tbody>
 </table>
-<p>To create the <code>tbl_survey</code> object for this SRS data, the design should be specified as follows:</p>
-<div class="sourceCode" id="cb285"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb285-1"><a href="c10-sample-designs-replicate-weights.html#cb285-1" tabindex="-1"></a>apisrs_des <span class="ot">&lt;-</span> apisrs_slim <span class="sc">%&gt;%</span></span>
-<span id="cb285-2"><a href="c10-sample-designs-replicate-weights.html#cb285-2" tabindex="-1"></a> <span class="fu">as_survey_design</span>(<span class="at">weights =</span> pw, </span>
-<span id="cb285-3"><a href="c10-sample-designs-replicate-weights.html#cb285-3" tabindex="-1"></a>                  <span class="at">fpc =</span> fpc)</span>
-<span id="cb285-4"><a href="c10-sample-designs-replicate-weights.html#cb285-4" tabindex="-1"></a></span>
-<span id="cb285-5"><a href="c10-sample-designs-replicate-weights.html#cb285-5" tabindex="-1"></a>apisrs_des</span></code></pre></div>
+<p>To create the <code>tbl_survey</code> object for the SRS data, we specify the design as:</p>
+<div class="sourceCode" id="cb286"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb286-1"><a href="c10-sample-designs-replicate-weights.html#cb286-1" tabindex="-1"></a>apisrs_des <span class="ot">&lt;-</span> apisrs_slim <span class="sc">%&gt;%</span></span>
+<span id="cb286-2"><a href="c10-sample-designs-replicate-weights.html#cb286-2" tabindex="-1"></a> <span class="fu">as_survey_design</span>(<span class="at">weights =</span> pw, </span>
+<span id="cb286-3"><a href="c10-sample-designs-replicate-weights.html#cb286-3" tabindex="-1"></a>                  <span class="at">fpc =</span> fpc)</span>
+<span id="cb286-4"><a href="c10-sample-designs-replicate-weights.html#cb286-4" tabindex="-1"></a></span>
+<span id="cb286-5"><a href="c10-sample-designs-replicate-weights.html#cb286-5" tabindex="-1"></a>apisrs_des</span></code></pre></div>
 <pre><code>## Independent Sampling design
 ## Called via srvyr
 ## Sampling variables:
@@ -671,8 +671,8 @@ <h4>Example<a href="c10-sample-designs-replicate-weights.html#example-2" class="
 ## Data variables: 
 ##   - cds (chr), dnum (int), snum (dbl), dname (chr), sname (chr), fpc
 ##     (dbl), pw (dbl)</code></pre>
-<p>In the printed design object above, the design is described as an “Independent Sampling design,” which is another term for SRS. The ids are specified as <code>1</code>, which means there is no clustering (a topic described in Section <a href="c10-sample-designs-replicate-weights.html#samp-cluster">10.2.4</a>), the FPC variable is indicated, and the weights are indicated. We can also look at the summary of the design object, and see the distribution of the probabilities (inverse of the weights) along with the population size and a list of the variables in the dataset.</p>
-<div class="sourceCode" id="cb287"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb287-1"><a href="c10-sample-designs-replicate-weights.html#cb287-1" tabindex="-1"></a><span class="fu">summary</span>(apisrs_des)</span></code></pre></div>
+<p>In the printed design object, the design is described as an “Independent Sampling design,” which is another term for SRS. The ids are specified as <code>1</code>, which means there is no clustering (a topic described in Section <a href="c10-sample-designs-replicate-weights.html#samp-cluster">10.2.4</a>), the FPC variable is indicated, and the weights are indicated. We can also look at the summary of the design object (<code>summary()</code>), and see the distribution of the probabilities (inverse of the weights) along with the population size and a list of the variables in the dataset.</p>
+<div class="sourceCode" id="cb288"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb288-1"><a href="c10-sample-designs-replicate-weights.html#cb288-1" tabindex="-1"></a><span class="fu">summary</span>(apisrs_des)</span></code></pre></div>
 <pre><code>## Independent Sampling design
 ## Called via srvyr
 ## Probabilities:
@@ -712,29 +712,29 @@ <h4>The math<a href="c10-sample-designs-replicate-weights.html#the-math-1" class
 </div>
 <div id="the-syntax-1" class="section level4 unnumbered hasAnchor">
 <h4>The syntax<a href="c10-sample-designs-replicate-weights.html#the-syntax-1" class="anchor-section" aria-label="Anchor link to header"></a></h4>
-<p>If we had a sample that was drawn through SRSWR and had no nonresponse or other weighting adjustments, in R, we should specify this design as:</p>
-<div class="sourceCode" id="cb289"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb289-1"><a href="c10-sample-designs-replicate-weights.html#cb289-1" tabindex="-1"></a>srswr1_des <span class="ot">&lt;-</span> dat <span class="sc">%&gt;%</span></span>
-<span id="cb289-2"><a href="c10-sample-designs-replicate-weights.html#cb289-2" tabindex="-1"></a> <span class="fu">as_survey_design</span>()</span></code></pre></div>
-<p>where <code>dat</code> is a tibble or data.frame containing our survey data. This syntax is the same as a SRS design, except a finite population correction (FPC) is not included. This is because when you claculate a sample with replacement, the population pool to select from is no longer finite, so a correction is not needed. Therefore, with large populations where the FPC is negligble, the underlying formulas for SRS and SRSWR designs are the same.</p>
-<p>If some post-survey adjustments were implemented and the weights are not all equal, specify the design as:</p>
-<div class="sourceCode" id="cb290"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb290-1"><a href="c10-sample-designs-replicate-weights.html#cb290-1" tabindex="-1"></a>srswr2_des <span class="ot">&lt;-</span> dat <span class="sc">%&gt;%</span></span>
-<span id="cb290-2"><a href="c10-sample-designs-replicate-weights.html#cb290-2" tabindex="-1"></a> <span class="fu">as_survey_design</span>(<span class="at">weights =</span> wtvar)</span></code></pre></div>
-<p>where <code>wtvar</code> is the variable for the weight on the data.</p>
+<p>If we had a sample that was drawn through SRSWR and had no nonresponse or other weighting adjustments, in R, we specify this design as:</p>
+<div class="sourceCode" id="cb290"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb290-1"><a href="c10-sample-designs-replicate-weights.html#cb290-1" tabindex="-1"></a>srswr1_des <span class="ot">&lt;-</span> dat <span class="sc">%&gt;%</span></span>
+<span id="cb290-2"><a href="c10-sample-designs-replicate-weights.html#cb290-2" tabindex="-1"></a> <span class="fu">as_survey_design</span>()</span></code></pre></div>
+<p>where <code>dat</code> is a tibble or data.frame containing our survey data. This syntax is the same as a SRS design, except a finite population correction (FPC) is not included. This is because when calculating a sample with replacement, the population pool to select from is no longer finite, so a correction is not needed. Therefore, with large populations where the FPC is negligible, the underlying formulas for SRS and SRSWR designs are the same.</p>
+<p>If some post-survey adjustments were implemented and the weights are not all equal, we specify the design as:</p>
+<div class="sourceCode" id="cb291"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb291-1"><a href="c10-sample-designs-replicate-weights.html#cb291-1" tabindex="-1"></a>srswr2_des <span class="ot">&lt;-</span> dat <span class="sc">%&gt;%</span></span>
+<span id="cb291-2"><a href="c10-sample-designs-replicate-weights.html#cb291-2" tabindex="-1"></a> <span class="fu">as_survey_design</span>(<span class="at">weights =</span> wtvar)</span></code></pre></div>
+<p>where <code>wtvar</code> is the variable for the weight of the data.</p>
 </div>
 <div id="example-3" class="section level4 unnumbered hasAnchor">
 <h4>Example<a href="c10-sample-designs-replicate-weights.html#example-3" class="anchor-section" aria-label="Anchor link to header"></a></h4>
-<p>The {survey} package does not include an example of SRSWR, so to illustrate this design we need to create an example. We use the api population data provided by the {survey} package <code>apipop</code> and select a sample of 200 cases using the <code>slice_sample()</code> function from the tidyverse. One of the arguments in the <code>slice_sample()</code> function is <code>replace</code>. If <code>replace=TRUE</code>, then we are conducting a SRSWR. We then calculate selection weights as the inverse of the probability of selection and call this new dataset <code>apisrswr</code>.</p>
-<div class="sourceCode" id="cb291"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb291-1"><a href="c10-sample-designs-replicate-weights.html#cb291-1" tabindex="-1"></a><span class="fu">set.seed</span>(<span class="dv">409963</span>)</span>
-<span id="cb291-2"><a href="c10-sample-designs-replicate-weights.html#cb291-2" tabindex="-1"></a></span>
-<span id="cb291-3"><a href="c10-sample-designs-replicate-weights.html#cb291-3" tabindex="-1"></a>apisrswr <span class="ot">&lt;-</span> apipop <span class="sc">%&gt;%</span></span>
-<span id="cb291-4"><a href="c10-sample-designs-replicate-weights.html#cb291-4" tabindex="-1"></a> <span class="fu">as_tibble</span>() <span class="sc">%&gt;%</span></span>
-<span id="cb291-5"><a href="c10-sample-designs-replicate-weights.html#cb291-5" tabindex="-1"></a> <span class="fu">slice_sample</span>(<span class="at">n =</span> <span class="dv">200</span>, <span class="at">replace =</span> <span class="cn">TRUE</span>) <span class="sc">%&gt;%</span></span>
-<span id="cb291-6"><a href="c10-sample-designs-replicate-weights.html#cb291-6" tabindex="-1"></a> <span class="fu">select</span>(cds, dnum, snum, dname, sname) <span class="sc">%&gt;%</span></span>
-<span id="cb291-7"><a href="c10-sample-designs-replicate-weights.html#cb291-7" tabindex="-1"></a> <span class="fu">mutate</span>(</span>
-<span id="cb291-8"><a href="c10-sample-designs-replicate-weights.html#cb291-8" tabindex="-1"></a>  <span class="at">weight =</span> <span class="fu">nrow</span>(apipop)<span class="sc">/</span><span class="dv">200</span></span>
-<span id="cb291-9"><a href="c10-sample-designs-replicate-weights.html#cb291-9" tabindex="-1"></a> )</span>
-<span id="cb291-10"><a href="c10-sample-designs-replicate-weights.html#cb291-10" tabindex="-1"></a></span>
-<span id="cb291-11"><a href="c10-sample-designs-replicate-weights.html#cb291-11" tabindex="-1"></a><span class="fu">head</span>(apisrswr)</span></code></pre></div>
+<p>The {survey} package does not include an example of SRSWR, so to illustrate this design, we need to create an example. We use the api population data provided by the {survey} package <code>apipop</code> and select a sample of 200 cases using the <code>slice_sample()</code> function from the tidyverse. One of the arguments in the <code>slice_sample()</code> function is <code>replace</code>. If <code>replace=TRUE</code>, then we are conducting a SRSWR. We then calculate selection weights as the inverse of the probability of selection and call this new dataset <code>apisrswr</code>.</p>
+<div class="sourceCode" id="cb292"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb292-1"><a href="c10-sample-designs-replicate-weights.html#cb292-1" tabindex="-1"></a><span class="fu">set.seed</span>(<span class="dv">409963</span>)</span>
+<span id="cb292-2"><a href="c10-sample-designs-replicate-weights.html#cb292-2" tabindex="-1"></a></span>
+<span id="cb292-3"><a href="c10-sample-designs-replicate-weights.html#cb292-3" tabindex="-1"></a>apisrswr <span class="ot">&lt;-</span> apipop <span class="sc">%&gt;%</span></span>
+<span id="cb292-4"><a href="c10-sample-designs-replicate-weights.html#cb292-4" tabindex="-1"></a> <span class="fu">as_tibble</span>() <span class="sc">%&gt;%</span></span>
+<span id="cb292-5"><a href="c10-sample-designs-replicate-weights.html#cb292-5" tabindex="-1"></a> <span class="fu">slice_sample</span>(<span class="at">n =</span> <span class="dv">200</span>, <span class="at">replace =</span> <span class="cn">TRUE</span>) <span class="sc">%&gt;%</span></span>
+<span id="cb292-6"><a href="c10-sample-designs-replicate-weights.html#cb292-6" tabindex="-1"></a> <span class="fu">select</span>(cds, dnum, snum, dname, sname) <span class="sc">%&gt;%</span></span>
+<span id="cb292-7"><a href="c10-sample-designs-replicate-weights.html#cb292-7" tabindex="-1"></a> <span class="fu">mutate</span>(</span>
+<span id="cb292-8"><a href="c10-sample-designs-replicate-weights.html#cb292-8" tabindex="-1"></a>  <span class="at">weight =</span> <span class="fu">nrow</span>(apipop)<span class="sc">/</span><span class="dv">200</span></span>
+<span id="cb292-9"><a href="c10-sample-designs-replicate-weights.html#cb292-9" tabindex="-1"></a> )</span>
+<span id="cb292-10"><a href="c10-sample-designs-replicate-weights.html#cb292-10" tabindex="-1"></a></span>
+<span id="cb292-11"><a href="c10-sample-designs-replicate-weights.html#cb292-11" tabindex="-1"></a><span class="fu">head</span>(apisrswr)</span></code></pre></div>
 <pre><code>## # A tibble: 6 × 6
 ##   cds             dnum  snum dname                    sname       weight
 ##   &lt;chr&gt;          &lt;int&gt; &lt;dbl&gt; &lt;chr&gt;                    &lt;chr&gt;        &lt;dbl&gt;
@@ -744,11 +744,11 @@ <h4>Example<a href="c10-sample-designs-replicate-weights.html#example-3" class="
 ## 4 07617056003719   346   377 Knightsen Elementary     Knightsen …   31.0
 ## 5 19650606023022   744  2351 Torrance Unified         Carr (Evel…   31.0
 ## 6 01611196090120     6    13 Alameda City Unified     Paden (Wil…   31.0</code></pre>
-<p>Because this is a SRS design <em>with replacement</em>, there will be duplicates in the data. It is important to keep the duplicates in the data for proper estimation, but for reference we can view the duplicates in the example data we just created.</p>
-<div class="sourceCode" id="cb293"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb293-1"><a href="c10-sample-designs-replicate-weights.html#cb293-1" tabindex="-1"></a>apisrswr <span class="sc">%&gt;%</span></span>
-<span id="cb293-2"><a href="c10-sample-designs-replicate-weights.html#cb293-2" tabindex="-1"></a> <span class="fu">group_by</span>(cds) <span class="sc">%&gt;%</span></span>
-<span id="cb293-3"><a href="c10-sample-designs-replicate-weights.html#cb293-3" tabindex="-1"></a> <span class="fu">filter</span>(<span class="fu">n</span>()<span class="sc">&gt;</span><span class="dv">1</span>) <span class="sc">%&gt;%</span></span>
-<span id="cb293-4"><a href="c10-sample-designs-replicate-weights.html#cb293-4" tabindex="-1"></a> <span class="fu">arrange</span>(cds)</span></code></pre></div>
+<p>Because this is a SRS design <em>with replacement</em>, there may be duplicates in the data. It is important to keep the duplicates in the data for proper estimation, but for reference, we can view the duplicates in the example data we just created.</p>
+<div class="sourceCode" id="cb294"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb294-1"><a href="c10-sample-designs-replicate-weights.html#cb294-1" tabindex="-1"></a>apisrswr <span class="sc">%&gt;%</span></span>
+<span id="cb294-2"><a href="c10-sample-designs-replicate-weights.html#cb294-2" tabindex="-1"></a> <span class="fu">group_by</span>(cds) <span class="sc">%&gt;%</span></span>
+<span id="cb294-3"><a href="c10-sample-designs-replicate-weights.html#cb294-3" tabindex="-1"></a> <span class="fu">filter</span>(<span class="fu">n</span>()<span class="sc">&gt;</span><span class="dv">1</span>) <span class="sc">%&gt;%</span></span>
+<span id="cb294-4"><a href="c10-sample-designs-replicate-weights.html#cb294-4" tabindex="-1"></a> <span class="fu">arrange</span>(cds)</span></code></pre></div>
 <pre><code>## # A tibble: 4 × 6
 ## # Groups:   cds [2]
 ##   cds             dnum  snum dname                 sname          weight
@@ -757,11 +757,11 @@ <h4>Example<a href="c10-sample-designs-replicate-weights.html#example-3" class="
 ## 2 15633216008841    41   869 Bakersfield City Elem Chipman Junio…   31.0
 ## 3 39686766042782   716  4880 Stockton City Unified Tyler Skills …   31.0
 ## 4 39686766042782   716  4880 Stockton City Unified Tyler Skills …   31.0</code></pre>
-<p>We created a weight variable in this example data, which is the inverse of the probability of selection. To specify the sampling design for <code>apisrswr</code>, the following syntax should be used:</p>
-<div class="sourceCode" id="cb295"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb295-1"><a href="c10-sample-designs-replicate-weights.html#cb295-1" tabindex="-1"></a>apisrswr_des <span class="ot">&lt;-</span> apisrswr <span class="sc">%&gt;%</span></span>
-<span id="cb295-2"><a href="c10-sample-designs-replicate-weights.html#cb295-2" tabindex="-1"></a> <span class="fu">as_survey_design</span>(<span class="at">weights =</span> weight)</span>
-<span id="cb295-3"><a href="c10-sample-designs-replicate-weights.html#cb295-3" tabindex="-1"></a></span>
-<span id="cb295-4"><a href="c10-sample-designs-replicate-weights.html#cb295-4" tabindex="-1"></a>apisrswr_des</span></code></pre></div>
+<p>We created a weight variable in this example data, which is the inverse of the probability of selection. We specify the sampling design for <code>apisrswr</code> as:</p>
+<div class="sourceCode" id="cb296"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb296-1"><a href="c10-sample-designs-replicate-weights.html#cb296-1" tabindex="-1"></a>apisrswr_des <span class="ot">&lt;-</span> apisrswr <span class="sc">%&gt;%</span></span>
+<span id="cb296-2"><a href="c10-sample-designs-replicate-weights.html#cb296-2" tabindex="-1"></a> <span class="fu">as_survey_design</span>(<span class="at">weights =</span> weight)</span>
+<span id="cb296-3"><a href="c10-sample-designs-replicate-weights.html#cb296-3" tabindex="-1"></a></span>
+<span id="cb296-4"><a href="c10-sample-designs-replicate-weights.html#cb296-4" tabindex="-1"></a>apisrswr_des</span></code></pre></div>
 <pre><code>## Independent Sampling design (with replacement)
 ## Called via srvyr
 ## Sampling variables:
@@ -770,7 +770,7 @@ <h4>Example<a href="c10-sample-designs-replicate-weights.html#example-3" class="
 ## Data variables: 
 ##   - cds (chr), dnum (int), snum (dbl), dname (chr), sname (chr), weight
 ##     (dbl)</code></pre>
-<div class="sourceCode" id="cb297"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb297-1"><a href="c10-sample-designs-replicate-weights.html#cb297-1" tabindex="-1"></a><span class="fu">summary</span>(apisrswr_des)</span></code></pre></div>
+<div class="sourceCode" id="cb298"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb298-1"><a href="c10-sample-designs-replicate-weights.html#cb298-1" tabindex="-1"></a><span class="fu">summary</span>(apisrswr_des)</span></code></pre></div>
 <pre><code>## Independent Sampling design (with replacement)
 ## Called via srvyr
 ## Probabilities:
@@ -778,14 +778,14 @@ <h4>Example<a href="c10-sample-designs-replicate-weights.html#example-3" class="
 ##  0.0323  0.0323  0.0323  0.0323  0.0323  0.0323 
 ## Data variables:
 ## [1] &quot;cds&quot;    &quot;dnum&quot;   &quot;snum&quot;   &quot;dname&quot;  &quot;sname&quot;  &quot;weight&quot;</code></pre>
-<p>In the output above, the design object and the object summary are shown. Both note that the sampling is done “with replacement” because no FPC was specified. The probabilities, which are derived from the weights, are summarized in the summary.</p>
+<p>In the output above, the design object and the object summary are shown. Both note that the sampling is done “with replacement” because no FPC was specified. The probabilities, which are derived from the weights, are summarized in the summary function output.</p>
 </div>
 </div>
 <div id="stratified-sampling" class="section level3 hasAnchor" number="10.2.3">
 <h3><span class="header-section-number">10.2.3</span> Stratified sampling<a href="c10-sample-designs-replicate-weights.html#stratified-sampling" class="anchor-section" aria-label="Anchor link to header"></a></h3>
 <p>Stratified sampling occurs when a population is divided into mutually exclusive subpopulations (strata), and then samples are selected independently within each stratum.</p>
 <ul>
-<li><strong>Requirements</strong>: The sampling frame must include the information to divide the population into groups for every unit.</li>
+<li><strong>Requirements</strong>: The sampling frame must include the information to divide the population into strata for every unit.</li>
 <li><strong>Advantages</strong>:
 <ul>
 <li>This design ensures sample representation in all subpopulations.</li>
@@ -793,61 +793,61 @@ <h3><span class="header-section-number">10.2.3</span> Stratified sampling<a href
 </li>
 <li>This results in a more efficient design.</li>
 </ul></li>
-<li><strong>Disadvantages</strong>: Auxiliary data may not exist to divide the sampling frame into groups, or the data may be outdated.</li>
+<li><strong>Disadvantages</strong>: Auxiliary data may not exist to divide the sampling frame into strata, or the data may be outdated.</li>
 <li><strong>Examples</strong>:
 <ul>
-<li><strong>Example 1</strong>: A population of North Carolina residents could be separated (stratified) into urban and rural areas, and then a SRS of residents from both rural and urban areas is selected independently. This ensures there are residents from both areas in the sample.</li>
-<li><strong>Example 2</strong>: Law enforcement agencies could be separated (stratified) into the three primary general-purpose categories in the US: local police, sheriff’s departments, and state police. A SRS of agencies from each of the three types is then selected independently to ensure all three types of agencies are represented.</li>
+<li><strong>Example 1</strong>: A population of North Carolina residents could be stratified into urban and rural areas, and then an SRS of residents from both rural and urban areas is selected independently. This ensures there are residents from both areas in the sample.</li>
+<li><strong>Example 2</strong>: Law enforcement agencies could be stratified into the three primary general-purpose categories in the U.S.: local police, sheriff’s departments, and state police. A SRS of agencies from each of the three types is then selected independently to ensure all three types of agencies are represented.</li>
 </ul></li>
 </ul>
 <div id="the-math-2" class="section level4 unnumbered hasAnchor">
 <h4>The math<a href="c10-sample-designs-replicate-weights.html#the-math-2" class="anchor-section" aria-label="Anchor link to header"></a></h4>
-<p>Let <span class="math inline">\(\bar{y}_h\)</span> be the sample mean for stratum <span class="math inline">\(h\)</span>, <span class="math inline">\(N_h\)</span> be the population size of stratum <span class="math inline">\(h\)</span>, and <span class="math inline">\(n_h\)</span> be the sample size of stratum <span class="math inline">\(h\)</span>. Then the estimate for the population mean under stratified SRS sampling is:</p>
+<p>Let <span class="math inline">\(\bar{y}_h\)</span> be the sample mean for stratum <span class="math inline">\(h\)</span>, <span class="math inline">\(N_h\)</span> be the population size of stratum <span class="math inline">\(h\)</span>, <span class="math inline">\(n_h\)</span> be the sample size of stratum <span class="math inline">\(h\)</span>, and <span class="math inline">\(H\)</span> is the total number of strata. Then, the estimate for the population mean under stratified SRS sampling is:</p>
 <p><span class="math display">\[\bar{y}=\frac{1}{N}\sum_{h=1}^H N_h\bar{y}_h\]</span>
 and the estimate of the standard error of <span class="math inline">\(\bar{y}\)</span> is:</p>
 <p><span class="math display">\[se(\bar{y})=\sqrt{\frac{1}{N^2} \sum_{h=1}^H N_h^2 \frac{s_h^2}{n_h}\left(1-\frac{n_h}{N_h}\right)} \]</span></p>
 <p>where
-<span class="math display">\[s_h^2=\frac{1}{n_h-1}\sum_{i=1}^{n_h}\left(y_{i,h}-\bar{y}_h\right)^2.\]</span></p>
-<p>For estimates of proportions, let <span class="math inline">\(\hat{p}_h\)</span> be the estimated proportion in stratum <span class="math inline">\(h\)</span>. Then the population proportion estimate is:</p>
+<span class="math display">\[s_h^2=\frac{1}{n_h-1}\sum_{i=1}^{n_h}\left(y_{i,h}-\bar{y}_h\right)^2\]</span></p>
+<p>For estimates of proportions, let <span class="math inline">\(\hat{p}_h\)</span> be the estimated proportion in stratum <span class="math inline">\(h\)</span>. Then, the population proportion estimate is:</p>
 <p><span class="math display">\[\hat{p}= \frac{1}{N}\sum_{h=1}^H N_h \hat{p}_h\]</span></p>
-<p>where <span class="math inline">\(H\)</span> is the total number of strata. The standard error of the proportion is:</p>
+<p>The standard error of the proportion is:</p>
 <p><span class="math display">\[se(\hat{p}) = \frac{1}{N} \sqrt{ \sum_{h=1}^H N_h^2 \frac{\hat{p}_h(1-\hat{p}_h)}{n_h-1} \left(1-\frac{n_h}{N_h}\right)}\]</span></p>
 </div>
 <div id="the-syntax-2" class="section level4 unnumbered hasAnchor">
 <h4>The syntax<a href="c10-sample-designs-replicate-weights.html#the-syntax-2" class="anchor-section" aria-label="Anchor link to header"></a></h4>
-<p>In addition to the <code>fpc</code> and <code>weights</code> arguments discussed in the types above, stratified designs requires the addition of the <code>strata</code> argument. For example, to specify a stratified SRS design in {srvyr} when using the FPC, that is, where the population sizes of the strata are not too large and are known, specify the design as:</p>
-<div class="sourceCode" id="cb299"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb299-1"><a href="c10-sample-designs-replicate-weights.html#cb299-1" tabindex="-1"></a>stsrs1_des <span class="ot">&lt;-</span> dat <span class="sc">%&gt;%</span></span>
-<span id="cb299-2"><a href="c10-sample-designs-replicate-weights.html#cb299-2" tabindex="-1"></a> <span class="fu">as_survey_design</span>(<span class="at">fpc =</span> fpcvar, </span>
-<span id="cb299-3"><a href="c10-sample-designs-replicate-weights.html#cb299-3" tabindex="-1"></a>                  <span class="at">strata =</span> stratvar)</span></code></pre></div>
-<p>where <code>fpcvar</code> is a variable on our data that indicates <span class="math inline">\(N_h\)</span> for each row, and <code>stratavar</code> is a variable indicating the stratum for each row. You can omit the FPC if it is not applicable. Additionally, we can indicate the weight variable if it is present where <code>wtvar</code> is a variable on our data with a numeric weight.</p>
-<div class="sourceCode" id="cb300"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb300-1"><a href="c10-sample-designs-replicate-weights.html#cb300-1" tabindex="-1"></a>stsrs2_des <span class="ot">&lt;-</span> dat <span class="sc">%&gt;%</span></span>
-<span id="cb300-2"><a href="c10-sample-designs-replicate-weights.html#cb300-2" tabindex="-1"></a> <span class="fu">as_survey_design</span>(<span class="at">weights =</span> wtvar, </span>
+<p>In addition to the <code>fpc</code> and <code>weights</code> arguments discussed in the types above, stratified designs require the addition of the <code>strata</code> argument. For example, to specify a stratified SRS design in {srvyr} when using the FPC, that is, where the population sizes of the strata are not too large and are known, we specify the design as:</p>
+<div class="sourceCode" id="cb300"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb300-1"><a href="c10-sample-designs-replicate-weights.html#cb300-1" tabindex="-1"></a>stsrs1_des <span class="ot">&lt;-</span> dat <span class="sc">%&gt;%</span></span>
+<span id="cb300-2"><a href="c10-sample-designs-replicate-weights.html#cb300-2" tabindex="-1"></a> <span class="fu">as_survey_design</span>(<span class="at">fpc =</span> fpcvar, </span>
 <span id="cb300-3"><a href="c10-sample-designs-replicate-weights.html#cb300-3" tabindex="-1"></a>                  <span class="at">strata =</span> stratvar)</span></code></pre></div>
+<p>where <code>fpcvar</code> is a variable on our data that indicates <span class="math inline">\(N_h\)</span> for each row, and <code>stratavar</code> is a variable indicating the stratum for each row. We can omit the FPC if it is not applicable. Additionally, we can indicate the weight variable if it is present where <code>wtvar</code> is a variable on our data with a numeric weight.</p>
+<div class="sourceCode" id="cb301"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb301-1"><a href="c10-sample-designs-replicate-weights.html#cb301-1" tabindex="-1"></a>stsrs2_des <span class="ot">&lt;-</span> dat <span class="sc">%&gt;%</span></span>
+<span id="cb301-2"><a href="c10-sample-designs-replicate-weights.html#cb301-2" tabindex="-1"></a> <span class="fu">as_survey_design</span>(<span class="at">weights =</span> wtvar, </span>
+<span id="cb301-3"><a href="c10-sample-designs-replicate-weights.html#cb301-3" tabindex="-1"></a>                  <span class="at">strata =</span> stratvar)</span></code></pre></div>
 </div>
 <div id="example-4" class="section level4 unnumbered hasAnchor">
 <h4>Example<a href="c10-sample-designs-replicate-weights.html#example-4" class="anchor-section" aria-label="Anchor link to header"></a></h4>
 <p>In the example API data, <code>apistrat</code> is a stratified random sample, stratified by school type (<code>stype</code>) with three levels: <code>E</code> for elementary school, <code>M</code> for middle school, and <code>H</code> for high school. As with the SRS example above, we sort and select specific variables for use in printing. The data are illustrated below, including a count of the number of cases per stratum:</p>
-<div class="sourceCode" id="cb301"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb301-1"><a href="c10-sample-designs-replicate-weights.html#cb301-1" tabindex="-1"></a>apistrat_slim <span class="ot">&lt;-</span></span>
-<span id="cb301-2"><a href="c10-sample-designs-replicate-weights.html#cb301-2" tabindex="-1"></a> apistrat <span class="sc">%&gt;%</span></span>
-<span id="cb301-3"><a href="c10-sample-designs-replicate-weights.html#cb301-3" tabindex="-1"></a> <span class="fu">as_tibble</span>() <span class="sc">%&gt;%</span></span>
-<span id="cb301-4"><a href="c10-sample-designs-replicate-weights.html#cb301-4" tabindex="-1"></a> <span class="fu">arrange</span>(dnum, snum) <span class="sc">%&gt;%</span></span>
-<span id="cb301-5"><a href="c10-sample-designs-replicate-weights.html#cb301-5" tabindex="-1"></a> <span class="fu">select</span>(cds, dnum, snum, dname, sname, stype, fpc, pw)</span>
-<span id="cb301-6"><a href="c10-sample-designs-replicate-weights.html#cb301-6" tabindex="-1"></a></span>
-<span id="cb301-7"><a href="c10-sample-designs-replicate-weights.html#cb301-7" tabindex="-1"></a>apistrat_slim <span class="sc">%&gt;%</span></span>
-<span id="cb301-8"><a href="c10-sample-designs-replicate-weights.html#cb301-8" tabindex="-1"></a> <span class="fu">count</span>(stype, fpc)</span></code></pre></div>
+<div class="sourceCode" id="cb302"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb302-1"><a href="c10-sample-designs-replicate-weights.html#cb302-1" tabindex="-1"></a>apistrat_slim <span class="ot">&lt;-</span></span>
+<span id="cb302-2"><a href="c10-sample-designs-replicate-weights.html#cb302-2" tabindex="-1"></a> apistrat <span class="sc">%&gt;%</span></span>
+<span id="cb302-3"><a href="c10-sample-designs-replicate-weights.html#cb302-3" tabindex="-1"></a> <span class="fu">as_tibble</span>() <span class="sc">%&gt;%</span></span>
+<span id="cb302-4"><a href="c10-sample-designs-replicate-weights.html#cb302-4" tabindex="-1"></a> <span class="fu">arrange</span>(dnum, snum) <span class="sc">%&gt;%</span></span>
+<span id="cb302-5"><a href="c10-sample-designs-replicate-weights.html#cb302-5" tabindex="-1"></a> <span class="fu">select</span>(cds, dnum, snum, dname, sname, stype, fpc, pw)</span>
+<span id="cb302-6"><a href="c10-sample-designs-replicate-weights.html#cb302-6" tabindex="-1"></a></span>
+<span id="cb302-7"><a href="c10-sample-designs-replicate-weights.html#cb302-7" tabindex="-1"></a>apistrat_slim <span class="sc">%&gt;%</span></span>
+<span id="cb302-8"><a href="c10-sample-designs-replicate-weights.html#cb302-8" tabindex="-1"></a> <span class="fu">count</span>(stype, fpc)</span></code></pre></div>
 <pre><code>## # A tibble: 3 × 3
 ##   stype   fpc     n
 ##   &lt;fct&gt; &lt;dbl&gt; &lt;int&gt;
 ## 1 E      4421   100
 ## 2 H       755    50
 ## 3 M      1018    50</code></pre>
-<p>The FPC is the same for each case within each stratum. This output also shows that 100 elementary schools, 50 middle schools, and 50 high schools were sampled. It is often common for the number of units sampled from each strata to be different based on the goals of the project, or to mirror the size of each strata in the population. This design should be specified as follows:</p>
-<div class="sourceCode" id="cb303"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb303-1"><a href="c10-sample-designs-replicate-weights.html#cb303-1" tabindex="-1"></a>apistrat_des <span class="ot">&lt;-</span> apistrat_slim <span class="sc">%&gt;%</span></span>
-<span id="cb303-2"><a href="c10-sample-designs-replicate-weights.html#cb303-2" tabindex="-1"></a>  <span class="fu">as_survey_design</span>(<span class="at">strata =</span> stype,</span>
-<span id="cb303-3"><a href="c10-sample-designs-replicate-weights.html#cb303-3" tabindex="-1"></a>                   <span class="at">weights =</span> pw,</span>
-<span id="cb303-4"><a href="c10-sample-designs-replicate-weights.html#cb303-4" tabindex="-1"></a>                   <span class="at">fpc =</span> fpc)</span>
-<span id="cb303-5"><a href="c10-sample-designs-replicate-weights.html#cb303-5" tabindex="-1"></a></span>
-<span id="cb303-6"><a href="c10-sample-designs-replicate-weights.html#cb303-6" tabindex="-1"></a>apistrat_des</span></code></pre></div>
+<p>The FPC is the same for each case within each stratum. This output also shows that 100 elementary schools, 50 middle schools, and 50 high schools were sampled. It is often common for the number of units sampled from each strata to be different based on the goals of the project, or to mirror the size of each strata in the population. We specify the design as:</p>
+<div class="sourceCode" id="cb304"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb304-1"><a href="c10-sample-designs-replicate-weights.html#cb304-1" tabindex="-1"></a>apistrat_des <span class="ot">&lt;-</span> apistrat_slim <span class="sc">%&gt;%</span></span>
+<span id="cb304-2"><a href="c10-sample-designs-replicate-weights.html#cb304-2" tabindex="-1"></a>  <span class="fu">as_survey_design</span>(<span class="at">strata =</span> stype,</span>
+<span id="cb304-3"><a href="c10-sample-designs-replicate-weights.html#cb304-3" tabindex="-1"></a>                   <span class="at">weights =</span> pw,</span>
+<span id="cb304-4"><a href="c10-sample-designs-replicate-weights.html#cb304-4" tabindex="-1"></a>                   <span class="at">fpc =</span> fpc)</span>
+<span id="cb304-5"><a href="c10-sample-designs-replicate-weights.html#cb304-5" tabindex="-1"></a></span>
+<span id="cb304-6"><a href="c10-sample-designs-replicate-weights.html#cb304-6" tabindex="-1"></a>apistrat_des</span></code></pre></div>
 <pre><code>## Stratified Independent Sampling design
 ## Called via srvyr
 ## Sampling variables:
@@ -858,7 +858,7 @@ <h4>Example<a href="c10-sample-designs-replicate-weights.html#example-4" class="
 ## Data variables: 
 ##   - cds (chr), dnum (int), snum (dbl), dname (chr), sname (chr), stype
 ##     (fct), fpc (dbl), pw (dbl)</code></pre>
-<div class="sourceCode" id="cb305"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb305-1"><a href="c10-sample-designs-replicate-weights.html#cb305-1" tabindex="-1"></a><span class="fu">summary</span>(apistrat_des)</span></code></pre></div>
+<div class="sourceCode" id="cb306"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb306-1"><a href="c10-sample-designs-replicate-weights.html#cb306-1" tabindex="-1"></a><span class="fu">summary</span>(apistrat_des)</span></code></pre></div>
 <pre><code>## Stratified Independent Sampling design
 ## Called via srvyr
 ## Probabilities:
@@ -874,29 +874,29 @@ <h4>Example<a href="c10-sample-designs-replicate-weights.html#example-4" class="
 ## 4421  755 1018 
 ## Data variables:
 ## [1] &quot;cds&quot;   &quot;dnum&quot;  &quot;snum&quot;  &quot;dname&quot; &quot;sname&quot; &quot;stype&quot; &quot;fpc&quot;   &quot;pw&quot;</code></pre>
-<p>When printing the object, it is specified as a “Stratified Independent Sampling design,” also known as a stratified SRS, and the strata variable is included. Printing the summary we see a distribution of probabilities, as we saw with SRS, but we also see the sample and populations sizes by stratum.</p>
+<p>When printing the object, it is specified as an “Stratified Independent Sampling design,” also known as a stratified SRS, and the strata variable is included. Printing the summary, we see a distribution of probabilities, as we saw with SRS, but we also see the sample and population sizes by stratum.</p>
 </div>
 </div>
 <div id="samp-cluster" class="section level3 hasAnchor" number="10.2.4">
 <h3><span class="header-section-number">10.2.4</span> Clustered sampling<a href="c10-sample-designs-replicate-weights.html#samp-cluster" class="anchor-section" aria-label="Anchor link to header"></a></h3>
-<p>Clustered sampling occurs when a population is divided into mutually exclusive subgroups called clusters or primary sampling units (PSUs). A random selection of PSUs is sampled, and then another level of sampling is done within these clusters. There can be multiple levels of this selection. Clustered sampling is often used when a list of the entire population is not available, or data collection involves interviewers needing direct contact with respondents.</p>
+<p>Clustered sampling occurs when a population is divided into mutually exclusive subgroups called clusters or primary sampling units (PSUs.) A random selection of PSUs is sampled, and then another level of sampling is done within these clusters. There can be multiple levels of this selection. Clustered sampling is often used when a list of the entire population is not available or data collection involves interviewers needing direct contact with respondents.</p>
 <ul>
-<li><strong>Requirements</strong>: There must be a way to divide the population into clusters. Clusters are commonly structural such as institutions (e.g., schools, prisons) or geography (e.g., states, counties).</li>
+<li><strong>Requirements</strong>: There must be a way to divide the population into clusters. Clusters are commonly structural, such as institutions (e.g., schools, prisons) or geography (e.g., states, counties.)</li>
 <li><strong>Advantages</strong>:
 <ul>
 <li>Clustered sampling is advantageous when data collection is done in person, so interviewers are sent to specific sampled areas rather than completely at random across a country.</li>
-<li>With clustered sampling, a list of the entire population is not necessary. For example, if sampling students, we do not need a list of all students but only a list of all schools. Once the schools are sampled, lists of students can be obtained within the sampled schools.</li>
+<li>With clustered sampling, a list of the entire population is not necessary. For example, if sampling students, we do not need a list of all students, but only a list of all schools. Once the schools are sampled, lists of students can be obtained within the sampled schools.</li>
 </ul></li>
 <li><strong>Disadvantages</strong>: Compared to a simple random sample for the same sample size, clustered samples generally have larger standard errors of estimates.</li>
 <li><strong>Examples</strong>:
 <ul>
-<li><strong>Example 1</strong>: Consider a study needing a sample of 6th-grade students in the United States, no list likely exists of all these students. However, it is more likely to obtain a list of schools that have 6th graders, so a study design could select a random sample of schools that have 6th graders. The selected schools can then provide a list of students to do a second stage of sampling where 6th-grade students are randomly sampled within each of the sampled schools. This is a one-stage sample design (the one representing the number of clusters) and will be the type of design we will discuss in the formulas below.</li>
-<li><strong>Example 2</strong>: Consider a study sending interviewers to households for a survey. This is a more complicated example that requires two levels of clustering (two-stage sample design) to efficiently use interviewers in geographic clusters. First, in the U.S., counties could be selected as the PSU, then Census block groups within counties could be selected as the secondary sampling unit (SSU). Households could then be randomly sampled within the block groups. This type of design is popular for in-person surveys as it reduces the travel necessary for interviewers.</li>
+<li><strong>Example 1</strong>: Consider a study needing a sample of 6th-grade students in the United States. No list likely exists of all these students. However, it is more likely to obtain a list of schools that enroll 6th graders, so a study design could select a random sample of schools that enroll 6th graders. The selected schools can then provide a list of students to do a second stage of sampling where 6th-grade students are randomly sampled within each of the sampled schools. This is a one-stage sample design (the one representing the number of clusters) and is the type of design we discuss in the formulas below.</li>
+<li><strong>Example 2</strong>: Consider a study sending interviewers to households for a survey. This is a more complicated example that requires two levels of clustering (two-stage sample design) to efficiently use interviewers in geographic clusters. First, in the U.S., counties could be selected as the PSU and then census block groups within counties could be selected as the secondary sampling unit (SSU.) Households could then be randomly sampled within the block groups. This type of design is popular for in-person surveys as it reduces the travel necessary for interviewers.</li>
 </ul></li>
 </ul>
 <div id="the-math-3" class="section level4 unnumbered hasAnchor">
 <h4>The math<a href="c10-sample-designs-replicate-weights.html#the-math-3" class="anchor-section" aria-label="Anchor link to header"></a></h4>
-<p>Consider a survey where a sample of <span class="math inline">\(a\)</span> clusters are sampled from a population of <span class="math inline">\(A\)</span> clusters via SRS. Units within each sampled cluster are sampled via SRS as well. Within each sampled cluster, <span class="math inline">\(i\)</span>, there are <span class="math inline">\(B_i\)</span> units and <span class="math inline">\(b_i\)</span> units are sampled via SRS. Let <span class="math inline">\(\bar{y}_{i}\)</span> be the sample mean of cluster <span class="math inline">\(i\)</span>. Then, a ratio estimator of the population mean is:</p>
+<p>Consider a survey where a sample of <span class="math inline">\(a\)</span> clusters are sampled from a population of <span class="math inline">\(A\)</span> clusters via SRS. Units within each sampled cluster are sampled via SRS as well. Within each sampled cluster, <span class="math inline">\(i\)</span>, there are <span class="math inline">\(B_i\)</span> units in the population and <span class="math inline">\(b_i\)</span> units are sampled via SRS. Let <span class="math inline">\(\bar{y}_{i}\)</span> be the sample mean of cluster <span class="math inline">\(i\)</span>. Then, a ratio estimator of the population mean is:</p>
 <p><span class="math display">\[\bar{y}=\frac{\sum_{i=1}^a B_i \bar{y}_{i}}{ \sum_{i=1}^a B_i}\]</span>
 Note this is a consistent but biased estimator. Often the population size is not known, so this is a method to estimate a mean without knowing the population size. The estimated standard error of the mean is:</p>
 <p><span class="math display">\[se(\bar{y})= \frac{1}{\hat{N}}\sqrt{\left(1-\frac{a}{A}\right)\frac{s_a^2}{a} + \frac{A}{a} \sum_{i=1}^a \left(1-\frac{b_i}{B_i}\right) \frac{s_i^2}{b_i} }\]</span>
@@ -905,37 +905,37 @@ <h4>The math<a href="c10-sample-designs-replicate-weights.html#the-math-3" class
 <p><span class="math display">\[s_a^2=\frac{1}{a-1}\sum_{i=1}^a \left( \hat{y}_i - \frac{\sum_{i=1}^a \hat{y}_{i} }{a}\right)^2\]</span>
 where <span class="math inline">\(\hat{y}_i =B_i\bar{y_i}\)</span> .</p>
 <p>The formula for the within-cluster variance (<span class="math inline">\(s_i^2\)</span>) is:</p>
-<p><span class="math display">\[s_b^2=\frac{1}{a(b_i-1)} \sum_{j=1}^{b_i} \left(y_{ij}-\bar{y}_i\right)^2\]</span>
+<p><span class="math display">\[s_i^2=\frac{1}{a(b_i-1)} \sum_{j=1}^{b_i} \left(y_{ij}-\bar{y}_i\right)^2\]</span>
 where <span class="math inline">\(y_{ij}\)</span> is the outcome for sampled unit <span class="math inline">\(j\)</span> within cluster <span class="math inline">\(i\)</span>.</p>
 </div>
 <div id="the-syntax-3" class="section level4 unnumbered hasAnchor">
 <h4>The syntax<a href="c10-sample-designs-replicate-weights.html#the-syntax-3" class="anchor-section" aria-label="Anchor link to header"></a></h4>
-<p>Clustered sampling designs require the addition of the <code>ids</code> argument which specifies what variables are the cluster levels. To specify a two-stage clustered design without replacement, use the following syntax:</p>
-<div class="sourceCode" id="cb307"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb307-1"><a href="c10-sample-designs-replicate-weights.html#cb307-1" tabindex="-1"></a>clus2_des <span class="ot">&lt;-</span> dat <span class="sc">%&gt;%</span></span>
-<span id="cb307-2"><a href="c10-sample-designs-replicate-weights.html#cb307-2" tabindex="-1"></a> <span class="fu">as_survey_design</span>(<span class="at">weights =</span> wtvar, </span>
-<span id="cb307-3"><a href="c10-sample-designs-replicate-weights.html#cb307-3" tabindex="-1"></a>                  <span class="at">ids =</span> <span class="fu">c</span>(PSU, SSU), </span>
-<span id="cb307-4"><a href="c10-sample-designs-replicate-weights.html#cb307-4" tabindex="-1"></a>                  <span class="at">fpc =</span> <span class="fu">c</span>(A, B))</span></code></pre></div>
-<p>where <code>PSU</code> and <code>SSU</code> are the variables indicating the PSU and SSU identifiers, and <code>A</code> and <code>B</code> are the variables indicating the population sizes for each level (i.e., <code>A</code> is the number of clusters, and <code>B</code> is the number of units within each cluster). Note that <code>A</code> will be the same for all records (within a strata), and <code>B</code> will be the same for all records within the same cluster.</p>
-<p>If clusters were sampled with replacement or from a very large population, a FPC is unnecessary. Additionally, only the first stage of selection is necessary regardless of whether the units were selected with replacement at any stage. The subsequent stages of selection are ignored in computation as their contribution to the variance is overpowered by the first stage (see <span class="citation">Särndal, Swensson, and Wretman (<a href="#ref-sarndal2003model">2003</a>)</span> or <span class="citation">Wolter (<a href="#ref-wolter2007introduction">2007</a>)</span> for a more in-depth discussion). Therefore, the syntax below will yield the same estimates in the end:</p>
-<div class="sourceCode" id="cb308"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb308-1"><a href="c10-sample-designs-replicate-weights.html#cb308-1" tabindex="-1"></a>clus2wra_des <span class="ot">&lt;-</span> dat <span class="sc">%&gt;%</span></span>
+<p>Clustered sampling designs require the addition of the <code>ids</code> argument, which specifies what the cluster levels variables. To specify a two-stage clustered design without replacement, we specify the design as:</p>
+<div class="sourceCode" id="cb308"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb308-1"><a href="c10-sample-designs-replicate-weights.html#cb308-1" tabindex="-1"></a>clus2_des <span class="ot">&lt;-</span> dat <span class="sc">%&gt;%</span></span>
 <span id="cb308-2"><a href="c10-sample-designs-replicate-weights.html#cb308-2" tabindex="-1"></a> <span class="fu">as_survey_design</span>(<span class="at">weights =</span> wtvar, </span>
-<span id="cb308-3"><a href="c10-sample-designs-replicate-weights.html#cb308-3" tabindex="-1"></a>                  <span class="at">ids =</span> <span class="fu">c</span>(PSU, SSU))</span>
-<span id="cb308-4"><a href="c10-sample-designs-replicate-weights.html#cb308-4" tabindex="-1"></a></span>
-<span id="cb308-5"><a href="c10-sample-designs-replicate-weights.html#cb308-5" tabindex="-1"></a>clus2wrb_des <span class="ot">&lt;-</span> dat <span class="sc">%&gt;%</span></span>
-<span id="cb308-6"><a href="c10-sample-designs-replicate-weights.html#cb308-6" tabindex="-1"></a> <span class="fu">as_survey_design</span>(<span class="at">weights =</span> wtvar, </span>
-<span id="cb308-7"><a href="c10-sample-designs-replicate-weights.html#cb308-7" tabindex="-1"></a>                  <span class="at">ids =</span> PSU)</span></code></pre></div>
-<p>Note that there is one additional argument that is sometimes necessary which is <code>nest = TRUE</code>. This option relabels cluster IDs to enforce nesting within strata. Sometimes, as an example, there may be a cluster <code>1</code> and a cluster <code>2</code> within each stratum but these are actually different clusters. This option indicates that the repeated use of numbering does not mean it is the same cluster. If this option is not used and there are repeated cluster IDs across different strata, an error will be generated.</p>
+<span id="cb308-3"><a href="c10-sample-designs-replicate-weights.html#cb308-3" tabindex="-1"></a>                  <span class="at">ids =</span> <span class="fu">c</span>(PSU, SSU), </span>
+<span id="cb308-4"><a href="c10-sample-designs-replicate-weights.html#cb308-4" tabindex="-1"></a>                  <span class="at">fpc =</span> <span class="fu">c</span>(A, B))</span></code></pre></div>
+<p>where <code>PSU</code> and <code>SSU</code> are the variables indicating the PSU and SSU identifiers, and <code>A</code> and <code>B</code> are the variables indicating the population sizes for each level (i.e., <code>A</code> is the number of clusters, and <code>B</code> is the number of units within each cluster.) Note that <code>A</code> is the same for all records, and <code>B</code> is the same for all records within the same cluster.</p>
+<p>If clusters were sampled with replacement or from a very large population, the FPC is unnecessary. Additionally, only the first stage of selection is necessary regardless of whether the units were selected with replacement at any stage. The subsequent stages of selection are ignored in computation as their contribution to the variance is overpowered by the first stage (see <span class="citation">Särndal, Swensson, and Wretman (<a href="#ref-sarndal2003model">2003</a>)</span> or <span class="citation">Wolter (<a href="#ref-wolter2007introduction">2007</a>)</span> for a more in-depth discussion.) Therefore, the two design objects specified below yield the same estimates in the end:</p>
+<div class="sourceCode" id="cb309"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb309-1"><a href="c10-sample-designs-replicate-weights.html#cb309-1" tabindex="-1"></a>clus2ex1_des <span class="ot">&lt;-</span> dat <span class="sc">%&gt;%</span></span>
+<span id="cb309-2"><a href="c10-sample-designs-replicate-weights.html#cb309-2" tabindex="-1"></a> <span class="fu">as_survey_design</span>(<span class="at">weights =</span> wtvar, </span>
+<span id="cb309-3"><a href="c10-sample-designs-replicate-weights.html#cb309-3" tabindex="-1"></a>                  <span class="at">ids =</span> <span class="fu">c</span>(PSU, SSU))</span>
+<span id="cb309-4"><a href="c10-sample-designs-replicate-weights.html#cb309-4" tabindex="-1"></a></span>
+<span id="cb309-5"><a href="c10-sample-designs-replicate-weights.html#cb309-5" tabindex="-1"></a>clus2ex2_des <span class="ot">&lt;-</span> dat <span class="sc">%&gt;%</span></span>
+<span id="cb309-6"><a href="c10-sample-designs-replicate-weights.html#cb309-6" tabindex="-1"></a> <span class="fu">as_survey_design</span>(<span class="at">weights =</span> wtvar, </span>
+<span id="cb309-7"><a href="c10-sample-designs-replicate-weights.html#cb309-7" tabindex="-1"></a>                  <span class="at">ids =</span> PSU)</span></code></pre></div>
+<p>Note that there is one additional argument that is sometimes necessary, which is <code>nest = TRUE</code>. This option relabels cluster IDs to enforce nesting within strata. Sometimes, as an example, there may be a cluster <code>1</code> within each stratum, but cluster <code>1</code> in stratum <code>1</code> is a different cluster than cluster <code>1</code> in stratum <code>2</code>. These are actually different clusters. This option indicates that repeated numbering does not mean it is the same cluster. If this option is not used and there are repeated cluster IDs across different strata, an error is generated.</p>
 </div>
 <div id="example-5" class="section level4 unnumbered hasAnchor">
 <h4>Example<a href="c10-sample-designs-replicate-weights.html#example-5" class="anchor-section" aria-label="Anchor link to header"></a></h4>
-<p>The <code>survey</code> package includes a two-stage cluster sample data, <code>apiclus2</code>, in which school districts were sampled, and then a random sample of five schools was selected within each district. For districts with fewer than five schools, all schools were sampled. School districts are identified by <code>dnum</code>, and schools are identified by <code>snum</code>. The variable <code>fpc1</code> indicates how many districts there are in California (<code>A</code>), and <code>fpc2</code> indicates how many schools were in a given district with at least 100 students (<code>B</code>). The data has a row for each school. In the data printed below, there are 757 school districts, as indicated by <code>fpc1</code>, and there are nine schools in District 731, one school in District 742, two schools in District 768, and so on as indicated by <code>fpc2</code>. For illustration purposes, the object <code>apiclus2_slim</code> has been created from <code>apiclus2</code>, which subsets the data to only the necessary columns and sorts data.</p>
-<div class="sourceCode" id="cb309"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb309-1"><a href="c10-sample-designs-replicate-weights.html#cb309-1" tabindex="-1"></a>apiclus2_slim <span class="ot">&lt;-</span></span>
-<span id="cb309-2"><a href="c10-sample-designs-replicate-weights.html#cb309-2" tabindex="-1"></a>  apiclus2 <span class="sc">%&gt;%</span></span>
-<span id="cb309-3"><a href="c10-sample-designs-replicate-weights.html#cb309-3" tabindex="-1"></a>  <span class="fu">as_tibble</span>() <span class="sc">%&gt;%</span></span>
-<span id="cb309-4"><a href="c10-sample-designs-replicate-weights.html#cb309-4" tabindex="-1"></a>  <span class="fu">arrange</span>(<span class="fu">desc</span>(dnum), snum) <span class="sc">%&gt;%</span></span>
-<span id="cb309-5"><a href="c10-sample-designs-replicate-weights.html#cb309-5" tabindex="-1"></a>  <span class="fu">select</span>(cds, dnum, snum, fpc1, fpc2, pw)</span>
-<span id="cb309-6"><a href="c10-sample-designs-replicate-weights.html#cb309-6" tabindex="-1"></a></span>
-<span id="cb309-7"><a href="c10-sample-designs-replicate-weights.html#cb309-7" tabindex="-1"></a>apiclus2_slim</span></code></pre></div>
+<p>The <code>survey</code> package includes a two-stage cluster sample data, <code>apiclus2</code>, in which school districts were sampled, and then a random sample of five schools was selected within each district. strict. All districts with fewer than five schools were sampled. School districts are identified by <code>dnum</code>, and schools are identified by <code>snum</code>. The variable <code>fpc1</code> indicates how many districts there are in California (the total number of PSUs or <code>A</code>), and <code>fpc2</code> indicates how many schools were in a given district with at least 100 students (the total number of SSUs or <code>B</code>.) The data include a row for each school. In the data printed below, there are 757 school districts, as indicated by <code>fpc1</code>, and nine schools in District 731, one school in District 742, two schools in District 768, and so on, as indicated by <code>fpc2</code>. For illustration purposes, the object <code>apiclus2_slim</code> has been created from <code>apiclus2</code>, which subsets the data to only the necessary columns and sorts the data.</p>
+<div class="sourceCode" id="cb310"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb310-1"><a href="c10-sample-designs-replicate-weights.html#cb310-1" tabindex="-1"></a>apiclus2_slim <span class="ot">&lt;-</span></span>
+<span id="cb310-2"><a href="c10-sample-designs-replicate-weights.html#cb310-2" tabindex="-1"></a>  apiclus2 <span class="sc">%&gt;%</span></span>
+<span id="cb310-3"><a href="c10-sample-designs-replicate-weights.html#cb310-3" tabindex="-1"></a>  <span class="fu">as_tibble</span>() <span class="sc">%&gt;%</span></span>
+<span id="cb310-4"><a href="c10-sample-designs-replicate-weights.html#cb310-4" tabindex="-1"></a>  <span class="fu">arrange</span>(<span class="fu">desc</span>(dnum), snum) <span class="sc">%&gt;%</span></span>
+<span id="cb310-5"><a href="c10-sample-designs-replicate-weights.html#cb310-5" tabindex="-1"></a>  <span class="fu">select</span>(cds, dnum, snum, fpc1, fpc2, pw)</span>
+<span id="cb310-6"><a href="c10-sample-designs-replicate-weights.html#cb310-6" tabindex="-1"></a></span>
+<span id="cb310-7"><a href="c10-sample-designs-replicate-weights.html#cb310-7" tabindex="-1"></a>apiclus2_slim</span></code></pre></div>
 <pre><code>## # A tibble: 126 × 6
 ##    cds             dnum  snum  fpc1      fpc2    pw
 ##    &lt;chr&gt;          &lt;int&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;int[1d]&gt; &lt;dbl&gt;
@@ -950,13 +950,15 @@ <h4>Example<a href="c10-sample-designs-replicate-weights.html#example-5" class="
 ##  9 54722076054423   742  5898   757         1  18.9
 ## 10 50712906053086   731  5781   757         9  34.1
 ## # ℹ 116 more rows</code></pre>
-<p>To specify this design in R, the following syntax should be used:</p>
-<div class="sourceCode" id="cb311"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb311-1"><a href="c10-sample-designs-replicate-weights.html#cb311-1" tabindex="-1"></a>apiclus2_des <span class="ot">&lt;-</span> apiclus2_slim <span class="sc">%&gt;%</span></span>
-<span id="cb311-2"><a href="c10-sample-designs-replicate-weights.html#cb311-2" tabindex="-1"></a>  <span class="fu">as_survey_design</span>(<span class="at">ids =</span> <span class="fu">c</span>(dnum, snum),</span>
-<span id="cb311-3"><a href="c10-sample-designs-replicate-weights.html#cb311-3" tabindex="-1"></a>                   <span class="at">fpc =</span> <span class="fu">c</span>(fpc1, fpc2),</span>
-<span id="cb311-4"><a href="c10-sample-designs-replicate-weights.html#cb311-4" tabindex="-1"></a>                   <span class="at">weights =</span> pw)</span>
-<span id="cb311-5"><a href="c10-sample-designs-replicate-weights.html#cb311-5" tabindex="-1"></a></span>
-<span id="cb311-6"><a href="c10-sample-designs-replicate-weights.html#cb311-6" tabindex="-1"></a>apiclus2_des</span></code></pre></div>
+<p>To specify this design in R, we use the following:</p>
+<div class="sourceCode" id="cb312"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb312-1"><a href="c10-sample-designs-replicate-weights.html#cb312-1" tabindex="-1"></a>apiclus2_des <span class="ot">&lt;-</span> apiclus2_slim <span class="sc">%&gt;%</span></span>
+<span id="cb312-2"><a href="c10-sample-designs-replicate-weights.html#cb312-2" tabindex="-1"></a>  <span class="fu">as_survey_design</span>(</span>
+<span id="cb312-3"><a href="c10-sample-designs-replicate-weights.html#cb312-3" tabindex="-1"></a>    <span class="at">ids =</span> <span class="fu">c</span>(dnum, snum),</span>
+<span id="cb312-4"><a href="c10-sample-designs-replicate-weights.html#cb312-4" tabindex="-1"></a>    <span class="at">fpc =</span> <span class="fu">c</span>(fpc1, fpc2),</span>
+<span id="cb312-5"><a href="c10-sample-designs-replicate-weights.html#cb312-5" tabindex="-1"></a>    <span class="at">weights =</span> pw</span>
+<span id="cb312-6"><a href="c10-sample-designs-replicate-weights.html#cb312-6" tabindex="-1"></a>  )</span>
+<span id="cb312-7"><a href="c10-sample-designs-replicate-weights.html#cb312-7" tabindex="-1"></a></span>
+<span id="cb312-8"><a href="c10-sample-designs-replicate-weights.html#cb312-8" tabindex="-1"></a>apiclus2_des</span></code></pre></div>
 <pre><code>## 2 - level Cluster Sampling design
 ## With (40, 126) clusters.
 ## Called via srvyr
@@ -967,7 +969,7 @@ <h4>Example<a href="c10-sample-designs-replicate-weights.html#example-5" class="
 ## Data variables: 
 ##   - cds (chr), dnum (int), snum (dbl), fpc1 (dbl), fpc2 (int[1d]), pw
 ##     (dbl)</code></pre>
-<div class="sourceCode" id="cb313"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb313-1"><a href="c10-sample-designs-replicate-weights.html#cb313-1" tabindex="-1"></a><span class="fu">summary</span>(apiclus2_des)</span></code></pre></div>
+<div class="sourceCode" id="cb314"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb314-1"><a href="c10-sample-designs-replicate-weights.html#cb314-1" tabindex="-1"></a><span class="fu">summary</span>(apiclus2_des)</span></code></pre></div>
 <pre><code>## 2 - level Cluster Sampling design
 ## With (40, 126) clusters.
 ## Called via srvyr
@@ -977,107 +979,109 @@ <h4>Example<a href="c10-sample-designs-replicate-weights.html#example-5" class="
 ## Population size (PSUs): 757 
 ## Data variables:
 ## [1] &quot;cds&quot;  &quot;dnum&quot; &quot;snum&quot; &quot;fpc1&quot; &quot;fpc2&quot; &quot;pw&quot;</code></pre>
-<p>The design objects are described as “2 - level Cluster Sampling design” and include the ids (cluster), FPC, and weight variables. The summary notes that the sample includes 40 first-level clusters (PSUs), which are school districts, and 126 second-level clusters (SSUs), which are schools. Additionally, the summary includes a numeric summary of the probabilities of selection and the population size (number of PSUs) as 757.</p>
+<p>The design objects are described as “2 - level Cluster Sampling design,” and include the ids (cluster), FPC, and weight variables. The summary notes that the sample includes 40 first-level clusters (PSUs), which are school districts, and 126 second-level clusters (SSUs), which are schools. Additionally, the summary includes a numeric summary of the probabilities of selection and the population size (number of PSUs) as 757.</p>
 </div>
 </div>
 </div>
 <div id="samp-combo" class="section level2 hasAnchor" number="10.3">
 <h2><span class="header-section-number">10.3</span> Combining sampling methods<a href="c10-sample-designs-replicate-weights.html#samp-combo" class="anchor-section" aria-label="Anchor link to header"></a></h2>
-<p>SRS, stratified, and clustered designs are the backbone of sampling designs, and the features are often combined in one design. Additionally, rather than using SRS for selection, other sampling mechanisms are commonly used, such as probability proportional to size (PPS), systematic sampling, or selection with unequal probabilities, which are briefly described here. In PPS sampling, a size measure is constructed for each unit (e.g., the population of the PSU or the number of occupied housing units) and then units with larger size measures are more likely to be sampled. Systematic sampling is commonly used to ensure representation across a population. Units are sorted by a feature and then every <span class="math inline">\(k\)</span> units are selected from a random start point so the sample is spread across the population. In addition to PPS, other unequal probabilities of selection may be used. For example, in a study of establishments (e.g., businesses or public institutions) that conducts a survey every year, an establishment that recently participated (e.g., participated last year) may have a reduced chance of selection in a subsequent round to reduce the burden on the establishment. To learn more about sampling designs, refer to <span class="citation">Valliant, Dever, and Kreuter (<a href="#ref-valliant2013practical">2013</a>)</span>, <span class="citation">Cox et al. (<a href="#ref-cox2011business">2011</a>)</span>, <span class="citation">Cochran (<a href="#ref-cochran1977sampling">1977</a>)</span>, and <span class="citation">Deming (<a href="#ref-deming1991sample">1991</a>)</span>.</p>
-<p>A common method of sampling is to stratify PSUs, select PSUs within the stratum using PPS selection, and then select units within the PSUs either with SRS or PPS. Reading survey documentation is an important first step in survey analysis to understand the design of the survey we are using and variables necessary to specify the design. Good documentation will highlight the variables necessary to specify the design. This is often found in User’s Guides, methodology, analysis guides, or technical documentation (see Chapter <a href="c03-survey-data-documentation.html#c03-survey-data-documentation">3</a> for more details).</p>
+<p>SRS, stratified, and clustered designs are the backbone of sampling designs, and the features are often combined in one design. Additionally, rather than using SRS for selection, other sampling mechanisms are commonly used, such as probability proportional to size (PPS), systematic sampling, or selection with unequal probabilities, which are briefly described here. In PPS sampling, a size measure is constructed for each unit (e.g., the population of the PSU or the number of occupied housing units), and units with larger size measures are more likely to be sampled. Systematic sampling is commonly used to ensure representation across a population. Units are sorted by a feature, and then every <span class="math inline">\(k\)</span> units is selected from a random start point so the sample is spread across the population. In addition to PPS, other unequal probabilities of selection may be used. For example, in a study of establishments (e.g., businesses or public institutions) that conducts a survey every year, an establishment that recently participated (e.g., participated last year) may have a reduced chance of selection in a subsequent round to reduce the burden on the establishment. To learn more about sampling designs, refer to <span class="citation">Valliant, Dever, and Kreuter (<a href="#ref-valliant2013practical">2013</a>)</span>, <span class="citation">Cox et al. (<a href="#ref-cox2011business">2011</a>)</span>, <span class="citation">Cochran (<a href="#ref-cochran1977sampling">1977</a>)</span>, and <span class="citation">Deming (<a href="#ref-deming1991sample">1991</a>)</span>.</p>
+<p>A common method of sampling is to stratify PSUs, select PSUs within the stratum using PPS selection, and then select units within the PSUs either with SRS or PPS. Reading survey documentation is an important first step in survey analysis to understand the design of the survey we are using and variables necessary to specify the design. Good documentation highlights the variables necessary to specify the design. This is often found in the user guide, methodology report, analysis guide, or technical documentation (see Chapter <a href="c03-survey-data-documentation.html#c03-survey-data-documentation">3</a> for more details.)</p>
 <div id="example-6" class="section level3 unnumbered hasAnchor">
 <h3>Example<a href="c10-sample-designs-replicate-weights.html#example-6" class="anchor-section" aria-label="Anchor link to header"></a></h3>
-<p>For example, the (2017-2019 National Survey of Family Growth)[ <a href="https://www.cdc.gov/nchs/data/nsfg/NSFG-2017-2019-Sample-Design-Documentation-508.pdf" class="uri">https://www.cdc.gov/nchs/data/nsfg/NSFG-2017-2019-Sample-Design-Documentation-508.pdf</a>] (NSFG) had a stratified multi-stage area probability sample:
-1. In the first stage, PSUs are counties or collections of counties and are stratified by Census region/division, size (population), and MSA status. Within each stratum, PSUs were selected via PPS.
-2. In the second stage, neighborhoods were selected within the sampled PSUs using PPS selection.
-3. In the third stage, housing units were selected within the sampled neighborhoods.
-4. In the fourth stage, a person was randomly chosen within the selected housing units among eligible persons using unequal probabilities based on the person’s age and sex.</p>
-<p>The public use file does not include all these levels of selection and instead has pseudo-strata and pseudo-clusters, which are the variables used in R to specify the design. As specified on page 4 of the documentation, the stratum variable is <code>SEST</code>, the cluster variable is <code>SECU</code>, and the weight variable is <code>WGT2017_2019</code>. Thus, to specify this design in R, use the following syntax:</p>
-<div class="sourceCode" id="cb315"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb315-1"><a href="c10-sample-designs-replicate-weights.html#cb315-1" tabindex="-1"></a>nsfg_des <span class="ot">&lt;-</span> nsfgdata <span class="sc">%&gt;%</span></span>
-<span id="cb315-2"><a href="c10-sample-designs-replicate-weights.html#cb315-2" tabindex="-1"></a>  <span class="fu">as_survey_design</span>(<span class="at">ids =</span> SECU,</span>
-<span id="cb315-3"><a href="c10-sample-designs-replicate-weights.html#cb315-3" tabindex="-1"></a>                   <span class="at">strata =</span> SEST,</span>
-<span id="cb315-4"><a href="c10-sample-designs-replicate-weights.html#cb315-4" tabindex="-1"></a>                   <span class="at">weights =</span> WGT2017_2019)</span></code></pre></div>
+<p>For example, the <a href="https://www.cdc.gov/nchs/data/nsfg/NSFG-2017-2019-Sample-Design-Documentation-508.pdf">2017-2019 National Survey of Family Growth</a> had a stratified multi-stage area probability sample:</p>
+<ol style="list-style-type: decimal">
+<li>In the first stage, PSUs are counties or collections of counties and are stratified by Census region/division, size (population), and MSA status. Within each stratum, PSUs were selected via PPS.</li>
+<li>In the second stage, neighborhoods were selected within the sampled PSUs using PPS selection.</li>
+<li>In the third stage, housing units were selected within the sampled neighborhoods.</li>
+<li>In the fourth stage, a person was randomly chosen among eligible persons within the selected housing units using unequal probabilities based on the person’s age and sex.</li>
+</ol>
+<p>The public use file does not include all these levels of selection and instead has pseudo-strata and pseudo-clusters, which are the variables used in R to specify the design. As specified on page 4 of the documentation, the stratum variable is <code>SEST</code>, the cluster variable is <code>SECU</code>, and the weight variable is <code>WGT2017_2019</code>. Thus, to specify this design in R, we use the following syntax:</p>
+<div class="sourceCode" id="cb316"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb316-1"><a href="c10-sample-designs-replicate-weights.html#cb316-1" tabindex="-1"></a>nsfg_des <span class="ot">&lt;-</span> nsfgdata <span class="sc">%&gt;%</span></span>
+<span id="cb316-2"><a href="c10-sample-designs-replicate-weights.html#cb316-2" tabindex="-1"></a>  <span class="fu">as_survey_design</span>(<span class="at">ids =</span> SECU,</span>
+<span id="cb316-3"><a href="c10-sample-designs-replicate-weights.html#cb316-3" tabindex="-1"></a>                   <span class="at">strata =</span> SEST,</span>
+<span id="cb316-4"><a href="c10-sample-designs-replicate-weights.html#cb316-4" tabindex="-1"></a>                   <span class="at">weights =</span> WGT2017_2019)</span></code></pre></div>
 </div>
 </div>
 <div id="replicate-weights" class="section level2 hasAnchor" number="10.4">
 <h2><span class="header-section-number">10.4</span> Replicate weights<a href="c10-sample-designs-replicate-weights.html#replicate-weights" class="anchor-section" aria-label="Anchor link to header"></a></h2>
-<p>Replicate weights are often included on analysis files instead of, or in addition to, the design variables (strata and PSUs). Replicate weights are used as another method to estimate variability. Often researchers choose to use replicate weights to avoid publishing design variables (strata or clustering variables) as a measure to reduce the risk of disclosure. There are several types of replicate weights, including balanced repeated replication (BRR), Fay’s BRR, jackknife, and bootstrap methods. An overview of the process for using replicate weights is as follows:</p>
+<p>Replicate weights are often included on analysis files instead of, or in addition to, the design variables (strata and PSUs.) Replicate weights are used as another method to estimate variability. Often, researchers choose to use replicate weights to avoid publishing design variables (strata or clustering variables) as a measure to reduce the risk of disclosure. There are several types of replicate weights, including balanced repeated replication (BRR), Fay’s BRR, jackknife, and bootstrap methods. An overview of the process for using replicate weights is as follows:</p>
 <ol style="list-style-type: decimal">
 <li>Divide the sample into subsample replicates that mirror the design of the sample</li>
 <li>Calculate weights for each replicate using the same procedures for the full-sample weight (i.e., nonresponse and post-stratification)</li>
 <li>Calculate estimates for each replicate using the same method as the full-sample estimate</li>
-<li>Calculate the estimated variance, which will be proportional to the variance of the replicate estimates</li>
+<li>Calculate the estimated variance, which is proportional to the variance of the replicate estimates</li>
 </ol>
-<p>The different types of replicate weights largely differ between step 1 (how the sample is divided into subsamples) and step 4 (which multiplication factors (scales) are used to multiply the variance). The general format for the standard error is:</p>
+<p>The different types of replicate weights largely differ between step 1 (how the sample is divided into subsamples) and step 4 (which multiplication factors (scales) are used to multiply the variance.) The general format for the standard error is:</p>
 <p><span class="math display">\[ \sqrt{\alpha \sum_{r=1}^R \alpha_r (\hat{\theta}_r - \hat{\theta})^2 }\]</span></p>
 <p>where <span class="math inline">\(R\)</span> is the number of replicates, <span class="math inline">\(\alpha\)</span> is a constant that depends on the replication method, <span class="math inline">\(\alpha_r\)</span> is a factor associated with each replicate, <span class="math inline">\(\hat{\theta}\)</span> is the weighted estimate based on the full sample, and <span class="math inline">\(\hat{\theta}_r\)</span> is the weighted estimate of <span class="math inline">\(\theta\)</span> based on the <span class="math inline">\(r^{\text{th}}\)</span> replicate.</p>
-<p>To create the design object for surveys with replicate weights, we use <code>as_survey_rep()</code> instead of <code>as_survey_design()</code> that we use for the common sampling designs in the sections above.</p>
+<p>To create the design object for surveys with replicate weights, we use <code>as_survey_rep()</code> instead of <code>as_survey_design()</code>, which we use for the common sampling designs in the sections above.</p>
 <div id="balanced-repeated-replication-brr-method" class="section level3 hasAnchor" number="10.4.1">
 <h3><span class="header-section-number">10.4.1</span> Balanced Repeated Replication (BRR) method<a href="c10-sample-designs-replicate-weights.html#balanced-repeated-replication-brr-method" class="anchor-section" aria-label="Anchor link to header"></a></h3>
-<p>The BRR method requires a stratified sample design with two PSUs in each stratum. Each replicate is constructed by deleting one PSU per stratum using a Hadamard matrix. For the PSU that is included, the weight is generally multiplied by two but may have other adjustments, such as post-stratification. A Hadamard matrix is a special square matrix with entries of +1 or -1 with mutually orthogonal rows. Hadamard matrices must have one row, two rows, or a multiple of four rows. The size of the Hadamard matrix is determined by the first multiple of 4 greater than or equal to the number of strata. For example, if a survey had 7 strata, the Hadamard matrix would be an <span class="math inline">\(8\times8\)</span> matrix. Additionally, a survey with 8 strata would also have an <span class="math inline">\(8\times8\)</span> Hadamard matrix. The columns in the matrix specify the strata and the rows specify the replicate. In each replicate (row), a +1 means to use the first PSU and a -1 means to use the second PSU in the estimate. For example, here is a <span class="math inline">\(4\times4\)</span> Hadamard matrix:</p>
+<p>The BRR method requires a stratified sample design with two PSUs in each stratum. Each replicate is constructed by deleting one PSU per stratum using a Hadamard matrix. For the PSU that is included, the weight is generally multiplied by two but may have other adjustments, such as post-stratification. A Hadamard matrix is a special square matrix with entries of +1 or -1 with mutually orthogonal rows. Hadamard matrices must have one row, two rows, or a multiple of four rows. The size of the Hadamard matrix is determined by the first multiple of 4 greater than or equal to the number of strata. For example, if a survey had seven strata, the Hadamard matrix would be an <span class="math inline">\(8\times8\)</span> matrix. Additionally, a survey with eight strata would also have an <span class="math inline">\(8\times8\)</span> Hadamard matrix. The columns in the matrix specify the strata and the rows specify the replicate. In each replicate (row), a +1 means to use the first PSU and a -1 means to use the second PSU in the estimate. For example, here is a <span class="math inline">\(4\times4\)</span> Hadamard matrix:</p>
 <p><span class="math display">\[ \begin{array}{rrrr} +1 &amp;+1 &amp;+1 &amp;+1\\ +1&amp;-1&amp;+1&amp;-1\\ +1&amp;+1&amp;-1&amp;-1\\ +1 &amp;-1&amp;-1&amp;+1 \end{array} \]</span>
-In the first replicate (row), all the values are +1, so in each stratum, the first PSU would be used in the estimate. In the second replicate, the first PSU would be used in stratum 1 and 3, while the second PSU would be used in stratum 2 and 4. In the third replicate, the first PSU would be used in stratum 1 and 2, while the second PSU would be used in strata 3 and 4. Finally, in the fourth replicate, the first PSU would be used in strata 1 and 4, while the second PSU would be used in strata 2 and 3. For more information about Hadamard matrices see <span class="citation">Wolter (<a href="#ref-wolter2007introduction">2007</a>)</span>. Note that supplied BRR weights from a data provider will already incorporate this adjustment, and the {survey} package generates the Hadamard matrix, if necessary for calculating BRR weights so an analyst will not need to provide the matrix.</p>
+In the first replicate (row), all the values are +1, so in each stratum, the first PSU would be used in the estimate. In the second replicate, the first PSU would be used in strata 1 and 3, while the second PSU would be used in strata 2 and 4. In the third replicate, the first PSU would be used in strata 1 and 2, while the second PSU would be used in strata 3 and 4. Finally, in the fourth replicate, the first PSU would be used in strata 1 and 4, while the second PSU would be used in strata 2 and 3. For more information about Hadamard matrices, see <span class="citation">Wolter (<a href="#ref-wolter2007introduction">2007</a>)</span>. Note that supplied BRR weights from a data provider already incorporate this adjustment, and the {survey} package generates the Hadamard matrix, if necessary, for calculating BRR weights, so an analyst does not need to create or provide the matrix.</p>
 <div id="the-math-4" class="section level4 unnumbered hasAnchor">
 <h4>The math<a href="c10-sample-designs-replicate-weights.html#the-math-4" class="anchor-section" aria-label="Anchor link to header"></a></h4>
 <p>A weighted estimate for the full sample is calculated as <span class="math inline">\(\hat{\theta}\)</span>, and then a weighted estimate for each replicate is calculated as <span class="math inline">\(\hat{\theta}_r\)</span> for <span class="math inline">\(R\)</span> replicates. Using the generic notation above, <span class="math inline">\(\alpha=\frac{1}{R}\)</span> and <span class="math inline">\(\alpha_r=1\)</span> for each <span class="math inline">\(r\)</span>. The standard error of the estimate is calculated as follows:</p>
 <p><span class="math display">\[se(\hat{\theta})=\sqrt{\frac{1}{R} \sum_{r=1}^R \left( \hat{\theta}_r-\hat{\theta}\right)^2}\]</span></p>
-<p>Specifying replicate weights in R requires specifying the type of replicate weights, the main weight variable, the replicate weight variables, and other options. One of the key options is for the mean squared error (MSE). If <code>mse=TRUE</code>, variances are computed around the point estimate <span class="math inline">\((\hat{\theta})\)</span>, whereas if <code>mse=FALSE</code>, variances are computed around the mean of the replicates <span class="math inline">\((\bar{\theta})\)</span> instead which looks like this:</p>
+<p>Specifying replicate weights in R requires specifying the type of replicate weights, the main weight variable, the replicate weight variables, and other options. One of the key options is for the mean squared error (MSE.) If <code>mse=TRUE</code>, variances are computed around the point estimate <span class="math inline">\((\hat{\theta})\)</span>, whereas if <code>mse=FALSE</code>, variances are computed around the mean of the replicates <span class="math inline">\((\bar{\theta})\)</span> instead, which looks like this:</p>
 <p><span class="math display">\[se(\hat{\theta})=\sqrt{\frac{1}{R} \sum_{r=1}^R \left( \hat{\theta}_r-\bar{\theta}\right)^2}\]</span> where <span class="math display">\[\bar{\theta}=\frac{1}{R}\sum_{r=1}^R \hat{\theta}_r\]</span></p>
-<p>The default option for <code>mse</code> is to use the global option of “survey.replicates.mse” which is set to <code>FALSE</code> initially unless a user changes it. To determine if <code>mse</code> should be set to <code>TRUE</code> or <code>FALSE</code>, read the survey documentation. If there is no indication in the survey documentation, for BRR, we recommend setting <code>mse</code> to <code>TRUE</code> as this is the default in other software (e.g., SAS, SUDAAN).</p>
+<p>The default option for <code>mse</code> is to use the global option of “survey.replicates.mse” which is set to <code>FALSE</code> initially unless a user changes it. To determine if <code>mse</code> should be set to <code>TRUE</code> or <code>FALSE</code>, read the survey documentation. If there is no indication in the survey documentation for BRR, we recommend setting <code>mse</code> to <code>TRUE</code> as this is the default in other software (e.g., SAS, SUDAAN.)</p>
 </div>
 <div id="the-syntax-4" class="section level4 unnumbered hasAnchor">
 <h4>The syntax<a href="c10-sample-designs-replicate-weights.html#the-syntax-4" class="anchor-section" aria-label="Anchor link to header"></a></h4>
-<p>Replicate weights generally come in groups and are sequentially numbered, such as PWGTP1, PWGTP2, …, PWGTP80 for the person weights in the American Community Survey (ACS) <span class="citation">(<a href="#ref-acs-pums-2021">U.S. Census Bureau 2021</a>)</span> or BRRWT1, BRRWT2, …, BRRWT96 in the 2015 Residential Energy Consumption Survey (RECS) <span class="citation">(<a href="#ref-recs-2015-micro">U.S. Energy Information Administration 2017</a>)</span>. This makes it easy to use some of the (tidy selection)[<a href="https://dplyr.tidyverse.org/reference/dplyr_tidy_select.html" class="uri">https://dplyr.tidyverse.org/reference/dplyr_tidy_select.html</a>] functions in R.</p>
-<p>To specify a BRR design, we need to specify the weight variable (<code>weights</code>), the replicate weight variables (<code>repweights</code>), the type of replicate weights is BRR (<code>type = BRR</code>), and whether the mean squared error should be used (<code>mse = TRUE</code>) or not (<code>mse = FALSE</code>). For example, if a dataset had WT0 for the main weight and had 20 BRR weights indicated WT1, WT2, …, WT20, we can use the following syntax (both are equivalent):</p>
-<div class="sourceCode" id="cb316"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb316-1"><a href="c10-sample-designs-replicate-weights.html#cb316-1" tabindex="-1"></a>brr_des <span class="ot">&lt;-</span> dat <span class="sc">%&gt;%</span></span>
-<span id="cb316-2"><a href="c10-sample-designs-replicate-weights.html#cb316-2" tabindex="-1"></a>  <span class="fu">as_survey_rep</span>(<span class="at">weights =</span> WT0,</span>
-<span id="cb316-3"><a href="c10-sample-designs-replicate-weights.html#cb316-3" tabindex="-1"></a>                <span class="at">repweights =</span> <span class="fu">all_of</span>(<span class="fu">str_c</span>(<span class="st">&quot;WT&quot;</span>, <span class="dv">1</span><span class="sc">:</span><span class="dv">20</span>)), </span>
-<span id="cb316-4"><a href="c10-sample-designs-replicate-weights.html#cb316-4" tabindex="-1"></a>                <span class="at">type =</span> <span class="st">&quot;BRR&quot;</span>,</span>
-<span id="cb316-5"><a href="c10-sample-designs-replicate-weights.html#cb316-5" tabindex="-1"></a>                <span class="at">mse =</span> <span class="cn">TRUE</span>)</span>
-<span id="cb316-6"><a href="c10-sample-designs-replicate-weights.html#cb316-6" tabindex="-1"></a></span>
-<span id="cb316-7"><a href="c10-sample-designs-replicate-weights.html#cb316-7" tabindex="-1"></a>brr_des <span class="ot">&lt;-</span> dat <span class="sc">%&gt;%</span></span>
-<span id="cb316-8"><a href="c10-sample-designs-replicate-weights.html#cb316-8" tabindex="-1"></a>  <span class="fu">as_survey_rep</span>(<span class="at">weights =</span> WT0,</span>
-<span id="cb316-9"><a href="c10-sample-designs-replicate-weights.html#cb316-9" tabindex="-1"></a>                <span class="at">repweights =</span> <span class="fu">num_range</span>(<span class="st">&quot;WT&quot;</span>, <span class="dv">1</span><span class="sc">:</span><span class="dv">20</span>),</span>
-<span id="cb316-10"><a href="c10-sample-designs-replicate-weights.html#cb316-10" tabindex="-1"></a>                <span class="at">type =</span> <span class="st">&quot;BRR&quot;</span>,</span>
-<span id="cb316-11"><a href="c10-sample-designs-replicate-weights.html#cb316-11" tabindex="-1"></a>                <span class="at">mse =</span> <span class="cn">TRUE</span>)</span></code></pre></div>
-<p>If a dataset had WT for the main weight and had 20 BRR weights indicated REPWT1, REPWT2, …, REPWT20, the following syntax could be used (both are equivalent):</p>
+<p>Replicate weights generally come in groups and are sequentially numbered, such as PWGTP1, PWGTP2, …, PWGTP80 for the person weights in the American Community Survey (ACS) <span class="citation">(<a href="#ref-acs-pums-2021">U.S. Census Bureau 2021</a>)</span> or BRRWT1, BRRWT2, …, BRRWT96 in the 2015 Residential Energy Consumption Survey (RECS) <span class="citation">(<a href="#ref-recs-2015-micro">U.S. Energy Information Administration 2017</a>)</span>. This makes it easy to use some of the <a href="https://dplyr.tidyverse.org/reference/dplyr_tidy_select.html">tidy selection</a> functions in R.</p>
+<p>To specify a BRR design, we need to specify the weight variable (<code>weights</code>), the replicate weight variables (<code>repweights</code>), the type of replicate weights as BRR (<code>type = BRR</code>), and whether the mean squared error should be used (<code>mse = TRUE</code>) or not (<code>mse = FALSE</code>.) For example, if a dataset had WT0 for the main weight and had 20 BRR weights indicated WT1, WT2, …, WT20, we can use the following syntax (both are equivalent):</p>
 <div class="sourceCode" id="cb317"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb317-1"><a href="c10-sample-designs-replicate-weights.html#cb317-1" tabindex="-1"></a>brr_des <span class="ot">&lt;-</span> dat <span class="sc">%&gt;%</span></span>
-<span id="cb317-2"><a href="c10-sample-designs-replicate-weights.html#cb317-2" tabindex="-1"></a>  <span class="fu">as_survey_rep</span>(<span class="at">weights =</span> WT,</span>
-<span id="cb317-3"><a href="c10-sample-designs-replicate-weights.html#cb317-3" tabindex="-1"></a>                <span class="at">repweights =</span> <span class="fu">all_of</span>(<span class="fu">str_c</span>(<span class="st">&quot;REPWT&quot;</span>, <span class="dv">1</span><span class="sc">:</span><span class="dv">20</span>)),</span>
+<span id="cb317-2"><a href="c10-sample-designs-replicate-weights.html#cb317-2" tabindex="-1"></a>  <span class="fu">as_survey_rep</span>(<span class="at">weights =</span> WT0,</span>
+<span id="cb317-3"><a href="c10-sample-designs-replicate-weights.html#cb317-3" tabindex="-1"></a>                <span class="at">repweights =</span> <span class="fu">all_of</span>(<span class="fu">str_c</span>(<span class="st">&quot;WT&quot;</span>, <span class="dv">1</span><span class="sc">:</span><span class="dv">20</span>)), </span>
 <span id="cb317-4"><a href="c10-sample-designs-replicate-weights.html#cb317-4" tabindex="-1"></a>                <span class="at">type =</span> <span class="st">&quot;BRR&quot;</span>,</span>
 <span id="cb317-5"><a href="c10-sample-designs-replicate-weights.html#cb317-5" tabindex="-1"></a>                <span class="at">mse =</span> <span class="cn">TRUE</span>)</span>
 <span id="cb317-6"><a href="c10-sample-designs-replicate-weights.html#cb317-6" tabindex="-1"></a></span>
 <span id="cb317-7"><a href="c10-sample-designs-replicate-weights.html#cb317-7" tabindex="-1"></a>brr_des <span class="ot">&lt;-</span> dat <span class="sc">%&gt;%</span></span>
-<span id="cb317-8"><a href="c10-sample-designs-replicate-weights.html#cb317-8" tabindex="-1"></a>  <span class="fu">as_survey_rep</span>(<span class="at">weights =</span> WT,</span>
-<span id="cb317-9"><a href="c10-sample-designs-replicate-weights.html#cb317-9" tabindex="-1"></a>                <span class="at">repweights =</span> <span class="fu">starts_with</span>(<span class="st">&quot;REPWT&quot;</span>),</span>
+<span id="cb317-8"><a href="c10-sample-designs-replicate-weights.html#cb317-8" tabindex="-1"></a>  <span class="fu">as_survey_rep</span>(<span class="at">weights =</span> WT0,</span>
+<span id="cb317-9"><a href="c10-sample-designs-replicate-weights.html#cb317-9" tabindex="-1"></a>                <span class="at">repweights =</span> <span class="fu">num_range</span>(<span class="st">&quot;WT&quot;</span>, <span class="dv">1</span><span class="sc">:</span><span class="dv">20</span>),</span>
 <span id="cb317-10"><a href="c10-sample-designs-replicate-weights.html#cb317-10" tabindex="-1"></a>                <span class="at">type =</span> <span class="st">&quot;BRR&quot;</span>,</span>
 <span id="cb317-11"><a href="c10-sample-designs-replicate-weights.html#cb317-11" tabindex="-1"></a>                <span class="at">mse =</span> <span class="cn">TRUE</span>)</span></code></pre></div>
-<p>If the replicate weight variables are in the file consecutively, the following syntax can also be used:</p>
+<p>If a dataset had WT for the main weight and had 20 BRR weights indicated REPWT1, REPWT2, …, REPWT20, we can use the following syntax (both are equivalent):</p>
 <div class="sourceCode" id="cb318"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb318-1"><a href="c10-sample-designs-replicate-weights.html#cb318-1" tabindex="-1"></a>brr_des <span class="ot">&lt;-</span> dat <span class="sc">%&gt;%</span></span>
 <span id="cb318-2"><a href="c10-sample-designs-replicate-weights.html#cb318-2" tabindex="-1"></a>  <span class="fu">as_survey_rep</span>(<span class="at">weights =</span> WT,</span>
-<span id="cb318-3"><a href="c10-sample-designs-replicate-weights.html#cb318-3" tabindex="-1"></a>                <span class="at">repweights =</span> REPWT1<span class="sc">:</span>REPWT20,</span>
+<span id="cb318-3"><a href="c10-sample-designs-replicate-weights.html#cb318-3" tabindex="-1"></a>                <span class="at">repweights =</span> <span class="fu">all_of</span>(<span class="fu">str_c</span>(<span class="st">&quot;REPWT&quot;</span>, <span class="dv">1</span><span class="sc">:</span><span class="dv">20</span>)),</span>
 <span id="cb318-4"><a href="c10-sample-designs-replicate-weights.html#cb318-4" tabindex="-1"></a>                <span class="at">type =</span> <span class="st">&quot;BRR&quot;</span>,</span>
-<span id="cb318-5"><a href="c10-sample-designs-replicate-weights.html#cb318-5" tabindex="-1"></a>                <span class="at">mse =</span> <span class="cn">TRUE</span>)</span></code></pre></div>
-<p>Typically, each replicate weight sums to a value similar to the main weight, as both the replicate weights and the main weight are supposed to provide population estimates. Rarely, an alternative method will be used where the replicate weights have values of 0 or 2 in the case of BRR weights. This would be indicated in the documentation (see Chapter <a href="c03-survey-data-documentation.html#c03-survey-data-documentation">3</a> for more information on how to understand the provided documentation). In this case, the replicate weights are not combined, and the option <code>combined_weights = FALSE</code> should be indicated, as the default value for this argument is TRUE. This specific syntax is shown below:</p>
+<span id="cb318-5"><a href="c10-sample-designs-replicate-weights.html#cb318-5" tabindex="-1"></a>                <span class="at">mse =</span> <span class="cn">TRUE</span>)</span>
+<span id="cb318-6"><a href="c10-sample-designs-replicate-weights.html#cb318-6" tabindex="-1"></a></span>
+<span id="cb318-7"><a href="c10-sample-designs-replicate-weights.html#cb318-7" tabindex="-1"></a>brr_des <span class="ot">&lt;-</span> dat <span class="sc">%&gt;%</span></span>
+<span id="cb318-8"><a href="c10-sample-designs-replicate-weights.html#cb318-8" tabindex="-1"></a>  <span class="fu">as_survey_rep</span>(<span class="at">weights =</span> WT,</span>
+<span id="cb318-9"><a href="c10-sample-designs-replicate-weights.html#cb318-9" tabindex="-1"></a>                <span class="at">repweights =</span> <span class="fu">starts_with</span>(<span class="st">&quot;REPWT&quot;</span>),</span>
+<span id="cb318-10"><a href="c10-sample-designs-replicate-weights.html#cb318-10" tabindex="-1"></a>                <span class="at">type =</span> <span class="st">&quot;BRR&quot;</span>,</span>
+<span id="cb318-11"><a href="c10-sample-designs-replicate-weights.html#cb318-11" tabindex="-1"></a>                <span class="at">mse =</span> <span class="cn">TRUE</span>)</span></code></pre></div>
+<p>If the replicate weight variables are in the file consecutively, we can also use the following syntax:</p>
 <div class="sourceCode" id="cb319"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb319-1"><a href="c10-sample-designs-replicate-weights.html#cb319-1" tabindex="-1"></a>brr_des <span class="ot">&lt;-</span> dat <span class="sc">%&gt;%</span></span>
 <span id="cb319-2"><a href="c10-sample-designs-replicate-weights.html#cb319-2" tabindex="-1"></a>  <span class="fu">as_survey_rep</span>(<span class="at">weights =</span> WT,</span>
-<span id="cb319-3"><a href="c10-sample-designs-replicate-weights.html#cb319-3" tabindex="-1"></a>                <span class="at">repweights =</span> <span class="fu">starts_with</span>(<span class="st">&quot;REPWT&quot;</span>),</span>
+<span id="cb319-3"><a href="c10-sample-designs-replicate-weights.html#cb319-3" tabindex="-1"></a>                <span class="at">repweights =</span> REPWT1<span class="sc">:</span>REPWT20,</span>
 <span id="cb319-4"><a href="c10-sample-designs-replicate-weights.html#cb319-4" tabindex="-1"></a>                <span class="at">type =</span> <span class="st">&quot;BRR&quot;</span>,</span>
-<span id="cb319-5"><a href="c10-sample-designs-replicate-weights.html#cb319-5" tabindex="-1"></a>                <span class="at">combined_weights =</span> <span class="cn">FALSE</span>,</span>
-<span id="cb319-6"><a href="c10-sample-designs-replicate-weights.html#cb319-6" tabindex="-1"></a>                <span class="at">mse =</span> <span class="cn">TRUE</span>)</span></code></pre></div>
+<span id="cb319-5"><a href="c10-sample-designs-replicate-weights.html#cb319-5" tabindex="-1"></a>                <span class="at">mse =</span> <span class="cn">TRUE</span>)</span></code></pre></div>
+<p>Typically, each replicate weight sums to a value similar to the main weight, as both the replicate weights and the main weight are supposed to provide population estimates. Rarely, an alternative method is used where the replicate weights have values of 0 or 2 in the case of BRR weights. This would be indicated in the documentation (see Chapter <a href="c03-survey-data-documentation.html#c03-survey-data-documentation">3</a> for more information on reading documentation.) In this case, the replicate weights are not combined, and the option <code>combined_weights = FALSE</code> should be indicated, as the default value for this argument is <code>TRUE</code>. This specific syntax is shown below:</p>
+<div class="sourceCode" id="cb320"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb320-1"><a href="c10-sample-designs-replicate-weights.html#cb320-1" tabindex="-1"></a>brr_des <span class="ot">&lt;-</span> dat <span class="sc">%&gt;%</span></span>
+<span id="cb320-2"><a href="c10-sample-designs-replicate-weights.html#cb320-2" tabindex="-1"></a>  <span class="fu">as_survey_rep</span>(<span class="at">weights =</span> WT,</span>
+<span id="cb320-3"><a href="c10-sample-designs-replicate-weights.html#cb320-3" tabindex="-1"></a>                <span class="at">repweights =</span> <span class="fu">starts_with</span>(<span class="st">&quot;REPWT&quot;</span>),</span>
+<span id="cb320-4"><a href="c10-sample-designs-replicate-weights.html#cb320-4" tabindex="-1"></a>                <span class="at">type =</span> <span class="st">&quot;BRR&quot;</span>,</span>
+<span id="cb320-5"><a href="c10-sample-designs-replicate-weights.html#cb320-5" tabindex="-1"></a>                <span class="at">combined_weights =</span> <span class="cn">FALSE</span>,</span>
+<span id="cb320-6"><a href="c10-sample-designs-replicate-weights.html#cb320-6" tabindex="-1"></a>                <span class="at">mse =</span> <span class="cn">TRUE</span>)</span></code></pre></div>
 </div>
 <div id="example-7" class="section level4 unnumbered hasAnchor">
 <h4>Example<a href="c10-sample-designs-replicate-weights.html#example-7" class="anchor-section" aria-label="Anchor link to header"></a></h4>
-<p>The {survey} package includes a data example from Section 12.2 of <span class="citation">Levy and Lemeshow (<a href="#ref-levy2013sampling">2013</a>)</span>. In this fictional data, two out of five ambulance stations were sampled from each of three emergency service areas (ESAs), thus BRR weights are appropriate with 2 PSUs (stations) sampled in each stratum (ESA). In the code below, BRR weights are created as was done by <span class="citation">Levy and Lemeshow (<a href="#ref-levy2013sampling">2013</a>)</span>.</p>
-<div class="sourceCode" id="cb320"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb320-1"><a href="c10-sample-designs-replicate-weights.html#cb320-1" tabindex="-1"></a>scdbrr <span class="ot">&lt;-</span> scd <span class="sc">%&gt;%</span></span>
-<span id="cb320-2"><a href="c10-sample-designs-replicate-weights.html#cb320-2" tabindex="-1"></a>  <span class="fu">as_tibble</span>() <span class="sc">%&gt;%</span></span>
-<span id="cb320-3"><a href="c10-sample-designs-replicate-weights.html#cb320-3" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">wt =</span> <span class="dv">5</span> <span class="sc">/</span> <span class="dv">2</span>,</span>
-<span id="cb320-4"><a href="c10-sample-designs-replicate-weights.html#cb320-4" tabindex="-1"></a>         <span class="at">rep1 =</span> <span class="dv">2</span> <span class="sc">*</span> <span class="fu">c</span>(<span class="dv">1</span>, <span class="dv">0</span>, <span class="dv">1</span>, <span class="dv">0</span>, <span class="dv">1</span>, <span class="dv">0</span>),</span>
-<span id="cb320-5"><a href="c10-sample-designs-replicate-weights.html#cb320-5" tabindex="-1"></a>         <span class="at">rep2 =</span> <span class="dv">2</span> <span class="sc">*</span> <span class="fu">c</span>(<span class="dv">1</span>, <span class="dv">0</span>, <span class="dv">0</span>, <span class="dv">1</span>, <span class="dv">0</span>, <span class="dv">1</span>),</span>
-<span id="cb320-6"><a href="c10-sample-designs-replicate-weights.html#cb320-6" tabindex="-1"></a>         <span class="at">rep3 =</span> <span class="dv">2</span> <span class="sc">*</span> <span class="fu">c</span>(<span class="dv">0</span>, <span class="dv">1</span>, <span class="dv">1</span>, <span class="dv">0</span>, <span class="dv">0</span>, <span class="dv">1</span>),</span>
-<span id="cb320-7"><a href="c10-sample-designs-replicate-weights.html#cb320-7" tabindex="-1"></a>         <span class="at">rep4 =</span> <span class="dv">2</span> <span class="sc">*</span> <span class="fu">c</span>(<span class="dv">0</span>, <span class="dv">1</span>, <span class="dv">0</span>, <span class="dv">1</span>, <span class="dv">1</span>, <span class="dv">0</span>))</span>
-<span id="cb320-8"><a href="c10-sample-designs-replicate-weights.html#cb320-8" tabindex="-1"></a></span>
-<span id="cb320-9"><a href="c10-sample-designs-replicate-weights.html#cb320-9" tabindex="-1"></a>scdbrr</span></code></pre></div>
+<p>The {survey} package includes a data example from Section 12.2 of <span class="citation">Levy and Lemeshow (<a href="#ref-levy2013sampling">2013</a>)</span>. In this fictional data, two out of five ambulance stations were sampled from each of three emergency service areas (ESAs), thus BRR weights are appropriate with 2 PSUs (stations) sampled in each stratum (ESA.) In the code below, we create BRR weights as was done by <span class="citation">Levy and Lemeshow (<a href="#ref-levy2013sampling">2013</a>)</span>.</p>
+<div class="sourceCode" id="cb321"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb321-1"><a href="c10-sample-designs-replicate-weights.html#cb321-1" tabindex="-1"></a>scdbrr <span class="ot">&lt;-</span> scd <span class="sc">%&gt;%</span></span>
+<span id="cb321-2"><a href="c10-sample-designs-replicate-weights.html#cb321-2" tabindex="-1"></a>  <span class="fu">as_tibble</span>() <span class="sc">%&gt;%</span></span>
+<span id="cb321-3"><a href="c10-sample-designs-replicate-weights.html#cb321-3" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">wt =</span> <span class="dv">5</span> <span class="sc">/</span> <span class="dv">2</span>,</span>
+<span id="cb321-4"><a href="c10-sample-designs-replicate-weights.html#cb321-4" tabindex="-1"></a>         <span class="at">rep1 =</span> <span class="dv">2</span> <span class="sc">*</span> <span class="fu">c</span>(<span class="dv">1</span>, <span class="dv">0</span>, <span class="dv">1</span>, <span class="dv">0</span>, <span class="dv">1</span>, <span class="dv">0</span>),</span>
+<span id="cb321-5"><a href="c10-sample-designs-replicate-weights.html#cb321-5" tabindex="-1"></a>         <span class="at">rep2 =</span> <span class="dv">2</span> <span class="sc">*</span> <span class="fu">c</span>(<span class="dv">1</span>, <span class="dv">0</span>, <span class="dv">0</span>, <span class="dv">1</span>, <span class="dv">0</span>, <span class="dv">1</span>),</span>
+<span id="cb321-6"><a href="c10-sample-designs-replicate-weights.html#cb321-6" tabindex="-1"></a>         <span class="at">rep3 =</span> <span class="dv">2</span> <span class="sc">*</span> <span class="fu">c</span>(<span class="dv">0</span>, <span class="dv">1</span>, <span class="dv">1</span>, <span class="dv">0</span>, <span class="dv">0</span>, <span class="dv">1</span>),</span>
+<span id="cb321-7"><a href="c10-sample-designs-replicate-weights.html#cb321-7" tabindex="-1"></a>         <span class="at">rep4 =</span> <span class="dv">2</span> <span class="sc">*</span> <span class="fu">c</span>(<span class="dv">0</span>, <span class="dv">1</span>, <span class="dv">0</span>, <span class="dv">1</span>, <span class="dv">1</span>, <span class="dv">0</span>))</span>
+<span id="cb321-8"><a href="c10-sample-designs-replicate-weights.html#cb321-8" tabindex="-1"></a></span>
+<span id="cb321-9"><a href="c10-sample-designs-replicate-weights.html#cb321-9" tabindex="-1"></a>scdbrr</span></code></pre></div>
 <pre><code>## # A tibble: 6 × 9
 ##     ESA ambulance arrests alive    wt  rep1  rep2  rep3  rep4
 ##   &lt;int&gt;     &lt;int&gt;   &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt;
@@ -1087,14 +1091,14 @@ <h4>Example<a href="c10-sample-designs-replicate-weights.html#example-7" class="
 ## 4     2         2     228    49   2.5     0     2     0     2
 ## 5     3         1     670    80   2.5     2     0     0     2
 ## 6     3         2     530    70   2.5     0     2     2     0</code></pre>
-<p>To specify the BRR weights, the following syntax is used:</p>
-<div class="sourceCode" id="cb322"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb322-1"><a href="c10-sample-designs-replicate-weights.html#cb322-1" tabindex="-1"></a>scdbrr_des <span class="ot">&lt;-</span> scdbrr <span class="sc">%&gt;%</span></span>
-<span id="cb322-2"><a href="c10-sample-designs-replicate-weights.html#cb322-2" tabindex="-1"></a>  <span class="fu">as_survey_rep</span>(<span class="at">type =</span> <span class="st">&quot;BRR&quot;</span>,</span>
-<span id="cb322-3"><a href="c10-sample-designs-replicate-weights.html#cb322-3" tabindex="-1"></a>                <span class="at">repweights =</span> <span class="fu">starts_with</span>(<span class="st">&quot;rep&quot;</span>),</span>
-<span id="cb322-4"><a href="c10-sample-designs-replicate-weights.html#cb322-4" tabindex="-1"></a>                <span class="at">combined_weights =</span> <span class="cn">FALSE</span>,  </span>
-<span id="cb322-5"><a href="c10-sample-designs-replicate-weights.html#cb322-5" tabindex="-1"></a>                <span class="at">weight =</span> wt)</span>
-<span id="cb322-6"><a href="c10-sample-designs-replicate-weights.html#cb322-6" tabindex="-1"></a></span>
-<span id="cb322-7"><a href="c10-sample-designs-replicate-weights.html#cb322-7" tabindex="-1"></a>scdbrr_des</span></code></pre></div>
+<p>To specify the BRR weights, we use the following syntax:</p>
+<div class="sourceCode" id="cb323"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb323-1"><a href="c10-sample-designs-replicate-weights.html#cb323-1" tabindex="-1"></a>scdbrr_des <span class="ot">&lt;-</span> scdbrr <span class="sc">%&gt;%</span></span>
+<span id="cb323-2"><a href="c10-sample-designs-replicate-weights.html#cb323-2" tabindex="-1"></a>  <span class="fu">as_survey_rep</span>(<span class="at">type =</span> <span class="st">&quot;BRR&quot;</span>,</span>
+<span id="cb323-3"><a href="c10-sample-designs-replicate-weights.html#cb323-3" tabindex="-1"></a>                <span class="at">repweights =</span> <span class="fu">starts_with</span>(<span class="st">&quot;rep&quot;</span>),</span>
+<span id="cb323-4"><a href="c10-sample-designs-replicate-weights.html#cb323-4" tabindex="-1"></a>                <span class="at">combined_weights =</span> <span class="cn">FALSE</span>,  </span>
+<span id="cb323-5"><a href="c10-sample-designs-replicate-weights.html#cb323-5" tabindex="-1"></a>                <span class="at">weight =</span> wt)</span>
+<span id="cb323-6"><a href="c10-sample-designs-replicate-weights.html#cb323-6" tabindex="-1"></a></span>
+<span id="cb323-7"><a href="c10-sample-designs-replicate-weights.html#cb323-7" tabindex="-1"></a>scdbrr_des</span></code></pre></div>
 <pre><code>## Call: Called via srvyr
 ## Balanced Repeated Replicates with 4 replicates.
 ## Sampling variables:
@@ -1103,7 +1107,7 @@ <h4>Example<a href="c10-sample-designs-replicate-weights.html#example-7" class="
 ## Data variables: 
 ##   - ESA (int), ambulance (int), arrests (dbl), alive (dbl), wt (dbl),
 ##     rep1 (dbl), rep2 (dbl), rep3 (dbl), rep4 (dbl)</code></pre>
-<div class="sourceCode" id="cb324"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb324-1"><a href="c10-sample-designs-replicate-weights.html#cb324-1" tabindex="-1"></a><span class="fu">summary</span>(scdbrr_des)</span></code></pre></div>
+<div class="sourceCode" id="cb325"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb325-1"><a href="c10-sample-designs-replicate-weights.html#cb325-1" tabindex="-1"></a><span class="fu">summary</span>(scdbrr_des)</span></code></pre></div>
 <pre><code>## Call: Called via srvyr
 ## Balanced Repeated Replicates with 4 replicates.
 ## Sampling variables:
@@ -1115,12 +1119,12 @@ <h4>Example<a href="c10-sample-designs-replicate-weights.html#example-7" class="
 ## Variables: 
 ## [1] &quot;ESA&quot;       &quot;ambulance&quot; &quot;arrests&quot;   &quot;alive&quot;     &quot;wt&quot;       
 ## [6] &quot;rep1&quot;      &quot;rep2&quot;      &quot;rep3&quot;      &quot;rep4&quot;</code></pre>
-<p>Note that <code>combined_weights</code> was specified as <code>FALSE</code> because these weights are simply specified as 0 and 2 and do not incorporate the overall weight. When printing the object, the type of replication is noted as Balanced Repeated Replicates, and the replicate weights and the weight variable are specified. Additionally, the summary lists the variables included.</p>
+<p>Note that <code>combined_weights</code> was specified as <code>FALSE</code> because these weights are simply specified as 0 and 2 and do not incorporate the overall weight. When printing the object, the type of replication is noted as Balanced Repeated Replicates, and the replicate weights and the weight variable are specified. Additionally, the summary lists the variables included in the data and design object.</p>
 </div>
 </div>
 <div id="fays-brr-method" class="section level3 hasAnchor" number="10.4.2">
 <h3><span class="header-section-number">10.4.2</span> Fay’s BRR method<a href="c10-sample-designs-replicate-weights.html#fays-brr-method" class="anchor-section" aria-label="Anchor link to header"></a></h3>
-<p>Fay’s BRR method for replicate weights is similar to the BRR method in that it uses a Hadamard matrix to construct replicate weights. However, rather than deleting PSUs for each replicate, with Fay’s BRR half of the PSUs have a replicate weight which is the main weight multiplied by <span class="math inline">\(\rho\)</span>, and the other half have the main weight multiplied by <span class="math inline">\((2-\rho)\)</span> where <span class="math inline">\(0 \le \rho &lt; 1\)</span>. Note that when <span class="math inline">\(\rho=0\)</span>, this is equivalent to the standard BRR weights, and as <span class="math inline">\(\rho\)</span> becomes closer to 1, this method is more similar to jackknife discussed in the next section. To obtain the value of <span class="math inline">\(\rho\)</span>, it is necessary to read the survey documentation (see Chapter <a href="c03-survey-data-documentation.html#c03-survey-data-documentation">3</a>).</p>
+<p>Fay’s BRR method for replicate weights is similar to the BRR method in that it uses a Hadamard matrix to construct replicate weights. However, rather than deleting PSUs for each replicate, with Fay’s BRR, half of the PSUs have a replicate weight, which is the main weight multiplied by <span class="math inline">\(\rho\)</span>, and the other half have the main weight multiplied by <span class="math inline">\((2-\rho)\)</span>, where <span class="math inline">\(0 \le \rho &lt; 1\)</span>. Note that when <span class="math inline">\(\rho=0\)</span>, this is equivalent to the standard BRR weights, and as <span class="math inline">\(\rho\)</span> becomes closer to 1, this method is more similar to jackknife discussed in Section <a href="c10-sample-designs-replicate-weights.html#samp-jackknife">10.4.3</a>. To obtain the value of <span class="math inline">\(\rho\)</span>, it is necessary to read the survey documentation (see Chapter <a href="c03-survey-data-documentation.html#c03-survey-data-documentation">3</a>.)</p>
 <div id="the-math-5" class="section level4 unnumbered hasAnchor">
 <h4>The math<a href="c10-sample-designs-replicate-weights.html#the-math-5" class="anchor-section" aria-label="Anchor link to header"></a></h4>
 <p>The standard error estimate for <span class="math inline">\(\hat{\theta}\)</span> is slightly different than the BRR, due to the addition of the multiplier of <span class="math inline">\(\rho\)</span>. Using the generic notation above, <span class="math inline">\(\alpha=\frac{1}{R \left(1-\rho\right)^2}\)</span> and <span class="math inline">\(\alpha_r=1 \text{ for all } r\)</span>. The standard error is calculated as:</p>
@@ -1128,26 +1132,26 @@ <h4>The math<a href="c10-sample-designs-replicate-weights.html#the-math-5" class
 </div>
 <div id="the-syntax-5" class="section level4 unnumbered hasAnchor">
 <h4>The syntax<a href="c10-sample-designs-replicate-weights.html#the-syntax-5" class="anchor-section" aria-label="Anchor link to header"></a></h4>
-<p>The syntax is very similar for BRR and Fay’s BRR. To specify a Fay’s BRR design, we need to specify the weight variable (<code>weights</code>), the replicate weight variables (<code>repweights</code>), the type of replicate weights is Fay’s BRR (<code>type = Fay</code>), whether the mean squared error should be used (<code>mse = TRUE</code>) or not (<code>mse = FALSE</code>), and Fay’s multiplier (<code>rho</code>). For example, if a dataset had WT0 for the main weight and had 20 BRR weights indicated as WT1, WT2, …, WT20, and Fay’s multiplier is 0.3, use the following syntax:</p>
-<div class="sourceCode" id="cb326"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb326-1"><a href="c10-sample-designs-replicate-weights.html#cb326-1" tabindex="-1"></a>fay_des <span class="ot">&lt;-</span> dat <span class="sc">%&gt;%</span></span>
-<span id="cb326-2"><a href="c10-sample-designs-replicate-weights.html#cb326-2" tabindex="-1"></a>  <span class="fu">as_survey_rep</span>(<span class="at">weights =</span> WT0,</span>
-<span id="cb326-3"><a href="c10-sample-designs-replicate-weights.html#cb326-3" tabindex="-1"></a>                <span class="at">repweights =</span> <span class="fu">num_range</span>(<span class="st">&quot;WT&quot;</span>, <span class="dv">1</span><span class="sc">:</span><span class="dv">20</span>),</span>
-<span id="cb326-4"><a href="c10-sample-designs-replicate-weights.html#cb326-4" tabindex="-1"></a>                <span class="at">type =</span> <span class="st">&quot;Fay&quot;</span>,</span>
-<span id="cb326-5"><a href="c10-sample-designs-replicate-weights.html#cb326-5" tabindex="-1"></a>                <span class="at">mse =</span> <span class="cn">TRUE</span>,</span>
-<span id="cb326-6"><a href="c10-sample-designs-replicate-weights.html#cb326-6" tabindex="-1"></a>                <span class="at">rho =</span> <span class="fl">0.3</span>)</span></code></pre></div>
+<p>The syntax is very similar for BRR and Fay’s BRR. To specify a Fay’s BRR design, we need to specify the weight variable (<code>weights</code>), the replicate weight variables (<code>repweights</code>), the type of replicate weights as Fay’s BRR (<code>type = Fay</code>), whether the mean squared error should be used (<code>mse = TRUE</code>) or not (<code>mse = FALSE</code>), and Fay’s multiplier (<code>rho</code>.) For example, if a dataset had WT0 for the main weight and had 20 BRR weights indicated as WT1, WT2, …, WT20, and Fay’s multiplier is 0.3, we use the following syntax:</p>
+<div class="sourceCode" id="cb327"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb327-1"><a href="c10-sample-designs-replicate-weights.html#cb327-1" tabindex="-1"></a>fay_des <span class="ot">&lt;-</span> dat <span class="sc">%&gt;%</span></span>
+<span id="cb327-2"><a href="c10-sample-designs-replicate-weights.html#cb327-2" tabindex="-1"></a>  <span class="fu">as_survey_rep</span>(<span class="at">weights =</span> WT0,</span>
+<span id="cb327-3"><a href="c10-sample-designs-replicate-weights.html#cb327-3" tabindex="-1"></a>                <span class="at">repweights =</span> <span class="fu">num_range</span>(<span class="st">&quot;WT&quot;</span>, <span class="dv">1</span><span class="sc">:</span><span class="dv">20</span>),</span>
+<span id="cb327-4"><a href="c10-sample-designs-replicate-weights.html#cb327-4" tabindex="-1"></a>                <span class="at">type =</span> <span class="st">&quot;Fay&quot;</span>,</span>
+<span id="cb327-5"><a href="c10-sample-designs-replicate-weights.html#cb327-5" tabindex="-1"></a>                <span class="at">mse =</span> <span class="cn">TRUE</span>,</span>
+<span id="cb327-6"><a href="c10-sample-designs-replicate-weights.html#cb327-6" tabindex="-1"></a>                <span class="at">rho =</span> <span class="fl">0.3</span>)</span></code></pre></div>
 </div>
 <div id="example-8" class="section level4 unnumbered hasAnchor">
 <h4>Example<a href="c10-sample-designs-replicate-weights.html#example-8" class="anchor-section" aria-label="Anchor link to header"></a></h4>
-<p>The 2015 RECS <span class="citation">(<a href="#ref-recs-2015-micro">U.S. Energy Information Administration 2017</a>)</span> uses Fay’s BRR weights with the final weight as NWEIGHT and replicate weights as BRRWT1 - BRRWT96 and the documentation specifies a Fay’s multiplier of 0.5. On the file, DOEID is a unique identifier for each respondent, TOTALDOL is the total cost of energy, TOTSQFT_EN is the total square footage of the residence, and REGOINC is the Census region. We have already pulled in the 2015 RECS data from the {srvyrexploR} package that provides data for this book. To specify the design for the <code>recs_2015</code> data, use the following syntax:</p>
-<div class="sourceCode" id="cb327"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb327-1"><a href="c10-sample-designs-replicate-weights.html#cb327-1" tabindex="-1"></a>recs_2015_des <span class="ot">&lt;-</span> recs_2015 <span class="sc">%&gt;%</span></span>
-<span id="cb327-2"><a href="c10-sample-designs-replicate-weights.html#cb327-2" tabindex="-1"></a>  <span class="fu">as_survey_rep</span>(<span class="at">weights =</span> NWEIGHT,</span>
-<span id="cb327-3"><a href="c10-sample-designs-replicate-weights.html#cb327-3" tabindex="-1"></a>                <span class="at">repweights =</span> BRRWT1<span class="sc">:</span>BRRWT96,</span>
-<span id="cb327-4"><a href="c10-sample-designs-replicate-weights.html#cb327-4" tabindex="-1"></a>                <span class="at">type =</span> <span class="st">&quot;Fay&quot;</span>,</span>
-<span id="cb327-5"><a href="c10-sample-designs-replicate-weights.html#cb327-5" tabindex="-1"></a>                <span class="at">rho =</span> <span class="fl">0.5</span>,</span>
-<span id="cb327-6"><a href="c10-sample-designs-replicate-weights.html#cb327-6" tabindex="-1"></a>                <span class="at">mse =</span> <span class="cn">TRUE</span>,</span>
-<span id="cb327-7"><a href="c10-sample-designs-replicate-weights.html#cb327-7" tabindex="-1"></a>                <span class="at">variables =</span> <span class="fu">c</span>(DOEID, TOTALDOL, TOTSQFT_EN, REGIONC))</span>
-<span id="cb327-8"><a href="c10-sample-designs-replicate-weights.html#cb327-8" tabindex="-1"></a></span>
-<span id="cb327-9"><a href="c10-sample-designs-replicate-weights.html#cb327-9" tabindex="-1"></a>recs_2015_des </span></code></pre></div>
+<p>The 2015 RECS <span class="citation">(<a href="#ref-recs-2015-micro">U.S. Energy Information Administration 2017</a>)</span> uses Fay’s BRR weights with the final weight as NWEIGHT and replicate weights as BRRWT1 - BRRWT96, and the documentation specifies a Fay’s multiplier of 0.5. On the file, DOEID is a unique identifier for each respondent, TOTALDOL is the total energy cost, TOTSQFT_EN is the total square footage of the residence, and REGOINC is the Census region. We use the 2015 RECS data from the {srvyrexploR} package that provides data for this book (see the prerequisites box at the beginning of this chapter.) To specify the design for the <code>recs_2015</code> data, we use the following syntax:</p>
+<div class="sourceCode" id="cb328"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb328-1"><a href="c10-sample-designs-replicate-weights.html#cb328-1" tabindex="-1"></a>recs_2015_des <span class="ot">&lt;-</span> recs_2015 <span class="sc">%&gt;%</span></span>
+<span id="cb328-2"><a href="c10-sample-designs-replicate-weights.html#cb328-2" tabindex="-1"></a>  <span class="fu">as_survey_rep</span>(<span class="at">weights =</span> NWEIGHT,</span>
+<span id="cb328-3"><a href="c10-sample-designs-replicate-weights.html#cb328-3" tabindex="-1"></a>                <span class="at">repweights =</span> BRRWT1<span class="sc">:</span>BRRWT96,</span>
+<span id="cb328-4"><a href="c10-sample-designs-replicate-weights.html#cb328-4" tabindex="-1"></a>                <span class="at">type =</span> <span class="st">&quot;Fay&quot;</span>,</span>
+<span id="cb328-5"><a href="c10-sample-designs-replicate-weights.html#cb328-5" tabindex="-1"></a>                <span class="at">rho =</span> <span class="fl">0.5</span>,</span>
+<span id="cb328-6"><a href="c10-sample-designs-replicate-weights.html#cb328-6" tabindex="-1"></a>                <span class="at">mse =</span> <span class="cn">TRUE</span>,</span>
+<span id="cb328-7"><a href="c10-sample-designs-replicate-weights.html#cb328-7" tabindex="-1"></a>                <span class="at">variables =</span> <span class="fu">c</span>(DOEID, TOTALDOL, TOTSQFT_EN, REGIONC))</span>
+<span id="cb328-8"><a href="c10-sample-designs-replicate-weights.html#cb328-8" tabindex="-1"></a></span>
+<span id="cb328-9"><a href="c10-sample-designs-replicate-weights.html#cb328-9" tabindex="-1"></a>recs_2015_des </span></code></pre></div>
 <pre><code>## Call: Called via srvyr
 ## Fay&#39;s variance method (rho= 0.5 ) with 96 replicates and MSE variances.
 ## Sampling variables:
@@ -1169,7 +1173,7 @@ <h4>Example<a href="c10-sample-designs-replicate-weights.html#example-8" class="
 ##   - weights: NWEIGHT 
 ## Data variables: 
 ##   - DOEID (dbl), TOTALDOL (dbl), TOTSQFT_EN (dbl), REGIONC (dbl)</code></pre>
-<div class="sourceCode" id="cb329"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb329-1"><a href="c10-sample-designs-replicate-weights.html#cb329-1" tabindex="-1"></a><span class="fu">summary</span>(recs_2015_des) </span></code></pre></div>
+<div class="sourceCode" id="cb330"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb330-1"><a href="c10-sample-designs-replicate-weights.html#cb330-1" tabindex="-1"></a><span class="fu">summary</span>(recs_2015_des) </span></code></pre></div>
 <pre><code>## Call: Called via srvyr
 ## Fay&#39;s variance method (rho= 0.5 ) with 96 replicates and MSE variances.
 ## Sampling variables:
@@ -1193,13 +1197,13 @@ <h4>Example<a href="c10-sample-designs-replicate-weights.html#example-8" class="
 ##   - DOEID (dbl), TOTALDOL (dbl), TOTSQFT_EN (dbl), REGIONC (dbl)
 ## Variables: 
 ## [1] &quot;DOEID&quot;      &quot;TOTALDOL&quot;   &quot;TOTSQFT_EN&quot; &quot;REGIONC&quot;</code></pre>
-<p>In specifying the design, the <code>variables</code> option was also used to include which variables might be used in analyses. This is optional but can make our object smaller and easier to work with. When printing the design object or looking at the summary, the replicate weight type is re-iterated as <code>Fay's variance method (rho= 0.5) with 96 replicates and MSE variances</code>, and the variables are included. No weight or probability summary is included in this output as we have seen in some other design objects.</p>
+<p>In specifying the design, the <code>variables</code> option was also used to include which variables might be used in analyses. This is optional but can make our object smaller and easier to work with. When printing the design object or looking at the summary, the replicate weight type is re-iterated as <code>Fay's variance method (rho= 0.5) with 96 replicates and MSE variances</code>, and the variables are included. No weight or probability summary is included in this output, as we have seen in some other design objects.</p>
 </div>
 </div>
-<div id="jackknife-method" class="section level3 hasAnchor" number="10.4.3">
-<h3><span class="header-section-number">10.4.3</span> Jackknife method<a href="c10-sample-designs-replicate-weights.html#jackknife-method" class="anchor-section" aria-label="Anchor link to header"></a></h3>
-<p>There are three jackknife estimators implemented in {srvyr} - jackknife 1 (JK1), jackknife n (JKn), and jackknife 2 (JK2). The JK1 method can be used for unstratified designs, and replicates are created by removing one PSU at a time so the number of replicates is the same as the number of PSUs. If there is no clustering, then the PSU is the ultimate sampling unit (e.g., unit).</p>
-<p>The JKn method is used for stratified designs and requires two or more PSUs per stratum. In this case, each replicate is created by deleting one PSU from a single stratum, so the number of replicates is the number of total PSUs across all strata. The JK2 method is a special case of JKn when there are exactly 2 PSUs sampled per stratum. For variance estimation, scaling constants must also be specified.</p>
+<div id="samp-jackknife" class="section level3 hasAnchor" number="10.4.3">
+<h3><span class="header-section-number">10.4.3</span> Jackknife method<a href="c10-sample-designs-replicate-weights.html#samp-jackknife" class="anchor-section" aria-label="Anchor link to header"></a></h3>
+<p>There are three jackknife estimators implemented in {srvyr} - jackknife 1 (JK1), jackknife n (JKn), and jackknife 2 (JK2.) The JK1 method can be used for unstratified designs, and replicates are created by removing one PSU at a time so the number of replicates is the same as the number of PSUs. If there is no clustering, then the PSU is the ultimate sampling unit (e.g., students.)</p>
+<p>The JKn method is used for stratified designs and requires two or more PSUs per stratum. In this case, each replicate is created by deleting one PSU from a single stratum, so the number of replicates is the number of total PSUs across all strata. The JK2 method is a special case of JKn when there are exactly 2 PSUs sampled per stratum. For variance estimation, we also need to specify the scaling constants.</p>
 <div id="the-math-6" class="section level4 unnumbered hasAnchor">
 <h4>The math<a href="c10-sample-designs-replicate-weights.html#the-math-6" class="anchor-section" aria-label="Anchor link to header"></a></h4>
 <p>Using the generic notation above, <span class="math inline">\(\alpha=\frac{R-1}{R}\)</span> and <span class="math inline">\(\alpha_r=1 \text{ for all } r\)</span>. For the JK1 method, the standard error estimate for <span class="math inline">\(\hat{\theta}\)</span> is calculated as:</p>
@@ -1209,36 +1213,36 @@ <h4>The math<a href="c10-sample-designs-replicate-weights.html#the-math-6" class
 </div>
 <div id="the-syntax-6" class="section level4 unnumbered hasAnchor">
 <h4>The syntax<a href="c10-sample-designs-replicate-weights.html#the-syntax-6" class="anchor-section" aria-label="Anchor link to header"></a></h4>
-<p>To specify the jackknife method, we use the survey documentation to understand the type of jackknife (1, n, or 2) and the multiplier. In the syntax we need to specify the weight variable (<code>weights</code>), the replicate weight variables (<code>repweights</code>), the type of replicate weights as jackknife 1 (<code>type = "JK1"</code>), n (<code>type = "JKN"</code>), or 2 (<code>type = "JK2"</code>), whether the mean squared error should be used (<code>mse = TRUE</code>) or not (<code>mse = FALSE</code>), and the multiplier (<code>scale</code>). For example, if the survey is a jackknife 1 method with a multiplier of <span class="math inline">\(\alpha_r=(R-1)/R=19/20=0.95\)</span>, the dataset has WT0 for the main weight and 20 replicate weights indicated as WT1, WT2, …, WT20, use the following syntax:</p>
-<div class="sourceCode" id="cb331"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb331-1"><a href="c10-sample-designs-replicate-weights.html#cb331-1" tabindex="-1"></a>jk1_des <span class="ot">&lt;-</span> dat <span class="sc">%&gt;%</span></span>
-<span id="cb331-2"><a href="c10-sample-designs-replicate-weights.html#cb331-2" tabindex="-1"></a> <span class="fu">as_survey_rep</span>(<span class="at">weights =</span> WT0, </span>
-<span id="cb331-3"><a href="c10-sample-designs-replicate-weights.html#cb331-3" tabindex="-1"></a>               <span class="at">repweights=</span> <span class="fu">num_range</span>(<span class="st">&quot;WT&quot;</span>, <span class="dv">1</span><span class="sc">:</span><span class="dv">20</span>),</span>
-<span id="cb331-4"><a href="c10-sample-designs-replicate-weights.html#cb331-4" tabindex="-1"></a>               <span class="at">type=</span><span class="st">&quot;JK1&quot;</span>, </span>
-<span id="cb331-5"><a href="c10-sample-designs-replicate-weights.html#cb331-5" tabindex="-1"></a>               <span class="at">mse=</span><span class="cn">TRUE</span>, </span>
-<span id="cb331-6"><a href="c10-sample-designs-replicate-weights.html#cb331-6" tabindex="-1"></a>               <span class="at">scale=</span><span class="fl">0.95</span>)</span></code></pre></div>
-<p>For a jackknife n method, we need to specify the multiplier for all replicates. In this case we use the <code>rscales</code> argument to specify each one. The documentation will provide details on what the multipliers (<span class="math inline">\(\alpha_r\)</span>) are, and they may be the same for all replicates. For example, consider a case where <span class="math inline">\(\alpha_r=0.1\)</span> for all replicates and the dataset had WT0 for the main weight and had 20 replicate weights indicated as WT1, WT2, …, WT20. We specify the type as <code>type = "JKN"</code>, and the multiplier as <code>rscales=rep(0.1,20)</code>:</p>
-<div class="sourceCode" id="cb332"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb332-1"><a href="c10-sample-designs-replicate-weights.html#cb332-1" tabindex="-1"></a>jkn_des <span class="ot">&lt;-</span> dat <span class="sc">%&gt;%</span></span>
+<p>To specify the jackknife method, we use the survey documentation to understand the type of jackknife (1, n, or 2) and the multiplier. In the syntax, we need to specify the weight variable (<code>weights</code>), the replicate weight variables (<code>repweights</code>), the type of replicate weights as jackknife 1 (<code>type = "JK1"</code>), n (<code>type = "JKN"</code>), or 2 (<code>type = "JK2"</code>), whether the mean squared error should be used (<code>mse = TRUE</code>) or not (<code>mse = FALSE</code>), and the multiplier (<code>scale</code>.) For example, if the survey is a jackknife 1 method with a multiplier of <span class="math inline">\(\alpha_r=(R-1)/R=19/20=0.95\)</span>, the dataset has WT0 for the main weight and 20 replicate weights indicated as WT1, WT2, …, WT20, we use the following syntax:</p>
+<div class="sourceCode" id="cb332"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb332-1"><a href="c10-sample-designs-replicate-weights.html#cb332-1" tabindex="-1"></a>jk1_des <span class="ot">&lt;-</span> dat <span class="sc">%&gt;%</span></span>
 <span id="cb332-2"><a href="c10-sample-designs-replicate-weights.html#cb332-2" tabindex="-1"></a> <span class="fu">as_survey_rep</span>(<span class="at">weights =</span> WT0, </span>
 <span id="cb332-3"><a href="c10-sample-designs-replicate-weights.html#cb332-3" tabindex="-1"></a>               <span class="at">repweights=</span> <span class="fu">num_range</span>(<span class="st">&quot;WT&quot;</span>, <span class="dv">1</span><span class="sc">:</span><span class="dv">20</span>),</span>
-<span id="cb332-4"><a href="c10-sample-designs-replicate-weights.html#cb332-4" tabindex="-1"></a>               <span class="at">type=</span><span class="st">&quot;JKN&quot;</span>, </span>
+<span id="cb332-4"><a href="c10-sample-designs-replicate-weights.html#cb332-4" tabindex="-1"></a>               <span class="at">type=</span><span class="st">&quot;JK1&quot;</span>, </span>
 <span id="cb332-5"><a href="c10-sample-designs-replicate-weights.html#cb332-5" tabindex="-1"></a>               <span class="at">mse=</span><span class="cn">TRUE</span>, </span>
-<span id="cb332-6"><a href="c10-sample-designs-replicate-weights.html#cb332-6" tabindex="-1"></a>               <span class="at">rscales=</span><span class="fu">rep</span>(<span class="fl">0.1</span>, <span class="dv">20</span>))</span></code></pre></div>
+<span id="cb332-6"><a href="c10-sample-designs-replicate-weights.html#cb332-6" tabindex="-1"></a>               <span class="at">scale=</span><span class="fl">0.95</span>)</span></code></pre></div>
+<p>For a jackknife n method, we need to specify the multiplier for all replicates. In this case, we use the <code>rscales</code> argument to specify each one. The documentation provides details on what the multipliers (<span class="math inline">\(\alpha_r\)</span>) are, and they may be the same for all replicates. For example, consider a case where <span class="math inline">\(\alpha_r=0.1\)</span> for all replicates, and the dataset had WT0 for the main weight and had 20 replicate weights indicated as WT1, WT2, …, WT20. We specify the type as <code>type = "JKN"</code>, and the multiplier as <code>rscales=rep(0.1,20)</code>:</p>
+<div class="sourceCode" id="cb333"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb333-1"><a href="c10-sample-designs-replicate-weights.html#cb333-1" tabindex="-1"></a>jkn_des <span class="ot">&lt;-</span> dat <span class="sc">%&gt;%</span></span>
+<span id="cb333-2"><a href="c10-sample-designs-replicate-weights.html#cb333-2" tabindex="-1"></a> <span class="fu">as_survey_rep</span>(<span class="at">weights =</span> WT0, </span>
+<span id="cb333-3"><a href="c10-sample-designs-replicate-weights.html#cb333-3" tabindex="-1"></a>               <span class="at">repweights=</span> <span class="fu">num_range</span>(<span class="st">&quot;WT&quot;</span>, <span class="dv">1</span><span class="sc">:</span><span class="dv">20</span>),</span>
+<span id="cb333-4"><a href="c10-sample-designs-replicate-weights.html#cb333-4" tabindex="-1"></a>               <span class="at">type=</span><span class="st">&quot;JKN&quot;</span>, </span>
+<span id="cb333-5"><a href="c10-sample-designs-replicate-weights.html#cb333-5" tabindex="-1"></a>               <span class="at">mse=</span><span class="cn">TRUE</span>, </span>
+<span id="cb333-6"><a href="c10-sample-designs-replicate-weights.html#cb333-6" tabindex="-1"></a>               <span class="at">rscales=</span><span class="fu">rep</span>(<span class="fl">0.1</span>, <span class="dv">20</span>))</span></code></pre></div>
 </div>
 <div id="example-9" class="section level4 unnumbered hasAnchor">
 <h4>Example<a href="c10-sample-designs-replicate-weights.html#example-9" class="anchor-section" aria-label="Anchor link to header"></a></h4>
-<p>The 2020 RECS <span class="citation">(<a href="#ref-recs-2020-micro">U.S. Energy Information Administration 2023c</a>)</span> uses jackknife weights with the final weight as NWEIGHT and replicate weights as NWEIGHT1 - NWEIGHT60 with a scale of <span class="math inline">\((R-1)/R=59/60\)</span>. On the file, DOEID is a unique identifier for each respondent, TOTALDOL is the total cost of energy, TOTSQFT_EN is the total square footage of the residence, and REGOINC is the Census region. We have already read in the RECS data and created a dataset called <code>recs_2020</code> above in the prerequisites.</p>
-<p>To specify this design, use the following syntax:</p>
-<div class="sourceCode" id="cb333"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb333-1"><a href="c10-sample-designs-replicate-weights.html#cb333-1" tabindex="-1"></a>recs_des <span class="ot">&lt;-</span> recs_2020 <span class="sc">%&gt;%</span></span>
-<span id="cb333-2"><a href="c10-sample-designs-replicate-weights.html#cb333-2" tabindex="-1"></a>  <span class="fu">as_survey_rep</span>(</span>
-<span id="cb333-3"><a href="c10-sample-designs-replicate-weights.html#cb333-3" tabindex="-1"></a>    <span class="at">weights =</span> NWEIGHT,</span>
-<span id="cb333-4"><a href="c10-sample-designs-replicate-weights.html#cb333-4" tabindex="-1"></a>    <span class="at">repweights =</span> NWEIGHT1<span class="sc">:</span>NWEIGHT60,</span>
-<span id="cb333-5"><a href="c10-sample-designs-replicate-weights.html#cb333-5" tabindex="-1"></a>    <span class="at">type =</span> <span class="st">&quot;JK1&quot;</span>,</span>
-<span id="cb333-6"><a href="c10-sample-designs-replicate-weights.html#cb333-6" tabindex="-1"></a>    <span class="at">scale =</span> <span class="dv">59</span><span class="sc">/</span><span class="dv">60</span>,</span>
-<span id="cb333-7"><a href="c10-sample-designs-replicate-weights.html#cb333-7" tabindex="-1"></a>    <span class="at">mse =</span> <span class="cn">TRUE</span>,</span>
-<span id="cb333-8"><a href="c10-sample-designs-replicate-weights.html#cb333-8" tabindex="-1"></a>    <span class="at">variables =</span> <span class="fu">c</span>(DOEID, TOTALDOL, TOTSQFT_EN, REGIONC)</span>
-<span id="cb333-9"><a href="c10-sample-designs-replicate-weights.html#cb333-9" tabindex="-1"></a>  )</span>
-<span id="cb333-10"><a href="c10-sample-designs-replicate-weights.html#cb333-10" tabindex="-1"></a></span>
-<span id="cb333-11"><a href="c10-sample-designs-replicate-weights.html#cb333-11" tabindex="-1"></a>recs_des</span></code></pre></div>
+<p>The 2020 RECS <span class="citation">(<a href="#ref-recs-2020-micro">U.S. Energy Information Administration 2023c</a>)</span> uses jackknife weights with the final weight as NWEIGHT and replicate weights as NWEIGHT1 - NWEIGHT60 with a scale of <span class="math inline">\((R-1)/R=59/60\)</span>. On the file, DOEID is a unique identifier for each respondent, TOTALDOL is the total cost of energy, TOTSQFT_EN is the total square footage of the residence, and REGOINC is the Census region. We use the 2020 RECS data from the {srvyrexploR} package that provides data for this book (see the prerequisites box at the beginning of this chapter.)</p>
+<p>To specify this design, we use the following syntax:</p>
+<div class="sourceCode" id="cb334"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb334-1"><a href="c10-sample-designs-replicate-weights.html#cb334-1" tabindex="-1"></a>recs_des <span class="ot">&lt;-</span> recs_2020 <span class="sc">%&gt;%</span></span>
+<span id="cb334-2"><a href="c10-sample-designs-replicate-weights.html#cb334-2" tabindex="-1"></a>  <span class="fu">as_survey_rep</span>(</span>
+<span id="cb334-3"><a href="c10-sample-designs-replicate-weights.html#cb334-3" tabindex="-1"></a>    <span class="at">weights =</span> NWEIGHT,</span>
+<span id="cb334-4"><a href="c10-sample-designs-replicate-weights.html#cb334-4" tabindex="-1"></a>    <span class="at">repweights =</span> NWEIGHT1<span class="sc">:</span>NWEIGHT60,</span>
+<span id="cb334-5"><a href="c10-sample-designs-replicate-weights.html#cb334-5" tabindex="-1"></a>    <span class="at">type =</span> <span class="st">&quot;JK1&quot;</span>,</span>
+<span id="cb334-6"><a href="c10-sample-designs-replicate-weights.html#cb334-6" tabindex="-1"></a>    <span class="at">scale =</span> <span class="dv">59</span><span class="sc">/</span><span class="dv">60</span>,</span>
+<span id="cb334-7"><a href="c10-sample-designs-replicate-weights.html#cb334-7" tabindex="-1"></a>    <span class="at">mse =</span> <span class="cn">TRUE</span>,</span>
+<span id="cb334-8"><a href="c10-sample-designs-replicate-weights.html#cb334-8" tabindex="-1"></a>    <span class="at">variables =</span> <span class="fu">c</span>(DOEID, TOTALDOL, TOTSQFT_EN, REGIONC)</span>
+<span id="cb334-9"><a href="c10-sample-designs-replicate-weights.html#cb334-9" tabindex="-1"></a>  )</span>
+<span id="cb334-10"><a href="c10-sample-designs-replicate-weights.html#cb334-10" tabindex="-1"></a></span>
+<span id="cb334-11"><a href="c10-sample-designs-replicate-weights.html#cb334-11" tabindex="-1"></a>recs_des</span></code></pre></div>
 <pre><code>## Call: Called via srvyr
 ## Unstratified cluster jacknife (JK1) with 60 replicates and MSE variances.
 ## Sampling variables:
@@ -1257,7 +1261,7 @@ <h4>Example<a href="c10-sample-designs-replicate-weights.html#example-9" class="
 ##   - weights: NWEIGHT 
 ## Data variables: 
 ##   - DOEID (dbl), TOTALDOL (dbl), TOTSQFT_EN (dbl), REGIONC (chr)</code></pre>
-<div class="sourceCode" id="cb335"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb335-1"><a href="c10-sample-designs-replicate-weights.html#cb335-1" tabindex="-1"></a><span class="fu">summary</span>(recs_des)</span></code></pre></div>
+<div class="sourceCode" id="cb336"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb336-1"><a href="c10-sample-designs-replicate-weights.html#cb336-1" tabindex="-1"></a><span class="fu">summary</span>(recs_des)</span></code></pre></div>
 <pre><code>## Call: Called via srvyr
 ## Unstratified cluster jacknife (JK1) with 60 replicates and MSE variances.
 ## Sampling variables:
@@ -1283,50 +1287,50 @@ <h4>Example<a href="c10-sample-designs-replicate-weights.html#example-9" class="
 </div>
 <div id="bootstrap-method" class="section level3 hasAnchor" number="10.4.4">
 <h3><span class="header-section-number">10.4.4</span> Bootstrap method<a href="c10-sample-designs-replicate-weights.html#bootstrap-method" class="anchor-section" aria-label="Anchor link to header"></a></h3>
-<p>In bootstrap resampling, replicates are created by selecting random samples of the PSUs with replacement (SRSWR). If there are <span class="math inline">\(M\)</span> PSUs in the sample, then each replicate will be created by selecting a random sample of <span class="math inline">\(M\)</span> PSUs with replacement. Each replicate is created independently, and the weights for each replicate are adjusted to reflect the population, generally using the same method as how the analysis weight was adjusted.</p>
+<p>In bootstrap resampling, replicates are created by selecting random samples of the PSUs with replacement (SRSWR.) If there are <span class="math inline">\(A\)</span> PSUs in the sample, then each replicate is created by selecting a random sample of <span class="math inline">\(A\)</span> PSUs with replacement. Each replicate is created independently, and the weights for each replicate are adjusted to reflect the population, generally using the same method as how the analysis weight was adjusted.</p>
 <div id="the-math-7" class="section level4 unnumbered hasAnchor">
 <h4>The math<a href="c10-sample-designs-replicate-weights.html#the-math-7" class="anchor-section" aria-label="Anchor link to header"></a></h4>
 <p>A weighted estimate for the full sample is calculated as <span class="math inline">\(\hat{\theta}\)</span>, and then a weighted estimate for each replicate is calculated as <span class="math inline">\(\hat{\theta}_r\)</span> for <span class="math inline">\(R\)</span> replicates. Then the standard error of the estimate is calculated as follows:</p>
 <p><span class="math display">\[se(\hat{\theta})=\sqrt{\alpha \sum_{r=1}^R \left( \hat{\theta}_r-\hat{\theta}\right)^2}\]</span></p>
-<p>where <span class="math inline">\(\alpha\)</span> is the scaling constant. Note that the scaling constant (<span class="math inline">\(\alpha\)</span>) is provided in the survey documentation as there are many types of bootstrap methods which generate custom scaling constants.</p>
+<p>where <span class="math inline">\(\alpha\)</span> is the scaling constant. Note that the scaling constant (<span class="math inline">\(\alpha\)</span>) is provided in the survey documentation, as there are many types of bootstrap methods that generate custom scaling constants.</p>
 </div>
 <div id="the-syntax-7" class="section level4 unnumbered hasAnchor">
 <h4>The syntax<a href="c10-sample-designs-replicate-weights.html#the-syntax-7" class="anchor-section" aria-label="Anchor link to header"></a></h4>
-<p>To specify a bootstrap method, we need to specify the weight variable (<code>weights</code>), the replicate weight variables (<code>repweights</code>), the type of replicate weights as bootstrap (<code>type = "bootstrap"</code>), whether the mean squared error should be used (<code>mse = TRUE</code>) or not (<code>mse = FALSE</code>), and the multiplier (<code>scale</code>). For example, if a dataset had WT0 for the main weight, 20 bootstrap weights indicated WT1, WT2, …, WT20, and a multiplier of <span class="math inline">\(\alpha=.02\)</span>, use the following syntax:</p>
-<div class="sourceCode" id="cb337"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb337-1"><a href="c10-sample-designs-replicate-weights.html#cb337-1" tabindex="-1"></a>bs_des <span class="ot">&lt;-</span> dat <span class="sc">%&gt;%</span></span>
-<span id="cb337-2"><a href="c10-sample-designs-replicate-weights.html#cb337-2" tabindex="-1"></a> <span class="fu">as_survey_rep</span>(<span class="at">weights =</span> WT0, </span>
-<span id="cb337-3"><a href="c10-sample-designs-replicate-weights.html#cb337-3" tabindex="-1"></a>               <span class="at">repweights=</span> <span class="fu">num_range</span>(<span class="st">&quot;WT&quot;</span>, <span class="dv">1</span><span class="sc">:</span><span class="dv">20</span>),</span>
-<span id="cb337-4"><a href="c10-sample-designs-replicate-weights.html#cb337-4" tabindex="-1"></a>               <span class="at">type=</span><span class="st">&quot;bootstrap&quot;</span>, </span>
-<span id="cb337-5"><a href="c10-sample-designs-replicate-weights.html#cb337-5" tabindex="-1"></a>               <span class="at">mse=</span><span class="cn">TRUE</span>, </span>
-<span id="cb337-6"><a href="c10-sample-designs-replicate-weights.html#cb337-6" tabindex="-1"></a>               <span class="at">scale=</span>.<span class="dv">02</span>)</span></code></pre></div>
+<p>To specify a bootstrap method, we need to specify the weight variable (<code>weights</code>), the replicate weight variables (<code>repweights</code>), the type of replicate weights as bootstrap (<code>type = "bootstrap"</code>), whether the mean squared error should be used (<code>mse = TRUE</code>) or not (<code>mse = FALSE</code>), and the multiplier (<code>scale</code>.) For example, if a dataset had WT0 for the main weight, 20 bootstrap weights indicated WT1, WT2, …, WT20, and a multiplier of <span class="math inline">\(\alpha=.02\)</span>, we use the following syntax:</p>
+<div class="sourceCode" id="cb338"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb338-1"><a href="c10-sample-designs-replicate-weights.html#cb338-1" tabindex="-1"></a>bs_des <span class="ot">&lt;-</span> dat <span class="sc">%&gt;%</span></span>
+<span id="cb338-2"><a href="c10-sample-designs-replicate-weights.html#cb338-2" tabindex="-1"></a> <span class="fu">as_survey_rep</span>(<span class="at">weights =</span> WT0, </span>
+<span id="cb338-3"><a href="c10-sample-designs-replicate-weights.html#cb338-3" tabindex="-1"></a>               <span class="at">repweights=</span> <span class="fu">num_range</span>(<span class="st">&quot;WT&quot;</span>, <span class="dv">1</span><span class="sc">:</span><span class="dv">20</span>),</span>
+<span id="cb338-4"><a href="c10-sample-designs-replicate-weights.html#cb338-4" tabindex="-1"></a>               <span class="at">type=</span><span class="st">&quot;bootstrap&quot;</span>, </span>
+<span id="cb338-5"><a href="c10-sample-designs-replicate-weights.html#cb338-5" tabindex="-1"></a>               <span class="at">mse=</span><span class="cn">TRUE</span>, </span>
+<span id="cb338-6"><a href="c10-sample-designs-replicate-weights.html#cb338-6" tabindex="-1"></a>               <span class="at">scale=</span>.<span class="dv">02</span>)</span></code></pre></div>
 </div>
 <div id="example-10" class="section level4 unnumbered hasAnchor">
 <h4>Example<a href="c10-sample-designs-replicate-weights.html#example-10" class="anchor-section" aria-label="Anchor link to header"></a></h4>
-<p>Returning to the api example, we are going to create a dataset with bootstrap weights to use as an example. In this example, we construct a one-cluster design with fifty replicate weights.<a href="#fn26" class="footnote-ref" id="fnref26"><sup>26</sup></a></p>
-<div class="sourceCode" id="cb338"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb338-1"><a href="c10-sample-designs-replicate-weights.html#cb338-1" tabindex="-1"></a>apiclus1_slim <span class="ot">&lt;-</span></span>
-<span id="cb338-2"><a href="c10-sample-designs-replicate-weights.html#cb338-2" tabindex="-1"></a>  apiclus1 <span class="sc">%&gt;%</span></span>
-<span id="cb338-3"><a href="c10-sample-designs-replicate-weights.html#cb338-3" tabindex="-1"></a>  <span class="fu">as_tibble</span>() <span class="sc">%&gt;%</span></span>
-<span id="cb338-4"><a href="c10-sample-designs-replicate-weights.html#cb338-4" tabindex="-1"></a>  <span class="fu">arrange</span>(dnum) <span class="sc">%&gt;%</span></span>
-<span id="cb338-5"><a href="c10-sample-designs-replicate-weights.html#cb338-5" tabindex="-1"></a>  <span class="fu">select</span>(cds, dnum, fpc, pw)</span>
-<span id="cb338-6"><a href="c10-sample-designs-replicate-weights.html#cb338-6" tabindex="-1"></a></span>
-<span id="cb338-7"><a href="c10-sample-designs-replicate-weights.html#cb338-7" tabindex="-1"></a><span class="fu">set.seed</span>(<span class="dv">662152</span>)</span>
-<span id="cb338-8"><a href="c10-sample-designs-replicate-weights.html#cb338-8" tabindex="-1"></a>apibw <span class="ot">&lt;-</span></span>
-<span id="cb338-9"><a href="c10-sample-designs-replicate-weights.html#cb338-9" tabindex="-1"></a>  <span class="fu">bootweights</span>(<span class="at">psu =</span> apiclus1_slim<span class="sc">$</span>dnum,</span>
-<span id="cb338-10"><a href="c10-sample-designs-replicate-weights.html#cb338-10" tabindex="-1"></a>              <span class="at">strata =</span> <span class="fu">rep</span>(<span class="dv">1</span>, <span class="fu">nrow</span>(apiclus1_slim)),</span>
-<span id="cb338-11"><a href="c10-sample-designs-replicate-weights.html#cb338-11" tabindex="-1"></a>              <span class="at">fpc =</span> apiclus1_slim<span class="sc">$</span>fpc,</span>
-<span id="cb338-12"><a href="c10-sample-designs-replicate-weights.html#cb338-12" tabindex="-1"></a>              <span class="at">replicates =</span> <span class="dv">50</span>)</span>
-<span id="cb338-13"><a href="c10-sample-designs-replicate-weights.html#cb338-13" tabindex="-1"></a></span>
-<span id="cb338-14"><a href="c10-sample-designs-replicate-weights.html#cb338-14" tabindex="-1"></a>bwmata <span class="ot">&lt;-</span></span>
-<span id="cb338-15"><a href="c10-sample-designs-replicate-weights.html#cb338-15" tabindex="-1"></a>  apibw<span class="sc">$</span>repweights<span class="sc">$</span>weights[apibw<span class="sc">$</span>repweights<span class="sc">$</span>index,] <span class="sc">*</span> apiclus1_slim<span class="sc">$</span>pw</span>
-<span id="cb338-16"><a href="c10-sample-designs-replicate-weights.html#cb338-16" tabindex="-1"></a></span>
-<span id="cb338-17"><a href="c10-sample-designs-replicate-weights.html#cb338-17" tabindex="-1"></a>apiclus1_slim <span class="ot">&lt;-</span> bwmata <span class="sc">%&gt;%</span></span>
-<span id="cb338-18"><a href="c10-sample-designs-replicate-weights.html#cb338-18" tabindex="-1"></a>  <span class="fu">as.data.frame</span>() <span class="sc">%&gt;%</span></span>
-<span id="cb338-19"><a href="c10-sample-designs-replicate-weights.html#cb338-19" tabindex="-1"></a>  <span class="fu">set_names</span>(<span class="fu">str_c</span>(<span class="st">&quot;pw&quot;</span>, <span class="dv">1</span><span class="sc">:</span><span class="dv">50</span>)) <span class="sc">%&gt;%</span></span>
-<span id="cb338-20"><a href="c10-sample-designs-replicate-weights.html#cb338-20" tabindex="-1"></a>  <span class="fu">cbind</span>(apiclus1_slim) <span class="sc">%&gt;%</span></span>
-<span id="cb338-21"><a href="c10-sample-designs-replicate-weights.html#cb338-21" tabindex="-1"></a>  <span class="fu">as_tibble</span>() <span class="sc">%&gt;%</span></span>
-<span id="cb338-22"><a href="c10-sample-designs-replicate-weights.html#cb338-22" tabindex="-1"></a>  <span class="fu">select</span>(cds, dnum, fpc, pw, <span class="fu">everything</span>())</span>
-<span id="cb338-23"><a href="c10-sample-designs-replicate-weights.html#cb338-23" tabindex="-1"></a></span>
-<span id="cb338-24"><a href="c10-sample-designs-replicate-weights.html#cb338-24" tabindex="-1"></a>apiclus1_slim</span></code></pre></div>
+<p>Returning to the api example, we are going to create a dataset with bootstrap weights to use as an example. In this example, we construct a one-cluster design with fifty replicate weights.<a href="#fn27" class="footnote-ref" id="fnref27"><sup>27</sup></a></p>
+<div class="sourceCode" id="cb339"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb339-1"><a href="c10-sample-designs-replicate-weights.html#cb339-1" tabindex="-1"></a>apiclus1_slim <span class="ot">&lt;-</span></span>
+<span id="cb339-2"><a href="c10-sample-designs-replicate-weights.html#cb339-2" tabindex="-1"></a>  apiclus1 <span class="sc">%&gt;%</span></span>
+<span id="cb339-3"><a href="c10-sample-designs-replicate-weights.html#cb339-3" tabindex="-1"></a>  <span class="fu">as_tibble</span>() <span class="sc">%&gt;%</span></span>
+<span id="cb339-4"><a href="c10-sample-designs-replicate-weights.html#cb339-4" tabindex="-1"></a>  <span class="fu">arrange</span>(dnum) <span class="sc">%&gt;%</span></span>
+<span id="cb339-5"><a href="c10-sample-designs-replicate-weights.html#cb339-5" tabindex="-1"></a>  <span class="fu">select</span>(cds, dnum, fpc, pw)</span>
+<span id="cb339-6"><a href="c10-sample-designs-replicate-weights.html#cb339-6" tabindex="-1"></a></span>
+<span id="cb339-7"><a href="c10-sample-designs-replicate-weights.html#cb339-7" tabindex="-1"></a><span class="fu">set.seed</span>(<span class="dv">662152</span>)</span>
+<span id="cb339-8"><a href="c10-sample-designs-replicate-weights.html#cb339-8" tabindex="-1"></a>apibw <span class="ot">&lt;-</span></span>
+<span id="cb339-9"><a href="c10-sample-designs-replicate-weights.html#cb339-9" tabindex="-1"></a>  <span class="fu">bootweights</span>(<span class="at">psu =</span> apiclus1_slim<span class="sc">$</span>dnum,</span>
+<span id="cb339-10"><a href="c10-sample-designs-replicate-weights.html#cb339-10" tabindex="-1"></a>              <span class="at">strata =</span> <span class="fu">rep</span>(<span class="dv">1</span>, <span class="fu">nrow</span>(apiclus1_slim)),</span>
+<span id="cb339-11"><a href="c10-sample-designs-replicate-weights.html#cb339-11" tabindex="-1"></a>              <span class="at">fpc =</span> apiclus1_slim<span class="sc">$</span>fpc,</span>
+<span id="cb339-12"><a href="c10-sample-designs-replicate-weights.html#cb339-12" tabindex="-1"></a>              <span class="at">replicates =</span> <span class="dv">50</span>)</span>
+<span id="cb339-13"><a href="c10-sample-designs-replicate-weights.html#cb339-13" tabindex="-1"></a></span>
+<span id="cb339-14"><a href="c10-sample-designs-replicate-weights.html#cb339-14" tabindex="-1"></a>bwmata <span class="ot">&lt;-</span></span>
+<span id="cb339-15"><a href="c10-sample-designs-replicate-weights.html#cb339-15" tabindex="-1"></a>  apibw<span class="sc">$</span>repweights<span class="sc">$</span>weights[apibw<span class="sc">$</span>repweights<span class="sc">$</span>index,] <span class="sc">*</span> apiclus1_slim<span class="sc">$</span>pw</span>
+<span id="cb339-16"><a href="c10-sample-designs-replicate-weights.html#cb339-16" tabindex="-1"></a></span>
+<span id="cb339-17"><a href="c10-sample-designs-replicate-weights.html#cb339-17" tabindex="-1"></a>apiclus1_slim <span class="ot">&lt;-</span> bwmata <span class="sc">%&gt;%</span></span>
+<span id="cb339-18"><a href="c10-sample-designs-replicate-weights.html#cb339-18" tabindex="-1"></a>  <span class="fu">as.data.frame</span>() <span class="sc">%&gt;%</span></span>
+<span id="cb339-19"><a href="c10-sample-designs-replicate-weights.html#cb339-19" tabindex="-1"></a>  <span class="fu">set_names</span>(<span class="fu">str_c</span>(<span class="st">&quot;pw&quot;</span>, <span class="dv">1</span><span class="sc">:</span><span class="dv">50</span>)) <span class="sc">%&gt;%</span></span>
+<span id="cb339-20"><a href="c10-sample-designs-replicate-weights.html#cb339-20" tabindex="-1"></a>  <span class="fu">cbind</span>(apiclus1_slim) <span class="sc">%&gt;%</span></span>
+<span id="cb339-21"><a href="c10-sample-designs-replicate-weights.html#cb339-21" tabindex="-1"></a>  <span class="fu">as_tibble</span>() <span class="sc">%&gt;%</span></span>
+<span id="cb339-22"><a href="c10-sample-designs-replicate-weights.html#cb339-22" tabindex="-1"></a>  <span class="fu">select</span>(cds, dnum, fpc, pw, <span class="fu">everything</span>())</span>
+<span id="cb339-23"><a href="c10-sample-designs-replicate-weights.html#cb339-23" tabindex="-1"></a></span>
+<span id="cb339-24"><a href="c10-sample-designs-replicate-weights.html#cb339-24" tabindex="-1"></a>apiclus1_slim</span></code></pre></div>
 <pre><code>## # A tibble: 183 × 54
 ##    cds        dnum   fpc    pw   pw1   pw2   pw3   pw4   pw5   pw6   pw7
 ##    &lt;chr&gt;     &lt;int&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt;
@@ -1347,17 +1351,17 @@ <h4>Example<a href="c10-sample-designs-replicate-weights.html#example-10" class=
 ## #   pw22 &lt;dbl&gt;, pw23 &lt;dbl&gt;, pw24 &lt;dbl&gt;, pw25 &lt;dbl&gt;, pw26 &lt;dbl&gt;,
 ## #   pw27 &lt;dbl&gt;, pw28 &lt;dbl&gt;, pw29 &lt;dbl&gt;, pw30 &lt;dbl&gt;, pw31 &lt;dbl&gt;,
 ## #   pw32 &lt;dbl&gt;, pw33 &lt;dbl&gt;, pw34 &lt;dbl&gt;, pw35 &lt;dbl&gt;, pw36 &lt;dbl&gt;, …</code></pre>
-<p>The output of <code>apiclus1_slim</code> includes the same variables we have seen in other api examples (see Table <a href="c10-sample-designs-replicate-weights.html#tab:apidata">10.1</a>), but now additionally includes bootstrap weights <code>pw1</code>, …, <code>pw50</code>. When creating the survey design object, we use the bootstrap weights as the replicate weights. Additionally, with replicate weights we need to include the scale (<span class="math inline">\(\alpha\)</span>). For this example we created,</p>
-<p><span class="math display">\[\alpha=\frac{M}{(M-1)(R-1)}=\frac{15}{(15-1)*(50-1)}=0.02186589\]</span>
-where <span class="math inline">\(M\)</span> is the average number of PSUs per strata and <span class="math inline">\(R\)</span> is the number of replicates. There is only 1 stratum and the number of clusters/PSUs is 15 so <span class="math inline">\(M=15\)</span>.</p>
-<div class="sourceCode" id="cb340"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb340-1"><a href="c10-sample-designs-replicate-weights.html#cb340-1" tabindex="-1"></a>api1_bs_des <span class="ot">&lt;-</span> apiclus1_slim <span class="sc">%&gt;%</span></span>
-<span id="cb340-2"><a href="c10-sample-designs-replicate-weights.html#cb340-2" tabindex="-1"></a>  <span class="fu">as_survey_rep</span>(<span class="at">weights =</span> pw,</span>
-<span id="cb340-3"><a href="c10-sample-designs-replicate-weights.html#cb340-3" tabindex="-1"></a>                <span class="at">repweights =</span> pw1<span class="sc">:</span>pw50,</span>
-<span id="cb340-4"><a href="c10-sample-designs-replicate-weights.html#cb340-4" tabindex="-1"></a>                <span class="at">type =</span> <span class="st">&quot;bootstrap&quot;</span>,</span>
-<span id="cb340-5"><a href="c10-sample-designs-replicate-weights.html#cb340-5" tabindex="-1"></a>                <span class="at">scale =</span> <span class="fl">0.02186589</span>,</span>
-<span id="cb340-6"><a href="c10-sample-designs-replicate-weights.html#cb340-6" tabindex="-1"></a>                <span class="at">mse =</span> <span class="cn">TRUE</span>)</span>
-<span id="cb340-7"><a href="c10-sample-designs-replicate-weights.html#cb340-7" tabindex="-1"></a></span>
-<span id="cb340-8"><a href="c10-sample-designs-replicate-weights.html#cb340-8" tabindex="-1"></a>api1_bs_des </span></code></pre></div>
+<p>The output of <code>apiclus1_slim</code> includes the same variables we have seen in other api examples (see Table <a href="c10-sample-designs-replicate-weights.html#tab:apidata">10.1</a>), but now additionally includes bootstrap weights <code>pw1</code>, …, <code>pw50</code>. When creating the survey design object, we use the bootstrap weights as the replicate weights. Additionally, with replicate weights we need to include the scale (<span class="math inline">\(\alpha\)</span>.) For this example, we created:</p>
+<p><span class="math display">\[\alpha=\frac{A}{(A-1)(R-1)}=\frac{15}{(15-1)*(50-1)}=0.02186589\]</span>
+where <span class="math inline">\(A\)</span> is the average number of PSUs per strata and <span class="math inline">\(R\)</span> is the number of replicates. There is only 1 stratum and the number of clusters/PSUs is 15 so <span class="math inline">\(A=15\)</span>. Using this information, we specify the design object as:</p>
+<div class="sourceCode" id="cb341"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb341-1"><a href="c10-sample-designs-replicate-weights.html#cb341-1" tabindex="-1"></a>api1_bs_des <span class="ot">&lt;-</span> apiclus1_slim <span class="sc">%&gt;%</span></span>
+<span id="cb341-2"><a href="c10-sample-designs-replicate-weights.html#cb341-2" tabindex="-1"></a>  <span class="fu">as_survey_rep</span>(<span class="at">weights =</span> pw,</span>
+<span id="cb341-3"><a href="c10-sample-designs-replicate-weights.html#cb341-3" tabindex="-1"></a>                <span class="at">repweights =</span> pw1<span class="sc">:</span>pw50,</span>
+<span id="cb341-4"><a href="c10-sample-designs-replicate-weights.html#cb341-4" tabindex="-1"></a>                <span class="at">type =</span> <span class="st">&quot;bootstrap&quot;</span>,</span>
+<span id="cb341-5"><a href="c10-sample-designs-replicate-weights.html#cb341-5" tabindex="-1"></a>                <span class="at">scale =</span> <span class="fl">0.02186589</span>,</span>
+<span id="cb341-6"><a href="c10-sample-designs-replicate-weights.html#cb341-6" tabindex="-1"></a>                <span class="at">mse =</span> <span class="cn">TRUE</span>)</span>
+<span id="cb341-7"><a href="c10-sample-designs-replicate-weights.html#cb341-7" tabindex="-1"></a></span>
+<span id="cb341-8"><a href="c10-sample-designs-replicate-weights.html#cb341-8" tabindex="-1"></a>api1_bs_des </span></code></pre></div>
 <pre><code>## Call: Called via srvyr
 ## Survey bootstrap with 50 replicates and MSE variances.
 ## Sampling variables:
@@ -1379,7 +1383,7 @@ <h4>Example<a href="c10-sample-designs-replicate-weights.html#example-10" class=
 ##     (dbl), pw37 (dbl), pw38 (dbl), pw39 (dbl), pw40 (dbl), pw41 (dbl),
 ##     pw42 (dbl), pw43 (dbl), pw44 (dbl), pw45 (dbl), pw46 (dbl), pw47
 ##     (dbl), pw48 (dbl), pw49 (dbl), pw50 (dbl)</code></pre>
-<div class="sourceCode" id="cb342"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb342-1"><a href="c10-sample-designs-replicate-weights.html#cb342-1" tabindex="-1"></a><span class="fu">summary</span>(api1_bs_des) </span></code></pre></div>
+<div class="sourceCode" id="cb343"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb343-1"><a href="c10-sample-designs-replicate-weights.html#cb343-1" tabindex="-1"></a><span class="fu">summary</span>(api1_bs_des) </span></code></pre></div>
 <pre><code>## Call: Called via srvyr
 ## Survey bootstrap with 50 replicates and MSE variances.
 ## Sampling variables:
@@ -1416,8 +1420,8 @@ <h4>Example<a href="c10-sample-designs-replicate-weights.html#example-10" class=
 <h2><span class="header-section-number">10.5</span> Exercises<a href="c10-sample-designs-replicate-weights.html#exercises-2" class="anchor-section" aria-label="Anchor link to header"></a></h2>
 <p>For this chapter, the exercises entail reading public documentation to determine how to specify the survey design. While reading the documentation, be on the lookout for description of the weights and the survey design variables or replicate weights.</p>
 <ol style="list-style-type: decimal">
-<li><p>The National Health Interview Survey (NHIS) is an annual household survey conducted by the National Center for Health Statistics (NCHS). The NHIS includes a wide variety of health topics for adults including health status and conditions, functioning and disability, health care access and health service utilization, health-related behaviors, health promotion, mental health, barriers to care, and community engagement. Like many national in-person surveys, the sampling design is a stratified clustered design with details included in the Survey Description <span class="citation">(<a href="#ref-nhis-svy-des">National Center for Health Statistics 2023</a>)</span>. The Survey Description provides information on setting up syntax in SUDAAN, Stata, SPSS, SAS, and R ({survey} package implementation). You have imported the data and the variable containing the data is: <code>nhis_adult_data</code>. How would you specify the design using {srvyr} using either <code>as_survey_design()</code> or <code>as_survey_rep()</code>?</p></li>
-<li><p>The General Social Survey is a survey that has been administered since 1972 on social, behavioral, and attitudinal topics. The 2016-2020 GSS Panel codebook provides examples of setting up syntax in SAS and Stata but not R <span class="citation">(<a href="#ref-gss-codebook">Davern et al. 2021</a>)</span>. You have imported the data and the variable containing the data is: <code>gss_data</code>. How would you specify the design in R using either <code>as_survey_design()</code> or <code>as_survey_rep()</code>?</p></li>
+<li><p>The National Health Interview Survey (NHIS) is an annual household survey conducted by the National Center for Health Statistics (NCHS.) The NHIS includes a wide variety of health topics for adults, including health status and conditions, functioning and disability, health care access and health service utilization, health-related behaviors, health promotion, mental health, barriers to care, and community engagement. Like many national in-person surveys, the sampling design is a stratified clustered design with details included in the Survey Description <span class="citation">(<a href="#ref-nhis-svy-des">National Center for Health Statistics 2023</a>)</span>. The Survey Description provides information on setting up syntax in SUDAAN, Stata, SPSS, SAS, and R ({survey} package implementation.) We have imported the data and the variable containing the data as <code>nhis_adult_data</code>. How would we specify the design using either <code>as_survey_design()</code> or <code>as_survey_rep()</code>?</p></li>
+<li><p>The General Social Survey is a survey that has been administered since 1972 on social, behavioral, and attitudinal topics. The 2016-2020 GSS Panel codebook provides examples of setting up syntax in SAS and Stata but not R <span class="citation">(<a href="#ref-gss-codebook">Davern et al. 2021</a>)</span>. We have imported the data and the variable containing the data as: <code>gss_data</code>. How would we specify the design in R using either <code>as_survey_design()</code> or <code>as_survey_rep()</code>?</p></li>
 </ol>
 
 </div>
@@ -1437,7 +1441,7 @@ <h3>References<a href="references.html#references" class="anchor-section" aria-l
 Deming, W Edwards. 1991. <em>Sample Design in Business Research</em>. Vol. 23. John Wiley &amp; Sons.
 </div>
 <div id="ref-R-srvyr" class="csl-entry">
-Freedman Ellis, Greg, and Ben Schneider. 2023. <em><span class="nocase">srvyr</span>: Dplyr-Like Syntax for Summary Statistics of Survey Data</em>.
+Freedman Ellis, Greg, and Ben Schneider. 2023. <em><span class="nocase">srvyr</span>: ’<span class="nocase">dplyr</span>’-Like Syntax for Summary Statistics of Survey Data</em>.
 </div>
 <div id="ref-fuller2011sampling" class="csl-entry">
 Fuller, Wayne A. 2011. <em>Sampling Statistics</em>. John Wiley &amp; Sons.
@@ -1457,9 +1461,6 @@ <h3>References<a href="references.html#references" class="anchor-section" aria-l
 <div id="ref-sarndal2003model" class="csl-entry">
 Särndal, Carl-Erik, Bengt Swensson, and Jan Wretman. 2003. <em>Model Assisted Survey Sampling</em>. Springer Science &amp; Business Media.
 </div>
-<div id="ref-R-srvyrexploR" class="csl-entry">
-Stephanie, Zimmer, Powell Rebecca, and Velásquez Isabella. 2024. <em><span class="nocase">srvyrexploR</span>: Data Supplement for Exploring Complex Survey Data Analysis in <span>R</span></em>.
-</div>
 <div id="ref-acs-pums-2021" class="csl-entry">
 U.S. Census Bureau. 2021. <span>“<span class="nocase">Understanding and Using the American Community Survey Public Use Microdata Sample Files What Data Users Need to Know</span>.”</span> U.S. Government Printing Office; <a href="https://www.census.gov/content/dam/Census/library/publications/2021/acs/acs_pums_handbook_2021.pdf" class="uri">https://www.census.gov/content/dam/Census/library/publications/2021/acs/acs_pums_handbook_2021.pdf</a>.
 </div>
@@ -1475,11 +1476,14 @@ <h3>References<a href="references.html#references" class="anchor-section" aria-l
 <div id="ref-wolter2007introduction" class="csl-entry">
 Wolter, Kirk M. 2007. <em>Introduction to Variance Estimation</em>. Vol. 53. Springer.
 </div>
+<div id="ref-R-srvyrexploR" class="csl-entry">
+Zimmer, Stephanie, Rebecca Powell, and Isabella Velásquez. 2024. <em><span class="nocase">srvyrexploR</span>: Data Supplement for Exploring Complex Survey Data Analysis in <span>R</span></em>.
+</div>
 </div>
 <div class="footnotes">
 <hr />
-<ol start="26">
-<li id="fn26"><p>We provide the code here for you to replicate this example, but are not focusing on the creation of the weights as that is outside the scope of this book. We recommend you reference <span class="citation">Wolter (<a href="#ref-wolter2007introduction">2007</a>)</span> for more information on creating bootstrap weights.<a href="c10-sample-designs-replicate-weights.html#fnref26" class="footnote-back">↩︎</a></p></li>
+<ol start="27">
+<li id="fn27"><p>We provide the code here to replicate this example, but are not focusing on the creation of the weights as that is outside the scope of this book. We recommend referencing <span class="citation">Wolter (<a href="#ref-wolter2007introduction">2007</a>)</span> for more information on creating bootstrap weights.<a href="c10-sample-designs-replicate-weights.html#fnref27" class="footnote-back">↩︎</a></p></li>
 </ol>
 </div>
             </section>
diff --git a/c11-missing-data.html b/c11-missing-data.html
index d7f94502..bce93827 100644
--- a/c11-missing-data.html
+++ b/c11-missing-data.html
@@ -23,7 +23,7 @@
 <meta name="author" content="Stephanie Zimmer, Rebecca J. Powell, and Isabella Velásquez" />
 
 
-<meta name="date" content="2024-04-16" />
+<meta name="date" content="2024-04-24" />
 
   <meta name="viewport" content="width=device-width, initial-scale=1" />
   <meta name="apple-mobile-web-app-capable" content="yes" />
@@ -229,7 +229,7 @@
 <li class="chapter" data-level="4.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#introduction-3"><i class="fa fa-check"></i><b>4.1</b> Introduction</a></li>
 <li class="chapter" data-level="4.2" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup"><i class="fa fa-check"></i><b>4.2</b> Setup</a>
 <ul>
-<li class="chapter" data-level="4.2.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#packages"><i class="fa fa-check"></i><b>4.2.1</b> Packages</a></li>
+<li class="chapter" data-level="4.2.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup-load-pkgs"><i class="fa fa-check"></i><b>4.2.1</b> Packages</a></li>
 <li class="chapter" data-level="4.2.2" data-path="c04-getting-started.html"><a href="c04-getting-started.html#data"><i class="fa fa-check"></i><b>4.2.2</b> Data</a></li>
 <li class="chapter" data-level="4.2.3" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup-des-obj"><i class="fa fa-check"></i><b>4.2.3</b> Design objects</a></li>
 </ul></li>
@@ -358,7 +358,7 @@
 <li class="chapter" data-level="9.9.1" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#random-number-seeds"><i class="fa fa-check"></i><b>9.9.1</b> Random number seeds</a></li>
 <li class="chapter" data-level="9.9.2" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#descriptive-names-and-labels"><i class="fa fa-check"></i><b>9.9.2</b> Descriptive names and labels</a></li>
 </ul></li>
-<li class="chapter" data-level="9.10" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#summary"><i class="fa fa-check"></i><b>9.10</b> Summary</a></li>
+<li class="chapter" data-level="9.10" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#additional-resources-1"><i class="fa fa-check"></i><b>9.10</b> Additional resources</a></li>
 </ul></li>
 <li class="part"><span><b>IV Real life data</b></span></li>
 <li class="chapter" data-level="10" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html"><i class="fa fa-check"></i><b>10</b> Sample designs and replicate weights</a>
@@ -380,7 +380,7 @@
 <ul>
 <li class="chapter" data-level="10.4.1" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#balanced-repeated-replication-brr-method"><i class="fa fa-check"></i><b>10.4.1</b> Balanced Repeated Replication (BRR) method</a></li>
 <li class="chapter" data-level="10.4.2" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#fays-brr-method"><i class="fa fa-check"></i><b>10.4.2</b> Fay’s BRR method</a></li>
-<li class="chapter" data-level="10.4.3" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#jackknife-method"><i class="fa fa-check"></i><b>10.4.3</b> Jackknife method</a></li>
+<li class="chapter" data-level="10.4.3" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#samp-jackknife"><i class="fa fa-check"></i><b>10.4.3</b> Jackknife method</a></li>
 <li class="chapter" data-level="10.4.4" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#bootstrap-method"><i class="fa fa-check"></i><b>10.4.4</b> Bootstrap method</a></li>
 </ul></li>
 <li class="chapter" data-level="10.5" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#exercises-2"><i class="fa fa-check"></i><b>10.5</b> Exercises</a></li>
@@ -398,14 +398,14 @@
 <li class="chapter" data-level="11.4" data-path="c11-missing-data.html"><a href="c11-missing-data.html#analysis-with-missing-data"><i class="fa fa-check"></i><b>11.4</b> Analysis with missing data</a>
 <ul>
 <li class="chapter" data-level="11.4.1" data-path="c11-missing-data.html"><a href="c11-missing-data.html#recoding-missing-data"><i class="fa fa-check"></i><b>11.4.1</b> Recoding missing data</a></li>
-<li class="chapter" data-level="11.4.2" data-path="c11-missing-data.html"><a href="c11-missing-data.html#accounting-for-skip-patterns"><i class="fa fa-check"></i><b>11.4.2</b> Accounting for skip patterns</a></li>
+<li class="chapter" data-level="11.4.2" data-path="c11-missing-data.html"><a href="c11-missing-data.html#missing-skip-patt"><i class="fa fa-check"></i><b>11.4.2</b> Accounting for skip patterns</a></li>
 </ul></li>
 </ul></li>
 <li class="chapter" data-level="12" data-path="c12-recommendations.html"><a href="c12-recommendations.html"><i class="fa fa-check"></i><b>12</b> Successful survey analysis recommendations</a>
 <ul>
 <li class="chapter" data-level="" data-path="c12-recommendations.html"><a href="c12-recommendations.html#prereq12"><i class="fa fa-check"></i>Prerequisites</a></li>
 <li class="chapter" data-level="12.1" data-path="c12-recommendations.html"><a href="c12-recommendations.html#introduction-10"><i class="fa fa-check"></i><b>12.1</b> Introduction</a></li>
-<li class="chapter" data-level="12.2" data-path="c12-recommendations.html"><a href="c12-recommendations.html#recs-survey-process"><i class="fa fa-check"></i><b>12.2</b> Follow survey analysis process</a></li>
+<li class="chapter" data-level="12.2" data-path="c12-recommendations.html"><a href="c12-recommendations.html#recs-survey-process"><i class="fa fa-check"></i><b>12.2</b> Follow the survey analysis process</a></li>
 <li class="chapter" data-level="12.3" data-path="c12-recommendations.html"><a href="c12-recommendations.html#begin-with-descriptive-analysis"><i class="fa fa-check"></i><b>12.3</b> Begin with descriptive analysis</a>
 <ul>
 <li class="chapter" data-level="12.3.1" data-path="c12-recommendations.html"><a href="c12-recommendations.html#table-review"><i class="fa fa-check"></i><b>12.3.1</b> Table review</a></li>
@@ -524,62 +524,62 @@ <h3>Prerequisites<a href="c11-missing-data.html#prereq11" class="anchor-section"
 </div>
 <div class="prereqbox">
 <p>For this chapter, load the following packages:</p>
-<div class="sourceCode" id="cb344"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb344-1"><a href="c11-missing-data.html#cb344-1" tabindex="-1"></a><span class="fu">library</span>(tidyverse)</span>
-<span id="cb344-2"><a href="c11-missing-data.html#cb344-2" tabindex="-1"></a><span class="fu">library</span>(survey) </span>
-<span id="cb344-3"><a href="c11-missing-data.html#cb344-3" tabindex="-1"></a><span class="fu">library</span>(srvyr) </span>
-<span id="cb344-4"><a href="c11-missing-data.html#cb344-4" tabindex="-1"></a><span class="fu">library</span>(srvyrexploR)</span>
-<span id="cb344-5"><a href="c11-missing-data.html#cb344-5" tabindex="-1"></a><span class="fu">library</span>(naniar)</span>
-<span id="cb344-6"><a href="c11-missing-data.html#cb344-6" tabindex="-1"></a><span class="fu">library</span>(haven)</span>
-<span id="cb344-7"><a href="c11-missing-data.html#cb344-7" tabindex="-1"></a><span class="fu">library</span>(gt)</span></code></pre></div>
-<p>We will be using data from ANES and RECS. Here is the code to create the design objects for each to use throughout this chapter. For ANES, we need to adjust the weight so it sums to the population instead of the sample (see the ANES documentation and Chapter <a href="c03-survey-data-documentation.html#c03-survey-data-documentation">3</a> for more information).</p>
-<div class="sourceCode" id="cb345"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb345-1"><a href="c11-missing-data.html#cb345-1" tabindex="-1"></a>targetpop <span class="ot">&lt;-</span> <span class="dv">231592693</span></span>
-<span id="cb345-2"><a href="c11-missing-data.html#cb345-2" tabindex="-1"></a></span>
-<span id="cb345-3"><a href="c11-missing-data.html#cb345-3" tabindex="-1"></a>anes_adjwgt <span class="ot">&lt;-</span> anes_2020 <span class="sc">%&gt;%</span></span>
-<span id="cb345-4"><a href="c11-missing-data.html#cb345-4" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">Weight =</span> Weight <span class="sc">/</span> <span class="fu">sum</span>(Weight) <span class="sc">*</span> targetpop)</span>
-<span id="cb345-5"><a href="c11-missing-data.html#cb345-5" tabindex="-1"></a></span>
-<span id="cb345-6"><a href="c11-missing-data.html#cb345-6" tabindex="-1"></a>anes_des <span class="ot">&lt;-</span> anes_adjwgt <span class="sc">%&gt;%</span></span>
-<span id="cb345-7"><a href="c11-missing-data.html#cb345-7" tabindex="-1"></a>  <span class="fu">as_survey_design</span>(</span>
-<span id="cb345-8"><a href="c11-missing-data.html#cb345-8" tabindex="-1"></a>    <span class="at">weights =</span> Weight,</span>
-<span id="cb345-9"><a href="c11-missing-data.html#cb345-9" tabindex="-1"></a>    <span class="at">strata =</span> Stratum,</span>
-<span id="cb345-10"><a href="c11-missing-data.html#cb345-10" tabindex="-1"></a>    <span class="at">ids =</span> VarUnit,</span>
-<span id="cb345-11"><a href="c11-missing-data.html#cb345-11" tabindex="-1"></a>    <span class="at">nest =</span> <span class="cn">TRUE</span></span>
-<span id="cb345-12"><a href="c11-missing-data.html#cb345-12" tabindex="-1"></a>  )</span></code></pre></div>
+<div class="sourceCode" id="cb345"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb345-1"><a href="c11-missing-data.html#cb345-1" tabindex="-1"></a><span class="fu">library</span>(tidyverse)</span>
+<span id="cb345-2"><a href="c11-missing-data.html#cb345-2" tabindex="-1"></a><span class="fu">library</span>(survey) </span>
+<span id="cb345-3"><a href="c11-missing-data.html#cb345-3" tabindex="-1"></a><span class="fu">library</span>(srvyr) </span>
+<span id="cb345-4"><a href="c11-missing-data.html#cb345-4" tabindex="-1"></a><span class="fu">library</span>(srvyrexploR)</span>
+<span id="cb345-5"><a href="c11-missing-data.html#cb345-5" tabindex="-1"></a><span class="fu">library</span>(naniar)</span>
+<span id="cb345-6"><a href="c11-missing-data.html#cb345-6" tabindex="-1"></a><span class="fu">library</span>(haven)</span>
+<span id="cb345-7"><a href="c11-missing-data.html#cb345-7" tabindex="-1"></a><span class="fu">library</span>(gt)</span></code></pre></div>
+<p>We are using data from ANES and RECS described in Chapter <a href="c04-getting-started.html#c04-getting-started">4</a>. As a reminder, here is the code to create the design objects for each to use throughout this chapter. For ANES, we need to adjust the weight so it sums to the population instead of the sample (see the ANES documentation and Chapter <a href="c04-getting-started.html#c04-getting-started">4</a> for more information.)</p>
+<div class="sourceCode" id="cb346"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb346-1"><a href="c11-missing-data.html#cb346-1" tabindex="-1"></a>targetpop <span class="ot">&lt;-</span> <span class="dv">231592693</span></span>
+<span id="cb346-2"><a href="c11-missing-data.html#cb346-2" tabindex="-1"></a></span>
+<span id="cb346-3"><a href="c11-missing-data.html#cb346-3" tabindex="-1"></a>anes_adjwgt <span class="ot">&lt;-</span> anes_2020 <span class="sc">%&gt;%</span></span>
+<span id="cb346-4"><a href="c11-missing-data.html#cb346-4" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">Weight =</span> Weight <span class="sc">/</span> <span class="fu">sum</span>(Weight) <span class="sc">*</span> targetpop)</span>
+<span id="cb346-5"><a href="c11-missing-data.html#cb346-5" tabindex="-1"></a></span>
+<span id="cb346-6"><a href="c11-missing-data.html#cb346-6" tabindex="-1"></a>anes_des <span class="ot">&lt;-</span> anes_adjwgt <span class="sc">%&gt;%</span></span>
+<span id="cb346-7"><a href="c11-missing-data.html#cb346-7" tabindex="-1"></a>  <span class="fu">as_survey_design</span>(</span>
+<span id="cb346-8"><a href="c11-missing-data.html#cb346-8" tabindex="-1"></a>    <span class="at">weights =</span> Weight,</span>
+<span id="cb346-9"><a href="c11-missing-data.html#cb346-9" tabindex="-1"></a>    <span class="at">strata =</span> Stratum,</span>
+<span id="cb346-10"><a href="c11-missing-data.html#cb346-10" tabindex="-1"></a>    <span class="at">ids =</span> VarUnit,</span>
+<span id="cb346-11"><a href="c11-missing-data.html#cb346-11" tabindex="-1"></a>    <span class="at">nest =</span> <span class="cn">TRUE</span></span>
+<span id="cb346-12"><a href="c11-missing-data.html#cb346-12" tabindex="-1"></a>  )</span></code></pre></div>
 <p>For RECS, details are included in the RECS documentation and Chapter <a href="c10-sample-designs-replicate-weights.html#c10-sample-designs-replicate-weights">10</a>.</p>
-<div class="sourceCode" id="cb346"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb346-1"><a href="c11-missing-data.html#cb346-1" tabindex="-1"></a>recs_des <span class="ot">&lt;-</span> recs_2020 <span class="sc">%&gt;%</span></span>
-<span id="cb346-2"><a href="c11-missing-data.html#cb346-2" tabindex="-1"></a>  <span class="fu">as_survey_rep</span>(</span>
-<span id="cb346-3"><a href="c11-missing-data.html#cb346-3" tabindex="-1"></a>    <span class="at">weights =</span> NWEIGHT,</span>
-<span id="cb346-4"><a href="c11-missing-data.html#cb346-4" tabindex="-1"></a>    <span class="at">repweights =</span> NWEIGHT1<span class="sc">:</span>NWEIGHT60,</span>
-<span id="cb346-5"><a href="c11-missing-data.html#cb346-5" tabindex="-1"></a>    <span class="at">type =</span> <span class="st">&quot;JK1&quot;</span>,</span>
-<span id="cb346-6"><a href="c11-missing-data.html#cb346-6" tabindex="-1"></a>    <span class="at">scale =</span> <span class="dv">59</span><span class="sc">/</span><span class="dv">60</span>,</span>
-<span id="cb346-7"><a href="c11-missing-data.html#cb346-7" tabindex="-1"></a>    <span class="at">mse =</span> <span class="cn">TRUE</span></span>
-<span id="cb346-8"><a href="c11-missing-data.html#cb346-8" tabindex="-1"></a>  )</span></code></pre></div>
+<div class="sourceCode" id="cb347"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb347-1"><a href="c11-missing-data.html#cb347-1" tabindex="-1"></a>recs_des <span class="ot">&lt;-</span> recs_2020 <span class="sc">%&gt;%</span></span>
+<span id="cb347-2"><a href="c11-missing-data.html#cb347-2" tabindex="-1"></a>  <span class="fu">as_survey_rep</span>(</span>
+<span id="cb347-3"><a href="c11-missing-data.html#cb347-3" tabindex="-1"></a>    <span class="at">weights =</span> NWEIGHT,</span>
+<span id="cb347-4"><a href="c11-missing-data.html#cb347-4" tabindex="-1"></a>    <span class="at">repweights =</span> NWEIGHT1<span class="sc">:</span>NWEIGHT60,</span>
+<span id="cb347-5"><a href="c11-missing-data.html#cb347-5" tabindex="-1"></a>    <span class="at">type =</span> <span class="st">&quot;JK1&quot;</span>,</span>
+<span id="cb347-6"><a href="c11-missing-data.html#cb347-6" tabindex="-1"></a>    <span class="at">scale =</span> <span class="dv">59</span><span class="sc">/</span><span class="dv">60</span>,</span>
+<span id="cb347-7"><a href="c11-missing-data.html#cb347-7" tabindex="-1"></a>    <span class="at">mse =</span> <span class="cn">TRUE</span></span>
+<span id="cb347-8"><a href="c11-missing-data.html#cb347-8" tabindex="-1"></a>  )</span></code></pre></div>
 </div>
 <div id="introduction-9" class="section level2 hasAnchor" number="11.1">
 <h2><span class="header-section-number">11.1</span> Introduction<a href="c11-missing-data.html#introduction-9" class="anchor-section" aria-label="Anchor link to header"></a></h2>
-<p>Missing data in surveys refers to situations where participants do not provide complete responses to survey questions. Respondents may not have seen a question by design. Or, they may not respond to a question for various other reasons, such as not wanting to answer a particular question, not understanding the question, or simply forgetting to answer. Missing data is important to consider and account for, as it can introduce bias and reduce the representativeness of the data. This chapter provides an overview of the types of missing data, how to assess missing data in surveys, and how to conduct analysis when missing data is present. Understanding this complex topic can help ensure accurate reporting of survey results and can provide insight into potential changes to the survey design for the future.</p>
+<p>Missing data in surveys refer to situations where participants do not provide complete responses to survey questions. Respondents may not have seen a question by design. Or, they may not respond to a question for various other reasons, such as not wanting to answer a particular question, not understanding the question, or simply forgetting to answer. Missing data are important to consider and account for, as it can introduce bias and reduce the representativeness of the data. This chapter provides an overview of the types of missing data, how to assess missing data in surveys, and how to conduct analysis when missing data are present. Understanding this complex topic can help ensure accurate reporting of survey results and provide insight into potential changes to the survey design for the future.</p>
 </div>
 <div id="missing-data-mechanisms" class="section level2 hasAnchor" number="11.2">
 <h2><span class="header-section-number">11.2</span> Missing data mechanisms<a href="c11-missing-data.html#missing-data-mechanisms" class="anchor-section" aria-label="Anchor link to header"></a></h2>
-<p>There are two main categories that missing data typically fall into: missing by design or unintentional missing data. Missing by design is part of the survey plan and can be more easily incorporated into weights and analyses. Unintentional missing data on the other hand, can lead to bias in survey estimates if not correctly accounted for. Below we provide more information on the types of missing data.</p>
+<p>There are two main categories that missing data typically fall into: missing by design and unintentional missing data. Missing by design is part of the survey plan and can be more easily incorporated into weights and analyses. Unintentional missing data on the other hand, can lead to bias in survey estimates if not correctly accounted for. Below we provide more information on the types of missing data.</p>
 <ol style="list-style-type: decimal">
 <li><p><strong>Missing by design/questionnaire skip logic</strong>: This type of missingness occurs when certain respondents are intentionally directed to skip specific questions based on their previous responses or characteristics. For example, in a survey about employment, if a respondent indicates that they are not employed, they may be directed to skip questions related to their job responsibilities. Additionally, some surveys randomize questions or modules so that not all participants respond to all questions. In these instances, respondents would have missing data for the modules not randomly assigned to them.</p></li>
 <li><p><strong>Unintentional missing data</strong>: This type of missingness occurs when researchers do not intend for there to be missing data on a particular question, for example, if respondents did not finish the survey or refused to answer individual questions. There are three main types of unintentional missing data that each should be considered and handled differently <span class="citation">(<a href="#ref-mack">Mack, Su, and Westreich 2018</a>; <a href="#ref-Schafer2002">Schafer and Graham 2002</a>)</span>:</p>
 <ol style="list-style-type: lower-alpha">
-<li><p><strong>Missing completely at random (MCAR)</strong>: The missing data is unrelated to both observed and unobserved data, and the probability of being missing is the same across all cases. For example, if a respondent missed a question because they had to leave the survey early due to an emergency.</p></li>
-<li><p><strong>Missing at random (MAR)</strong>: The missing data is related to observed data but not unobserved data, and the probability of being missing is the same within groups. For example, if older respondents choose not to answer specific questions but younger respondents do answer them and we know the respondent’s age.</p></li>
-<li><p><strong>Missing not at random (MNAR)</strong>: The missing data is related to unobserved data, and the probability of being missing varies for reasons we are not measuring. For example, if respondents with depression do not answer a question about depression severity.</p></li>
+<li><p><strong>Missing completely at random (MCAR)</strong>: The missing data are unrelated to both observed and unobserved data, and the probability of being missing is the same across all cases. For example, if a respondent missed a question because they had to leave the survey early due to an emergency.</p></li>
+<li><p><strong>Missing at random (MAR)</strong>: The missing data are related to observed data but not unobserved data, and the probability of being missing is the same within groups. For example, we know the respondents’ ages if and older respondents choose not to answer specific questions but younger respondents do answer them.</p></li>
+<li><p><strong>Missing not at random (MNAR)</strong>: The missing data are related to unobserved data, and the probability of being missing varies for reasons we are not measuring. For example, if respondents with depression do not answer a question about depression severity.</p></li>
 </ol></li>
 </ol>
 </div>
 <div id="assessing-missing-data" class="section level2 hasAnchor" number="11.3">
 <h2><span class="header-section-number">11.3</span> Assessing missing data<a href="c11-missing-data.html#assessing-missing-data" class="anchor-section" aria-label="Anchor link to header"></a></h2>
-<p>Before beginning analysis, we should explore the data to determine if there is missing data and what types of missing data are present. Conducting this descriptive analysis can help with analysis and reporting of survey data (see Section <a href="c12-recommendations.html#c12-recommendations">12</a>), and can inform the survey design in future studies. For example, large amounts of unexpected missing data may indicate the questions were unclear or difficult to recall. There are several ways to explore missing data which we walk through below. When assessing the missing data, we recommend using a data.frame object and not the survey object as most of the analysis is about patterns of records and weights are not necessary.</p>
+<p>Before beginning an analysis, we should explore the data to determine if there is missing data and what types of missing data are present. Conducting this descriptive analysis can help with the analysis and reporting of survey data (see Section <a href="c12-recommendations.html#c12-recommendations">12</a>) and can inform the survey design in future studies. For example, large amounts of unexpected missing data may indicate the questions were unclear or difficult to recall. There are several ways to explore missing data, which we walk through below. When assessing the missing data, we recommend using a data.frame object and not the survey object, as most of the analysis is about patterns of records, and weights are not necessary.</p>
 <div id="summarize-data" class="section level3 hasAnchor" number="11.3.1">
 <h3><span class="header-section-number">11.3.1</span> Summarize data<a href="c11-missing-data.html#summarize-data" class="anchor-section" aria-label="Anchor link to header"></a></h3>
-<p>A very rudimentary first exploration is to use the <code>summary()</code> function to summarize the data which will illuminate <code>NA</code> values in the data. Let’s look at a few analytic variables on the ANES 2020 data using <code>summary()</code>:</p>
-<div class="sourceCode" id="cb347"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb347-1"><a href="c11-missing-data.html#cb347-1" tabindex="-1"></a>anes_2020 <span class="sc">%&gt;%</span></span>
-<span id="cb347-2"><a href="c11-missing-data.html#cb347-2" tabindex="-1"></a>  <span class="fu">select</span>(V202051<span class="sc">:</span>EarlyVote2020) <span class="sc">%&gt;%</span></span>
-<span id="cb347-3"><a href="c11-missing-data.html#cb347-3" tabindex="-1"></a>  <span class="fu">summary</span>()</span></code></pre></div>
+<p>A very rudimentary first exploration is to use the <code>summary()</code> function to summarize the data, which illuminates <code>NA</code> values in the data. Let’s look at a few analytic variables on the ANES 2020 data using <code>summary()</code>:</p>
+<div class="sourceCode" id="cb348"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb348-1"><a href="c11-missing-data.html#cb348-1" tabindex="-1"></a>anes_2020 <span class="sc">%&gt;%</span></span>
+<span id="cb348-2"><a href="c11-missing-data.html#cb348-2" tabindex="-1"></a>  <span class="fu">select</span>(V202051<span class="sc">:</span>EarlyVote2020) <span class="sc">%&gt;%</span></span>
+<span id="cb348-3"><a href="c11-missing-data.html#cb348-3" tabindex="-1"></a>  <span class="fu">summary</span>()</span></code></pre></div>
 <pre><code>##     V202051                Income7                  Income    
 ##  Min.   :-9.000   $125k or more:1468   Under $9,999    : 647  
 ##  1st Qu.:-1.000   Under $20k   :1076   $50,000-59,999  : 485  
@@ -684,9 +684,9 @@ <h3><span class="header-section-number">11.3.1</span> Summarize data<a href="c11
 ##               
 ##               
 ## </code></pre>
-<p>We see that there are NA values in several of the derived variables (those not beginning with “V”) and negative values in the original variables (those beginning with “V”). We can also use the <code>count()</code> function to get an understanding of the different types of missing data on the original variables. For example, let’s look at the count of data for <code>V202072</code>, which corresponds to our <code>VotedPres2020</code> variable.</p>
-<div class="sourceCode" id="cb349"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb349-1"><a href="c11-missing-data.html#cb349-1" tabindex="-1"></a>anes_2020 <span class="sc">%&gt;%</span> </span>
-<span id="cb349-2"><a href="c11-missing-data.html#cb349-2" tabindex="-1"></a>  <span class="fu">count</span>(VotedPres2020,V202072)</span></code></pre></div>
+<p>We see that there are NA values in several of the derived variables (those not beginning with “V”) and negative values in the original variables (those beginning with “V”.) We can also use the <code>count()</code> function to get an understanding of the different types of missing data on the original variables. For example, let’s look at the count of data for <code>V202072</code>, which corresponds to our <code>VotedPres2020</code> variable.</p>
+<div class="sourceCode" id="cb350"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb350-1"><a href="c11-missing-data.html#cb350-1" tabindex="-1"></a>anes_2020 <span class="sc">%&gt;%</span> </span>
+<span id="cb350-2"><a href="c11-missing-data.html#cb350-2" tabindex="-1"></a>  <span class="fu">count</span>(VotedPres2020,V202072)</span></code></pre></div>
 <pre><code>## # A tibble: 7 × 3
 ##   VotedPres2020 V202072                                   n
 ##   &lt;fct&gt;         &lt;dbl+lbl&gt;                             &lt;int&gt;
@@ -697,36 +697,36 @@ <h3><span class="header-section-number">11.3.1</span> Summarize data<a href="c11
 ## 5 &lt;NA&gt;          -9 [-9. Refused]                          2
 ## 6 &lt;NA&gt;          -6 [-6. No post-election interview]       4
 ## 7 &lt;NA&gt;          -1 [-1. Inapplicable]                  1047</code></pre>
-<p>Here we can see that there are three types of missing data, and that the majority of them fall under the “Inapplicable” category. This is usually a term associated with data missing due to skip patterns and is considered to be missing data by design. Based on the documentation from ANES <span class="citation">(<a href="#ref-debell">DeBell 2010</a>)</span>, we can see that this question was only asked to respondents who voted in the election.</p>
+<p>Here, we can see that there are three types of missing data, and the majority of them fall under the “Inapplicable” category. This is usually a term associated with data missing due to skip patterns and is considered to be missing data by design. Based on the documentation from ANES <span class="citation">(<a href="#ref-debell">DeBell 2010</a>)</span>, we can see that this question was only asked to respondents who voted in the election.</p>
 </div>
 <div id="visualization-of-missing-data" class="section level3 hasAnchor" number="11.3.2">
 <h3><span class="header-section-number">11.3.2</span> Visualization of missing data<a href="c11-missing-data.html#visualization-of-missing-data" class="anchor-section" aria-label="Anchor link to header"></a></h3>
-<p>It can be challenging to look at tables for every variable, and instead may be more efficient to view missing data in a graphical format to help narrow in on patterns or unique variables. The {naniar} package is very useful in exploring missing data visually. It provides quick graphics to explore the missingness patterns in the data. We can use the <code>vis_miss()</code> function available in both {visdat} and {naniar} packages to view the amount of missing data by variable <span class="citation">(<a href="#ref-visdat2017">Tierney 2017</a>; <a href="#ref-naniar2023">Tierney and Cook 2023</a>)</span>.</p>
-<div class="sourceCode" id="cb351"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb351-1"><a href="c11-missing-data.html#cb351-1" tabindex="-1"></a>anes_2020_derived<span class="ot">&lt;-</span>anes_2020 <span class="sc">%&gt;%</span></span>
-<span id="cb351-2"><a href="c11-missing-data.html#cb351-2" tabindex="-1"></a>  <span class="fu">select</span>(<span class="sc">!</span><span class="fu">starts_with</span>(<span class="st">&quot;V2&quot;</span>),<span class="sc">-</span>CaseID,<span class="sc">-</span>InterviewMode,<span class="sc">-</span>Weight,<span class="sc">-</span>Stratum,<span class="sc">-</span>VarUnit)</span>
-<span id="cb351-3"><a href="c11-missing-data.html#cb351-3" tabindex="-1"></a></span>
-<span id="cb351-4"><a href="c11-missing-data.html#cb351-4" tabindex="-1"></a>anes_2020_derived <span class="sc">%&gt;%</span></span>
-<span id="cb351-5"><a href="c11-missing-data.html#cb351-5" tabindex="-1"></a>  <span class="fu">vis_miss</span>(<span class="at">cluster=</span> <span class="cn">TRUE</span>, <span class="at">show_perc =</span> <span class="cn">FALSE</span>) <span class="sc">+</span></span>
-<span id="cb351-6"><a href="c11-missing-data.html#cb351-6" tabindex="-1"></a>  <span class="fu">scale_fill_manual</span>(<span class="at">values =</span> book_colors[<span class="fu">c</span>(<span class="dv">3</span>,<span class="dv">1</span>)], </span>
-<span id="cb351-7"><a href="c11-missing-data.html#cb351-7" tabindex="-1"></a>                    <span class="at">labels =</span> <span class="fu">c</span>(<span class="st">&quot;Present&quot;</span>,<span class="st">&quot;Missing&quot;</span>),</span>
-<span id="cb351-8"><a href="c11-missing-data.html#cb351-8" tabindex="-1"></a>                    <span class="at">name =</span> <span class="st">&quot;&quot;</span>)</span></code></pre></div>
+<p>It can be challenging to look at tables for every variable and instead may be more efficient to view missing data in a graphical format to help narrow in on patterns or unique variables. The {naniar} package is very useful in exploring missing data visually. We can use the <code>vis_miss()</code> function available in both {visdat} and {naniar} packages to view the amount of missing data by variable (see Figure <a href="c11-missing-data.html#fig:missing-anes-vismiss">11.1</a>) <span class="citation">(<a href="#ref-visdattierney">Tierney 2017</a>; <a href="#ref-naniar2023">Tierney and Cook 2023</a>)</span>.</p>
+<div class="sourceCode" id="cb352"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb352-1"><a href="c11-missing-data.html#cb352-1" tabindex="-1"></a>anes_2020_derived<span class="ot">&lt;-</span>anes_2020 <span class="sc">%&gt;%</span></span>
+<span id="cb352-2"><a href="c11-missing-data.html#cb352-2" tabindex="-1"></a>  <span class="fu">select</span>(<span class="sc">!</span><span class="fu">starts_with</span>(<span class="st">&quot;V2&quot;</span>),<span class="sc">-</span>CaseID,<span class="sc">-</span>InterviewMode,<span class="sc">-</span>Weight,<span class="sc">-</span>Stratum,<span class="sc">-</span>VarUnit)</span>
+<span id="cb352-3"><a href="c11-missing-data.html#cb352-3" tabindex="-1"></a></span>
+<span id="cb352-4"><a href="c11-missing-data.html#cb352-4" tabindex="-1"></a>anes_2020_derived <span class="sc">%&gt;%</span></span>
+<span id="cb352-5"><a href="c11-missing-data.html#cb352-5" tabindex="-1"></a>  <span class="fu">vis_miss</span>(<span class="at">cluster=</span> <span class="cn">TRUE</span>, <span class="at">show_perc =</span> <span class="cn">FALSE</span>) <span class="sc">+</span></span>
+<span id="cb352-6"><a href="c11-missing-data.html#cb352-6" tabindex="-1"></a>  <span class="fu">scale_fill_manual</span>(<span class="at">values =</span> book_colors[<span class="fu">c</span>(<span class="dv">3</span>,<span class="dv">1</span>)], </span>
+<span id="cb352-7"><a href="c11-missing-data.html#cb352-7" tabindex="-1"></a>                    <span class="at">labels =</span> <span class="fu">c</span>(<span class="st">&quot;Present&quot;</span>,<span class="st">&quot;Missing&quot;</span>),</span>
+<span id="cb352-8"><a href="c11-missing-data.html#cb352-8" tabindex="-1"></a>                    <span class="at">name =</span> <span class="st">&quot;&quot;</span>)</span></code></pre></div>
 <div class="figure"><span style="display:block;" id="fig:missing-anes-vismiss"></span>
 <img src="bookdown_files/figure-html/missing-anes-vismiss-1.png" alt="This chart shows a the missingness of the selected variables where missing is highlighted in a dark color. Each row of the plot is an observation and each column is a variable. There are some patterns observed such as a large block of missing for `VotedPres2016_selection` and many of the same respondents also having missing for `VotedPres2020_selection`." width="672" />
 <p class="caption">
 FIGURE 11.1: Visual depiction of missing data in the ANES 2020 data
 </p>
 </div>
-<p>From this visualization, we can start to get a picture of what questions may be related to each other in terms of missing data. Even if we did not have the informative variable names, we could be able to deduce that <code>VotedPres2020</code>, <code>VotedPres2020_selection</code>, and <code>EarlyVote2020</code> are likely related since their missing data patterns are similar.</p>
-<p>Additionally, we can also look at <code>VotedPres2016_selection</code> and see that there is a lot of missing data in that variable. Most likely this is due to a skip pattern, and we can look at further graphics to see how it might be related to other variables. The {naniar} package has multiple visualization functions that can help dive deeper such as the <code>gg_miss_fct()</code> function which looks at missing data for all variables by levels of another variable.</p>
-<div class="sourceCode" id="cb352"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb352-1"><a href="c11-missing-data.html#cb352-1" tabindex="-1"></a>anes_2020_derived <span class="sc">%&gt;%</span> </span>
-<span id="cb352-2"><a href="c11-missing-data.html#cb352-2" tabindex="-1"></a>  <span class="fu">gg_miss_fct</span>(VotedPres2016) <span class="sc">+</span></span>
-<span id="cb352-3"><a href="c11-missing-data.html#cb352-3" tabindex="-1"></a>  <span class="fu">scale_fill_gradientn</span>(</span>
-<span id="cb352-4"><a href="c11-missing-data.html#cb352-4" tabindex="-1"></a>    <span class="at">guide =</span> <span class="st">&quot;colorbar&quot;</span>,</span>
-<span id="cb352-5"><a href="c11-missing-data.html#cb352-5" tabindex="-1"></a>    <span class="at">name =</span> <span class="st">&quot;% Miss&quot;</span>,</span>
-<span id="cb352-6"><a href="c11-missing-data.html#cb352-6" tabindex="-1"></a>    <span class="at">colors =</span> book_colors[<span class="fu">c</span>(<span class="dv">3</span>, <span class="dv">2</span>, <span class="dv">1</span>)] </span>
-<span id="cb352-7"><a href="c11-missing-data.html#cb352-7" tabindex="-1"></a>  ) <span class="sc">+</span></span>
-<span id="cb352-8"><a href="c11-missing-data.html#cb352-8" tabindex="-1"></a>  <span class="fu">ylab</span>(<span class="st">&quot;Variable&quot;</span>) <span class="sc">+</span></span>
-<span id="cb352-9"><a href="c11-missing-data.html#cb352-9" tabindex="-1"></a>  <span class="fu">xlab</span>(<span class="st">&quot;Voted for President in 2016&quot;</span>)</span></code></pre></div>
+<p>From the visualization in Figure <a href="c11-missing-data.html#fig:missing-anes-vismiss">11.1</a>, we can start to get a picture of what questions may be connected to each other in terms of missing data. Even if we did not have the informative variable names, we could deduce that <code>VotedPres2020</code>, <code>VotedPres2020_selection</code>, and <code>EarlyVote2020</code> are likely connected since their missing data patterns are similar.</p>
+<p>Additionally, we can also look at <code>VotedPres2016_selection</code> and see that there is a lot of missing data in that variable. The missing data are likely due to a skip pattern, and we can look at other graphics to see how they relate to other variables. The {naniar} package has multiple visualization functions that can help dive deeper, such as the <code>gg_miss_fct()</code> function, which looks at missing data for all variables by levels of another variable (see Figure <a href="c11-missing-data.html#fig:missing-anes-ggmissfct">11.2</a>.)</p>
+<div class="sourceCode" id="cb353"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb353-1"><a href="c11-missing-data.html#cb353-1" tabindex="-1"></a>anes_2020_derived <span class="sc">%&gt;%</span> </span>
+<span id="cb353-2"><a href="c11-missing-data.html#cb353-2" tabindex="-1"></a>  <span class="fu">gg_miss_fct</span>(VotedPres2016) <span class="sc">+</span></span>
+<span id="cb353-3"><a href="c11-missing-data.html#cb353-3" tabindex="-1"></a>  <span class="fu">scale_fill_gradientn</span>(</span>
+<span id="cb353-4"><a href="c11-missing-data.html#cb353-4" tabindex="-1"></a>    <span class="at">guide =</span> <span class="st">&quot;colorbar&quot;</span>,</span>
+<span id="cb353-5"><a href="c11-missing-data.html#cb353-5" tabindex="-1"></a>    <span class="at">name =</span> <span class="st">&quot;% Miss&quot;</span>,</span>
+<span id="cb353-6"><a href="c11-missing-data.html#cb353-6" tabindex="-1"></a>    <span class="at">colors =</span> book_colors[<span class="fu">c</span>(<span class="dv">3</span>, <span class="dv">2</span>, <span class="dv">1</span>)] </span>
+<span id="cb353-7"><a href="c11-missing-data.html#cb353-7" tabindex="-1"></a>  ) <span class="sc">+</span></span>
+<span id="cb353-8"><a href="c11-missing-data.html#cb353-8" tabindex="-1"></a>  <span class="fu">ylab</span>(<span class="st">&quot;Variable&quot;</span>) <span class="sc">+</span></span>
+<span id="cb353-9"><a href="c11-missing-data.html#cb353-9" tabindex="-1"></a>  <span class="fu">xlab</span>(<span class="st">&quot;Voted for President in 2016&quot;</span>)</span></code></pre></div>
 <pre><code>## Scale for fill is already present.
 ## Adding another scale for fill, which will replace the existing scale.</code></pre>
 <div class="figure"><span style="display:block;" id="fig:missing-anes-ggmissfct"></span>
@@ -735,17 +735,17 @@ <h3><span class="header-section-number">11.3.2</span> Visualization of missing d
 FIGURE 11.2: Missingness in variables for each level of <code>VotedPres2016</code> in the ANES 2020 data
 </p>
 </div>
-<p>In this case, we can see that if they did not vote for president in 2016 or did not answer that question, then they were not asked about who they voted for in 2016 (the percentage of missing data if 100%). Additionally, we can see with this graphic, that there is more missing data across all questions if they did not provide an answer to <code>VotedPres2016</code>.</p>
-<p>There are other graphics that work well with numeric data. For example, in the RECS 2020 data we can plot two continuous variables and the missing data associated with it to see if there are any patterns to the missingness. To do this, we can use the <code>bind_shadow()</code> function from the {naniar} package. This creates a <strong>nabular</strong> (combination of “na” with “tabular”), which features the original columns followed by the same number of columns with a specific <code>NA</code> format. These <code>NA</code> columns are indicators of if the value in the original data is missing or not. The example printed below shows how most levels of <code>HeatingBehavior</code> are not missing <code>!NA</code> in the NA variable of <code>HeatingBehavior_NA</code>, but those missing in <code>HeatingBehavior</code> are also missing in <code>HeatingBehavior_NA</code>.</p>
-<div class="sourceCode" id="cb354"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb354-1"><a href="c11-missing-data.html#cb354-1" tabindex="-1"></a>recs_2020_shadow <span class="ot">&lt;-</span> recs_2020 <span class="sc">%&gt;%</span> </span>
-<span id="cb354-2"><a href="c11-missing-data.html#cb354-2" tabindex="-1"></a>  <span class="fu">bind_shadow</span>()</span>
-<span id="cb354-3"><a href="c11-missing-data.html#cb354-3" tabindex="-1"></a></span>
-<span id="cb354-4"><a href="c11-missing-data.html#cb354-4" tabindex="-1"></a><span class="fu">ncol</span>(recs_2020)</span></code></pre></div>
+<p>In Figure <a href="c11-missing-data.html#fig:missing-anes-ggmissfct">11.2</a>, we can see that if respondents did not vote for president in 2016 or did not answer that question, then they were not asked about who they voted for in 2016 (the percentage of missing data is 100%.) Additionally, we can see with Figure <a href="c11-missing-data.html#fig:missing-anes-ggmissfct">11.2</a>, that there is more missing data across all questions if they did not provide an answer to <code>VotedPres2016</code>.</p>
+<p>There are other visualizations that work well with numeric data. For example, in the RECS 2020 data, we can plot two continuous variables and the missing data associated with them to see if there are any patterns in the missingness. To do this, we can use the <code>bind_shadow()</code> function from the {naniar} package. This creates a <strong>nabular</strong> (combination of “na” with “tabular”), which features the original columns followed by the same number of columns with a specific <code>NA</code> format. These <code>NA</code> columns are indicators of whether the value in the original data is missing or not. The example printed below shows how most levels of <code>HeatingBehavior</code> are not missing (<code>!NA</code>) in the NA variable of <code>HeatingBehavior_NA</code>, but those missing in <code>HeatingBehavior</code> are also missing in <code>HeatingBehavior_NA</code>.</p>
+<div class="sourceCode" id="cb355"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb355-1"><a href="c11-missing-data.html#cb355-1" tabindex="-1"></a>recs_2020_shadow <span class="ot">&lt;-</span> recs_2020 <span class="sc">%&gt;%</span> </span>
+<span id="cb355-2"><a href="c11-missing-data.html#cb355-2" tabindex="-1"></a>  <span class="fu">bind_shadow</span>()</span>
+<span id="cb355-3"><a href="c11-missing-data.html#cb355-3" tabindex="-1"></a></span>
+<span id="cb355-4"><a href="c11-missing-data.html#cb355-4" tabindex="-1"></a><span class="fu">ncol</span>(recs_2020)</span></code></pre></div>
 <pre><code>## [1] 118</code></pre>
-<div class="sourceCode" id="cb356"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb356-1"><a href="c11-missing-data.html#cb356-1" tabindex="-1"></a><span class="fu">ncol</span>(recs_2020_shadow)</span></code></pre></div>
+<div class="sourceCode" id="cb357"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb357-1"><a href="c11-missing-data.html#cb357-1" tabindex="-1"></a><span class="fu">ncol</span>(recs_2020_shadow)</span></code></pre></div>
 <pre><code>## [1] 236</code></pre>
-<div class="sourceCode" id="cb358"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb358-1"><a href="c11-missing-data.html#cb358-1" tabindex="-1"></a>recs_2020_shadow <span class="sc">%&gt;%</span> </span>
-<span id="cb358-2"><a href="c11-missing-data.html#cb358-2" tabindex="-1"></a>  <span class="fu">count</span>(HeatingBehavior,HeatingBehavior_NA)</span></code></pre></div>
+<div class="sourceCode" id="cb359"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb359-1"><a href="c11-missing-data.html#cb359-1" tabindex="-1"></a>recs_2020_shadow <span class="sc">%&gt;%</span> </span>
+<span id="cb359-2"><a href="c11-missing-data.html#cb359-2" tabindex="-1"></a>  <span class="fu">count</span>(HeatingBehavior,HeatingBehavior_NA)</span></code></pre></div>
 <pre><code>## # A tibble: 7 × 3
 ##   HeatingBehavior                               HeatingBehavior_NA     n
 ##   &lt;fct&gt;                                         &lt;fct&gt;              &lt;int&gt;
@@ -756,18 +756,20 @@ <h3><span class="header-section-number">11.3.2</span> Visualization of missing d
 ## 5 No control                                    !NA                  438
 ## 6 Other                                         !NA                   46
 ## 7 &lt;NA&gt;                                          NA                   751</code></pre>
-<p>We can then use these new variables to plot the missing data along side the actual data. For example, let’s plot a histogram of the total electric bill grouped by those that are missing and not missing by heating behavior.</p>
-<div class="sourceCode" id="cb360"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb360-1"><a href="c11-missing-data.html#cb360-1" tabindex="-1"></a>recs_2020_shadow <span class="sc">%&gt;%</span> </span>
-<span id="cb360-2"><a href="c11-missing-data.html#cb360-2" tabindex="-1"></a>  <span class="fu">filter</span>(TOTALDOL <span class="sc">&lt;</span> <span class="dv">5000</span>) <span class="sc">%&gt;%</span></span>
-<span id="cb360-3"><a href="c11-missing-data.html#cb360-3" tabindex="-1"></a>  <span class="fu">ggplot</span>(<span class="fu">aes</span>(<span class="at">x=</span>TOTALDOL,<span class="at">fill=</span>HeatingBehavior_NA)) <span class="sc">+</span></span>
-<span id="cb360-4"><a href="c11-missing-data.html#cb360-4" tabindex="-1"></a>  <span class="fu">geom_histogram</span>() <span class="sc">+</span></span>
-<span id="cb360-5"><a href="c11-missing-data.html#cb360-5" tabindex="-1"></a>  <span class="fu">scale_fill_manual</span>(<span class="at">values =</span> book_colors[<span class="fu">c</span>(<span class="dv">3</span>, <span class="dv">1</span>)],</span>
-<span id="cb360-6"><a href="c11-missing-data.html#cb360-6" tabindex="-1"></a>                    <span class="at">labels =</span> <span class="fu">c</span>(<span class="st">&quot;Present&quot;</span>, <span class="st">&quot;Missing&quot;</span>),</span>
-<span id="cb360-7"><a href="c11-missing-data.html#cb360-7" tabindex="-1"></a>                    <span class="at">name =</span> <span class="st">&quot;Heating Behavior&quot;</span>) <span class="sc">+</span></span>
-<span id="cb360-8"><a href="c11-missing-data.html#cb360-8" tabindex="-1"></a>  <span class="fu">theme_minimal</span>() <span class="sc">+</span></span>
-<span id="cb360-9"><a href="c11-missing-data.html#cb360-9" tabindex="-1"></a>  <span class="fu">xlab</span>(<span class="st">&quot;Total Energy Cost (Truncated at $5000)&quot;</span>) <span class="sc">+</span></span>
-<span id="cb360-10"><a href="c11-missing-data.html#cb360-10" tabindex="-1"></a>  <span class="fu">ylab</span>(<span class="st">&quot;Number of Households&quot;</span>) <span class="sc">+</span></span>
-<span id="cb360-11"><a href="c11-missing-data.html#cb360-11" tabindex="-1"></a>  <span class="fu">labs</span>(<span class="at">title =</span> <span class="st">&quot;Histogram of Energy Cost by Heating Behavior Missing Data&quot;</span>)</span></code></pre></div>
+<p>We can then use these new variables to plot the missing data alongside the actual data. For example, let’s plot a histogram of the total electric bill grouped by those missing and not missing by heating behavior (see Figure <a href="c11-missing-data.html#fig:missing-recs-hist">11.3</a>.)</p>
+<div class="sourceCode" id="cb361"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb361-1"><a href="c11-missing-data.html#cb361-1" tabindex="-1"></a>recs_2020_shadow <span class="sc">%&gt;%</span></span>
+<span id="cb361-2"><a href="c11-missing-data.html#cb361-2" tabindex="-1"></a>  <span class="fu">filter</span>(TOTALDOL <span class="sc">&lt;</span> <span class="dv">5000</span>) <span class="sc">%&gt;%</span></span>
+<span id="cb361-3"><a href="c11-missing-data.html#cb361-3" tabindex="-1"></a>  <span class="fu">ggplot</span>(<span class="fu">aes</span>(<span class="at">x =</span> TOTALDOL, <span class="at">fill =</span> HeatingBehavior_NA)) <span class="sc">+</span></span>
+<span id="cb361-4"><a href="c11-missing-data.html#cb361-4" tabindex="-1"></a>  <span class="fu">geom_histogram</span>() <span class="sc">+</span></span>
+<span id="cb361-5"><a href="c11-missing-data.html#cb361-5" tabindex="-1"></a>  <span class="fu">scale_fill_manual</span>(</span>
+<span id="cb361-6"><a href="c11-missing-data.html#cb361-6" tabindex="-1"></a>    <span class="at">values =</span> book_colors[<span class="fu">c</span>(<span class="dv">3</span>, <span class="dv">1</span>)],</span>
+<span id="cb361-7"><a href="c11-missing-data.html#cb361-7" tabindex="-1"></a>    <span class="at">labels =</span> <span class="fu">c</span>(<span class="st">&quot;Present&quot;</span>, <span class="st">&quot;Missing&quot;</span>),</span>
+<span id="cb361-8"><a href="c11-missing-data.html#cb361-8" tabindex="-1"></a>    <span class="at">name =</span> <span class="st">&quot;Heating Behavior&quot;</span></span>
+<span id="cb361-9"><a href="c11-missing-data.html#cb361-9" tabindex="-1"></a>  ) <span class="sc">+</span></span>
+<span id="cb361-10"><a href="c11-missing-data.html#cb361-10" tabindex="-1"></a>  <span class="fu">theme_minimal</span>() <span class="sc">+</span></span>
+<span id="cb361-11"><a href="c11-missing-data.html#cb361-11" tabindex="-1"></a>  <span class="fu">xlab</span>(<span class="st">&quot;Total Energy Cost (Truncated at $5000)&quot;</span>) <span class="sc">+</span></span>
+<span id="cb361-12"><a href="c11-missing-data.html#cb361-12" tabindex="-1"></a>  <span class="fu">ylab</span>(<span class="st">&quot;Number of Households&quot;</span>) <span class="sc">+</span></span>
+<span id="cb361-13"><a href="c11-missing-data.html#cb361-13" tabindex="-1"></a>  <span class="fu">ggtitle</span>(<span class="st">&quot;Histogram of Energy Cost by Heating Behavior Missing Data&quot;</span>)</span></code></pre></div>
 <pre><code>## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.</code></pre>
 <div class="figure"><span style="display:block;" id="fig:missing-recs-hist"></span>
 <img src="bookdown_files/figure-html/missing-recs-hist-1.png" alt="This chart has title 'Histogram of Energy Cost by Heating Behavior Missing Data'. It has x-axis 'Total Energy Cost (Truncated at $5000)' with labels 0, 1000, 2000, 3000, 4000 and 5000. It has y-axis 'Number of Households' with labels 0, 500, 1000 and 1500. There is a legend indicating fill is used to show HeatingBehavior_NA, with 2 levels: !NA shown as very pale blue fill and  NA shown as dark blue fill. The chart is a bar chart with 30 vertical bars. These are stacked, as sorted by HeatingBehavior_NA." width="672" />
@@ -775,36 +777,38 @@ <h3><span class="header-section-number">11.3.2</span> Visualization of missing d
 FIGURE 11.3: Histogram of Energy Cost by Heating Behavior Missing Data
 </p>
 </div>
-<p>This plot indicates that respondents who did not provide a response for the heating behavior question may have a different distribution of total energy cost compared to respondents who did provide a response. This view of the raw data and missingness could indicate some bias in the data. Researchers take these different bias aspects into account when calculating weights and we need to make sure that the weights are incorporated when analyzing the data.</p>
+<p>Figure <a href="c11-missing-data.html#fig:missing-recs-hist">11.3</a> indicates that respondents who did not provide a response for the heating behavior question may have a different distribution of total energy cost compared to respondents who did provide a response. This view of the raw data and missingness could indicate some bias in the data. Researchers take these different bias aspects into account when calculating weights, and we need to make sure that we incorporate the weights when analyzing the data.</p>
 <p>There are many other visualizations that can be helpful in reviewing the data, and we recommend reviewing the {naniar} documentation for more information <span class="citation">(<a href="#ref-naniar2023">Tierney and Cook 2023</a>)</span>.</p>
 </div>
 </div>
 <div id="analysis-with-missing-data" class="section level2 hasAnchor" number="11.4">
 <h2><span class="header-section-number">11.4</span> Analysis with missing data<a href="c11-missing-data.html#analysis-with-missing-data" class="anchor-section" aria-label="Anchor link to header"></a></h2>
-<p>Once we understand the types of missingness, we can begin the analysis of the data. Different missingness types may be handled in different ways. In most publicly available datasets, researchers will have already calculated weights and imputed missing values if deemed necessary. Those interested in learning more about how to calculate weights and impute data for different missing data mechanisms, we recommended <span class="citation">Kim and Shao (<a href="#ref-Kim2021">2021</a>)</span> and <span class="citation">Valliant and Dever (<a href="#ref-Valliant2018weights">2018</a>)</span>.</p>
-<p>Even with weights and imputation, missing data will still most likely exist in the data and need to be accounted for in analysis. This section provides an overview on how to recode missing data in R, and how to account for skip patterns in analysis.</p>
+<p>Once we understand the types of missingness, we can begin the analysis of the data. Different missingness types may be handled in different ways. In most publicly available datasets, researchers have already calculated weights and imputed missing values if necessary. For those interested in learning more about how to calculate weights and impute data for different missing data mechanisms, we recommended <span class="citation">Kim and Shao (<a href="#ref-Kim2021">2021</a>)</span> and <span class="citation">Valliant and Dever (<a href="#ref-Valliant2018weights">2018</a>)</span>.</p>
+<p>Even with weights and imputation, missing data are most likely still present and need to be accounted for in analysis. This section provides an overview on how to recode missing data in R, and how to account for skip patterns in analysis.</p>
 <div id="recoding-missing-data" class="section level3 hasAnchor" number="11.4.1">
 <h3><span class="header-section-number">11.4.1</span> Recoding missing data<a href="c11-missing-data.html#recoding-missing-data" class="anchor-section" aria-label="Anchor link to header"></a></h3>
-<p>Even within a variable, there can be different reasons for missing data. In publicly released data negative values are often present to provide different meaning for values. For example, in the ANES 2020 data they have the following negative values to represent different types of missing data:
-* -9: Refused
-* -8: Don’t Know
-* -7: No post-election data, deleted due to incomplete interview
-* -6: No post-election interview
-* -5: Interview breakoff (sufficient partial IW)
-* -4: Technical error
-* -3: Restricted
-* -2: Other missing reason (question specific)
-* -1: Inapplicable</p>
-<p>When we created the derived variables for use in this book, we coded all negative values as <code>NA</code> and proceeded to analyze the data. For most cases this is an appropriate approach as long as you filter the data appropriately to account for skip patterns (see next section). However, the {naniar} package does have the option to code special missing values. For example, if we wanted to have two <code>NA</code> values, one that indicated the question was missing by design (e.g., due to skip patterns) and one for the other missing categories we can use the <code>nabular</code> format to incorporate these with the <code>recode_shadow()</code> function.</p>
-<div class="sourceCode" id="cb362"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb362-1"><a href="c11-missing-data.html#cb362-1" tabindex="-1"></a>anes_2020_shadow<span class="ot">&lt;-</span>anes_2020 <span class="sc">%&gt;%</span></span>
-<span id="cb362-2"><a href="c11-missing-data.html#cb362-2" tabindex="-1"></a>  <span class="fu">select</span>(<span class="fu">starts_with</span>(<span class="st">&quot;V2&quot;</span>)) <span class="sc">%&gt;%</span></span>
-<span id="cb362-3"><a href="c11-missing-data.html#cb362-3" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="fu">across</span>(<span class="fu">everything</span>(),<span class="sc">~</span><span class="fu">case_when</span>(.x <span class="sc">&lt;</span> <span class="sc">-</span><span class="dv">1</span> <span class="sc">~</span> <span class="cn">NA</span>,</span>
-<span id="cb362-4"><a href="c11-missing-data.html#cb362-4" tabindex="-1"></a>                                        <span class="cn">TRUE</span><span class="sc">~</span>.x))) <span class="sc">%&gt;%</span> </span>
-<span id="cb362-5"><a href="c11-missing-data.html#cb362-5" tabindex="-1"></a>  <span class="fu">bind_shadow</span>() <span class="sc">%&gt;%</span></span>
-<span id="cb362-6"><a href="c11-missing-data.html#cb362-6" tabindex="-1"></a>  <span class="fu">recode_shadow</span>(<span class="at">V201103 =</span> <span class="fu">.where</span>(V201103<span class="sc">==-</span><span class="dv">1</span><span class="sc">~</span><span class="st">&quot;skip&quot;</span>))</span>
-<span id="cb362-7"><a href="c11-missing-data.html#cb362-7" tabindex="-1"></a></span>
-<span id="cb362-8"><a href="c11-missing-data.html#cb362-8" tabindex="-1"></a>anes_2020_shadow <span class="sc">%&gt;%</span> </span>
-<span id="cb362-9"><a href="c11-missing-data.html#cb362-9" tabindex="-1"></a>  <span class="fu">count</span>(V201103,V201103_NA)</span></code></pre></div>
+<p>Even within a variable, there can be different reasons for missing data. In publicly released data, negative values are often present to provide different meanings for values. For example, in the ANES 2020 data, they have the following negative values to represent different types of missing data:</p>
+<ul>
+<li>-9: Refused</li>
+<li>-8: Don’t Know</li>
+<li>-7: No post-election data, deleted due to incomplete interview</li>
+<li>-6: No post-election interview</li>
+<li>-5: Interview breakoff (sufficient partial IW)</li>
+<li>-4: Technical error</li>
+<li>-3: Restricted</li>
+<li>-2: Other missing reason (question specific)</li>
+<li>-1: Inapplicable</li>
+</ul>
+<p>When we created the derived variables for use in this book, we coded all negative values as <code>NA</code> and proceeded to analyze the data. For most cases, this is an appropriate approach as long as we filter the data appropriately to account for skip patterns (see Section <a href="c11-missing-data.html#missing-skip-patt">11.4.2</a>). However, the {naniar} package does have the option to code special missing values. For example, if we wanted to have two <code>NA</code> values, one that indicated the question was missing by design (e.g., due to skip patterns) and one for the other missing categories, we can use the <code>nabular</code> format to incorporate these with the <code>recode_shadow()</code> function.</p>
+<div class="sourceCode" id="cb363"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb363-1"><a href="c11-missing-data.html#cb363-1" tabindex="-1"></a>anes_2020_shadow<span class="ot">&lt;-</span>anes_2020 <span class="sc">%&gt;%</span></span>
+<span id="cb363-2"><a href="c11-missing-data.html#cb363-2" tabindex="-1"></a>  <span class="fu">select</span>(<span class="fu">starts_with</span>(<span class="st">&quot;V2&quot;</span>)) <span class="sc">%&gt;%</span></span>
+<span id="cb363-3"><a href="c11-missing-data.html#cb363-3" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="fu">across</span>(<span class="fu">everything</span>(),<span class="sc">~</span><span class="fu">case_when</span>(.x <span class="sc">&lt;</span> <span class="sc">-</span><span class="dv">1</span> <span class="sc">~</span> <span class="cn">NA</span>,</span>
+<span id="cb363-4"><a href="c11-missing-data.html#cb363-4" tabindex="-1"></a>                                        <span class="cn">TRUE</span><span class="sc">~</span>.x))) <span class="sc">%&gt;%</span> </span>
+<span id="cb363-5"><a href="c11-missing-data.html#cb363-5" tabindex="-1"></a>  <span class="fu">bind_shadow</span>() <span class="sc">%&gt;%</span></span>
+<span id="cb363-6"><a href="c11-missing-data.html#cb363-6" tabindex="-1"></a>  <span class="fu">recode_shadow</span>(<span class="at">V201103 =</span> <span class="fu">.where</span>(V201103<span class="sc">==-</span><span class="dv">1</span><span class="sc">~</span><span class="st">&quot;skip&quot;</span>))</span>
+<span id="cb363-7"><a href="c11-missing-data.html#cb363-7" tabindex="-1"></a></span>
+<span id="cb363-8"><a href="c11-missing-data.html#cb363-8" tabindex="-1"></a>anes_2020_shadow <span class="sc">%&gt;%</span> </span>
+<span id="cb363-9"><a href="c11-missing-data.html#cb363-9" tabindex="-1"></a>  <span class="fu">count</span>(V201103,V201103_NA)</span></code></pre></div>
 <pre><code>## # A tibble: 5 × 3
 ##   V201103                 V201103_NA     n
 ##   &lt;dbl+lbl&gt;               &lt;fct&gt;      &lt;int&gt;
@@ -813,20 +817,20 @@ <h3><span class="header-section-number">11.4.1</span> Recoding missing data<a hr
 ## 3  2 [2. Donald Trump]    !NA         2466
 ## 4  5 [5. Other {SPECIFY}] !NA          390
 ## 5 NA                      NA            43</code></pre>
-<p>However it is important to note that at the time of publication, there is no easy way to implement <code>recode_shadow()</code> to multiple variables at once (e.g., we cannot use the tidyverse feature of <code>across()</code>). The example code above only implements this for a single variable, so this would have to be done to all variables of interest manually or in a loop.</p>
+<p>However, it is important to note that at the time of publication, there is no easy way to implement <code>recode_shadow()</code> to multiple variables at once (e.g., we cannot use the tidyverse feature of <code>across()</code>.) The example code above only implements this for a single variable, so this would have to be done manually or in a loop for all variables of interest.</p>
 </div>
-<div id="accounting-for-skip-patterns" class="section level3 hasAnchor" number="11.4.2">
-<h3><span class="header-section-number">11.4.2</span> Accounting for skip patterns<a href="c11-missing-data.html#accounting-for-skip-patterns" class="anchor-section" aria-label="Anchor link to header"></a></h3>
-<p>When questions are skipped by design in a survey, it is meaningful that the data is later missing. For example the RECS survey asks people how they control the heat in their home in the winter (<code>HeatingBehavior</code>). This is only among those who have heat in their home (<code>SpaceHeatingUsed</code>). If no there is no heating equipment used, the value of <code>HeatingBehavior</code> is missing. One has several choices when analyzing this data which include 1) only including those with a valid value of <code>HeatingBehavior</code> and specifying the universe as those with heat or 2) including those who do not have heat. It is important to specify what population an analysis generalizes to.</p>
-<p>Here is example code where we only include those with a valid value of <code>HeatingBehavior</code> (choice 1). Note that we use the design object (<code>recs_des</code>) then filter to those that are not missing on <code>HeatingBehavior</code>.</p>
-<div class="sourceCode" id="cb364"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb364-1"><a href="c11-missing-data.html#cb364-1" tabindex="-1"></a>heat_cntl_1 <span class="ot">&lt;-</span> recs_des <span class="sc">%&gt;%</span></span>
-<span id="cb364-2"><a href="c11-missing-data.html#cb364-2" tabindex="-1"></a>  <span class="fu">filter</span>(<span class="sc">!</span><span class="fu">is.na</span>(HeatingBehavior)) <span class="sc">%&gt;%</span></span>
-<span id="cb364-3"><a href="c11-missing-data.html#cb364-3" tabindex="-1"></a>  <span class="fu">group_by</span>(HeatingBehavior) <span class="sc">%&gt;%</span></span>
-<span id="cb364-4"><a href="c11-missing-data.html#cb364-4" tabindex="-1"></a>  <span class="fu">summarize</span>(</span>
-<span id="cb364-5"><a href="c11-missing-data.html#cb364-5" tabindex="-1"></a>    <span class="at">p=</span><span class="fu">survey_prop</span>()</span>
-<span id="cb364-6"><a href="c11-missing-data.html#cb364-6" tabindex="-1"></a>  )</span>
-<span id="cb364-7"><a href="c11-missing-data.html#cb364-7" tabindex="-1"></a></span>
-<span id="cb364-8"><a href="c11-missing-data.html#cb364-8" tabindex="-1"></a>heat_cntl_1</span></code></pre></div>
+<div id="missing-skip-patt" class="section level3 hasAnchor" number="11.4.2">
+<h3><span class="header-section-number">11.4.2</span> Accounting for skip patterns<a href="c11-missing-data.html#missing-skip-patt" class="anchor-section" aria-label="Anchor link to header"></a></h3>
+<p>When questions are skipped by design in a survey, it is meaningful that the data are later missing. For example, the RECS survey asks people how they control the heat in their homes in the winter (<code>HeatingBehavior</code>.) This is only among those who have heat in their home (<code>SpaceHeatingUsed</code>.) If no heating equipment was used, the value of <code>HeatingBehavior</code> is missing. One has several choices when analyzing these data, which include 1) only including those with a valid value of <code>HeatingBehavior</code> and specifying the universe as those with heat, and 2) including those who do not have heat. It is important to specify what population an analysis generalizes to.</p>
+<p>Here is an example where we only include those with a valid value of <code>HeatingBehavior</code> (choice 1.) Note that we use the design object (<code>recs_des</code>) and then filter to those that are not missing on <code>HeatingBehavior</code>.</p>
+<div class="sourceCode" id="cb365"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb365-1"><a href="c11-missing-data.html#cb365-1" tabindex="-1"></a>heat_cntl_1 <span class="ot">&lt;-</span> recs_des <span class="sc">%&gt;%</span></span>
+<span id="cb365-2"><a href="c11-missing-data.html#cb365-2" tabindex="-1"></a>  <span class="fu">filter</span>(<span class="sc">!</span><span class="fu">is.na</span>(HeatingBehavior)) <span class="sc">%&gt;%</span></span>
+<span id="cb365-3"><a href="c11-missing-data.html#cb365-3" tabindex="-1"></a>  <span class="fu">group_by</span>(HeatingBehavior) <span class="sc">%&gt;%</span></span>
+<span id="cb365-4"><a href="c11-missing-data.html#cb365-4" tabindex="-1"></a>  <span class="fu">summarize</span>(</span>
+<span id="cb365-5"><a href="c11-missing-data.html#cb365-5" tabindex="-1"></a>    <span class="at">p=</span><span class="fu">survey_prop</span>()</span>
+<span id="cb365-6"><a href="c11-missing-data.html#cb365-6" tabindex="-1"></a>  )</span>
+<span id="cb365-7"><a href="c11-missing-data.html#cb365-7" tabindex="-1"></a></span>
+<span id="cb365-8"><a href="c11-missing-data.html#cb365-8" tabindex="-1"></a>heat_cntl_1</span></code></pre></div>
 <pre><code>## # A tibble: 6 × 3
 ##   HeatingBehavior                                              p    p_se
 ##   &lt;fct&gt;                                                    &lt;dbl&gt;   &lt;dbl&gt;
@@ -836,14 +840,14 @@ <h3><span class="header-section-number">11.4.2</span> Accounting for skip patter
 ## 4 Turn on or off as needed                               0.102   2.89e-3
 ## 5 No control                                             0.0333  1.70e-3
 ## 6 Other                                                  0.00208 3.59e-4</code></pre>
-<p>Here is example code where we include those that do not have heat (choice 2). To help understand what we are looking at we have included the output to show both variables <code>SpaceHeatingUsed</code> and <code>HeatingBehavior</code>.</p>
-<div class="sourceCode" id="cb366"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb366-1"><a href="c11-missing-data.html#cb366-1" tabindex="-1"></a>heat_cntl_2 <span class="ot">&lt;-</span> recs_des <span class="sc">%&gt;%</span></span>
-<span id="cb366-2"><a href="c11-missing-data.html#cb366-2" tabindex="-1"></a>  <span class="fu">group_by</span>(<span class="fu">interact</span>(SpaceHeatingUsed, HeatingBehavior)) <span class="sc">%&gt;%</span></span>
-<span id="cb366-3"><a href="c11-missing-data.html#cb366-3" tabindex="-1"></a>  <span class="fu">summarize</span>(</span>
-<span id="cb366-4"><a href="c11-missing-data.html#cb366-4" tabindex="-1"></a>    <span class="at">p=</span><span class="fu">survey_prop</span>()</span>
-<span id="cb366-5"><a href="c11-missing-data.html#cb366-5" tabindex="-1"></a>  )</span>
-<span id="cb366-6"><a href="c11-missing-data.html#cb366-6" tabindex="-1"></a></span>
-<span id="cb366-7"><a href="c11-missing-data.html#cb366-7" tabindex="-1"></a>heat_cntl_2</span></code></pre></div>
+<p>Here is an example where we include those that do not have heat (choice 2.) To help understand what we are looking at, we have included the output to show both variables, <code>SpaceHeatingUsed</code> and <code>HeatingBehavior</code>.</p>
+<div class="sourceCode" id="cb367"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb367-1"><a href="c11-missing-data.html#cb367-1" tabindex="-1"></a>heat_cntl_2 <span class="ot">&lt;-</span> recs_des <span class="sc">%&gt;%</span></span>
+<span id="cb367-2"><a href="c11-missing-data.html#cb367-2" tabindex="-1"></a>  <span class="fu">group_by</span>(<span class="fu">interact</span>(SpaceHeatingUsed, HeatingBehavior)) <span class="sc">%&gt;%</span></span>
+<span id="cb367-3"><a href="c11-missing-data.html#cb367-3" tabindex="-1"></a>  <span class="fu">summarize</span>(</span>
+<span id="cb367-4"><a href="c11-missing-data.html#cb367-4" tabindex="-1"></a>    <span class="at">p=</span><span class="fu">survey_prop</span>()</span>
+<span id="cb367-5"><a href="c11-missing-data.html#cb367-5" tabindex="-1"></a>  )</span>
+<span id="cb367-6"><a href="c11-missing-data.html#cb367-6" tabindex="-1"></a></span>
+<span id="cb367-7"><a href="c11-missing-data.html#cb367-7" tabindex="-1"></a>heat_cntl_2</span></code></pre></div>
 <pre><code>## # A tibble: 7 × 4
 ##   SpaceHeatingUsed HeatingBehavior                             p    p_se
 ##   &lt;lgl&gt;            &lt;fct&gt;                                   &lt;dbl&gt;   &lt;dbl&gt;
@@ -854,26 +858,26 @@ <h3><span class="header-section-number">11.4.2</span> Accounting for skip patter
 ## 5 TRUE             Turn on or off as needed              0.0976  2.79e-3
 ## 6 TRUE             No control                            0.0317  1.62e-3
 ## 7 TRUE             Other                                 0.00198 3.41e-4</code></pre>
-<p>If we ran the first analysis, we would say that 16.8% <strong>of households with heat</strong> use a programmable or smart thermostat for the heating of their home. While if we used the results from the second analysis, we could say that 16% of households use a programmable or smart thermostat for the heating of their home. The distinction of the two statements is bolded for emphasis. Skip patterns often change the universe that we are talking about and need to be carefully examined.</p>
-<p>Filtering to the correct universe is important when handling these types of missing data. The <code>nabular</code> we created above can also help with this. If we have <code>NA_skip</code> values in the shadow, we can make sure that we filter out all of these values and only include relevant missing. To do this with survey data we could first create the <code>nabular</code>, then create the design object on that data, and then use the shadow variables to assist with filtering the data. Let’s use the <code>nabular</code> we created above for ANES 2020 (<code>anes_2020_shadow</code>) to create the design object.</p>
-<div class="sourceCode" id="cb368"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb368-1"><a href="c11-missing-data.html#cb368-1" tabindex="-1"></a>anes_adjwgt_shadow <span class="ot">&lt;-</span> anes_2020_shadow <span class="sc">%&gt;%</span> </span>
-<span id="cb368-2"><a href="c11-missing-data.html#cb368-2" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">V200010b =</span> V200010b<span class="sc">/</span><span class="fu">sum</span>(V200010b)<span class="sc">*</span>targetpop)</span>
-<span id="cb368-3"><a href="c11-missing-data.html#cb368-3" tabindex="-1"></a></span>
-<span id="cb368-4"><a href="c11-missing-data.html#cb368-4" tabindex="-1"></a>anes_des_shadow <span class="ot">&lt;-</span> anes_adjwgt_shadow <span class="sc">%&gt;%</span> </span>
-<span id="cb368-5"><a href="c11-missing-data.html#cb368-5" tabindex="-1"></a>  <span class="fu">as_survey_design</span>(</span>
-<span id="cb368-6"><a href="c11-missing-data.html#cb368-6" tabindex="-1"></a>    <span class="at">weights =</span> V200010b,</span>
-<span id="cb368-7"><a href="c11-missing-data.html#cb368-7" tabindex="-1"></a>    <span class="at">strata =</span> V200010d,</span>
-<span id="cb368-8"><a href="c11-missing-data.html#cb368-8" tabindex="-1"></a>    <span class="at">ids =</span> V200010c,</span>
-<span id="cb368-9"><a href="c11-missing-data.html#cb368-9" tabindex="-1"></a>    <span class="at">nest =</span> <span class="cn">TRUE</span></span>
-<span id="cb368-10"><a href="c11-missing-data.html#cb368-10" tabindex="-1"></a>  )</span></code></pre></div>
-<p>Then we can use this design object to look at the percent of the population that voted for each candidate in 2016 (<code>V201103</code>). First, let’s look at the percentages without removing any cases:</p>
-<div class="sourceCode" id="cb369"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb369-1"><a href="c11-missing-data.html#cb369-1" tabindex="-1"></a>pres16_select1<span class="ot">&lt;-</span>anes_des_shadow <span class="sc">%&gt;%</span> </span>
-<span id="cb369-2"><a href="c11-missing-data.html#cb369-2" tabindex="-1"></a>  <span class="fu">group_by</span>(V201103) <span class="sc">%&gt;%</span></span>
-<span id="cb369-3"><a href="c11-missing-data.html#cb369-3" tabindex="-1"></a>  <span class="fu">summarize</span>(</span>
-<span id="cb369-4"><a href="c11-missing-data.html#cb369-4" tabindex="-1"></a>    <span class="at">All_Missing=</span><span class="fu">survey_prop</span>()</span>
-<span id="cb369-5"><a href="c11-missing-data.html#cb369-5" tabindex="-1"></a>  )</span>
-<span id="cb369-6"><a href="c11-missing-data.html#cb369-6" tabindex="-1"></a></span>
-<span id="cb369-7"><a href="c11-missing-data.html#cb369-7" tabindex="-1"></a>pres16_select1</span></code></pre></div>
+<p>If we ran the first analysis, we would say that 16.8% <strong>of households with heat</strong> use a programmable or smart thermostat for heating of their home. If we used the results from the second analysis, we would say that 16% <strong>of households</strong> use a programmable or smart thermostat for heating of their home. The distinction between the two statements is made bold for emphasis. Skip patterns often change the universe we are talking about and need to be carefully examined.</p>
+<p>Filtering to the correct universe is important when handling these types of missing data. The <code>nabular</code> we created above can also help with this. If we have <code>NA_skip</code> values in the shadow, we can make sure that we filter out all of these values and only include relevant missing values. To do this with survey data, we could first create the <code>nabular</code>, then create the design object on that data, and then use the shadow variables to assist with filtering the data. Let’s use the <code>nabular</code> we created above for ANES 2020 (<code>anes_2020_shadow</code>) to create the design object.</p>
+<div class="sourceCode" id="cb369"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb369-1"><a href="c11-missing-data.html#cb369-1" tabindex="-1"></a>anes_adjwgt_shadow <span class="ot">&lt;-</span> anes_2020_shadow <span class="sc">%&gt;%</span> </span>
+<span id="cb369-2"><a href="c11-missing-data.html#cb369-2" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">V200010b =</span> V200010b<span class="sc">/</span><span class="fu">sum</span>(V200010b)<span class="sc">*</span>targetpop)</span>
+<span id="cb369-3"><a href="c11-missing-data.html#cb369-3" tabindex="-1"></a></span>
+<span id="cb369-4"><a href="c11-missing-data.html#cb369-4" tabindex="-1"></a>anes_des_shadow <span class="ot">&lt;-</span> anes_adjwgt_shadow <span class="sc">%&gt;%</span> </span>
+<span id="cb369-5"><a href="c11-missing-data.html#cb369-5" tabindex="-1"></a>  <span class="fu">as_survey_design</span>(</span>
+<span id="cb369-6"><a href="c11-missing-data.html#cb369-6" tabindex="-1"></a>    <span class="at">weights =</span> V200010b,</span>
+<span id="cb369-7"><a href="c11-missing-data.html#cb369-7" tabindex="-1"></a>    <span class="at">strata =</span> V200010d,</span>
+<span id="cb369-8"><a href="c11-missing-data.html#cb369-8" tabindex="-1"></a>    <span class="at">ids =</span> V200010c,</span>
+<span id="cb369-9"><a href="c11-missing-data.html#cb369-9" tabindex="-1"></a>    <span class="at">nest =</span> <span class="cn">TRUE</span></span>
+<span id="cb369-10"><a href="c11-missing-data.html#cb369-10" tabindex="-1"></a>  )</span></code></pre></div>
+<p>Then, we can use this design object to look at the percentage of the population that voted for each candidate in 2016 (<code>V201103</code>.) First, let’s look at the percentages without removing any cases:</p>
+<div class="sourceCode" id="cb370"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb370-1"><a href="c11-missing-data.html#cb370-1" tabindex="-1"></a>pres16_select1<span class="ot">&lt;-</span>anes_des_shadow <span class="sc">%&gt;%</span> </span>
+<span id="cb370-2"><a href="c11-missing-data.html#cb370-2" tabindex="-1"></a>  <span class="fu">group_by</span>(V201103) <span class="sc">%&gt;%</span></span>
+<span id="cb370-3"><a href="c11-missing-data.html#cb370-3" tabindex="-1"></a>  <span class="fu">summarize</span>(</span>
+<span id="cb370-4"><a href="c11-missing-data.html#cb370-4" tabindex="-1"></a>    <span class="at">All_Missing=</span><span class="fu">survey_prop</span>()</span>
+<span id="cb370-5"><a href="c11-missing-data.html#cb370-5" tabindex="-1"></a>  )</span>
+<span id="cb370-6"><a href="c11-missing-data.html#cb370-6" tabindex="-1"></a></span>
+<span id="cb370-7"><a href="c11-missing-data.html#cb370-7" tabindex="-1"></a>pres16_select1</span></code></pre></div>
 <pre><code>## # A tibble: 5 × 3
 ##   V201103                 All_Missing All_Missing_se
 ##   &lt;dbl+lbl&gt;                     &lt;dbl&gt;          &lt;dbl&gt;
@@ -882,15 +886,15 @@ <h3><span class="header-section-number">11.4.2</span> Accounting for skip patter
 ## 3  2 [2. Donald Trump]        0.299          0.00728
 ## 4  5 [5. Other {SPECIFY}]     0.0409         0.00230
 ## 5 NA                          0.00627        0.00121</code></pre>
-<p>Next, we will look at the percentages removing only those that were missing due to skip patterns (i.e., they did not receive this question).</p>
-<div class="sourceCode" id="cb371"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb371-1"><a href="c11-missing-data.html#cb371-1" tabindex="-1"></a>pres16_select2<span class="ot">&lt;-</span>anes_des_shadow <span class="sc">%&gt;%</span> </span>
-<span id="cb371-2"><a href="c11-missing-data.html#cb371-2" tabindex="-1"></a>  <span class="fu">filter</span>(V201103_NA<span class="sc">!=</span><span class="st">&quot;NA_skip&quot;</span>) <span class="sc">%&gt;%</span> </span>
-<span id="cb371-3"><a href="c11-missing-data.html#cb371-3" tabindex="-1"></a>  <span class="fu">group_by</span>(V201103) <span class="sc">%&gt;%</span></span>
-<span id="cb371-4"><a href="c11-missing-data.html#cb371-4" tabindex="-1"></a>  <span class="fu">summarize</span>(</span>
-<span id="cb371-5"><a href="c11-missing-data.html#cb371-5" tabindex="-1"></a>    <span class="at">No_Skip_Missing=</span><span class="fu">survey_prop</span>()</span>
-<span id="cb371-6"><a href="c11-missing-data.html#cb371-6" tabindex="-1"></a>  )</span>
-<span id="cb371-7"><a href="c11-missing-data.html#cb371-7" tabindex="-1"></a></span>
-<span id="cb371-8"><a href="c11-missing-data.html#cb371-8" tabindex="-1"></a>pres16_select2</span></code></pre></div>
+<p>Next, we look at the percentages, removing only those missing due to skip patterns (i.e., they did not receive this question.)</p>
+<div class="sourceCode" id="cb372"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb372-1"><a href="c11-missing-data.html#cb372-1" tabindex="-1"></a>pres16_select2<span class="ot">&lt;-</span>anes_des_shadow <span class="sc">%&gt;%</span> </span>
+<span id="cb372-2"><a href="c11-missing-data.html#cb372-2" tabindex="-1"></a>  <span class="fu">filter</span>(V201103_NA<span class="sc">!=</span><span class="st">&quot;NA_skip&quot;</span>) <span class="sc">%&gt;%</span> </span>
+<span id="cb372-3"><a href="c11-missing-data.html#cb372-3" tabindex="-1"></a>  <span class="fu">group_by</span>(V201103) <span class="sc">%&gt;%</span></span>
+<span id="cb372-4"><a href="c11-missing-data.html#cb372-4" tabindex="-1"></a>  <span class="fu">summarize</span>(</span>
+<span id="cb372-5"><a href="c11-missing-data.html#cb372-5" tabindex="-1"></a>    <span class="at">No_Skip_Missing=</span><span class="fu">survey_prop</span>()</span>
+<span id="cb372-6"><a href="c11-missing-data.html#cb372-6" tabindex="-1"></a>  )</span>
+<span id="cb372-7"><a href="c11-missing-data.html#cb372-7" tabindex="-1"></a></span>
+<span id="cb372-8"><a href="c11-missing-data.html#cb372-8" tabindex="-1"></a>pres16_select2</span></code></pre></div>
 <pre><code>## # A tibble: 4 × 3
 ##   V201103                 No_Skip_Missing No_Skip_Missing_se
 ##   &lt;dbl+lbl&gt;                         &lt;dbl&gt;              &lt;dbl&gt;
@@ -898,15 +902,15 @@ <h3><span class="header-section-number">11.4.2</span> Accounting for skip patter
 ## 2  2 [2. Donald Trump]            0.443              0.00856
 ## 3  5 [5. Other {SPECIFY}]         0.0606             0.00330
 ## 4 NA                              0.00928            0.00178</code></pre>
-<p>Finally, we will look at the percentages removing all missing values both due to skip patterns and due to those who refused to answer the question.</p>
-<div class="sourceCode" id="cb373"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb373-1"><a href="c11-missing-data.html#cb373-1" tabindex="-1"></a>pres16_select3<span class="ot">&lt;-</span>anes_des_shadow <span class="sc">%&gt;%</span> </span>
-<span id="cb373-2"><a href="c11-missing-data.html#cb373-2" tabindex="-1"></a>  <span class="fu">filter</span>(V201103_NA<span class="sc">==</span><span class="st">&quot;!NA&quot;</span>) <span class="sc">%&gt;%</span> </span>
-<span id="cb373-3"><a href="c11-missing-data.html#cb373-3" tabindex="-1"></a>  <span class="fu">group_by</span>(V201103) <span class="sc">%&gt;%</span></span>
-<span id="cb373-4"><a href="c11-missing-data.html#cb373-4" tabindex="-1"></a>  <span class="fu">summarize</span>(</span>
-<span id="cb373-5"><a href="c11-missing-data.html#cb373-5" tabindex="-1"></a>    <span class="at">No_Missing=</span><span class="fu">survey_prop</span>()</span>
-<span id="cb373-6"><a href="c11-missing-data.html#cb373-6" tabindex="-1"></a>  )</span>
-<span id="cb373-7"><a href="c11-missing-data.html#cb373-7" tabindex="-1"></a></span>
-<span id="cb373-8"><a href="c11-missing-data.html#cb373-8" tabindex="-1"></a>pres16_select3</span></code></pre></div>
+<p>Finally, we look at the percentages, removing all missing values both due to skip patterns and due to those who refused to answer the question.</p>
+<div class="sourceCode" id="cb374"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb374-1"><a href="c11-missing-data.html#cb374-1" tabindex="-1"></a>pres16_select3<span class="ot">&lt;-</span>anes_des_shadow <span class="sc">%&gt;%</span> </span>
+<span id="cb374-2"><a href="c11-missing-data.html#cb374-2" tabindex="-1"></a>  <span class="fu">filter</span>(V201103_NA<span class="sc">==</span><span class="st">&quot;!NA&quot;</span>) <span class="sc">%&gt;%</span> </span>
+<span id="cb374-3"><a href="c11-missing-data.html#cb374-3" tabindex="-1"></a>  <span class="fu">group_by</span>(V201103) <span class="sc">%&gt;%</span></span>
+<span id="cb374-4"><a href="c11-missing-data.html#cb374-4" tabindex="-1"></a>  <span class="fu">summarize</span>(</span>
+<span id="cb374-5"><a href="c11-missing-data.html#cb374-5" tabindex="-1"></a>    <span class="at">No_Missing=</span><span class="fu">survey_prop</span>()</span>
+<span id="cb374-6"><a href="c11-missing-data.html#cb374-6" tabindex="-1"></a>  )</span>
+<span id="cb374-7"><a href="c11-missing-data.html#cb374-7" tabindex="-1"></a></span>
+<span id="cb374-8"><a href="c11-missing-data.html#cb374-8" tabindex="-1"></a>pres16_select3</span></code></pre></div>
 <pre><code>## # A tibble: 3 × 3
 ##   V201103                No_Missing No_Missing_se
 ##   &lt;dbl+lbl&gt;                   &lt;dbl&gt;         &lt;dbl&gt;
@@ -1420,7 +1424,7 @@ <h3><span class="header-section-number">11.4.2</span> Accounting for skip patter
   
 </table>
 </div>
-<p>As Table <a href="c11-missing-data.html#tab:missing-anes-shadow-tab">11.1</a> shows, the results can vary greatly depending on which type of missing data that are removed. If we remove only the skip patterns the margin between the Clinton and Trump is 4.5 percentage points, but if we include all data even including those that did not vote in 2016, the margin is 3.1 percentage points. How we handle the different types of missing values is important for interpretation of the data.</p>
+<p>As Table <a href="c11-missing-data.html#tab:missing-anes-shadow-tab">11.1</a> shows, the results can vary greatly depending on which type of missing data are removed. If we remove only the skip patterns the margin between Clinton and Trump is 4.5 percentage points, but if we include all data, even including those that did not vote in 2016, the margin is 3.1 percentage points. How we handle the different types of missing values is important for interpreting the data.</p>
 
 </div>
 </div>
@@ -1439,8 +1443,8 @@ <h3>References<a href="references.html#references" class="anchor-section" aria-l
 <div id="ref-Schafer2002" class="csl-entry">
 Schafer, Joseph L, and John W Graham. 2002. <span>“Missing Data: Our View of the State of the Art.”</span> <em>Psychological Methods</em> 7: 147–77. <a href="https://doi.org/10.1037//1082-989X.7.2.147">https://doi.org/10.1037//1082-989X.7.2.147</a>.
 </div>
-<div id="ref-visdat2017" class="csl-entry">
-Tierney, Nicholas. 2017. <span>“Visdat: Visualising Whole Data Frames.”</span> <em>JOSS</em> 2 (16): 355. <a href="https://doi.org/10.21105/joss.00355">https://doi.org/10.21105/joss.00355</a>.
+<div id="ref-visdattierney" class="csl-entry">
+Tierney, Nicholas. 2017. <span>“<span class="nocase">visdat</span>: Visualising Whole Data Frames.”</span> <em>Journal of Open Source Software</em> 2 (16): 355. <a href="https://doi.org/10.21105/joss.00355">https://doi.org/10.21105/joss.00355</a>.
 </div>
 <div id="ref-naniar2023" class="csl-entry">
 Tierney, Nicholas, and Dianne Cook. 2023. <span>“Expanding Tidy Data Principles to Facilitate Missing Data Exploration, Visualization and Assessment of Imputations.”</span> <em>Journal of Statistical Software</em> 105 (7): 1–31. <a href="https://doi.org/10.18637/jss.v105.i07">https://doi.org/10.18637/jss.v105.i07</a>.
diff --git a/c12-recommendations.html b/c12-recommendations.html
index e2a7221e..b8e37c32 100644
--- a/c12-recommendations.html
+++ b/c12-recommendations.html
@@ -23,7 +23,7 @@
 <meta name="author" content="Stephanie Zimmer, Rebecca J. Powell, and Isabella Velásquez" />
 
 
-<meta name="date" content="2024-04-16" />
+<meta name="date" content="2024-04-24" />
 
   <meta name="viewport" content="width=device-width, initial-scale=1" />
   <meta name="apple-mobile-web-app-capable" content="yes" />
@@ -229,7 +229,7 @@
 <li class="chapter" data-level="4.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#introduction-3"><i class="fa fa-check"></i><b>4.1</b> Introduction</a></li>
 <li class="chapter" data-level="4.2" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup"><i class="fa fa-check"></i><b>4.2</b> Setup</a>
 <ul>
-<li class="chapter" data-level="4.2.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#packages"><i class="fa fa-check"></i><b>4.2.1</b> Packages</a></li>
+<li class="chapter" data-level="4.2.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup-load-pkgs"><i class="fa fa-check"></i><b>4.2.1</b> Packages</a></li>
 <li class="chapter" data-level="4.2.2" data-path="c04-getting-started.html"><a href="c04-getting-started.html#data"><i class="fa fa-check"></i><b>4.2.2</b> Data</a></li>
 <li class="chapter" data-level="4.2.3" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup-des-obj"><i class="fa fa-check"></i><b>4.2.3</b> Design objects</a></li>
 </ul></li>
@@ -358,7 +358,7 @@
 <li class="chapter" data-level="9.9.1" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#random-number-seeds"><i class="fa fa-check"></i><b>9.9.1</b> Random number seeds</a></li>
 <li class="chapter" data-level="9.9.2" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#descriptive-names-and-labels"><i class="fa fa-check"></i><b>9.9.2</b> Descriptive names and labels</a></li>
 </ul></li>
-<li class="chapter" data-level="9.10" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#summary"><i class="fa fa-check"></i><b>9.10</b> Summary</a></li>
+<li class="chapter" data-level="9.10" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#additional-resources-1"><i class="fa fa-check"></i><b>9.10</b> Additional resources</a></li>
 </ul></li>
 <li class="part"><span><b>IV Real life data</b></span></li>
 <li class="chapter" data-level="10" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html"><i class="fa fa-check"></i><b>10</b> Sample designs and replicate weights</a>
@@ -380,7 +380,7 @@
 <ul>
 <li class="chapter" data-level="10.4.1" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#balanced-repeated-replication-brr-method"><i class="fa fa-check"></i><b>10.4.1</b> Balanced Repeated Replication (BRR) method</a></li>
 <li class="chapter" data-level="10.4.2" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#fays-brr-method"><i class="fa fa-check"></i><b>10.4.2</b> Fay’s BRR method</a></li>
-<li class="chapter" data-level="10.4.3" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#jackknife-method"><i class="fa fa-check"></i><b>10.4.3</b> Jackknife method</a></li>
+<li class="chapter" data-level="10.4.3" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#samp-jackknife"><i class="fa fa-check"></i><b>10.4.3</b> Jackknife method</a></li>
 <li class="chapter" data-level="10.4.4" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#bootstrap-method"><i class="fa fa-check"></i><b>10.4.4</b> Bootstrap method</a></li>
 </ul></li>
 <li class="chapter" data-level="10.5" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#exercises-2"><i class="fa fa-check"></i><b>10.5</b> Exercises</a></li>
@@ -398,14 +398,14 @@
 <li class="chapter" data-level="11.4" data-path="c11-missing-data.html"><a href="c11-missing-data.html#analysis-with-missing-data"><i class="fa fa-check"></i><b>11.4</b> Analysis with missing data</a>
 <ul>
 <li class="chapter" data-level="11.4.1" data-path="c11-missing-data.html"><a href="c11-missing-data.html#recoding-missing-data"><i class="fa fa-check"></i><b>11.4.1</b> Recoding missing data</a></li>
-<li class="chapter" data-level="11.4.2" data-path="c11-missing-data.html"><a href="c11-missing-data.html#accounting-for-skip-patterns"><i class="fa fa-check"></i><b>11.4.2</b> Accounting for skip patterns</a></li>
+<li class="chapter" data-level="11.4.2" data-path="c11-missing-data.html"><a href="c11-missing-data.html#missing-skip-patt"><i class="fa fa-check"></i><b>11.4.2</b> Accounting for skip patterns</a></li>
 </ul></li>
 </ul></li>
 <li class="chapter" data-level="12" data-path="c12-recommendations.html"><a href="c12-recommendations.html"><i class="fa fa-check"></i><b>12</b> Successful survey analysis recommendations</a>
 <ul>
 <li class="chapter" data-level="" data-path="c12-recommendations.html"><a href="c12-recommendations.html#prereq12"><i class="fa fa-check"></i>Prerequisites</a></li>
 <li class="chapter" data-level="12.1" data-path="c12-recommendations.html"><a href="c12-recommendations.html#introduction-10"><i class="fa fa-check"></i><b>12.1</b> Introduction</a></li>
-<li class="chapter" data-level="12.2" data-path="c12-recommendations.html"><a href="c12-recommendations.html#recs-survey-process"><i class="fa fa-check"></i><b>12.2</b> Follow survey analysis process</a></li>
+<li class="chapter" data-level="12.2" data-path="c12-recommendations.html"><a href="c12-recommendations.html#recs-survey-process"><i class="fa fa-check"></i><b>12.2</b> Follow the survey analysis process</a></li>
 <li class="chapter" data-level="12.3" data-path="c12-recommendations.html"><a href="c12-recommendations.html#begin-with-descriptive-analysis"><i class="fa fa-check"></i><b>12.3</b> Begin with descriptive analysis</a>
 <ul>
 <li class="chapter" data-level="12.3.1" data-path="c12-recommendations.html"><a href="c12-recommendations.html#table-review"><i class="fa fa-check"></i><b>12.3.1</b> Table review</a></li>
@@ -524,43 +524,43 @@ <h3>Prerequisites<a href="c12-recommendations.html#prereq12" class="anchor-secti
 </div>
 <div class="prereqbox">
 <p>For this chapter, load the following packages:</p>
-<div class="sourceCode" id="cb375"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb375-1"><a href="c12-recommendations.html#cb375-1" tabindex="-1"></a><span class="fu">library</span>(tidyverse)</span>
-<span id="cb375-2"><a href="c12-recommendations.html#cb375-2" tabindex="-1"></a><span class="fu">library</span>(survey)</span>
-<span id="cb375-3"><a href="c12-recommendations.html#cb375-3" tabindex="-1"></a><span class="fu">library</span>(srvyr)</span>
-<span id="cb375-4"><a href="c12-recommendations.html#cb375-4" tabindex="-1"></a><span class="fu">library</span>(srvyrexploR)</span></code></pre></div>
-<p>To illustrate the importance of data visualization, we will discuss Anscombe’s Quartet. The dataset can be replicated by running the code below:</p>
-<div class="sourceCode" id="cb376"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb376-1"><a href="c12-recommendations.html#cb376-1" tabindex="-1"></a>anscombe_tidy <span class="ot">&lt;-</span> anscombe <span class="sc">%&gt;%</span></span>
-<span id="cb376-2"><a href="c12-recommendations.html#cb376-2" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">observation =</span> <span class="fu">row_number</span>()) <span class="sc">%&gt;%</span></span>
-<span id="cb376-3"><a href="c12-recommendations.html#cb376-3" tabindex="-1"></a>  <span class="fu">pivot_longer</span>(<span class="sc">-</span>observation, <span class="at">names_to =</span> <span class="st">&quot;key&quot;</span>, <span class="at">values_to =</span> <span class="st">&quot;value&quot;</span>) <span class="sc">%&gt;%</span></span>
-<span id="cb376-4"><a href="c12-recommendations.html#cb376-4" tabindex="-1"></a>  <span class="fu">separate</span>(key, <span class="fu">c</span>(<span class="st">&quot;variable&quot;</span>, <span class="st">&quot;set&quot;</span>), <span class="dv">1</span>, <span class="at">convert =</span> <span class="cn">TRUE</span>) <span class="sc">%&gt;%</span></span>
-<span id="cb376-5"><a href="c12-recommendations.html#cb376-5" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">set =</span> <span class="fu">c</span>(<span class="st">&quot;I&quot;</span>, <span class="st">&quot;II&quot;</span>, <span class="st">&quot;III&quot;</span>, <span class="st">&quot;IV&quot;</span>)[set]) <span class="sc">%&gt;%</span></span>
-<span id="cb376-6"><a href="c12-recommendations.html#cb376-6" tabindex="-1"></a>  <span class="fu">pivot_wider</span>(<span class="at">names_from =</span> variable, <span class="at">values_from =</span> value)</span></code></pre></div>
+<div class="sourceCode" id="cb376"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb376-1"><a href="c12-recommendations.html#cb376-1" tabindex="-1"></a><span class="fu">library</span>(tidyverse)</span>
+<span id="cb376-2"><a href="c12-recommendations.html#cb376-2" tabindex="-1"></a><span class="fu">library</span>(survey)</span>
+<span id="cb376-3"><a href="c12-recommendations.html#cb376-3" tabindex="-1"></a><span class="fu">library</span>(srvyr)</span>
+<span id="cb376-4"><a href="c12-recommendations.html#cb376-4" tabindex="-1"></a><span class="fu">library</span>(srvyrexploR)</span></code></pre></div>
+<p>To illustrate the importance of data visualization, we discuss Anscombe’s Quartet. The dataset can be replicated by running the code below:</p>
+<div class="sourceCode" id="cb377"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb377-1"><a href="c12-recommendations.html#cb377-1" tabindex="-1"></a>anscombe_tidy <span class="ot">&lt;-</span> anscombe <span class="sc">%&gt;%</span></span>
+<span id="cb377-2"><a href="c12-recommendations.html#cb377-2" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">obs =</span> <span class="fu">row_number</span>()) <span class="sc">%&gt;%</span></span>
+<span id="cb377-3"><a href="c12-recommendations.html#cb377-3" tabindex="-1"></a>  <span class="fu">pivot_longer</span>(<span class="sc">-</span>obs, <span class="at">names_to =</span> <span class="st">&quot;key&quot;</span>, <span class="at">values_to =</span> <span class="st">&quot;value&quot;</span>) <span class="sc">%&gt;%</span></span>
+<span id="cb377-4"><a href="c12-recommendations.html#cb377-4" tabindex="-1"></a>  <span class="fu">separate</span>(key, <span class="fu">c</span>(<span class="st">&quot;variable&quot;</span>, <span class="st">&quot;set&quot;</span>), <span class="dv">1</span>, <span class="at">convert =</span> <span class="cn">TRUE</span>) <span class="sc">%&gt;%</span></span>
+<span id="cb377-5"><a href="c12-recommendations.html#cb377-5" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">set =</span> <span class="fu">c</span>(<span class="st">&quot;I&quot;</span>, <span class="st">&quot;II&quot;</span>, <span class="st">&quot;III&quot;</span>, <span class="st">&quot;IV&quot;</span>)[set]) <span class="sc">%&gt;%</span></span>
+<span id="cb377-6"><a href="c12-recommendations.html#cb377-6" tabindex="-1"></a>  <span class="fu">pivot_wider</span>(<span class="at">names_from =</span> variable, <span class="at">values_from =</span> value)</span></code></pre></div>
 <p>We create an example survey dataset to explain potential pitfalls and how to overcome them in survey analysis. To recreate the dataset, run the code below:</p>
-<div class="sourceCode" id="cb377"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb377-1"><a href="c12-recommendations.html#cb377-1" tabindex="-1"></a>example_srvy <span class="ot">&lt;-</span> <span class="fu">tribble</span>(</span>
-<span id="cb377-2"><a href="c12-recommendations.html#cb377-2" tabindex="-1"></a>  <span class="sc">~</span>id, <span class="sc">~</span>region, <span class="sc">~</span>q_d1,                 <span class="sc">~</span>q_d2_1, <span class="sc">~</span>gender, <span class="sc">~</span>weight,</span>
-<span id="cb377-3"><a href="c12-recommendations.html#cb377-3" tabindex="-1"></a>   <span class="dv">1</span><span class="dt">L</span>,   <span class="dv">1</span><span class="dt">L</span>,    <span class="dv">1</span><span class="dt">L</span>,      <span class="st">&quot;Somewhat interested&quot;</span>, <span class="st">&quot;female&quot;</span>,  <span class="dv">1740</span>,</span>
-<span id="cb377-4"><a href="c12-recommendations.html#cb377-4" tabindex="-1"></a>   <span class="dv">2</span><span class="dt">L</span>,   <span class="dv">1</span><span class="dt">L</span>,    <span class="dv">1</span><span class="dt">L</span>,    <span class="st">&quot;Not at all interested&quot;</span>, <span class="st">&quot;female&quot;</span>,  <span class="dv">1428</span>,</span>
-<span id="cb377-5"><a href="c12-recommendations.html#cb377-5" tabindex="-1"></a>   <span class="dv">3</span><span class="dt">L</span>,   <span class="dv">2</span><span class="dt">L</span>,    <span class="cn">NA</span>,      <span class="st">&quot;Somewhat interested&quot;</span>, <span class="st">&quot;female&quot;</span>,   <span class="dv">496</span>,</span>
-<span id="cb377-6"><a href="c12-recommendations.html#cb377-6" tabindex="-1"></a>   <span class="dv">4</span><span class="dt">L</span>,   <span class="dv">2</span><span class="dt">L</span>,    <span class="dv">1</span><span class="dt">L</span>,    <span class="st">&quot;Not at all interested&quot;</span>, <span class="st">&quot;female&quot;</span>,   <span class="dv">550</span>,</span>
-<span id="cb377-7"><a href="c12-recommendations.html#cb377-7" tabindex="-1"></a>   <span class="dv">5</span><span class="dt">L</span>,   <span class="dv">3</span><span class="dt">L</span>,    <span class="dv">1</span><span class="dt">L</span>,      <span class="st">&quot;Somewhat interested&quot;</span>, <span class="st">&quot;female&quot;</span>,  <span class="dv">1762</span>,</span>
-<span id="cb377-8"><a href="c12-recommendations.html#cb377-8" tabindex="-1"></a>   <span class="dv">6</span><span class="dt">L</span>,   <span class="dv">4</span><span class="dt">L</span>,    <span class="cn">NA</span>,          <span class="st">&quot;Very interested&quot;</span>, <span class="st">&quot;female&quot;</span>,  <span class="dv">1004</span>,</span>
-<span id="cb377-9"><a href="c12-recommendations.html#cb377-9" tabindex="-1"></a>   <span class="dv">7</span><span class="dt">L</span>,   <span class="dv">4</span><span class="dt">L</span>,    <span class="cn">NA</span>,      <span class="st">&quot;Somewhat interested&quot;</span>, <span class="st">&quot;female&quot;</span>,   <span class="dv">522</span>,</span>
-<span id="cb377-10"><a href="c12-recommendations.html#cb377-10" tabindex="-1"></a>   <span class="dv">8</span><span class="dt">L</span>,   <span class="dv">3</span><span class="dt">L</span>,    <span class="dv">2</span><span class="dt">L</span>,    <span class="st">&quot;Not at all interested&quot;</span>, <span class="st">&quot;female&quot;</span>,  <span class="dv">1099</span>,</span>
-<span id="cb377-11"><a href="c12-recommendations.html#cb377-11" tabindex="-1"></a>   <span class="dv">9</span><span class="dt">L</span>,   <span class="dv">4</span><span class="dt">L</span>,    <span class="dv">2</span><span class="dt">L</span>,      <span class="st">&quot;Somewhat interested&quot;</span>, <span class="st">&quot;female&quot;</span>,  <span class="dv">1295</span>,</span>
-<span id="cb377-12"><a href="c12-recommendations.html#cb377-12" tabindex="-1"></a>   <span class="dv">10</span><span class="dt">L</span>,  <span class="dv">2</span><span class="dt">L</span>,    <span class="dv">2</span><span class="dt">L</span>,      <span class="st">&quot;Somewhat interested&quot;</span>,   <span class="st">&quot;male&quot;</span>,   <span class="dv">983</span></span>
-<span id="cb377-13"><a href="c12-recommendations.html#cb377-13" tabindex="-1"></a>)</span>
-<span id="cb377-14"><a href="c12-recommendations.html#cb377-14" tabindex="-1"></a></span>
-<span id="cb377-15"><a href="c12-recommendations.html#cb377-15" tabindex="-1"></a>example_des <span class="ot">&lt;-</span></span>
-<span id="cb377-16"><a href="c12-recommendations.html#cb377-16" tabindex="-1"></a>  example_srvy <span class="sc">%&gt;%</span></span>
-<span id="cb377-17"><a href="c12-recommendations.html#cb377-17" tabindex="-1"></a>  <span class="fu">as_survey_design</span>(<span class="at">weights =</span> weight)</span></code></pre></div>
+<div class="sourceCode" id="cb378"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb378-1"><a href="c12-recommendations.html#cb378-1" tabindex="-1"></a>example_srvy <span class="ot">&lt;-</span> <span class="fu">tribble</span>(</span>
+<span id="cb378-2"><a href="c12-recommendations.html#cb378-2" tabindex="-1"></a>  <span class="sc">~</span>id, <span class="sc">~</span>region, <span class="sc">~</span>q_d1,                 <span class="sc">~</span>q_d2_1, <span class="sc">~</span>gender, <span class="sc">~</span>weight,</span>
+<span id="cb378-3"><a href="c12-recommendations.html#cb378-3" tabindex="-1"></a>   <span class="dv">1</span><span class="dt">L</span>,   <span class="dv">1</span><span class="dt">L</span>,    <span class="dv">1</span><span class="dt">L</span>,      <span class="st">&quot;Somewhat interested&quot;</span>, <span class="st">&quot;female&quot;</span>,  <span class="dv">1740</span>,</span>
+<span id="cb378-4"><a href="c12-recommendations.html#cb378-4" tabindex="-1"></a>   <span class="dv">2</span><span class="dt">L</span>,   <span class="dv">1</span><span class="dt">L</span>,    <span class="dv">1</span><span class="dt">L</span>,    <span class="st">&quot;Not at all interested&quot;</span>, <span class="st">&quot;female&quot;</span>,  <span class="dv">1428</span>,</span>
+<span id="cb378-5"><a href="c12-recommendations.html#cb378-5" tabindex="-1"></a>   <span class="dv">3</span><span class="dt">L</span>,   <span class="dv">2</span><span class="dt">L</span>,    <span class="cn">NA</span>,      <span class="st">&quot;Somewhat interested&quot;</span>, <span class="st">&quot;female&quot;</span>,   <span class="dv">496</span>,</span>
+<span id="cb378-6"><a href="c12-recommendations.html#cb378-6" tabindex="-1"></a>   <span class="dv">4</span><span class="dt">L</span>,   <span class="dv">2</span><span class="dt">L</span>,    <span class="dv">1</span><span class="dt">L</span>,    <span class="st">&quot;Not at all interested&quot;</span>, <span class="st">&quot;female&quot;</span>,   <span class="dv">550</span>,</span>
+<span id="cb378-7"><a href="c12-recommendations.html#cb378-7" tabindex="-1"></a>   <span class="dv">5</span><span class="dt">L</span>,   <span class="dv">3</span><span class="dt">L</span>,    <span class="dv">1</span><span class="dt">L</span>,      <span class="st">&quot;Somewhat interested&quot;</span>, <span class="st">&quot;female&quot;</span>,  <span class="dv">1762</span>,</span>
+<span id="cb378-8"><a href="c12-recommendations.html#cb378-8" tabindex="-1"></a>   <span class="dv">6</span><span class="dt">L</span>,   <span class="dv">4</span><span class="dt">L</span>,    <span class="cn">NA</span>,          <span class="st">&quot;Very interested&quot;</span>, <span class="st">&quot;female&quot;</span>,  <span class="dv">1004</span>,</span>
+<span id="cb378-9"><a href="c12-recommendations.html#cb378-9" tabindex="-1"></a>   <span class="dv">7</span><span class="dt">L</span>,   <span class="dv">4</span><span class="dt">L</span>,    <span class="cn">NA</span>,      <span class="st">&quot;Somewhat interested&quot;</span>, <span class="st">&quot;female&quot;</span>,   <span class="dv">522</span>,</span>
+<span id="cb378-10"><a href="c12-recommendations.html#cb378-10" tabindex="-1"></a>   <span class="dv">8</span><span class="dt">L</span>,   <span class="dv">3</span><span class="dt">L</span>,    <span class="dv">2</span><span class="dt">L</span>,    <span class="st">&quot;Not at all interested&quot;</span>, <span class="st">&quot;female&quot;</span>,  <span class="dv">1099</span>,</span>
+<span id="cb378-11"><a href="c12-recommendations.html#cb378-11" tabindex="-1"></a>   <span class="dv">9</span><span class="dt">L</span>,   <span class="dv">4</span><span class="dt">L</span>,    <span class="dv">2</span><span class="dt">L</span>,      <span class="st">&quot;Somewhat interested&quot;</span>, <span class="st">&quot;female&quot;</span>,  <span class="dv">1295</span>,</span>
+<span id="cb378-12"><a href="c12-recommendations.html#cb378-12" tabindex="-1"></a>   <span class="dv">10</span><span class="dt">L</span>,  <span class="dv">2</span><span class="dt">L</span>,    <span class="dv">2</span><span class="dt">L</span>,      <span class="st">&quot;Somewhat interested&quot;</span>,   <span class="st">&quot;male&quot;</span>,   <span class="dv">983</span></span>
+<span id="cb378-13"><a href="c12-recommendations.html#cb378-13" tabindex="-1"></a>)</span>
+<span id="cb378-14"><a href="c12-recommendations.html#cb378-14" tabindex="-1"></a></span>
+<span id="cb378-15"><a href="c12-recommendations.html#cb378-15" tabindex="-1"></a>example_des <span class="ot">&lt;-</span></span>
+<span id="cb378-16"><a href="c12-recommendations.html#cb378-16" tabindex="-1"></a>  example_srvy <span class="sc">%&gt;%</span></span>
+<span id="cb378-17"><a href="c12-recommendations.html#cb378-17" tabindex="-1"></a>  <span class="fu">as_survey_design</span>(<span class="at">weights =</span> weight)</span></code></pre></div>
 </div>
 <div id="introduction-10" class="section level2 hasAnchor" number="12.1">
 <h2><span class="header-section-number">12.1</span> Introduction<a href="c12-recommendations.html#introduction-10" class="anchor-section" aria-label="Anchor link to header"></a></h2>
-<p>The previous chapters in this book aimed to provide the technical skills and knowledge required for running survey analyses. This chapter builds upon the previously mentioned best practices to present a curated set of recommendations for running a <em>successful</em> survey analysis. We hope this list equips you with practical insights that assist in producing meaningful and reliable results.</p>
+<p>The previous chapters in this book aimed to provide the technical skills and knowledge required for running survey analyses. This chapter builds upon the previously mentioned best practices to present a curated set of recommendations for running a <em>successful</em> survey analysis. We hope this list provides practical insights that assist in producing meaningful and reliable results.</p>
 </div>
 <div id="recs-survey-process" class="section level2 hasAnchor" number="12.2">
-<h2><span class="header-section-number">12.2</span> Follow survey analysis process<a href="c12-recommendations.html#recs-survey-process" class="anchor-section" aria-label="Anchor link to header"></a></h2>
-<p>As we first introduced in Chapter <a href="c04-getting-started.html#c04-getting-started">4</a> (Section <a href="c04-getting-started.html#survey-analysis-process">4.3</a>), there are four main steps to successfully analyze survey data:</p>
+<h2><span class="header-section-number">12.2</span> Follow the survey analysis process<a href="c12-recommendations.html#recs-survey-process" class="anchor-section" aria-label="Anchor link to header"></a></h2>
+<p>As we first introduced in Chapter <a href="c04-getting-started.html#c04-getting-started">4</a>, there are four main steps to successfully analyze survey data:</p>
 <ol style="list-style-type: decimal">
 <li><p>Create a <code>tbl_svy</code> object (a survey object) using: <code>as_survey_design()</code> or <code>as_survey_rep()</code></p></li>
 <li><p>Subset data (if needed) using <code>filter()</code> (to create subpopulations)</p></li>
@@ -568,64 +568,64 @@ <h2><span class="header-section-number">12.2</span> Follow survey analysis proce
 <li><p>Within <code>summarize()</code>, specify variables to calculate, including means, totals, proportions, quantiles, and more</p></li>
 </ol>
 <p>The order of these steps matters in survey analysis. For example, if we need to subset the data, we must use <code>filter()</code> on our data <strong>after</strong> creating the survey design. If we do this before the survey design is created, we may not be correctly accounting for the study design, resulting in incorrect findings.</p>
-<p>Additionally, correctly identifying the survey design is one of the most important steps in survey analysis. Knowing the type of sample design (e.g., clustered, stratified) will help ensure the underlying error structure is correctly calculated and weights are correctly used. Reviewing the documentation (see Chapter <a href="c03-survey-data-documentation.html#c03-survey-data-documentation">3</a>) will help us understand what variables to use from the data. Learning about complex design factors such as clustering, stratification, and weighting is foundational to complex survey analysis, and we recommend that all analysts review Chapter <a href="c10-sample-designs-replicate-weights.html#c10-sample-designs-replicate-weights">10</a> before creating their first design object.</p>
-<p>Making sure to use the survey analysis functions from the {srvyr} and {survey} packages is also important in survey analysis. For example, using <code>mean()</code> and <code>survey_mean()</code> on the same data will result in different findings and outputs. Each of the survey functions from {srvyr} and {survey} impacts standard errors and variance, and we cannot treat complex surveys as unweighted simple random samples if we want to produce unbiased estimates <span class="citation">(<a href="#ref-R-srvyr">Freedman Ellis and Schneider 2023</a>; <a href="#ref-lumley2010complex">Lumley 2010</a>)</span>.</p>
+<p>Additionally, correctly identifying the survey design is one of the most important steps in survey analysis. Knowing the type of sample design (e.g., clustered, stratified) helps ensure the underlying error structure is correctly calculated and weights are correctly used. Reviewing the documentation (see Chapter <a href="c03-survey-data-documentation.html#c03-survey-data-documentation">3</a>) helps us understand what variables to use from the data. Learning about complex design factors such as clustering, stratification, and weighting is foundational to complex survey analysis, and we recommend that all analysts review Chapter <a href="c10-sample-designs-replicate-weights.html#c10-sample-designs-replicate-weights">10</a> before creating their first design object.</p>
+<p>Making sure to use the survey analysis functions from the {srvyr} and {survey} packages is also important in survey analysis. For example, using <code>mean()</code> and <code>survey_mean()</code> on the same data results in different findings and outputs. Each of the survey functions from {srvyr} and {survey} impacts standard errors and variance, and we cannot treat complex surveys as unweighted simple random samples if we want to produce unbiased estimates <span class="citation">(<a href="#ref-R-srvyr">Freedman Ellis and Schneider 2023</a>; <a href="#ref-lumley2010complex">Lumley 2010</a>)</span>.</p>
 </div>
 <div id="begin-with-descriptive-analysis" class="section level2 hasAnchor" number="12.3">
 <h2><span class="header-section-number">12.3</span> Begin with descriptive analysis<a href="c12-recommendations.html#begin-with-descriptive-analysis" class="anchor-section" aria-label="Anchor link to header"></a></h2>
-<p>When receiving a fresh batch of data, it’s tempting to jump right into running models to find significant results. However, a successful data analyst begins by exploring the dataset. This involves running descriptive analysis on the dataset as a whole, as well as individual variables and combinations of variables. As described in Chapter <a href="c05-descriptive-analysis.html#c05-descriptive-analysis">5</a>, descriptive analyses should always precede statistical analysis to prevent avoidable (and potentially embarrassing) mistakes.</p>
+<p>When receiving a fresh batch of data, it is tempting to jump right into running models to find significant results. However, a successful data analyst begins by exploring the dataset. Chapter <a href="c11-missing-data.html#c11-missing-data">11</a> talks about the importance of reviewing data when examining missing data patterns. In this chapter, we illustrate the value of reviewing all types of data. This involves running descriptive analysis on the dataset as a whole, as well as individual variables and combinations of variables. As described in Chapter <a href="c05-descriptive-analysis.html#c05-descriptive-analysis">5</a>, descriptive analyses should always precede statistical analysis to prevent avoidable (and potentially embarrassing) mistakes.</p>
 <div id="table-review" class="section level3 hasAnchor" number="12.3.1">
 <h3><span class="header-section-number">12.3.1</span> Table review<a href="c12-recommendations.html#table-review" class="anchor-section" aria-label="Anchor link to header"></a></h3>
-<p>Even before applying weights, consider running cross-tabulations on the raw data. Crosstabs can help us see if any patterns stand out that may be alarming or something worth further investigating.</p>
+<p>Even before applying weights, consider running cross-tabulations on the raw data. Cross-tabs can help us see if any patterns stand out that may be alarming or something worth further investigating.</p>
 <p>For example, let’s explore the example survey dataset introduced in the Prerequisites box, <code>example_srvy</code>. We run the code below on the unweighted data to inspect the <code>gender</code> variable:</p>
-<div class="sourceCode" id="cb378"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb378-1"><a href="c12-recommendations.html#cb378-1" tabindex="-1"></a>example_srvy <span class="sc">%&gt;%</span> </span>
-<span id="cb378-2"><a href="c12-recommendations.html#cb378-2" tabindex="-1"></a>  <span class="fu">group_by</span>(gender) <span class="sc">%&gt;%</span> </span>
-<span id="cb378-3"><a href="c12-recommendations.html#cb378-3" tabindex="-1"></a>  <span class="fu">summarise</span>(<span class="at">n =</span> <span class="fu">n</span>())</span></code></pre></div>
+<div class="sourceCode" id="cb379"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb379-1"><a href="c12-recommendations.html#cb379-1" tabindex="-1"></a>example_srvy <span class="sc">%&gt;%</span> </span>
+<span id="cb379-2"><a href="c12-recommendations.html#cb379-2" tabindex="-1"></a>  <span class="fu">group_by</span>(gender) <span class="sc">%&gt;%</span> </span>
+<span id="cb379-3"><a href="c12-recommendations.html#cb379-3" tabindex="-1"></a>  <span class="fu">summarise</span>(<span class="at">n =</span> <span class="fu">n</span>())</span></code></pre></div>
 <pre><code>## # A tibble: 2 × 2
 ##   gender     n
 ##   &lt;chr&gt;  &lt;int&gt;
 ## 1 female     9
 ## 2 male       1</code></pre>
-<p>The data shows that males comprise 1 out of 10, or 10%, of the sample. Generally, we assume something close to a 50/50 split between male and female respondents in a population. The sizeable female proportion could indicate either a unique sample or a potential error in the data. If we review the survey documentation and see this was a deliberate part of the design, we can continue our analysis using the appropriate methods. If this was not an intentional choice by the researchers, the results alert us that something may be incorrect in the data or our code, and we can verify if there’s an issue by comparing the results with the weighted means.</p>
+<p>The data show that males comprise 1 out of 10, or 10%, of the sample. Generally, we assume something close to a 50/50 split between male and female respondents in a population. The sizable female proportion could indicate either a unique sample or a potential error in the data. If we review the survey documentation and see this was a deliberate part of the design, we can continue our analysis using the appropriate methods. If this was not an intentional choice by the researchers, the results alert us that something may be incorrect in the data or our code, and we can verify if there’s an issue by comparing the results with the weighted means.</p>
 </div>
 <div id="graphical-review" class="section level3 hasAnchor" number="12.3.2">
 <h3><span class="header-section-number">12.3.2</span> Graphical review<a href="c12-recommendations.html#graphical-review" class="anchor-section" aria-label="Anchor link to header"></a></h3>
 <p>Tables provide a quick check of our assumptions, but there is no substitute for graphs and plots to visualize the distribution of data. We might miss outliers or nuances if we scan only summary statistics.</p>
-<p>For example, Anscombe’s Quartet demonstrates the importance of visualization in analysis. Let’s say we have a dataset with x- and y- variables in an object called <code>anscombe_tidy</code>. Let’s take a look at how the da taset is structured:</p>
-<div class="sourceCode" id="cb380"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb380-1"><a href="c12-recommendations.html#cb380-1" tabindex="-1"></a><span class="fu">head</span>(anscombe_tidy)</span></code></pre></div>
+<p>For example, Anscombe’s Quartet demonstrates the importance of visualization in analysis. Let’s say we have a dataset with x- and y- variables in an object called <code>anscombe_tidy</code>. Let’s take a look at how the dataset is structured:</p>
+<div class="sourceCode" id="cb381"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb381-1"><a href="c12-recommendations.html#cb381-1" tabindex="-1"></a><span class="fu">head</span>(anscombe_tidy)</span></code></pre></div>
 <pre><code>## # A tibble: 6 × 4
-##   observation set       x     y
-##         &lt;int&gt; &lt;chr&gt; &lt;dbl&gt; &lt;dbl&gt;
-## 1           1 I        10  8.04
-## 2           1 II       10  9.14
-## 3           1 III      10  7.46
-## 4           1 IV        8  6.58
-## 5           2 I         8  6.95
-## 6           2 II        8  8.14</code></pre>
+##     obs set       x     y
+##   &lt;int&gt; &lt;chr&gt; &lt;dbl&gt; &lt;dbl&gt;
+## 1     1 I        10  8.04
+## 2     1 II       10  9.14
+## 3     1 III      10  7.46
+## 4     1 IV        8  6.58
+## 5     2 I         8  6.95
+## 6     2 II        8  8.14</code></pre>
 <p>We can begin by checking one set of variables. For Set I, the x-variables have an average of 9 with a standard deviation of 3.3; for y, we have an average of 7.5 with a standard deviation of 2.03. The two variables have a correlation of 0.81.</p>
-<div class="sourceCode" id="cb382"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb382-1"><a href="c12-recommendations.html#cb382-1" tabindex="-1"></a>anscombe_tidy <span class="sc">%&gt;%</span> </span>
-<span id="cb382-2"><a href="c12-recommendations.html#cb382-2" tabindex="-1"></a>  <span class="fu">filter</span>(set <span class="sc">==</span> <span class="st">&quot;I&quot;</span>) <span class="sc">%&gt;%</span> </span>
-<span id="cb382-3"><a href="c12-recommendations.html#cb382-3" tabindex="-1"></a>  <span class="fu">summarize</span>(</span>
-<span id="cb382-4"><a href="c12-recommendations.html#cb382-4" tabindex="-1"></a>    <span class="at">x_mean =</span> <span class="fu">mean</span>(x),</span>
-<span id="cb382-5"><a href="c12-recommendations.html#cb382-5" tabindex="-1"></a>    <span class="at">x_sd =</span> <span class="fu">sd</span>(x),</span>
-<span id="cb382-6"><a href="c12-recommendations.html#cb382-6" tabindex="-1"></a>    <span class="at">y_mean =</span> <span class="fu">mean</span>(y),</span>
-<span id="cb382-7"><a href="c12-recommendations.html#cb382-7" tabindex="-1"></a>    <span class="at">y_sd =</span> <span class="fu">sd</span>(y),</span>
-<span id="cb382-8"><a href="c12-recommendations.html#cb382-8" tabindex="-1"></a>    <span class="at">correlation =</span> <span class="fu">cor</span>(x, y)</span>
-<span id="cb382-9"><a href="c12-recommendations.html#cb382-9" tabindex="-1"></a>  )</span></code></pre></div>
+<div class="sourceCode" id="cb383"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb383-1"><a href="c12-recommendations.html#cb383-1" tabindex="-1"></a>anscombe_tidy <span class="sc">%&gt;%</span> </span>
+<span id="cb383-2"><a href="c12-recommendations.html#cb383-2" tabindex="-1"></a>  <span class="fu">filter</span>(set <span class="sc">==</span> <span class="st">&quot;I&quot;</span>) <span class="sc">%&gt;%</span> </span>
+<span id="cb383-3"><a href="c12-recommendations.html#cb383-3" tabindex="-1"></a>  <span class="fu">summarize</span>(</span>
+<span id="cb383-4"><a href="c12-recommendations.html#cb383-4" tabindex="-1"></a>    <span class="at">x_mean =</span> <span class="fu">mean</span>(x),</span>
+<span id="cb383-5"><a href="c12-recommendations.html#cb383-5" tabindex="-1"></a>    <span class="at">x_sd =</span> <span class="fu">sd</span>(x),</span>
+<span id="cb383-6"><a href="c12-recommendations.html#cb383-6" tabindex="-1"></a>    <span class="at">y_mean =</span> <span class="fu">mean</span>(y),</span>
+<span id="cb383-7"><a href="c12-recommendations.html#cb383-7" tabindex="-1"></a>    <span class="at">y_sd =</span> <span class="fu">sd</span>(y),</span>
+<span id="cb383-8"><a href="c12-recommendations.html#cb383-8" tabindex="-1"></a>    <span class="at">correlation =</span> <span class="fu">cor</span>(x, y)</span>
+<span id="cb383-9"><a href="c12-recommendations.html#cb383-9" tabindex="-1"></a>  )</span></code></pre></div>
 <pre><code>## # A tibble: 1 × 5
 ##   x_mean  x_sd y_mean  y_sd correlation
 ##    &lt;dbl&gt; &lt;dbl&gt;  &lt;dbl&gt; &lt;dbl&gt;       &lt;dbl&gt;
 ## 1      9  3.32   7.50  2.03       0.816</code></pre>
-<p>These are useful statistics. We can note that the data doesn’t have high variability, and the two variables are strongly correlated. Now, let’s check all the sets (I-IV) in the Anscombe data. Notice anything interesting?</p>
-<div class="sourceCode" id="cb384"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb384-1"><a href="c12-recommendations.html#cb384-1" tabindex="-1"></a>anscombe_tidy <span class="sc">%&gt;%</span> </span>
-<span id="cb384-2"><a href="c12-recommendations.html#cb384-2" tabindex="-1"></a>  <span class="fu">group_by</span>(set) <span class="sc">%&gt;%</span></span>
-<span id="cb384-3"><a href="c12-recommendations.html#cb384-3" tabindex="-1"></a>  <span class="fu">summarize</span>(</span>
-<span id="cb384-4"><a href="c12-recommendations.html#cb384-4" tabindex="-1"></a>    <span class="at">x_mean =</span> <span class="fu">mean</span>(x),</span>
-<span id="cb384-5"><a href="c12-recommendations.html#cb384-5" tabindex="-1"></a>    <span class="at">x_sd =</span> <span class="fu">sd</span>(x, <span class="at">na.rm =</span> <span class="cn">TRUE</span>),</span>
-<span id="cb384-6"><a href="c12-recommendations.html#cb384-6" tabindex="-1"></a>    <span class="at">y_mean =</span> <span class="fu">mean</span>(y),</span>
-<span id="cb384-7"><a href="c12-recommendations.html#cb384-7" tabindex="-1"></a>    <span class="at">y_sd =</span> <span class="fu">sd</span>(y, <span class="at">na.rm =</span> <span class="cn">TRUE</span>),</span>
-<span id="cb384-8"><a href="c12-recommendations.html#cb384-8" tabindex="-1"></a>    <span class="at">correlation =</span> <span class="fu">cor</span>(x, y)</span>
-<span id="cb384-9"><a href="c12-recommendations.html#cb384-9" tabindex="-1"></a>  )</span></code></pre></div>
+<p>These are useful statistics. We can note that the data do not have high variability, and the two variables are strongly correlated. Now, let’s check all the sets (I-IV) in the Anscombe data. Notice anything interesting?</p>
+<div class="sourceCode" id="cb385"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb385-1"><a href="c12-recommendations.html#cb385-1" tabindex="-1"></a>anscombe_tidy <span class="sc">%&gt;%</span> </span>
+<span id="cb385-2"><a href="c12-recommendations.html#cb385-2" tabindex="-1"></a>  <span class="fu">group_by</span>(set) <span class="sc">%&gt;%</span></span>
+<span id="cb385-3"><a href="c12-recommendations.html#cb385-3" tabindex="-1"></a>  <span class="fu">summarize</span>(</span>
+<span id="cb385-4"><a href="c12-recommendations.html#cb385-4" tabindex="-1"></a>    <span class="at">x_mean =</span> <span class="fu">mean</span>(x),</span>
+<span id="cb385-5"><a href="c12-recommendations.html#cb385-5" tabindex="-1"></a>    <span class="at">x_sd =</span> <span class="fu">sd</span>(x, <span class="at">na.rm =</span> <span class="cn">TRUE</span>),</span>
+<span id="cb385-6"><a href="c12-recommendations.html#cb385-6" tabindex="-1"></a>    <span class="at">y_mean =</span> <span class="fu">mean</span>(y),</span>
+<span id="cb385-7"><a href="c12-recommendations.html#cb385-7" tabindex="-1"></a>    <span class="at">y_sd =</span> <span class="fu">sd</span>(y, <span class="at">na.rm =</span> <span class="cn">TRUE</span>),</span>
+<span id="cb385-8"><a href="c12-recommendations.html#cb385-8" tabindex="-1"></a>    <span class="at">correlation =</span> <span class="fu">cor</span>(x, y)</span>
+<span id="cb385-9"><a href="c12-recommendations.html#cb385-9" tabindex="-1"></a>  )</span></code></pre></div>
 <pre><code>## # A tibble: 4 × 6
 ##   set   x_mean  x_sd y_mean  y_sd correlation
 ##   &lt;chr&gt;  &lt;dbl&gt; &lt;dbl&gt;  &lt;dbl&gt; &lt;dbl&gt;       &lt;dbl&gt;
@@ -633,16 +633,19 @@ <h3><span class="header-section-number">12.3.2</span> Graphical review<a href="c
 ## 2 II         9  3.32   7.50  2.03       0.816
 ## 3 III        9  3.32   7.5   2.03       0.816
 ## 4 IV         9  3.32   7.50  2.03       0.817</code></pre>
-<p>The summary results for these four sets are nearly identical! Based on this, we might assume that each distribution is similar. Let’s look at a data visualization to see if our assumption is correct.</p>
-<div class="sourceCode" id="cb386"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb386-1"><a href="c12-recommendations.html#cb386-1" tabindex="-1"></a><span class="fu">ggplot</span>(anscombe_tidy, <span class="fu">aes</span>(x, y)) <span class="sc">+</span></span>
-<span id="cb386-2"><a href="c12-recommendations.html#cb386-2" tabindex="-1"></a>  <span class="fu">geom_point</span>() <span class="sc">+</span></span>
-<span id="cb386-3"><a href="c12-recommendations.html#cb386-3" tabindex="-1"></a>  <span class="fu">facet_wrap</span>( <span class="sc">~</span> set) <span class="sc">+</span></span>
-<span id="cb386-4"><a href="c12-recommendations.html#cb386-4" tabindex="-1"></a>  <span class="fu">geom_smooth</span>(<span class="at">method =</span> <span class="st">&quot;lm&quot;</span>, <span class="at">se =</span> <span class="cn">FALSE</span>, <span class="at">alpha =</span> <span class="fl">0.5</span>) <span class="sc">+</span></span>
-<span id="cb386-5"><a href="c12-recommendations.html#cb386-5" tabindex="-1"></a>  <span class="fu">theme_minimal</span>()</span></code></pre></div>
-<pre><code>## `geom_smooth()` using formula = &#39;y ~ x&#39;</code></pre>
-<p><img src="bookdown_files/figure-html/recommendations-anscombe-plot-1.png" width="672" /></p>
-<p>Although each of the four sets has the same summary statistics and regression line, when reviewing the plots, it becomes apparent that the distributions of the data are not the same at all. Each set of points results in different shapes and distributions. Imagine sharing each set (I-IV) and the corresponding plot with a different colleague. The interpretations and descriptions of the data would be very different even though the statistics are similar. Plotting data can also ensure that we are using the correct analysis method on the data, so understanding the underlying distributions is an important first step.</p>
-<p>With survey data, we may not always have continuous data that we can plot like Anscombe’s Quartet. However, if the dataset does contain continuous data or other types of data that would benefit from a visual representation, we recommend taking the time to graph distributions and correlations.</p>
+<p>The summary results for these four sets are nearly identical! Based on this, we might assume that each distribution is similar. Let’s look at a graphical visualization to see if our assumption is correct (see Figure <a href="c12-recommendations.html#fig:recommendations-anscombe-plot">12.1</a>.)</p>
+<div class="sourceCode" id="cb387"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb387-1"><a href="c12-recommendations.html#cb387-1" tabindex="-1"></a><span class="fu">ggplot</span>(anscombe_tidy, <span class="fu">aes</span>(x, y)) <span class="sc">+</span></span>
+<span id="cb387-2"><a href="c12-recommendations.html#cb387-2" tabindex="-1"></a>  <span class="fu">geom_point</span>() <span class="sc">+</span></span>
+<span id="cb387-3"><a href="c12-recommendations.html#cb387-3" tabindex="-1"></a>  <span class="fu">facet_wrap</span>( <span class="sc">~</span> set) <span class="sc">+</span></span>
+<span id="cb387-4"><a href="c12-recommendations.html#cb387-4" tabindex="-1"></a>  <span class="fu">geom_smooth</span>(<span class="at">method =</span> <span class="st">&quot;lm&quot;</span>, <span class="at">se =</span> <span class="cn">FALSE</span>, <span class="at">alpha =</span> <span class="fl">0.5</span>) <span class="sc">+</span></span>
+<span id="cb387-5"><a href="c12-recommendations.html#cb387-5" tabindex="-1"></a>  <span class="fu">theme_minimal</span>()</span></code></pre></div>
+<div class="figure"><span style="display:block;" id="fig:recommendations-anscombe-plot"></span>
+<img src="bookdown_files/figure-html/recommendations-anscombe-plot-1.png" alt="This figure shows four plots one for each of Anscombe's sets. The upper left plot is a plot of set I and has a trend line with a slope of 0.5 and an intercept of 3. The data points are distributed evenly around the trend line. The upper right plot is a plot of set II and has the same trend line as set I. The data points are curved around the trend line. The lower left plot is a plot of set III and has the same trend line as set I.  The data points closely followly the trend line with one outlier where the y-value for the point is much larger than the others.  The lower right plot is a plot of set IV and has the same trend line as set I. The data points all share the same x-value but different y-values with the exception of one data point, which has a much larger value for both y and x values." width="672" />
+<p class="caption">
+FIGURE 12.1: Plot of Anscombe’s Quartet data and the importance of reviewing data graphically
+</p>
+</div>
+<p>Although each of the four sets has the same summary statistics and regression line, when reviewing the plots (see Figure <a href="c12-recommendations.html#fig:recommendations-anscombe-plot">12.1</a>), it becomes apparent that the distributions of the data are not the same at all. Each set of points results in different shapes and distributions. Imagine sharing each set (I-IV) and the corresponding plot with a different colleague. The interpretations and descriptions of the data would be very different even though the statistics are similar. Plotting data can also ensure that we are using the correct analysis method on the data, so understanding the underlying distributions is an important first step.</p>
 </div>
 </div>
 <div id="check-variable-types" class="section level2 hasAnchor" number="12.4">
@@ -658,7 +661,7 @@ <h2><span class="header-section-number">12.4</span> Check variable types<a href=
 ## $ q_d2_1 &lt;chr&gt; &quot;Somewhat interested&quot;, &quot;Not at all interested&quot;, &quot;Somewh…
 ## $ gender &lt;chr&gt; &quot;female&quot;, &quot;female&quot;, &quot;female&quot;, &quot;female&quot;, &quot;female&quot;, &quot;fema…
 ## $ weight &lt;dbl&gt; 1740, 1428, 496, 550, 1762, 1004, 522, 1099, 1295, 983</code></pre>
-<p>The output shows that <code>q_d2_1</code> is a character variable, but the values of that variable show three options (Very interested / Somewhat interested / Not at all interested). In this case, we will most likely want to change <code>q_d2_1</code> to be a factor variable and order the factor levels to indicate that this is an ordinal variable. Here is some code on how we might approach this task using the {forcats} package <span class="citation">(<a href="#ref-R-forcats">Wickham 2023a</a>)</span>:</p>
+<p>The output shows that <code>q_d2_1</code> is a character variable, but the values of that variable show three options (Very interested / Somewhat interested / Not at all interested.) In this case, we most likely want to change <code>q_d2_1</code> to be a factor variable and order the factor levels to indicate that this is an ordinal variable. Here is some code on how we might approach this task using the {forcats} package <span class="citation">(<a href="#ref-R-forcats">Wickham 2023a</a>)</span>:</p>
 <div class="sourceCode" id="cb390"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb390-1"><a href="c12-recommendations.html#cb390-1" tabindex="-1"></a>example_srvy_fct <span class="ot">&lt;-</span> example_srvy <span class="sc">%&gt;%</span></span>
 <span id="cb390-2"><a href="c12-recommendations.html#cb390-2" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">q_d2_1_fct =</span> <span class="fu">factor</span>(</span>
 <span id="cb390-3"><a href="c12-recommendations.html#cb390-3" tabindex="-1"></a>    q_d2_1,</span>
@@ -686,7 +689,7 @@ <h2><span class="header-section-number">12.4</span> Check variable types<a href=
 ## 1 Very interested       Very interested           1
 ## 2 Somewhat interested   Somewhat interested       6
 ## 3 Not at all interested Not at all interested     3</code></pre>
-<p>This example data also includes a column called <code>region</code>, which is imported as a number (<code>&lt;int&gt;</code>). This is a good hint to use the questionnaire and codebook along with the data to find out if the values actually reflect a number or are perhaps a coded categorical variable (see Chapter <a href="c03-survey-data-documentation.html#c03-survey-data-documentation">3</a> for more details). R will calculate the mean even if it is not appropriate, leading to the common mistake of applying an average to categorical values instead of a proportion function. For example, for ease of coding, we may use the <code>across()</code> function to calculate the mean across all numeric variables:</p>
+<p>This example dataset also includes a column called <code>region</code>, which is imported as a number (<code>&lt;int&gt;</code>.) This is a good reminder to use the questionnaire and codebook along with the data to find out if the values actually reflect a number or are perhaps a coded categorical variable (see Chapter <a href="c03-survey-data-documentation.html#c03-survey-data-documentation">3</a> for more details.) R calculates the mean even if it is not appropriate, leading to the common mistake of applying an average to categorical values instead of a proportion function. For example, for ease of coding, we may use the <code>across()</code> function to calculate the mean across all numeric variables:</p>
 <div class="sourceCode" id="cb394"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb394-1"><a href="c12-recommendations.html#cb394-1" tabindex="-1"></a>example_des <span class="sc">%&gt;%</span></span>
 <span id="cb394-2"><a href="c12-recommendations.html#cb394-2" tabindex="-1"></a>  <span class="fu">select</span>(<span class="sc">-</span>weight) <span class="sc">%&gt;%</span></span>
 <span id="cb394-3"><a href="c12-recommendations.html#cb394-3" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="fu">across</span>(<span class="fu">where</span>(is.numeric), <span class="sc">~</span> <span class="fu">survey_mean</span>(.x, <span class="at">na.rm =</span> <span class="cn">TRUE</span>)))</span></code></pre></div>
@@ -694,7 +697,7 @@ <h2><span class="header-section-number">12.4</span> Check variable types<a href=
 ##      id id_se region region_se  q_d1 q_d1_se
 ##   &lt;dbl&gt; &lt;dbl&gt;  &lt;dbl&gt;     &lt;dbl&gt; &lt;dbl&gt;   &lt;dbl&gt;
 ## 1  5.24  1.12   2.49     0.428  1.38   0.196</code></pre>
-<p>In this example, if we do not adjust <code>region</code> to be a factor variable type, we might accidentally report an average region of 2.49 in our findings which is meaningless. Checking that our variables are appropriate will avoid this pitfall and ensure the measures and models are suitable for the variable type.</p>
+<p>In this example, if we do not adjust <code>region</code> to be a factor variable type, we might accidentally report an average region of 2.49 in our findings, which is meaningless. Checking that our variables are appropriate avoids this pitfall and ensures the measures and models are suitable for the variable type.</p>
 </div>
 <div id="improve-debugging-skills" class="section level2 hasAnchor" number="12.5">
 <h2><span class="header-section-number">12.5</span> Improve debugging skills<a href="c12-recommendations.html#improve-debugging-skills" class="anchor-section" aria-label="Anchor link to header"></a></h2>
@@ -717,7 +720,7 @@ <h2><span class="header-section-number">12.5</span> Improve debugging skills<a h
 <div class="sourceCode" id="cb400"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb400-1"><a href="c12-recommendations.html#cb400-1" tabindex="-1"></a>example_des <span class="sc">%&gt;%</span> </span>
 <span id="cb400-2"><a href="c12-recommendations.html#cb400-2" tabindex="-1"></a>  <span class="fu">svyttest</span>(q_d1<span class="sc">~</span>gender)</span></code></pre></div>
 <pre><code>## Error in UseMethod(&quot;svymean&quot;, design): no applicable method for &#39;svymean&#39; applied to an object of class &quot;formula&quot;</code></pre>
-<p>In this case, we need to remember that with functions from the {survey} packages like <code>svyttest()</code>, the design object is not the first argument, and we have to use the dot (<code>.</code>) notation (see Chapter <a href="c06-statistical-testing.html#c06-statistical-testing">6</a>). Adding in the named argument of <code>design=.</code> will fix this error.</p>
+<p>In this case, we need to remember that with functions from the {survey} packages like <code>svyttest()</code>, the design object is not the first argument, and we have to use the dot (<code>.</code>) notation (see Chapter <a href="c06-statistical-testing.html#c06-statistical-testing">6</a>.) Adding in the named argument of <code>design=.</code> fixes this error.</p>
 <div class="sourceCode" id="cb402"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb402-1"><a href="c12-recommendations.html#cb402-1" tabindex="-1"></a>example_des <span class="sc">%&gt;%</span></span>
 <span id="cb402-2"><a href="c12-recommendations.html#cb402-2" tabindex="-1"></a>  <span class="fu">svyttest</span>(q_d1 <span class="sc">~</span> gender,</span>
 <span id="cb402-3"><a href="c12-recommendations.html#cb402-3" tabindex="-1"></a>           <span class="at">design =</span> .)</span></code></pre></div>
@@ -740,9 +743,9 @@ <h2><span class="header-section-number">12.5</span> Improve debugging skills<a h
 </div>
 <div id="think-critically-about-conclusions" class="section level2 hasAnchor" number="12.6">
 <h2><span class="header-section-number">12.6</span> Think critically about conclusions<a href="c12-recommendations.html#think-critically-about-conclusions" class="anchor-section" aria-label="Anchor link to header"></a></h2>
-<p>Once we have our findings, we need to learn to think critically about our findings. As mentioned in Chapter <a href="c02-overview-surveys.html#c02-overview-surveys">2</a>, many aspects of the study design can impact our interpretation of the results, for example, the number and types of response options provided to the respondent or who was asked the question (both thinking about the full sample and any skip patterns). Knowing the overall study design can help us accurately think through what the findings may mean and identify any issues with our analyses. Additionally, we should make sure that our survey design object is correctly defined (see Chapter <a href="c10-sample-designs-replicate-weights.html#c10-sample-designs-replicate-weights">10</a>), carefully consider how we are managing missing data (see Chapter <a href="c11-missing-data.html#c11-missing-data">11</a>), and follow statistical analysis procedures such as avoiding model overfitting by using too many variables in our formulas.</p>
+<p>Once we have our findings, we need to learn to think critically about our findings. As mentioned in Chapter <a href="c02-overview-surveys.html#c02-overview-surveys">2</a>, many aspects of the study design can impact our interpretation of the results, for example, the number and types of response options provided to the respondent or who was asked the question (both thinking about the full sample and any skip patterns.) Knowing the overall study design can help us accurately think through what the findings may mean and identify any issues with our analyses. Additionally, we should make sure that our survey design object is correctly defined (see Chapter <a href="c10-sample-designs-replicate-weights.html#c10-sample-designs-replicate-weights">10</a>), carefully consider how we are managing missing data (see Chapter <a href="c11-missing-data.html#c11-missing-data">11</a>), and follow statistical analysis procedures such as avoiding model overfitting by using too many variables in our formulas.</p>
 <p>These considerations allow us to conduct our analyses and review findings for statistically significant results. It’s important to note that even significant results do not mean that they are meaningful or important. A large enough sample can produce statistically significant results. Therefore, we want to look at our results in context, such as comparing them with results from other studies or analyzing them in conjunction with confidence intervals and other measures.</p>
-<p>Communicating the results (see Chapter <a href="c08-communicating-results.html#c08-communicating-results">8</a>) in an unbiased manner is also a critical step in any analysis project. If we present results without error measures or only present results that support our initial hypotheses, we are not thinking critically and may incorrectly represent the data. As survey data analysts, we often interpret the survey data for the public. We must ensure that we are the best stewards of the data and work to bring light to meaningful and interesting findings that the public will want and need to know about.</p>
+<p>Communicating the results (see Chapter <a href="c08-communicating-results.html#c08-communicating-results">8</a>) in an unbiased manner is also a critical step in any analysis project. If we present results without error measures or only present results that support our initial hypotheses, we are not thinking critically and may incorrectly represent the data. As survey data analysts, we often interpret the survey data for the public. We must ensure that we are the best stewards of the data and work to bring light to meaningful and interesting findings that the public wants and needs to know about.</p>
 
 </div>
 </div>
@@ -752,7 +755,7 @@ <h2><span class="header-section-number">12.6</span> Think critically about concl
 <h3>References<a href="references.html#references" class="anchor-section" aria-label="Anchor link to header"></a></h3>
 <div id="refs" class="references csl-bib-body hanging-indent" entry-spacing="0">
 <div id="ref-R-srvyr" class="csl-entry">
-Freedman Ellis, Greg, and Ben Schneider. 2023. <em><span class="nocase">srvyr</span>: Dplyr-Like Syntax for Summary Statistics of Survey Data</em>.
+Freedman Ellis, Greg, and Ben Schneider. 2023. <em><span class="nocase">srvyr</span>: ’<span class="nocase">dplyr</span>’-Like Syntax for Summary Statistics of Survey Data</em>.
 </div>
 <div id="ref-lumley2010complex" class="csl-entry">
 Lumley, Thomas. 2010. <em>Complex Surveys: A Guide to Analysis Using <span>R</span></em>. John Wiley; Sons.
diff --git a/c13-ncvs-vignette.html b/c13-ncvs-vignette.html
index 383389a6..9653fede 100644
--- a/c13-ncvs-vignette.html
+++ b/c13-ncvs-vignette.html
@@ -23,7 +23,7 @@
 <meta name="author" content="Stephanie Zimmer, Rebecca J. Powell, and Isabella Velásquez" />
 
 
-<meta name="date" content="2024-04-16" />
+<meta name="date" content="2024-04-24" />
 
   <meta name="viewport" content="width=device-width, initial-scale=1" />
   <meta name="apple-mobile-web-app-capable" content="yes" />
@@ -229,7 +229,7 @@
 <li class="chapter" data-level="4.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#introduction-3"><i class="fa fa-check"></i><b>4.1</b> Introduction</a></li>
 <li class="chapter" data-level="4.2" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup"><i class="fa fa-check"></i><b>4.2</b> Setup</a>
 <ul>
-<li class="chapter" data-level="4.2.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#packages"><i class="fa fa-check"></i><b>4.2.1</b> Packages</a></li>
+<li class="chapter" data-level="4.2.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup-load-pkgs"><i class="fa fa-check"></i><b>4.2.1</b> Packages</a></li>
 <li class="chapter" data-level="4.2.2" data-path="c04-getting-started.html"><a href="c04-getting-started.html#data"><i class="fa fa-check"></i><b>4.2.2</b> Data</a></li>
 <li class="chapter" data-level="4.2.3" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup-des-obj"><i class="fa fa-check"></i><b>4.2.3</b> Design objects</a></li>
 </ul></li>
@@ -358,7 +358,7 @@
 <li class="chapter" data-level="9.9.1" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#random-number-seeds"><i class="fa fa-check"></i><b>9.9.1</b> Random number seeds</a></li>
 <li class="chapter" data-level="9.9.2" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#descriptive-names-and-labels"><i class="fa fa-check"></i><b>9.9.2</b> Descriptive names and labels</a></li>
 </ul></li>
-<li class="chapter" data-level="9.10" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#summary"><i class="fa fa-check"></i><b>9.10</b> Summary</a></li>
+<li class="chapter" data-level="9.10" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#additional-resources-1"><i class="fa fa-check"></i><b>9.10</b> Additional resources</a></li>
 </ul></li>
 <li class="part"><span><b>IV Real life data</b></span></li>
 <li class="chapter" data-level="10" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html"><i class="fa fa-check"></i><b>10</b> Sample designs and replicate weights</a>
@@ -380,7 +380,7 @@
 <ul>
 <li class="chapter" data-level="10.4.1" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#balanced-repeated-replication-brr-method"><i class="fa fa-check"></i><b>10.4.1</b> Balanced Repeated Replication (BRR) method</a></li>
 <li class="chapter" data-level="10.4.2" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#fays-brr-method"><i class="fa fa-check"></i><b>10.4.2</b> Fay’s BRR method</a></li>
-<li class="chapter" data-level="10.4.3" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#jackknife-method"><i class="fa fa-check"></i><b>10.4.3</b> Jackknife method</a></li>
+<li class="chapter" data-level="10.4.3" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#samp-jackknife"><i class="fa fa-check"></i><b>10.4.3</b> Jackknife method</a></li>
 <li class="chapter" data-level="10.4.4" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#bootstrap-method"><i class="fa fa-check"></i><b>10.4.4</b> Bootstrap method</a></li>
 </ul></li>
 <li class="chapter" data-level="10.5" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#exercises-2"><i class="fa fa-check"></i><b>10.5</b> Exercises</a></li>
@@ -398,14 +398,14 @@
 <li class="chapter" data-level="11.4" data-path="c11-missing-data.html"><a href="c11-missing-data.html#analysis-with-missing-data"><i class="fa fa-check"></i><b>11.4</b> Analysis with missing data</a>
 <ul>
 <li class="chapter" data-level="11.4.1" data-path="c11-missing-data.html"><a href="c11-missing-data.html#recoding-missing-data"><i class="fa fa-check"></i><b>11.4.1</b> Recoding missing data</a></li>
-<li class="chapter" data-level="11.4.2" data-path="c11-missing-data.html"><a href="c11-missing-data.html#accounting-for-skip-patterns"><i class="fa fa-check"></i><b>11.4.2</b> Accounting for skip patterns</a></li>
+<li class="chapter" data-level="11.4.2" data-path="c11-missing-data.html"><a href="c11-missing-data.html#missing-skip-patt"><i class="fa fa-check"></i><b>11.4.2</b> Accounting for skip patterns</a></li>
 </ul></li>
 </ul></li>
 <li class="chapter" data-level="12" data-path="c12-recommendations.html"><a href="c12-recommendations.html"><i class="fa fa-check"></i><b>12</b> Successful survey analysis recommendations</a>
 <ul>
 <li class="chapter" data-level="" data-path="c12-recommendations.html"><a href="c12-recommendations.html#prereq12"><i class="fa fa-check"></i>Prerequisites</a></li>
 <li class="chapter" data-level="12.1" data-path="c12-recommendations.html"><a href="c12-recommendations.html#introduction-10"><i class="fa fa-check"></i><b>12.1</b> Introduction</a></li>
-<li class="chapter" data-level="12.2" data-path="c12-recommendations.html"><a href="c12-recommendations.html#recs-survey-process"><i class="fa fa-check"></i><b>12.2</b> Follow survey analysis process</a></li>
+<li class="chapter" data-level="12.2" data-path="c12-recommendations.html"><a href="c12-recommendations.html#recs-survey-process"><i class="fa fa-check"></i><b>12.2</b> Follow the survey analysis process</a></li>
 <li class="chapter" data-level="12.3" data-path="c12-recommendations.html"><a href="c12-recommendations.html#begin-with-descriptive-analysis"><i class="fa fa-check"></i><b>12.3</b> Begin with descriptive analysis</a>
 <ul>
 <li class="chapter" data-level="12.3.1" data-path="c12-recommendations.html"><a href="c12-recommendations.html#table-review"><i class="fa fa-check"></i><b>12.3.1</b> Table review</a></li>
@@ -529,17 +529,17 @@ <h3>Prerequisites<a href="c13-ncvs-vignette.html#prereq9" class="anchor-section"
 <span id="cb405-3"><a href="c13-ncvs-vignette.html#cb405-3" tabindex="-1"></a><span class="fu">library</span>(srvyr) </span>
 <span id="cb405-4"><a href="c13-ncvs-vignette.html#cb405-4" tabindex="-1"></a><span class="fu">library</span>(srvyrexploR)</span>
 <span id="cb405-5"><a href="c13-ncvs-vignette.html#cb405-5" tabindex="-1"></a><span class="fu">library</span>(gt)</span></code></pre></div>
-<p>We will use data from the United States National Crime Victimization Survey (NCVS). These data are available in the {srvyrexploR} package as <code>ncvs_2021_incident</code>, <code>ncvs_2021_household</code>, and <code>ncvs_2021_person</code>.</p>
+<p>We use data from the United States National Crime Victimization Survey (NCVS.) These data are available in the {srvyrexploR} package as <code>ncvs_2021_incident</code>, <code>ncvs_2021_household</code>, and <code>ncvs_2021_person</code>.</p>
 </div>
 <div id="introduction-11" class="section level2 hasAnchor" number="13.1">
 <h2><span class="header-section-number">13.1</span> Introduction<a href="c13-ncvs-vignette.html#introduction-11" class="anchor-section" aria-label="Anchor link to header"></a></h2>
-<p>The NCVS is a household survey sponsored by the Bureau of Justice Statistics (BJS), which collects data on criminal victimization, including characteristics of the crimes, offenders, and victims. Crime types include both household and personal crimes, as well as violent and non-violent crimes. The target population of this survey is all people in the United States age 12 and older living in housing units and noninstitutional group quarters.</p>
-<p>The NCVS has been ongoing since 1992. An earlier survey, the National Crime Survey, was run from 1972 to 1991 <span class="citation">(<a href="#ref-ncvs_tech_2016">Bureau of Justice Statistics 2017</a>)</span>. The survey is administered using a rotating panel. When an address enters the sample, the residents of that address are interviewed every six months for a total of seven interviews. If the initial residents move away from the address during the period, the new residents are included in the survey, as people are not followed when they move.</p>
-<p>NCVS data is publicly available and distributed by Inter-university Consortium for Political and Social Research (ICPSR), with data going back to 1992. The vignette in this book will include data from 2021 <span class="citation">(<a href="#ref-ncvs_data_2021">United States. Bureau of Justice Statistics 2022</a>)</span>. The NCVS data structure is complicated, and the User’s Guide contains examples for analysis in SAS, SUDAAN, SPSS, and Stata, but not R <span class="citation">(<a href="#ref-ncvs_user_guide">Shook-Sa, Bonnie, Couzens, G. Lance, and Berzofsky, Marcus 2015</a>)</span>. This vignette will adapt those examples for R.</p>
+<p>The National Crime Victimization Survey (NCVS) is a household survey sponsored by the Bureau of Justice Statistics (BJS), which collects data on criminal victimization, including characteristics of the crimes, offenders, and victims. Crime types include both household and personal crimes, as well as violent and non-violent crimes. The population of interest of this survey is all people in the United States age 12 and older living in housing units and noninstitutional group quarters.</p>
+<p>The NCVS has been ongoing since 1992. An earlier survey, the National Crime Survey, was run from 1972 to 1991 <span class="citation">(<a href="#ref-ncvs_tech_2016">Bureau of Justice Statistics 2017</a>)</span>. The survey is administered using a rotating panel. When an address enters the sample, the residents of that address are interviewed every six months for a total of seven interviews. If the initial residents move away from the address during the period and new residents move in, the new residents are included in the survey, as people are not followed when they move.</p>
+<p>NCVS data are publicly available and distributed by Inter-university Consortium for Political and Social Research (ICPSR), with data going back to 1992. The vignette in this book includes data from 2021 <span class="citation">(<a href="#ref-ncvs_data_2021">United States. Bureau of Justice Statistics 2022</a>)</span>. The NCVS data structure is complicated, and the User’s Guide contains examples for analysis in SAS, SUDAAN, SPSS, and Stata, but not R <span class="citation">(<a href="#ref-ncvs_user_guide">Shook-Sa, Bonnie, Couzens, G. Lance, and Berzofsky, Marcus 2015</a>)</span>. This vignette adapts those examples for R.</p>
 </div>
 <div id="data-structure" class="section level2 hasAnchor" number="13.2">
 <h2><span class="header-section-number">13.2</span> Data structure<a href="c13-ncvs-vignette.html#data-structure" class="anchor-section" aria-label="Anchor link to header"></a></h2>
-<p>The data from ICPSR is distributed with five files, each having its unique identifier indicated:</p>
+<p>The data from ICPSR are distributed with five files, each having its unique identifier indicated:</p>
 <ul>
 <li>Address Record - <code>YEARQ</code>, <code>IDHH</code></li>
 <li>Household Record - <code>YEARQ</code>, <code>IDHH</code></li>
@@ -547,36 +547,36 @@ <h2><span class="header-section-number">13.2</span> Data structure<a href="c13-n
 <li>Incident Record - <code>YEARQ</code>, <code>IDHH</code>, <code>IDPER</code></li>
 <li>2021 Collection Year Incident - <code>YEARQ</code>, <code>IDHH</code>, <code>IDPER</code></li>
 </ul>
-<p>We will focus on the household, person, and incident files. From these files, we selected a subset of columns for examples to use in this vignette. We have included data in the {srvyexploR} package with a subset of columns, but you can download the complete files at ICPSR <span class="citation">(<a href="#ref-ncvs_data_2021">United States. Bureau of Justice Statistics 2022</a>)</span>.</p>
+<p>In this vignette, we focus on the household, person, and incident files and have selected a subset of columns for use in the examples. We have included data in the {srvyexploR} package with this subset of columns, but the complete data files can be downloaded from <a href="https://www.icpsr.umich.edu/web/NACJD/studies/38429">ICPSR</a>.</p>
 </div>
 <div id="survey-notation" class="section level2 hasAnchor" number="13.3">
 <h2><span class="header-section-number">13.3</span> Survey notation<a href="c13-ncvs-vignette.html#survey-notation" class="anchor-section" aria-label="Anchor link to header"></a></h2>
 <p>The NCVS User Guide <span class="citation">(<a href="#ref-ncvs_user_guide">Shook-Sa, Bonnie, Couzens, G. Lance, and Berzofsky, Marcus 2015</a>)</span> uses the following notation:</p>
 <ul>
 <li><span class="math inline">\(i\)</span> represents NCVS households, identified on the household-level file with the household identification number <code>IDHH</code>.</li>
-<li><span class="math inline">\(j\)</span> represents NCVS individual respondents within households <span class="math inline">\(i\)</span>, identified on the person-level file with the person identification number <code>IDPER</code>.</li>
-<li><span class="math inline">\(k\)</span> represents reporting periods (i.e., <code>YEARQ</code>) for households <span class="math inline">\(i\)</span> and individual respondent <span class="math inline">\(j\)</span>.</li>
+<li><span class="math inline">\(j\)</span> represents NCVS individual respondents within household <span class="math inline">\(i\)</span>, identified on the person-level file with the person identification number <code>IDPER</code>.</li>
+<li><span class="math inline">\(k\)</span> represents reporting periods (i.e., <code>YEARQ</code>) for household <span class="math inline">\(i\)</span> and individual respondent <span class="math inline">\(j\)</span>.</li>
 <li><span class="math inline">\(l\)</span> represents victimization records for respondent <span class="math inline">\(j\)</span> in household <span class="math inline">\(i\)</span> and reporting period <span class="math inline">\(k\)</span>. Each record on the NCVS incident-level file is associated with a victimization record <span class="math inline">\(l\)</span>.</li>
-<li><span class="math inline">\(D\)</span> represents one or more domain characteristics of interest in the calculation of NCVS estimates. For victimization totals and proportions, domains can be defined on the basis of crime types (e.g., violent crimes, property crimes), characteristics of victims (e.g., age, sex, household income), or characteristics of the victimizations (e.g., victimizations reported to police, victimizations committed with a weapon present). Domains could also be a combination of all of these types of characteristics. For example, in the calculation of victimization rates, domains are defined on the basis of the characteristics of the victims.</li>
-<li><span class="math inline">\(A_a\)</span> represents the level <span class="math inline">\(a\)</span> of covariate <span class="math inline">\(A\)</span>. Covariate <span class="math inline">\(A\)</span> is defined in the calculation of victimization proportions and represents the characteristic for which the analyst wants to obtain the distribution of victimizations in domain <span class="math inline">\(D\)</span>.</li>
+<li><span class="math inline">\(D\)</span> represents one or more domain characteristics of interest in the calculation of NCVS estimates. For victimization totals and proportions, domains can be defined on the basis of crime types (e.g., violent crimes, property crimes), characteristics of victims (e.g., age, sex, household income), or characteristics of the victimizations (e.g., victimizations reported to police, victimizations committed with a weapon present.) Domains could also be a combination of all of these types of characteristics. For example, in the calculation of victimization rates, domains are defined on the basis of the characteristics of the victims.</li>
+<li><span class="math inline">\(A_a\)</span> represents the level <span class="math inline">\(a\)</span> of covariate <span class="math inline">\(A\)</span>. Covariate <span class="math inline">\(A\)</span> is defined in the calculation of victimization proportions and represents the characteristic we want to obtain the distribution of victimizations in domain <span class="math inline">\(D\)</span>.</li>
 <li><span class="math inline">\(C\)</span> represents the personal or property crime for which we want to obtain a victimization rate.</li>
 </ul>
-<p>In this vignette, we will discuss four estimates:</p>
+<p>In this vignette, we discuss four estimates:</p>
 <ol style="list-style-type: decimal">
 <li><em>Victimization totals</em> estimate the number of criminal victimizations with a given characteristic. As demonstrated below, these can be calculated from any of the data files. The estimated victimization total, <span class="math inline">\(\hat{t}_D\)</span> for domain <span class="math inline">\(D\)</span> is estimated as</li>
 </ol>
 <p><span class="math display">\[ \hat{t}_D = \sum_{ijkl \in D} v_{ijkl}\]</span></p>
-<p>where <span class="math inline">\(v_{ijkl}\)</span> is the series-adjusted victimization weight for household <span class="math inline">\(i\)</span>, respondent <span class="math inline">\(j\)</span>, reporting period <span class="math inline">\(k\)</span>, and victimization <span class="math inline">\(l\)</span>, that is <code>WGTVICCY</code>.</p>
+<p>where <span class="math inline">\(v_{ijkl}\)</span> is the series-adjusted victimization weight for household <span class="math inline">\(i\)</span>, respondent <span class="math inline">\(j\)</span>, reporting period <span class="math inline">\(k\)</span>, and victimization <span class="math inline">\(l\)</span>, represented in the data as <code>WGTVICCY</code>.</p>
 <ol start="2" style="list-style-type: decimal">
 <li><em>Victimization proportions</em> estimate characteristics among victimizations or victims. Victimization proportions are calculated using the incident data file. The estimated victimization proportion for domain <span class="math inline">\(D\)</span> across level <span class="math inline">\(a\)</span> of covariate <span class="math inline">\(A\)</span>, <span class="math inline">\(\hat{p}_{A_a,D}\)</span> is</li>
 </ol>
 <p><span class="math display">\[ \hat{p}_{A_a,D} =\frac{\sum_{ijkl \in A_a, D} v_{ijkl}}{\sum_{ijkl \in D} v_{ijkl}}.\]</span>
 The numerator is the number of incidents with a particular characteristic in a domain, and the denominator is the number of incidents in a domain.</p>
 <ol start="3" style="list-style-type: decimal">
-<li><em>Victimization rates</em> are estimates of the number of victimizations per 1,000 persons or households in the population<a href="#fn27" class="footnote-ref" id="fnref27"><sup>27</sup></a>. Victimization rates are calculated using the household or person-level data files. The estimated victimization rate for crime <span class="math inline">\(C\)</span> in domain <span class="math inline">\(D\)</span> is</li>
+<li><em>Victimization rates</em> are estimates of the number of victimizations per 1,000 persons or households in the population.<a href="#fn28" class="footnote-ref" id="fnref28"><sup>28</sup></a> Victimization rates are calculated using the household or person-level data files. The estimated victimization rate for crime <span class="math inline">\(C\)</span> in domain <span class="math inline">\(D\)</span> is</li>
 </ol>
 <p><span class="math display">\[\hat{VR}_{C,D}= \frac{\sum_{ijkl \in C,D} v_{ijkl}}{\sum_{ijk \in D} w_{ijk}}\times 1000\]</span>
-where <span class="math inline">\(w_{ijk}\)</span> is the person weight (<code>WGTPERCY</code>) or household weight (<code>WGTHHCY</code>) for personal and household crimes, respectively. The numerator is the number of incidents in a domain, and the denominator is the number of persons or households in a domain. Notice that the weights in the numerator and denominator are different - this is important, and in the syntax and examples below, we will discuss how to make an estimate that involves two weights.</p>
+where <span class="math inline">\(w_{ijk}\)</span> is the person weight (<code>WGTPERCY</code>) for personal crimes or household weight (<code>WGTHHCY</code>) for household crimes. The numerator is the number of incidents in a domain, and the denominator is the number of persons or households in a domain. Notice that the weights in the numerator and denominator are different - this is important, and in the syntax and examples below, we discuss how to make an estimate that involves two weights.</p>
 <ol start="4" style="list-style-type: decimal">
 <li><em>Prevalence rates</em> are estimates of the percentage of the population (persons or households) who are victims of a crime. These are estimated using the household or person-level data files. The estimated prevalence rate for crime <span class="math inline">\(C\)</span> in domain <span class="math inline">\(D\)</span> is</li>
 </ol>
@@ -590,7 +590,7 @@ <h2><span class="header-section-number">13.4</span> Data file preparation<a href
 <div id="preparing-files-for-estimation-of-victimization-rates" class="section level3 hasAnchor" number="13.4.1">
 <h3><span class="header-section-number">13.4.1</span> Preparing files for estimation of victimization rates<a href="c13-ncvs-vignette.html#preparing-files-for-estimation-of-victimization-rates" class="anchor-section" aria-label="Anchor link to header"></a></h3>
 <p>Each record on the incident file represents one victimization, which is not the same as one incident. Some victimizations have several instances that make it difficult for the victim to differentiate the details of these incidents, labeled as “series crimes”. Appendix A of the User’s Guide indicates how to calculate the series weight in other statistical languages.</p>
-<p>Here, we adapt that code for R. Essentially, if a victimization is a series crime, its series weight is top-coded at 10 based on the number of actual victimizations, that is that even if the crime repeatedly occurred more than 10 times, it is counted as 10 times to reduce the influence of extreme outliers. If an incident is a series crime, but the number of occurrences is unknown, the series weight is set to 6. A description of the variables used to create indicators of series and the associated weights is included in Table <a href="c13-ncvs-vignette.html#tab:cb-incident">13.1</a>.</p>
+<p>Here, we adapt that code for R. Essentially, if a victimization is a series crime, its series weight is top-coded at 10 based on the number of actual victimizations, that is, even if the crime occurred more than 10 times, it is counted as 10 times to reduce the influence of extreme outliers. If an incident is a series crime, but the number of occurrences is unknown, the series weight is set to 6. A description of the variables used to create indicators of series and the associated weights is included in Table <a href="c13-ncvs-vignette.html#tab:cb-incident">13.1</a>.</p>
 <table style="width:100%;">
 <caption><span id="tab:cb-incident">TABLE 13.1: </span> Codebook for incident variables - related to series weight</caption>
 <colgroup>
@@ -682,7 +682,7 @@ <h3><span class="header-section-number">13.4.1</span> Preparing files for estima
 </tr>
 </tbody>
 </table>
-<p>We want to create four variables to indicate if an incident is a series crime. First, we create a variable called series using <code>V4017</code>, <code>V4018</code>, and <code>V4019</code> where an incident is considered a series crime if there are 6 or more incidents (<code>V4107</code>), the incidents are similar in detail (<code>V4018</code>), or there is not enough detail to distinguish the incidents (<code>V4019</code>). Next, we top-code the number of incidents (<code>V4016</code>) by creating a variable <code>n10v4016</code> which is set to 10 if <code>V4016 &gt; 10</code>. Finally, we create the series weight using our new top-coded variable and the existing weight.</p>
+<p>We want to create four variables to indicate if an incident is a series crime. First, we create a variable called <code>series</code> using <code>V4017</code>, <code>V4018</code>, and <code>V4019</code> where an incident is considered a series crime if there are 6 or more incidents (<code>V4107</code>), the incidents are similar in detail (<code>V4018</code>), or there is not enough detail to distinguish the incidents (<code>V4019</code>.) Second, we top-code the number of incidents (<code>V4016</code>) by creating a variable <code>n10v4016</code>, which is set to 10 if <code>V4016 &gt; 10</code>. Third, we create the <code>serieswgt</code> using the two new variables <code>series</code> and <code>n10v4019</code> to classify the max series based on missing data and number of incidents. Finally, we create the new weight using our new <code>serieswgt</code> variable and the existing weight (<code>WGTVICCY</code>.)</p>
 <div class="sourceCode" id="cb406"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb406-1"><a href="c13-ncvs-vignette.html#cb406-1" tabindex="-1"></a>inc_series <span class="ot">&lt;-</span> ncvs_2021_incident <span class="sc">%&gt;%</span></span>
 <span id="cb406-2"><a href="c13-ncvs-vignette.html#cb406-2" tabindex="-1"></a>  <span class="fu">mutate</span>(</span>
 <span id="cb406-3"><a href="c13-ncvs-vignette.html#cb406-3" tabindex="-1"></a>    <span class="at">series =</span> <span class="fu">case_when</span>(V4017 <span class="sc">%in%</span> <span class="fu">c</span>(<span class="dv">1</span>, <span class="dv">8</span>) <span class="sc">~</span> <span class="dv">1</span>,</span>
@@ -698,7 +698,7 @@ <h3><span class="header-section-number">13.4.1</span> Preparing files for estima
 <span id="cb406-13"><a href="c13-ncvs-vignette.html#cb406-13" tabindex="-1"></a>                          <span class="cn">TRUE</span> <span class="sc">~</span> <span class="dv">1</span>),</span>
 <span id="cb406-14"><a href="c13-ncvs-vignette.html#cb406-14" tabindex="-1"></a>    <span class="at">NEWWGT =</span> WGTVICCY <span class="sc">*</span> serieswgt</span>
 <span id="cb406-15"><a href="c13-ncvs-vignette.html#cb406-15" tabindex="-1"></a>  )</span></code></pre></div>
-<p>The next step in preparing the files for estimation is to create indicators on the victimization file for characteristics of interest. Almost all BJS publications limit the analysis to records where the victimization occurred in the United States, where <code>V4022</code> is not equal to 1, and we will do this for all estimates as well. A brief codebook of variables for this task is located in Table <a href="c13-ncvs-vignette.html#tab:cb-crimetype">13.2</a></p>
+<p>The next step in preparing the files for estimation is to create indicators on the victimization file for characteristics of interest. Almost all BJS publications limit the analysis to records where the victimization occurred in the United States (where <code>V4022</code> is not equal to 1). We do this for all estimates as well. A brief codebook of variables for this task is located in Table <a href="c13-ncvs-vignette.html#tab:cb-crimetype">13.2</a></p>
 <table>
 <caption><span id="tab:cb-crimetype">TABLE 13.2: </span> Codebook for incident variables - crime type indicators and characteristics</caption>
 <thead>
@@ -1048,7 +1048,7 @@ <h3><span class="header-section-number">13.4.1</span> Preparing files for estima
 </tr>
 </tbody>
 </table>
-<p>Using these variables, we will create the following indicators:</p>
+<p>Using these variables, we create the following indicators:</p>
 <ol style="list-style-type: decimal">
 <li>Property crime
 <ul>
@@ -1220,7 +1220,7 @@ <h3><span class="header-section-number">13.4.1</span> Preparing files for estima
 ##  9 TRUE  NoWeap     TRUE        FALSE        FALSE      FALSE         25
 ## 10 TRUE  Other      FALSE       FALSE        FALSE      TRUE         146
 ## 11 TRUE  UnkWeapUse FALSE       FALSE        FALSE      FALSE          3</code></pre>
-<p>After creating indicators of victimization types and characteristics, the file is summarized, and crimes are summed across persons or households by <code>YEARQ.</code> Property crimes (i.e., crimes committed against households, such as household burglary or motor vehicle theft) are summed across households, and personal crimes (i.e., crimes committed against an individual, such as assault, robbery, and personal theft) are summed across persons. The indicators are summed using the <code>serieswgt</code>, and the variable <code>WGTVICCY</code> needs to be retained for later analysis.</p>
+<p>After creating indicators of victimization types and characteristics, the file is summarized, and crimes are summed across persons or households by <code>YEARQ.</code> Property crimes (i.e., crimes committed against households, such as household burglary or motor vehicle theft) are summed across households, and personal crimes (i.e., crimes committed against an individual, such as assault, robbery, and personal theft) are summed across persons. The indicators are summed using our created series weight variable (<code>serieswgt</code>.) Additionally, the existing weight variable (<code>WGTVICCY</code>) needs to be retained for later analysis.</p>
 <div class="sourceCode" id="cb420"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb420-1"><a href="c13-ncvs-vignette.html#cb420-1" tabindex="-1"></a>inc_hh_sums <span class="ot">&lt;-</span></span>
 <span id="cb420-2"><a href="c13-ncvs-vignette.html#cb420-2" tabindex="-1"></a>  inc_ind <span class="sc">%&gt;%</span></span>
 <span id="cb420-3"><a href="c13-ncvs-vignette.html#cb420-3" tabindex="-1"></a>  <span class="fu">filter</span>(V4529_num <span class="sc">&gt;</span> <span class="dv">23</span>) <span class="sc">%&gt;%</span> <span class="co"># restrict to household crimes</span></span>
@@ -1240,31 +1240,30 @@ <h3><span class="header-section-number">13.4.1</span> Preparing files for estima
 <span id="cb420-17"><a href="c13-ncvs-vignette.html#cb420-17" tabindex="-1"></a>                   <span class="sc">~</span> <span class="fu">sum</span>(. <span class="sc">*</span> serieswgt), </span>
 <span id="cb420-18"><a href="c13-ncvs-vignette.html#cb420-18" tabindex="-1"></a>                   <span class="at">.names =</span> <span class="st">&quot;{.col}&quot;</span>),</span>
 <span id="cb420-19"><a href="c13-ncvs-vignette.html#cb420-19" tabindex="-1"></a>            <span class="at">.groups =</span> <span class="st">&quot;drop&quot;</span>)</span></code></pre></div>
-<p>Now, we merge the victimization summary files into the appropriate files. For any record on the household or person file that is not on the victimization file, the victimization counts are set to 0 after merging. In this step, we will also create the victimization adjustment factor. See 2.2.4 in the User’s Guide for details of why this adjustment is created (<span class="citation">Shook-Sa, Bonnie, Couzens, G. Lance, and Berzofsky, Marcus (<a href="#ref-ncvs_user_guide">2015</a>)</span>). It is calculated as follows:</p>
+<p>Now, we merge the victimization summary files into the appropriate files. For any record on the household or person file that is not on the victimization file, the victimization counts are set to 0 after merging. In this step, we also create the victimization adjustment factor. See Section 2.2.4 in the User’s Guide for details of why this adjustment is created <span class="citation">(<a href="#ref-ncvs_user_guide">Shook-Sa, Bonnie, Couzens, G. Lance, and Berzofsky, Marcus 2015</a>)</span>. It is calculated as follows:</p>
 <p><span class="math display">\[ A_{ijk}=\frac{v_{ijk}}{w_{ijk}}\]</span></p>
 <p>where <span class="math inline">\(w_{ijk}\)</span> is the person weight (<code>WGTPERCY</code>) for personal crimes or the household weight (<code>WGTHHCY</code>) for household crimes, and <span class="math inline">\(v_{ijk}\)</span> is the victimization weight (<code>WGTVICCY</code>) for household <span class="math inline">\(i\)</span>, respondent <span class="math inline">\(j\)</span>, in reporting period <span class="math inline">\(k\)</span>. The adjustment factor is set to 0 if no incidents are reported.</p>
-<div class="sourceCode" id="cb421"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb421-1"><a href="c13-ncvs-vignette.html#cb421-1" tabindex="-1"></a><span class="co"># Set up a list of 0s for each crime type/characteristic to replace NA&#39;s</span></span>
-<span id="cb421-2"><a href="c13-ncvs-vignette.html#cb421-2" tabindex="-1"></a>hh_z_list <span class="ot">&lt;-</span> <span class="fu">rep</span>(<span class="dv">0</span>, <span class="fu">ncol</span>(inc_hh_sums) <span class="sc">-</span> <span class="dv">3</span>) <span class="sc">%&gt;%</span> <span class="fu">as.list</span>() <span class="sc">%&gt;%</span></span>
-<span id="cb421-3"><a href="c13-ncvs-vignette.html#cb421-3" tabindex="-1"></a>  <span class="fu">setNames</span>(<span class="fu">names</span>(inc_hh_sums)[<span class="sc">-</span>(<span class="dv">1</span><span class="sc">:</span><span class="dv">3</span>)])</span>
-<span id="cb421-4"><a href="c13-ncvs-vignette.html#cb421-4" tabindex="-1"></a>pers_z_list <span class="ot">&lt;-</span> <span class="fu">rep</span>(<span class="dv">0</span>, <span class="fu">ncol</span>(inc_pers_sums) <span class="sc">-</span> <span class="dv">4</span>) <span class="sc">%&gt;%</span> <span class="fu">as.list</span>() <span class="sc">%&gt;%</span></span>
-<span id="cb421-5"><a href="c13-ncvs-vignette.html#cb421-5" tabindex="-1"></a>  <span class="fu">setNames</span>(<span class="fu">names</span>(inc_pers_sums)[<span class="sc">-</span>(<span class="dv">1</span><span class="sc">:</span><span class="dv">4</span>)])</span>
-<span id="cb421-6"><a href="c13-ncvs-vignette.html#cb421-6" tabindex="-1"></a></span>
-<span id="cb421-7"><a href="c13-ncvs-vignette.html#cb421-7" tabindex="-1"></a>hh_vsum <span class="ot">&lt;-</span> ncvs_2021_household <span class="sc">%&gt;%</span></span>
-<span id="cb421-8"><a href="c13-ncvs-vignette.html#cb421-8" tabindex="-1"></a>  <span class="fu">full_join</span>(inc_hh_sums, <span class="at">by =</span> <span class="fu">c</span>(<span class="st">&quot;YEARQ&quot;</span>, <span class="st">&quot;IDHH&quot;</span>)) <span class="sc">%&gt;%</span></span>
-<span id="cb421-9"><a href="c13-ncvs-vignette.html#cb421-9" tabindex="-1"></a>  <span class="fu">replace_na</span>(hh_z_list) <span class="sc">%&gt;%</span></span>
-<span id="cb421-10"><a href="c13-ncvs-vignette.html#cb421-10" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">ADJINC_WT =</span> <span class="fu">if_else</span>(<span class="fu">is.na</span>(WGTVICCY), <span class="dv">0</span>, WGTVICCY <span class="sc">/</span> WGTHHCY))</span>
-<span id="cb421-11"><a href="c13-ncvs-vignette.html#cb421-11" tabindex="-1"></a></span>
-<span id="cb421-12"><a href="c13-ncvs-vignette.html#cb421-12" tabindex="-1"></a>pers_vsum <span class="ot">&lt;-</span> ncvs_2021_person <span class="sc">%&gt;%</span></span>
-<span id="cb421-13"><a href="c13-ncvs-vignette.html#cb421-13" tabindex="-1"></a>  <span class="fu">full_join</span>(inc_pers_sums, <span class="at">by =</span> <span class="fu">c</span>(<span class="st">&quot;YEARQ&quot;</span>, <span class="st">&quot;IDHH&quot;</span>, <span class="st">&quot;IDPER&quot;</span>)) <span class="sc">%&gt;%</span></span>
-<span id="cb421-14"><a href="c13-ncvs-vignette.html#cb421-14" tabindex="-1"></a>  <span class="fu">replace_na</span>(pers_z_list) <span class="sc">%&gt;%</span></span>
-<span id="cb421-15"><a href="c13-ncvs-vignette.html#cb421-15" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">ADJINC_WT =</span> <span class="fu">if_else</span>(<span class="fu">is.na</span>(WGTVICCY), <span class="dv">0</span>, WGTVICCY <span class="sc">/</span> WGTPERCY))</span></code></pre></div>
+<div class="sourceCode" id="cb421"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb421-1"><a href="c13-ncvs-vignette.html#cb421-1" tabindex="-1"></a>hh_z_list <span class="ot">&lt;-</span> <span class="fu">rep</span>(<span class="dv">0</span>, <span class="fu">ncol</span>(inc_hh_sums) <span class="sc">-</span> <span class="dv">3</span>) <span class="sc">%&gt;%</span> <span class="fu">as.list</span>() <span class="sc">%&gt;%</span></span>
+<span id="cb421-2"><a href="c13-ncvs-vignette.html#cb421-2" tabindex="-1"></a>  <span class="fu">setNames</span>(<span class="fu">names</span>(inc_hh_sums)[<span class="sc">-</span>(<span class="dv">1</span><span class="sc">:</span><span class="dv">3</span>)])</span>
+<span id="cb421-3"><a href="c13-ncvs-vignette.html#cb421-3" tabindex="-1"></a>pers_z_list <span class="ot">&lt;-</span> <span class="fu">rep</span>(<span class="dv">0</span>, <span class="fu">ncol</span>(inc_pers_sums) <span class="sc">-</span> <span class="dv">4</span>) <span class="sc">%&gt;%</span> <span class="fu">as.list</span>() <span class="sc">%&gt;%</span></span>
+<span id="cb421-4"><a href="c13-ncvs-vignette.html#cb421-4" tabindex="-1"></a>  <span class="fu">setNames</span>(<span class="fu">names</span>(inc_pers_sums)[<span class="sc">-</span>(<span class="dv">1</span><span class="sc">:</span><span class="dv">4</span>)])</span>
+<span id="cb421-5"><a href="c13-ncvs-vignette.html#cb421-5" tabindex="-1"></a></span>
+<span id="cb421-6"><a href="c13-ncvs-vignette.html#cb421-6" tabindex="-1"></a>hh_vsum <span class="ot">&lt;-</span> ncvs_2021_household <span class="sc">%&gt;%</span></span>
+<span id="cb421-7"><a href="c13-ncvs-vignette.html#cb421-7" tabindex="-1"></a>  <span class="fu">full_join</span>(inc_hh_sums, <span class="at">by =</span> <span class="fu">c</span>(<span class="st">&quot;YEARQ&quot;</span>, <span class="st">&quot;IDHH&quot;</span>)) <span class="sc">%&gt;%</span></span>
+<span id="cb421-8"><a href="c13-ncvs-vignette.html#cb421-8" tabindex="-1"></a>  <span class="fu">replace_na</span>(hh_z_list) <span class="sc">%&gt;%</span></span>
+<span id="cb421-9"><a href="c13-ncvs-vignette.html#cb421-9" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">ADJINC_WT =</span> <span class="fu">if_else</span>(<span class="fu">is.na</span>(WGTVICCY), <span class="dv">0</span>, WGTVICCY <span class="sc">/</span> WGTHHCY))</span>
+<span id="cb421-10"><a href="c13-ncvs-vignette.html#cb421-10" tabindex="-1"></a></span>
+<span id="cb421-11"><a href="c13-ncvs-vignette.html#cb421-11" tabindex="-1"></a>pers_vsum <span class="ot">&lt;-</span> ncvs_2021_person <span class="sc">%&gt;%</span></span>
+<span id="cb421-12"><a href="c13-ncvs-vignette.html#cb421-12" tabindex="-1"></a>  <span class="fu">full_join</span>(inc_pers_sums, <span class="at">by =</span> <span class="fu">c</span>(<span class="st">&quot;YEARQ&quot;</span>, <span class="st">&quot;IDHH&quot;</span>, <span class="st">&quot;IDPER&quot;</span>)) <span class="sc">%&gt;%</span></span>
+<span id="cb421-13"><a href="c13-ncvs-vignette.html#cb421-13" tabindex="-1"></a>  <span class="fu">replace_na</span>(pers_z_list) <span class="sc">%&gt;%</span></span>
+<span id="cb421-14"><a href="c13-ncvs-vignette.html#cb421-14" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">ADJINC_WT =</span> <span class="fu">if_else</span>(<span class="fu">is.na</span>(WGTVICCY), <span class="dv">0</span>, WGTVICCY <span class="sc">/</span> WGTPERCY))</span></code></pre></div>
 </div>
 <div id="derived-demographic-variables" class="section level3 hasAnchor" number="13.4.2">
 <h3><span class="header-section-number">13.4.2</span> Derived demographic variables<a href="c13-ncvs-vignette.html#derived-demographic-variables" class="anchor-section" aria-label="Anchor link to header"></a></h3>
 <p>A final step in file preparation for the household and person files is creating any derived variables on the household and person files, such as income categories or age categories, for subgroup analysis. We can do this step before or after merging the victimization counts.</p>
 <div id="household-variables" class="section level4 hasAnchor" number="13.4.2.1">
 <h4><span class="header-section-number">13.4.2.1</span> Household variables<a href="c13-ncvs-vignette.html#household-variables" class="anchor-section" aria-label="Anchor link to header"></a></h4>
-<p>For the household file, we create categories for tenure (rental status), urbanicity, income, place size, and region. A codebook of the household variables are located in Table <a href="c13-ncvs-vignette.html#tab:cb-hh">13.3</a>.</p>
+<p>For the household file, we create categories for tenure (rental status), urbanicity, income, place size, and region. A codebook of the household variables is located in Table <a href="c13-ncvs-vignette.html#tab:cb-hh">13.3</a>.</p>
 <table>
 <caption><span id="tab:cb-hh">TABLE 13.3: </span> Codebook for household variables</caption>
 <thead>
@@ -1596,7 +1595,7 @@ <h4><span class="header-section-number">13.4.2.1</span> Household variables<a hr
 </div>
 <div id="person-variables" class="section level4 hasAnchor" number="13.4.2.2">
 <h4><span class="header-section-number">13.4.2.2</span> Person variables<a href="c13-ncvs-vignette.html#person-variables" class="anchor-section" aria-label="Anchor link to header"></a></h4>
-<p>For the person file, we create categories for sex, race/Hispanic origin, age categories, and marital status. A codebook of the household variables is located in Table <a href="c13-ncvs-vignette.html#tab:cb-pers">13.4</a>. We also merge the household demographics to the person file as well as the design variables (<code>V2117</code> and <code>V2118</code>).</p>
+<p>For the person file, we create categories for sex, race/Hispanic origin, age categories, and marital status. A codebook of the household variables is located in Table <a href="c13-ncvs-vignette.html#tab:cb-pers">13.4</a>. We also merge the household demographics to the person file as well as the design variables (<code>V2117</code> and <code>V2118</code>.)</p>
 <table>
 <caption><span id="tab:cb-pers">TABLE 13.4: </span> Codebook for person variables</caption>
 <thead>
@@ -1790,41 +1789,40 @@ <h4><span class="header-section-number">13.4.2.2</span> Person variables<a href=
 </tr>
 </tbody>
 </table>
-<div class="sourceCode" id="cb433"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb433-1"><a href="c13-ncvs-vignette.html#cb433-1" tabindex="-1"></a><span class="co"># Set label for usage later</span></span>
-<span id="cb433-2"><a href="c13-ncvs-vignette.html#cb433-2" tabindex="-1"></a>NHOPI <span class="ot">&lt;-</span> <span class="st">&quot;Native Hawaiian or Other Pacific Islander&quot;</span></span>
-<span id="cb433-3"><a href="c13-ncvs-vignette.html#cb433-3" tabindex="-1"></a></span>
-<span id="cb433-4"><a href="c13-ncvs-vignette.html#cb433-4" tabindex="-1"></a>pers_vsum_der <span class="ot">&lt;-</span> pers_vsum <span class="sc">%&gt;%</span></span>
-<span id="cb433-5"><a href="c13-ncvs-vignette.html#cb433-5" tabindex="-1"></a>  <span class="fu">mutate</span>(</span>
-<span id="cb433-6"><a href="c13-ncvs-vignette.html#cb433-6" tabindex="-1"></a>    <span class="at">Sex =</span> <span class="fu">factor</span>(<span class="fu">case_when</span>(V3018 <span class="sc">==</span> <span class="dv">1</span> <span class="sc">~</span> <span class="st">&quot;Male&quot;</span>,</span>
-<span id="cb433-7"><a href="c13-ncvs-vignette.html#cb433-7" tabindex="-1"></a>                           V3018 <span class="sc">==</span> <span class="dv">2</span> <span class="sc">~</span> <span class="st">&quot;Female&quot;</span>)),</span>
-<span id="cb433-8"><a href="c13-ncvs-vignette.html#cb433-8" tabindex="-1"></a>    <span class="at">RaceHispOrigin =</span> <span class="fu">factor</span>(<span class="fu">case_when</span>(V3024 <span class="sc">==</span> <span class="dv">1</span> <span class="sc">~</span> <span class="st">&quot;Hispanic&quot;</span>,</span>
-<span id="cb433-9"><a href="c13-ncvs-vignette.html#cb433-9" tabindex="-1"></a>                                      V3023A <span class="sc">==</span> <span class="dv">1</span> <span class="sc">~</span> <span class="st">&quot;White&quot;</span>,</span>
-<span id="cb433-10"><a href="c13-ncvs-vignette.html#cb433-10" tabindex="-1"></a>                                      V3023A <span class="sc">==</span> <span class="dv">2</span> <span class="sc">~</span> <span class="st">&quot;Black&quot;</span>,</span>
-<span id="cb433-11"><a href="c13-ncvs-vignette.html#cb433-11" tabindex="-1"></a>                                      V3023A <span class="sc">==</span> <span class="dv">4</span> <span class="sc">~</span> <span class="st">&quot;Asian&quot;</span>,</span>
-<span id="cb433-12"><a href="c13-ncvs-vignette.html#cb433-12" tabindex="-1"></a>                                      V3023A <span class="sc">==</span> <span class="dv">5</span> <span class="sc">~</span> NHOPI,</span>
-<span id="cb433-13"><a href="c13-ncvs-vignette.html#cb433-13" tabindex="-1"></a>                                      <span class="cn">TRUE</span> <span class="sc">~</span> <span class="st">&quot;Other&quot;</span>),</span>
-<span id="cb433-14"><a href="c13-ncvs-vignette.html#cb433-14" tabindex="-1"></a>                            <span class="at">levels =</span> <span class="fu">c</span>(<span class="st">&quot;White&quot;</span>, <span class="st">&quot;Black&quot;</span>, <span class="st">&quot;Hispanic&quot;</span>, </span>
-<span id="cb433-15"><a href="c13-ncvs-vignette.html#cb433-15" tabindex="-1"></a>                                       <span class="st">&quot;Asian&quot;</span>, NHOPI, <span class="st">&quot;Other&quot;</span>)),</span>
-<span id="cb433-16"><a href="c13-ncvs-vignette.html#cb433-16" tabindex="-1"></a>    <span class="at">V3014_num =</span> <span class="fu">as.numeric</span>(<span class="fu">as.character</span>(V3014)),</span>
-<span id="cb433-17"><a href="c13-ncvs-vignette.html#cb433-17" tabindex="-1"></a>    <span class="at">AgeGroup =</span> <span class="fu">case_when</span>(V3014_num <span class="sc">&lt;=</span> <span class="dv">17</span> <span class="sc">~</span> <span class="st">&quot;12-17&quot;</span>,</span>
-<span id="cb433-18"><a href="c13-ncvs-vignette.html#cb433-18" tabindex="-1"></a>                         V3014_num <span class="sc">&lt;=</span> <span class="dv">24</span> <span class="sc">~</span> <span class="st">&quot;18-24&quot;</span>,</span>
-<span id="cb433-19"><a href="c13-ncvs-vignette.html#cb433-19" tabindex="-1"></a>                         V3014_num <span class="sc">&lt;=</span> <span class="dv">34</span> <span class="sc">~</span> <span class="st">&quot;25-34&quot;</span>,</span>
-<span id="cb433-20"><a href="c13-ncvs-vignette.html#cb433-20" tabindex="-1"></a>                         V3014_num <span class="sc">&lt;=</span> <span class="dv">49</span> <span class="sc">~</span> <span class="st">&quot;35-49&quot;</span>,</span>
-<span id="cb433-21"><a href="c13-ncvs-vignette.html#cb433-21" tabindex="-1"></a>                         V3014_num <span class="sc">&lt;=</span> <span class="dv">64</span> <span class="sc">~</span> <span class="st">&quot;50-64&quot;</span>,</span>
-<span id="cb433-22"><a href="c13-ncvs-vignette.html#cb433-22" tabindex="-1"></a>                         V3014_num <span class="sc">&lt;=</span> <span class="dv">90</span> <span class="sc">~</span> <span class="st">&quot;65 or older&quot;</span>),</span>
-<span id="cb433-23"><a href="c13-ncvs-vignette.html#cb433-23" tabindex="-1"></a>    <span class="at">AgeGroup =</span> <span class="fu">fct_reorder</span>(AgeGroup, V3014_num),</span>
-<span id="cb433-24"><a href="c13-ncvs-vignette.html#cb433-24" tabindex="-1"></a>    <span class="at">MaritalStatus =</span> <span class="fu">factor</span>(<span class="fu">case_when</span>(V3015 <span class="sc">==</span> <span class="dv">1</span> <span class="sc">~</span> <span class="st">&quot;Married&quot;</span>,</span>
-<span id="cb433-25"><a href="c13-ncvs-vignette.html#cb433-25" tabindex="-1"></a>                                     V3015 <span class="sc">==</span> <span class="dv">2</span> <span class="sc">~</span> <span class="st">&quot;Widowed&quot;</span>,</span>
-<span id="cb433-26"><a href="c13-ncvs-vignette.html#cb433-26" tabindex="-1"></a>                                     V3015 <span class="sc">==</span> <span class="dv">3</span> <span class="sc">~</span> <span class="st">&quot;Divorced&quot;</span>,</span>
-<span id="cb433-27"><a href="c13-ncvs-vignette.html#cb433-27" tabindex="-1"></a>                                     V3015 <span class="sc">==</span> <span class="dv">4</span> <span class="sc">~</span> <span class="st">&quot;Separated&quot;</span>,</span>
-<span id="cb433-28"><a href="c13-ncvs-vignette.html#cb433-28" tabindex="-1"></a>                                     V3015 <span class="sc">==</span> <span class="dv">5</span> <span class="sc">~</span> <span class="st">&quot;Never married&quot;</span>),</span>
-<span id="cb433-29"><a href="c13-ncvs-vignette.html#cb433-29" tabindex="-1"></a>                           <span class="at">levels =</span> <span class="fu">c</span>(<span class="st">&quot;Never married&quot;</span>, <span class="st">&quot;Married&quot;</span>, </span>
-<span id="cb433-30"><a href="c13-ncvs-vignette.html#cb433-30" tabindex="-1"></a>                                      <span class="st">&quot;Widowed&quot;</span>,<span class="st">&quot;Divorced&quot;</span>, </span>
-<span id="cb433-31"><a href="c13-ncvs-vignette.html#cb433-31" tabindex="-1"></a>                                      <span class="st">&quot;Separated&quot;</span>))</span>
-<span id="cb433-32"><a href="c13-ncvs-vignette.html#cb433-32" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span> </span>
-<span id="cb433-33"><a href="c13-ncvs-vignette.html#cb433-33" tabindex="-1"></a>  <span class="fu">left_join</span>(hh_vsum_der <span class="sc">%&gt;%</span> <span class="fu">select</span>(YEARQ, IDHH, </span>
-<span id="cb433-34"><a href="c13-ncvs-vignette.html#cb433-34" tabindex="-1"></a>                                   V2117, V2118, Tenure<span class="sc">:</span>Region),</span>
-<span id="cb433-35"><a href="c13-ncvs-vignette.html#cb433-35" tabindex="-1"></a>            <span class="at">by =</span> <span class="fu">c</span>(<span class="st">&quot;YEARQ&quot;</span>, <span class="st">&quot;IDHH&quot;</span>))</span></code></pre></div>
+<div class="sourceCode" id="cb433"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb433-1"><a href="c13-ncvs-vignette.html#cb433-1" tabindex="-1"></a>NHOPI <span class="ot">&lt;-</span> <span class="st">&quot;Native Hawaiian or Other Pacific Islander&quot;</span></span>
+<span id="cb433-2"><a href="c13-ncvs-vignette.html#cb433-2" tabindex="-1"></a></span>
+<span id="cb433-3"><a href="c13-ncvs-vignette.html#cb433-3" tabindex="-1"></a>pers_vsum_der <span class="ot">&lt;-</span> pers_vsum <span class="sc">%&gt;%</span></span>
+<span id="cb433-4"><a href="c13-ncvs-vignette.html#cb433-4" tabindex="-1"></a>  <span class="fu">mutate</span>(</span>
+<span id="cb433-5"><a href="c13-ncvs-vignette.html#cb433-5" tabindex="-1"></a>    <span class="at">Sex =</span> <span class="fu">factor</span>(<span class="fu">case_when</span>(V3018 <span class="sc">==</span> <span class="dv">1</span> <span class="sc">~</span> <span class="st">&quot;Male&quot;</span>,</span>
+<span id="cb433-6"><a href="c13-ncvs-vignette.html#cb433-6" tabindex="-1"></a>                           V3018 <span class="sc">==</span> <span class="dv">2</span> <span class="sc">~</span> <span class="st">&quot;Female&quot;</span>)),</span>
+<span id="cb433-7"><a href="c13-ncvs-vignette.html#cb433-7" tabindex="-1"></a>    <span class="at">RaceHispOrigin =</span> <span class="fu">factor</span>(<span class="fu">case_when</span>(V3024 <span class="sc">==</span> <span class="dv">1</span> <span class="sc">~</span> <span class="st">&quot;Hispanic&quot;</span>,</span>
+<span id="cb433-8"><a href="c13-ncvs-vignette.html#cb433-8" tabindex="-1"></a>                                      V3023A <span class="sc">==</span> <span class="dv">1</span> <span class="sc">~</span> <span class="st">&quot;White&quot;</span>,</span>
+<span id="cb433-9"><a href="c13-ncvs-vignette.html#cb433-9" tabindex="-1"></a>                                      V3023A <span class="sc">==</span> <span class="dv">2</span> <span class="sc">~</span> <span class="st">&quot;Black&quot;</span>,</span>
+<span id="cb433-10"><a href="c13-ncvs-vignette.html#cb433-10" tabindex="-1"></a>                                      V3023A <span class="sc">==</span> <span class="dv">4</span> <span class="sc">~</span> <span class="st">&quot;Asian&quot;</span>,</span>
+<span id="cb433-11"><a href="c13-ncvs-vignette.html#cb433-11" tabindex="-1"></a>                                      V3023A <span class="sc">==</span> <span class="dv">5</span> <span class="sc">~</span> NHOPI,</span>
+<span id="cb433-12"><a href="c13-ncvs-vignette.html#cb433-12" tabindex="-1"></a>                                      <span class="cn">TRUE</span> <span class="sc">~</span> <span class="st">&quot;Other&quot;</span>),</span>
+<span id="cb433-13"><a href="c13-ncvs-vignette.html#cb433-13" tabindex="-1"></a>                            <span class="at">levels =</span> <span class="fu">c</span>(<span class="st">&quot;White&quot;</span>, <span class="st">&quot;Black&quot;</span>, <span class="st">&quot;Hispanic&quot;</span>, </span>
+<span id="cb433-14"><a href="c13-ncvs-vignette.html#cb433-14" tabindex="-1"></a>                                       <span class="st">&quot;Asian&quot;</span>, NHOPI, <span class="st">&quot;Other&quot;</span>)),</span>
+<span id="cb433-15"><a href="c13-ncvs-vignette.html#cb433-15" tabindex="-1"></a>    <span class="at">V3014_num =</span> <span class="fu">as.numeric</span>(<span class="fu">as.character</span>(V3014)),</span>
+<span id="cb433-16"><a href="c13-ncvs-vignette.html#cb433-16" tabindex="-1"></a>    <span class="at">AgeGroup =</span> <span class="fu">case_when</span>(V3014_num <span class="sc">&lt;=</span> <span class="dv">17</span> <span class="sc">~</span> <span class="st">&quot;12-17&quot;</span>,</span>
+<span id="cb433-17"><a href="c13-ncvs-vignette.html#cb433-17" tabindex="-1"></a>                         V3014_num <span class="sc">&lt;=</span> <span class="dv">24</span> <span class="sc">~</span> <span class="st">&quot;18-24&quot;</span>,</span>
+<span id="cb433-18"><a href="c13-ncvs-vignette.html#cb433-18" tabindex="-1"></a>                         V3014_num <span class="sc">&lt;=</span> <span class="dv">34</span> <span class="sc">~</span> <span class="st">&quot;25-34&quot;</span>,</span>
+<span id="cb433-19"><a href="c13-ncvs-vignette.html#cb433-19" tabindex="-1"></a>                         V3014_num <span class="sc">&lt;=</span> <span class="dv">49</span> <span class="sc">~</span> <span class="st">&quot;35-49&quot;</span>,</span>
+<span id="cb433-20"><a href="c13-ncvs-vignette.html#cb433-20" tabindex="-1"></a>                         V3014_num <span class="sc">&lt;=</span> <span class="dv">64</span> <span class="sc">~</span> <span class="st">&quot;50-64&quot;</span>,</span>
+<span id="cb433-21"><a href="c13-ncvs-vignette.html#cb433-21" tabindex="-1"></a>                         V3014_num <span class="sc">&lt;=</span> <span class="dv">90</span> <span class="sc">~</span> <span class="st">&quot;65 or older&quot;</span>),</span>
+<span id="cb433-22"><a href="c13-ncvs-vignette.html#cb433-22" tabindex="-1"></a>    <span class="at">AgeGroup =</span> <span class="fu">fct_reorder</span>(AgeGroup, V3014_num),</span>
+<span id="cb433-23"><a href="c13-ncvs-vignette.html#cb433-23" tabindex="-1"></a>    <span class="at">MaritalStatus =</span> <span class="fu">factor</span>(<span class="fu">case_when</span>(V3015 <span class="sc">==</span> <span class="dv">1</span> <span class="sc">~</span> <span class="st">&quot;Married&quot;</span>,</span>
+<span id="cb433-24"><a href="c13-ncvs-vignette.html#cb433-24" tabindex="-1"></a>                                     V3015 <span class="sc">==</span> <span class="dv">2</span> <span class="sc">~</span> <span class="st">&quot;Widowed&quot;</span>,</span>
+<span id="cb433-25"><a href="c13-ncvs-vignette.html#cb433-25" tabindex="-1"></a>                                     V3015 <span class="sc">==</span> <span class="dv">3</span> <span class="sc">~</span> <span class="st">&quot;Divorced&quot;</span>,</span>
+<span id="cb433-26"><a href="c13-ncvs-vignette.html#cb433-26" tabindex="-1"></a>                                     V3015 <span class="sc">==</span> <span class="dv">4</span> <span class="sc">~</span> <span class="st">&quot;Separated&quot;</span>,</span>
+<span id="cb433-27"><a href="c13-ncvs-vignette.html#cb433-27" tabindex="-1"></a>                                     V3015 <span class="sc">==</span> <span class="dv">5</span> <span class="sc">~</span> <span class="st">&quot;Never married&quot;</span>),</span>
+<span id="cb433-28"><a href="c13-ncvs-vignette.html#cb433-28" tabindex="-1"></a>                           <span class="at">levels =</span> <span class="fu">c</span>(<span class="st">&quot;Never married&quot;</span>, <span class="st">&quot;Married&quot;</span>, </span>
+<span id="cb433-29"><a href="c13-ncvs-vignette.html#cb433-29" tabindex="-1"></a>                                      <span class="st">&quot;Widowed&quot;</span>,<span class="st">&quot;Divorced&quot;</span>, </span>
+<span id="cb433-30"><a href="c13-ncvs-vignette.html#cb433-30" tabindex="-1"></a>                                      <span class="st">&quot;Separated&quot;</span>))</span>
+<span id="cb433-31"><a href="c13-ncvs-vignette.html#cb433-31" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span> </span>
+<span id="cb433-32"><a href="c13-ncvs-vignette.html#cb433-32" tabindex="-1"></a>  <span class="fu">left_join</span>(hh_vsum_der <span class="sc">%&gt;%</span> <span class="fu">select</span>(YEARQ, IDHH, </span>
+<span id="cb433-33"><a href="c13-ncvs-vignette.html#cb433-33" tabindex="-1"></a>                                   V2117, V2118, Tenure<span class="sc">:</span>Region),</span>
+<span id="cb433-34"><a href="c13-ncvs-vignette.html#cb433-34" tabindex="-1"></a>            <span class="at">by =</span> <span class="fu">c</span>(<span class="st">&quot;YEARQ&quot;</span>, <span class="st">&quot;IDHH&quot;</span>))</span></code></pre></div>
 <p>As before, we want to check to make sure the recoded variables we create match the existing data as expected.</p>
 <div class="sourceCode" id="cb434"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb434-1"><a href="c13-ncvs-vignette.html#cb434-1" tabindex="-1"></a>pers_vsum_der <span class="sc">%&gt;%</span> <span class="fu">count</span>(Sex, V3018)</span></code></pre></div>
 <pre><code>## # A tibble: 2 × 3
@@ -1934,7 +1932,7 @@ <h4><span class="header-section-number">13.4.2.2</span> Person variables<a href=
 </div>
 <div id="survey-design-objects" class="section level2 hasAnchor" number="13.5">
 <h2><span class="header-section-number">13.5</span> Survey design objects<a href="c13-ncvs-vignette.html#survey-design-objects" class="anchor-section" aria-label="Anchor link to header"></a></h2>
-<p>All the data prep above is necessary to prepare the data for survey analysis. At this point, we can create the design objects and finally begin analysis. We will create three design objects for different types of analysis as they depend on which type of estimate we are creating. For the incident data, the weight of analysis is <code>NEWWGT</code>, which we constructed previously. The household and person-level data use <code>WGTHHCY</code> and <code>WGTPERCY</code>, respectively. For all analyses, <code>V2117</code> is the strata variable, and <code>V2118</code> is the cluster/PSU variable for analysis.</p>
+<p>All the data prep above is necessary to prepare the data for survey analysis. At this point, we can create the design objects and finally begin analysis. We create three design objects for different types of analysis as they depend on which type of estimate we are creating. For the incident data, the weight of analysis is <code>NEWWGT</code>, which we constructed previously. The household and person-level data use <code>WGTHHCY</code> and <code>WGTPERCY</code>, respectively. For all analyses, <code>V2117</code> is the strata variable, and <code>V2118</code> is the cluster/PSU variable for analysis. All this information can be found in the User’s Guide <span class="citation">(<a href="#ref-ncvs_user_guide">Shook-Sa, Bonnie, Couzens, G. Lance, and Berzofsky, Marcus 2015</a>)</span>.</p>
 <div class="sourceCode" id="cb446"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb446-1"><a href="c13-ncvs-vignette.html#cb446-1" tabindex="-1"></a>inc_des <span class="ot">&lt;-</span> inc_analysis <span class="sc">%&gt;%</span></span>
 <span id="cb446-2"><a href="c13-ncvs-vignette.html#cb446-2" tabindex="-1"></a>  <span class="fu">as_survey</span>(</span>
 <span id="cb446-3"><a href="c13-ncvs-vignette.html#cb446-3" tabindex="-1"></a>    <span class="at">weight =</span> NEWWGT,</span>
@@ -1966,97 +1964,1516 @@ <h2><span class="header-section-number">13.6</span> Calculating estimates<a href
 <li><p><em>Victimization totals</em> estimate the number of criminal victimizations with a given characteristic.</p></li>
 <li><p><em>Victimization proportions</em> estimate characteristics among victimizations or victims.</p></li>
 <li><p><em>Victimization rates</em> are estimates of the number of victimizations per 1,000 persons or households in the population.</p></li>
-<li><p>Prevalence rates are estimates of the percentage of the population (persons or households) who are victims of a crime.</p></li>
+<li><p><em>Prevalence rates</em> are estimates of the percentage of the population (persons or households) who are victims of a crime.</p></li>
 </ol>
 <div id="vic-tot" class="section level3 hasAnchor" number="13.6.1">
 <h3><span class="header-section-number">13.6.1</span> Estimation 1: Victimization totals<a href="c13-ncvs-vignette.html#vic-tot" class="anchor-section" aria-label="Anchor link to header"></a></h3>
-<p>There are two ways to calculate victimization totals. Using the incident design object (<code>inc_des</code>) is the most straightforward method, but the person (<code>pers_des</code>) and household (<code>hh_des</code>) design objects can be used as well if the adjustment factor (<code>ADJINC_WT</code>) is incorporated. In the example below, the total number of property and violent victimizations is first calculated using the incident file and then using the household and person design objects. The incident file is smaller, and thus, estimation is faster using that file, but the estimates will be the same as illustrated below:</p>
-<div class="sourceCode" id="cb447"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb447-1"><a href="c13-ncvs-vignette.html#cb447-1" tabindex="-1"></a>vt1 <span class="ot">&lt;-</span> inc_des <span class="sc">%&gt;%</span></span>
-<span id="cb447-2"><a href="c13-ncvs-vignette.html#cb447-2" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">Property_Vzn =</span> <span class="fu">survey_total</span>(Property, <span class="at">na.rm =</span> <span class="cn">TRUE</span>),</span>
-<span id="cb447-3"><a href="c13-ncvs-vignette.html#cb447-3" tabindex="-1"></a>            <span class="at">Violent_Vzn =</span> <span class="fu">survey_total</span>(Violent, <span class="at">na.rm =</span> <span class="cn">TRUE</span>))</span>
-<span id="cb447-4"><a href="c13-ncvs-vignette.html#cb447-4" tabindex="-1"></a></span>
-<span id="cb447-5"><a href="c13-ncvs-vignette.html#cb447-5" tabindex="-1"></a>vt2a <span class="ot">&lt;-</span> hh_des <span class="sc">%&gt;%</span></span>
-<span id="cb447-6"><a href="c13-ncvs-vignette.html#cb447-6" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">Property_Vzn =</span> <span class="fu">survey_total</span>(Property <span class="sc">*</span> ADJINC_WT, </span>
-<span id="cb447-7"><a href="c13-ncvs-vignette.html#cb447-7" tabindex="-1"></a>                                        <span class="at">na.rm =</span> <span class="cn">TRUE</span>))</span>
-<span id="cb447-8"><a href="c13-ncvs-vignette.html#cb447-8" tabindex="-1"></a></span>
-<span id="cb447-9"><a href="c13-ncvs-vignette.html#cb447-9" tabindex="-1"></a>vt2b <span class="ot">&lt;-</span> pers_des <span class="sc">%&gt;%</span></span>
-<span id="cb447-10"><a href="c13-ncvs-vignette.html#cb447-10" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">Violent_Vzn =</span> <span class="fu">survey_total</span>(Violent <span class="sc">*</span> ADJINC_WT, </span>
-<span id="cb447-11"><a href="c13-ncvs-vignette.html#cb447-11" tabindex="-1"></a>                                       <span class="at">na.rm =</span> <span class="cn">TRUE</span>))</span>
-<span id="cb447-12"><a href="c13-ncvs-vignette.html#cb447-12" tabindex="-1"></a></span>
-<span id="cb447-13"><a href="c13-ncvs-vignette.html#cb447-13" tabindex="-1"></a>vt1</span></code></pre></div>
-<pre><code>## # A tibble: 1 × 4
-##   Property_Vzn Property_Vzn_se Violent_Vzn Violent_Vzn_se
-##          &lt;dbl&gt;           &lt;dbl&gt;       &lt;dbl&gt;          &lt;dbl&gt;
-## 1    11682056.         263844.    4598306.        198115.</code></pre>
-<div class="sourceCode" id="cb449"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb449-1"><a href="c13-ncvs-vignette.html#cb449-1" tabindex="-1"></a>vt2a</span></code></pre></div>
-<pre><code>## # A tibble: 1 × 2
-##   Property_Vzn Property_Vzn_se
-##          &lt;dbl&gt;           &lt;dbl&gt;
-## 1    11682056.         263844.</code></pre>
-<div class="sourceCode" id="cb451"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb451-1"><a href="c13-ncvs-vignette.html#cb451-1" tabindex="-1"></a>vt2b</span></code></pre></div>
-<pre><code>## # A tibble: 1 × 2
-##   Violent_Vzn Violent_Vzn_se
-##         &lt;dbl&gt;          &lt;dbl&gt;
-## 1    4598306.        198115.</code></pre>
-<p>The number of victimizations estimated using the incident file is equivalent to the person and household file method. There are 11,682,056 property incidents and 4,598,306 violent incidents in a six-month period.</p>
+<p>There are two ways to calculate victimization totals. Using the incident design object (<code>inc_des</code>) is the most straightforward method, but the person (<code>pers_des</code>) and household (<code>hh_des</code>) design objects can be used as well if the adjustment factor (<code>ADJINC_WT</code>) is incorporated. In the example below, the total number of property and violent victimizations is first calculated using the incident file and then using the household and person design objects. The incident file is smaller and estimation is faster using that file, but the estimates are the same as illustrated in Table <a href="c13-ncvs-vignette.html#tab:ncvs-vign-vt1">13.5</a>, Table <a href="c13-ncvs-vignette.html#tab:ncvs-vign-vt2a">13.6</a>, and Table <a href="c13-ncvs-vignette.html#tab:ncvs-vign-vt2b">13.7</a>.</p>
+<div class="sourceCode" id="cb447"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb447-1"><a href="c13-ncvs-vignette.html#cb447-1" tabindex="-1"></a>vt1 <span class="ot">&lt;-</span></span>
+<span id="cb447-2"><a href="c13-ncvs-vignette.html#cb447-2" tabindex="-1"></a>  inc_des <span class="sc">%&gt;%</span></span>
+<span id="cb447-3"><a href="c13-ncvs-vignette.html#cb447-3" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">Property_Vzn =</span> <span class="fu">survey_total</span>(Property, <span class="at">na.rm =</span> <span class="cn">TRUE</span>),</span>
+<span id="cb447-4"><a href="c13-ncvs-vignette.html#cb447-4" tabindex="-1"></a>            <span class="at">Violent_Vzn =</span> <span class="fu">survey_total</span>(Violent, <span class="at">na.rm =</span> <span class="cn">TRUE</span>)) <span class="sc">%&gt;%</span></span>
+<span id="cb447-5"><a href="c13-ncvs-vignette.html#cb447-5" tabindex="-1"></a>  <span class="fu">gt</span>() <span class="sc">%&gt;%</span></span>
+<span id="cb447-6"><a href="c13-ncvs-vignette.html#cb447-6" tabindex="-1"></a>  <span class="fu">tab_spanner</span>(</span>
+<span id="cb447-7"><a href="c13-ncvs-vignette.html#cb447-7" tabindex="-1"></a>    <span class="at">label=</span><span class="st">&quot;Property crime&quot;</span>,</span>
+<span id="cb447-8"><a href="c13-ncvs-vignette.html#cb447-8" tabindex="-1"></a>    <span class="at">columns=</span><span class="fu">starts_with</span>(<span class="st">&quot;Property&quot;</span>)</span>
+<span id="cb447-9"><a href="c13-ncvs-vignette.html#cb447-9" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
+<span id="cb447-10"><a href="c13-ncvs-vignette.html#cb447-10" tabindex="-1"></a>  <span class="fu">tab_spanner</span>(</span>
+<span id="cb447-11"><a href="c13-ncvs-vignette.html#cb447-11" tabindex="-1"></a>    <span class="at">label=</span><span class="st">&quot;Violent crime&quot;</span>,</span>
+<span id="cb447-12"><a href="c13-ncvs-vignette.html#cb447-12" tabindex="-1"></a>    <span class="at">columns=</span><span class="fu">starts_with</span>(<span class="st">&quot;Violent&quot;</span>)</span>
+<span id="cb447-13"><a href="c13-ncvs-vignette.html#cb447-13" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
+<span id="cb447-14"><a href="c13-ncvs-vignette.html#cb447-14" tabindex="-1"></a>  <span class="fu">cols_label</span>(</span>
+<span id="cb447-15"><a href="c13-ncvs-vignette.html#cb447-15" tabindex="-1"></a>    <span class="fu">ends_with</span>(<span class="st">&quot;Vzn&quot;</span>)<span class="sc">~</span><span class="st">&quot;Total&quot;</span>,</span>
+<span id="cb447-16"><a href="c13-ncvs-vignette.html#cb447-16" tabindex="-1"></a>    <span class="fu">ends_with</span>(<span class="st">&quot;se&quot;</span>)<span class="sc">~</span><span class="st">&quot;S.E.&quot;</span></span>
+<span id="cb447-17"><a href="c13-ncvs-vignette.html#cb447-17" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
+<span id="cb447-18"><a href="c13-ncvs-vignette.html#cb447-18" tabindex="-1"></a>  <span class="fu">fmt_number</span>(<span class="at">decimals=</span><span class="dv">0</span>)</span>
+<span id="cb447-19"><a href="c13-ncvs-vignette.html#cb447-19" tabindex="-1"></a>  </span>
+<span id="cb447-20"><a href="c13-ncvs-vignette.html#cb447-20" tabindex="-1"></a>vt2a <span class="ot">&lt;-</span> hh_des <span class="sc">%&gt;%</span></span>
+<span id="cb447-21"><a href="c13-ncvs-vignette.html#cb447-21" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">Property_Vzn =</span> <span class="fu">survey_total</span>(Property <span class="sc">*</span> ADJINC_WT, </span>
+<span id="cb447-22"><a href="c13-ncvs-vignette.html#cb447-22" tabindex="-1"></a>                                        <span class="at">na.rm =</span> <span class="cn">TRUE</span>)) <span class="sc">%&gt;%</span></span>
+<span id="cb447-23"><a href="c13-ncvs-vignette.html#cb447-23" tabindex="-1"></a>  <span class="fu">gt</span>() <span class="sc">%&gt;%</span></span>
+<span id="cb447-24"><a href="c13-ncvs-vignette.html#cb447-24" tabindex="-1"></a>    <span class="fu">tab_spanner</span>(</span>
+<span id="cb447-25"><a href="c13-ncvs-vignette.html#cb447-25" tabindex="-1"></a>    <span class="at">label=</span><span class="st">&quot;Property crime&quot;</span>,</span>
+<span id="cb447-26"><a href="c13-ncvs-vignette.html#cb447-26" tabindex="-1"></a>    <span class="at">columns=</span><span class="fu">starts_with</span>(<span class="st">&quot;Property&quot;</span>)</span>
+<span id="cb447-27"><a href="c13-ncvs-vignette.html#cb447-27" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
+<span id="cb447-28"><a href="c13-ncvs-vignette.html#cb447-28" tabindex="-1"></a>  <span class="fu">cols_label</span>(</span>
+<span id="cb447-29"><a href="c13-ncvs-vignette.html#cb447-29" tabindex="-1"></a>    <span class="fu">ends_with</span>(<span class="st">&quot;Vzn&quot;</span>)<span class="sc">~</span><span class="st">&quot;Total&quot;</span>,</span>
+<span id="cb447-30"><a href="c13-ncvs-vignette.html#cb447-30" tabindex="-1"></a>    <span class="fu">ends_with</span>(<span class="st">&quot;se&quot;</span>)<span class="sc">~</span><span class="st">&quot;S.E.&quot;</span></span>
+<span id="cb447-31"><a href="c13-ncvs-vignette.html#cb447-31" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
+<span id="cb447-32"><a href="c13-ncvs-vignette.html#cb447-32" tabindex="-1"></a>  <span class="fu">fmt_number</span>(<span class="at">decimals=</span><span class="dv">0</span>)</span>
+<span id="cb447-33"><a href="c13-ncvs-vignette.html#cb447-33" tabindex="-1"></a></span>
+<span id="cb447-34"><a href="c13-ncvs-vignette.html#cb447-34" tabindex="-1"></a>vt2b <span class="ot">&lt;-</span> pers_des <span class="sc">%&gt;%</span></span>
+<span id="cb447-35"><a href="c13-ncvs-vignette.html#cb447-35" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">Violent_Vzn =</span> <span class="fu">survey_total</span>(Violent <span class="sc">*</span> ADJINC_WT, </span>
+<span id="cb447-36"><a href="c13-ncvs-vignette.html#cb447-36" tabindex="-1"></a>                                       <span class="at">na.rm =</span> <span class="cn">TRUE</span>)) <span class="sc">%&gt;%</span></span>
+<span id="cb447-37"><a href="c13-ncvs-vignette.html#cb447-37" tabindex="-1"></a>  <span class="fu">gt</span>() <span class="sc">%&gt;%</span></span>
+<span id="cb447-38"><a href="c13-ncvs-vignette.html#cb447-38" tabindex="-1"></a>  <span class="fu">tab_spanner</span>(</span>
+<span id="cb447-39"><a href="c13-ncvs-vignette.html#cb447-39" tabindex="-1"></a>    <span class="at">label=</span><span class="st">&quot;Violent crime&quot;</span>,</span>
+<span id="cb447-40"><a href="c13-ncvs-vignette.html#cb447-40" tabindex="-1"></a>    <span class="at">columns=</span><span class="fu">starts_with</span>(<span class="st">&quot;Violent&quot;</span>)</span>
+<span id="cb447-41"><a href="c13-ncvs-vignette.html#cb447-41" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
+<span id="cb447-42"><a href="c13-ncvs-vignette.html#cb447-42" tabindex="-1"></a>  <span class="fu">cols_label</span>(</span>
+<span id="cb447-43"><a href="c13-ncvs-vignette.html#cb447-43" tabindex="-1"></a>    <span class="fu">ends_with</span>(<span class="st">&quot;Vzn&quot;</span>)<span class="sc">~</span><span class="st">&quot;Total&quot;</span>,</span>
+<span id="cb447-44"><a href="c13-ncvs-vignette.html#cb447-44" tabindex="-1"></a>    <span class="fu">ends_with</span>(<span class="st">&quot;se&quot;</span>)<span class="sc">~</span><span class="st">&quot;S.E.&quot;</span></span>
+<span id="cb447-45"><a href="c13-ncvs-vignette.html#cb447-45" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
+<span id="cb447-46"><a href="c13-ncvs-vignette.html#cb447-46" tabindex="-1"></a>  <span class="fu">fmt_number</span>(<span class="at">decimals=</span><span class="dv">0</span>)</span></code></pre></div>
+
+<div id="jslvphoojc" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
+<style>#jslvphoojc table {
+  font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
+  -webkit-font-smoothing: antialiased;
+  -moz-osx-font-smoothing: grayscale;
+}
+
+#jslvphoojc thead, #jslvphoojc tbody, #jslvphoojc tfoot, #jslvphoojc tr, #jslvphoojc td, #jslvphoojc th {
+  border-style: none;
+}
+
+#jslvphoojc p {
+  margin: 0;
+  padding: 0;
+}
+
+#jslvphoojc .gt_table {
+  display: table;
+  border-collapse: collapse;
+  line-height: normal;
+  margin-left: auto;
+  margin-right: auto;
+  color: #333333;
+  font-size: 16px;
+  font-weight: normal;
+  font-style: normal;
+  background-color: #FFFFFF;
+  width: auto;
+  border-top-style: solid;
+  border-top-width: 2px;
+  border-top-color: #A8A8A8;
+  border-right-style: none;
+  border-right-width: 2px;
+  border-right-color: #D3D3D3;
+  border-bottom-style: solid;
+  border-bottom-width: 2px;
+  border-bottom-color: #A8A8A8;
+  border-left-style: none;
+  border-left-width: 2px;
+  border-left-color: #D3D3D3;
+}
+
+#jslvphoojc .gt_caption {
+  padding-top: 4px;
+  padding-bottom: 4px;
+}
+
+#jslvphoojc .gt_title {
+  color: #333333;
+  font-size: 125%;
+  font-weight: initial;
+  padding-top: 4px;
+  padding-bottom: 4px;
+  padding-left: 5px;
+  padding-right: 5px;
+  border-bottom-color: #FFFFFF;
+  border-bottom-width: 0;
+}
+
+#jslvphoojc .gt_subtitle {
+  color: #333333;
+  font-size: 85%;
+  font-weight: initial;
+  padding-top: 3px;
+  padding-bottom: 5px;
+  padding-left: 5px;
+  padding-right: 5px;
+  border-top-color: #FFFFFF;
+  border-top-width: 0;
+}
+
+#jslvphoojc .gt_heading {
+  background-color: #FFFFFF;
+  text-align: center;
+  border-bottom-color: #FFFFFF;
+  border-left-style: none;
+  border-left-width: 1px;
+  border-left-color: #D3D3D3;
+  border-right-style: none;
+  border-right-width: 1px;
+  border-right-color: #D3D3D3;
+}
+
+#jslvphoojc .gt_bottom_border {
+  border-bottom-style: solid;
+  border-bottom-width: 2px;
+  border-bottom-color: #D3D3D3;
+}
+
+#jslvphoojc .gt_col_headings {
+  border-top-style: solid;
+  border-top-width: 2px;
+  border-top-color: #D3D3D3;
+  border-bottom-style: solid;
+  border-bottom-width: 2px;
+  border-bottom-color: #D3D3D3;
+  border-left-style: none;
+  border-left-width: 1px;
+  border-left-color: #D3D3D3;
+  border-right-style: none;
+  border-right-width: 1px;
+  border-right-color: #D3D3D3;
+}
+
+#jslvphoojc .gt_col_heading {
+  color: #333333;
+  background-color: #FFFFFF;
+  font-size: 100%;
+  font-weight: normal;
+  text-transform: inherit;
+  border-left-style: none;
+  border-left-width: 1px;
+  border-left-color: #D3D3D3;
+  border-right-style: none;
+  border-right-width: 1px;
+  border-right-color: #D3D3D3;
+  vertical-align: bottom;
+  padding-top: 5px;
+  padding-bottom: 6px;
+  padding-left: 5px;
+  padding-right: 5px;
+  overflow-x: hidden;
+}
+
+#jslvphoojc .gt_column_spanner_outer {
+  color: #333333;
+  background-color: #FFFFFF;
+  font-size: 100%;
+  font-weight: normal;
+  text-transform: inherit;
+  padding-top: 0;
+  padding-bottom: 0;
+  padding-left: 4px;
+  padding-right: 4px;
+}
+
+#jslvphoojc .gt_column_spanner_outer:first-child {
+  padding-left: 0;
+}
+
+#jslvphoojc .gt_column_spanner_outer:last-child {
+  padding-right: 0;
+}
+
+#jslvphoojc .gt_column_spanner {
+  border-bottom-style: solid;
+  border-bottom-width: 2px;
+  border-bottom-color: #D3D3D3;
+  vertical-align: bottom;
+  padding-top: 5px;
+  padding-bottom: 5px;
+  overflow-x: hidden;
+  display: inline-block;
+  width: 100%;
+}
+
+#jslvphoojc .gt_spanner_row {
+  border-bottom-style: hidden;
+}
+
+#jslvphoojc .gt_group_heading {
+  padding-top: 8px;
+  padding-bottom: 8px;
+  padding-left: 5px;
+  padding-right: 5px;
+  color: #333333;
+  background-color: #FFFFFF;
+  font-size: 100%;
+  font-weight: initial;
+  text-transform: inherit;
+  border-top-style: solid;
+  border-top-width: 2px;
+  border-top-color: #D3D3D3;
+  border-bottom-style: solid;
+  border-bottom-width: 2px;
+  border-bottom-color: #D3D3D3;
+  border-left-style: none;
+  border-left-width: 1px;
+  border-left-color: #D3D3D3;
+  border-right-style: none;
+  border-right-width: 1px;
+  border-right-color: #D3D3D3;
+  vertical-align: middle;
+  text-align: left;
+}
+
+#jslvphoojc .gt_empty_group_heading {
+  padding: 0.5px;
+  color: #333333;
+  background-color: #FFFFFF;
+  font-size: 100%;
+  font-weight: initial;
+  border-top-style: solid;
+  border-top-width: 2px;
+  border-top-color: #D3D3D3;
+  border-bottom-style: solid;
+  border-bottom-width: 2px;
+  border-bottom-color: #D3D3D3;
+  vertical-align: middle;
+}
+
+#jslvphoojc .gt_from_md > :first-child {
+  margin-top: 0;
+}
+
+#jslvphoojc .gt_from_md > :last-child {
+  margin-bottom: 0;
+}
+
+#jslvphoojc .gt_row {
+  padding-top: 8px;
+  padding-bottom: 8px;
+  padding-left: 5px;
+  padding-right: 5px;
+  margin: 10px;
+  border-top-style: solid;
+  border-top-width: 1px;
+  border-top-color: #D3D3D3;
+  border-left-style: none;
+  border-left-width: 1px;
+  border-left-color: #D3D3D3;
+  border-right-style: none;
+  border-right-width: 1px;
+  border-right-color: #D3D3D3;
+  vertical-align: middle;
+  overflow-x: hidden;
+}
+
+#jslvphoojc .gt_stub {
+  color: #333333;
+  background-color: #FFFFFF;
+  font-size: 100%;
+  font-weight: initial;
+  text-transform: inherit;
+  border-right-style: solid;
+  border-right-width: 2px;
+  border-right-color: #D3D3D3;
+  padding-left: 5px;
+  padding-right: 5px;
+}
+
+#jslvphoojc .gt_stub_row_group {
+  color: #333333;
+  background-color: #FFFFFF;
+  font-size: 100%;
+  font-weight: initial;
+  text-transform: inherit;
+  border-right-style: solid;
+  border-right-width: 2px;
+  border-right-color: #D3D3D3;
+  padding-left: 5px;
+  padding-right: 5px;
+  vertical-align: top;
+}
+
+#jslvphoojc .gt_row_group_first td {
+  border-top-width: 2px;
+}
+
+#jslvphoojc .gt_row_group_first th {
+  border-top-width: 2px;
+}
+
+#jslvphoojc .gt_summary_row {
+  color: #333333;
+  background-color: #FFFFFF;
+  text-transform: inherit;
+  padding-top: 8px;
+  padding-bottom: 8px;
+  padding-left: 5px;
+  padding-right: 5px;
+}
+
+#jslvphoojc .gt_first_summary_row {
+  border-top-style: solid;
+  border-top-color: #D3D3D3;
+}
+
+#jslvphoojc .gt_first_summary_row.thick {
+  border-top-width: 2px;
+}
+
+#jslvphoojc .gt_last_summary_row {
+  padding-top: 8px;
+  padding-bottom: 8px;
+  padding-left: 5px;
+  padding-right: 5px;
+  border-bottom-style: solid;
+  border-bottom-width: 2px;
+  border-bottom-color: #D3D3D3;
+}
+
+#jslvphoojc .gt_grand_summary_row {
+  color: #333333;
+  background-color: #FFFFFF;
+  text-transform: inherit;
+  padding-top: 8px;
+  padding-bottom: 8px;
+  padding-left: 5px;
+  padding-right: 5px;
+}
+
+#jslvphoojc .gt_first_grand_summary_row {
+  padding-top: 8px;
+  padding-bottom: 8px;
+  padding-left: 5px;
+  padding-right: 5px;
+  border-top-style: double;
+  border-top-width: 6px;
+  border-top-color: #D3D3D3;
+}
+
+#jslvphoojc .gt_last_grand_summary_row_top {
+  padding-top: 8px;
+  padding-bottom: 8px;
+  padding-left: 5px;
+  padding-right: 5px;
+  border-bottom-style: double;
+  border-bottom-width: 6px;
+  border-bottom-color: #D3D3D3;
+}
+
+#jslvphoojc .gt_striped {
+  background-color: rgba(128, 128, 128, 0.05);
+}
+
+#jslvphoojc .gt_table_body {
+  border-top-style: solid;
+  border-top-width: 2px;
+  border-top-color: #D3D3D3;
+  border-bottom-style: solid;
+  border-bottom-width: 2px;
+  border-bottom-color: #D3D3D3;
+}
+
+#jslvphoojc .gt_footnotes {
+  color: #333333;
+  background-color: #FFFFFF;
+  border-bottom-style: none;
+  border-bottom-width: 2px;
+  border-bottom-color: #D3D3D3;
+  border-left-style: none;
+  border-left-width: 2px;
+  border-left-color: #D3D3D3;
+  border-right-style: none;
+  border-right-width: 2px;
+  border-right-color: #D3D3D3;
+}
+
+#jslvphoojc .gt_footnote {
+  margin: 0px;
+  font-size: 90%;
+  padding-top: 4px;
+  padding-bottom: 4px;
+  padding-left: 5px;
+  padding-right: 5px;
+}
+
+#jslvphoojc .gt_sourcenotes {
+  color: #333333;
+  background-color: #FFFFFF;
+  border-bottom-style: none;
+  border-bottom-width: 2px;
+  border-bottom-color: #D3D3D3;
+  border-left-style: none;
+  border-left-width: 2px;
+  border-left-color: #D3D3D3;
+  border-right-style: none;
+  border-right-width: 2px;
+  border-right-color: #D3D3D3;
+}
+
+#jslvphoojc .gt_sourcenote {
+  font-size: 90%;
+  padding-top: 4px;
+  padding-bottom: 4px;
+  padding-left: 5px;
+  padding-right: 5px;
+}
+
+#jslvphoojc .gt_left {
+  text-align: left;
+}
+
+#jslvphoojc .gt_center {
+  text-align: center;
+}
+
+#jslvphoojc .gt_right {
+  text-align: right;
+  font-variant-numeric: tabular-nums;
+}
+
+#jslvphoojc .gt_font_normal {
+  font-weight: normal;
+}
+
+#jslvphoojc .gt_font_bold {
+  font-weight: bold;
+}
+
+#jslvphoojc .gt_font_italic {
+  font-style: italic;
+}
+
+#jslvphoojc .gt_super {
+  font-size: 65%;
+}
+
+#jslvphoojc .gt_footnote_marks {
+  font-size: 75%;
+  vertical-align: 0.4em;
+  position: initial;
+}
+
+#jslvphoojc .gt_asterisk {
+  font-size: 100%;
+  vertical-align: 0;
+}
+
+#jslvphoojc .gt_indent_1 {
+  text-indent: 5px;
+}
+
+#jslvphoojc .gt_indent_2 {
+  text-indent: 10px;
+}
+
+#jslvphoojc .gt_indent_3 {
+  text-indent: 15px;
+}
+
+#jslvphoojc .gt_indent_4 {
+  text-indent: 20px;
+}
+
+#jslvphoojc .gt_indent_5 {
+  text-indent: 25px;
+}
+</style>
+<table class="gt_table" data-quarto-disable-processing="false" data-quarto-bootstrap="false">
+  <caption><span id="tab:ncvs-vign-vt1">TABLE 13.5: </span>Estimates of total property and violent victimizations with standard errors calculated using the incident design object, 2021 (vt1)</caption>
+  <thead>
+    
+    <tr class="gt_col_headings gt_spanner_row">
+      <th class="gt_center gt_columns_top_border gt_column_spanner_outer" rowspan="1" colspan="2" scope="colgroup" id="Property crime">
+        <span class="gt_column_spanner">Property crime</span>
+      </th>
+      <th class="gt_center gt_columns_top_border gt_column_spanner_outer" rowspan="1" colspan="2" scope="colgroup" id="Violent crime">
+        <span class="gt_column_spanner">Violent crime</span>
+      </th>
+    </tr>
+    <tr class="gt_col_headings">
+      <th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1" scope="col" id="Total">Total</th>
+      <th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1" scope="col" id="S.E.">S.E.</th>
+      <th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1" scope="col" id="Total">Total</th>
+      <th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1" scope="col" id="S.E.">S.E.</th>
+    </tr>
+  </thead>
+  <tbody class="gt_table_body">
+    <tr><td headers="Property_Vzn" class="gt_row gt_right">11,682,056</td>
+<td headers="Property_Vzn_se" class="gt_row gt_right">263,844</td>
+<td headers="Violent_Vzn" class="gt_row gt_right">4,598,306</td>
+<td headers="Violent_Vzn_se" class="gt_row gt_right">198,115</td></tr>
+  </tbody>
+  
+  
+</table>
+</div>
+
+<div id="uphlolqabb" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
+<style>#uphlolqabb table {
+  font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
+  -webkit-font-smoothing: antialiased;
+  -moz-osx-font-smoothing: grayscale;
+}
+
+#uphlolqabb thead, #uphlolqabb tbody, #uphlolqabb tfoot, #uphlolqabb tr, #uphlolqabb td, #uphlolqabb th {
+  border-style: none;
+}
+
+#uphlolqabb p {
+  margin: 0;
+  padding: 0;
+}
+
+#uphlolqabb .gt_table {
+  display: table;
+  border-collapse: collapse;
+  line-height: normal;
+  margin-left: auto;
+  margin-right: auto;
+  color: #333333;
+  font-size: 16px;
+  font-weight: normal;
+  font-style: normal;
+  background-color: #FFFFFF;
+  width: auto;
+  border-top-style: solid;
+  border-top-width: 2px;
+  border-top-color: #A8A8A8;
+  border-right-style: none;
+  border-right-width: 2px;
+  border-right-color: #D3D3D3;
+  border-bottom-style: solid;
+  border-bottom-width: 2px;
+  border-bottom-color: #A8A8A8;
+  border-left-style: none;
+  border-left-width: 2px;
+  border-left-color: #D3D3D3;
+}
+
+#uphlolqabb .gt_caption {
+  padding-top: 4px;
+  padding-bottom: 4px;
+}
+
+#uphlolqabb .gt_title {
+  color: #333333;
+  font-size: 125%;
+  font-weight: initial;
+  padding-top: 4px;
+  padding-bottom: 4px;
+  padding-left: 5px;
+  padding-right: 5px;
+  border-bottom-color: #FFFFFF;
+  border-bottom-width: 0;
+}
+
+#uphlolqabb .gt_subtitle {
+  color: #333333;
+  font-size: 85%;
+  font-weight: initial;
+  padding-top: 3px;
+  padding-bottom: 5px;
+  padding-left: 5px;
+  padding-right: 5px;
+  border-top-color: #FFFFFF;
+  border-top-width: 0;
+}
+
+#uphlolqabb .gt_heading {
+  background-color: #FFFFFF;
+  text-align: center;
+  border-bottom-color: #FFFFFF;
+  border-left-style: none;
+  border-left-width: 1px;
+  border-left-color: #D3D3D3;
+  border-right-style: none;
+  border-right-width: 1px;
+  border-right-color: #D3D3D3;
+}
+
+#uphlolqabb .gt_bottom_border {
+  border-bottom-style: solid;
+  border-bottom-width: 2px;
+  border-bottom-color: #D3D3D3;
+}
+
+#uphlolqabb .gt_col_headings {
+  border-top-style: solid;
+  border-top-width: 2px;
+  border-top-color: #D3D3D3;
+  border-bottom-style: solid;
+  border-bottom-width: 2px;
+  border-bottom-color: #D3D3D3;
+  border-left-style: none;
+  border-left-width: 1px;
+  border-left-color: #D3D3D3;
+  border-right-style: none;
+  border-right-width: 1px;
+  border-right-color: #D3D3D3;
+}
+
+#uphlolqabb .gt_col_heading {
+  color: #333333;
+  background-color: #FFFFFF;
+  font-size: 100%;
+  font-weight: normal;
+  text-transform: inherit;
+  border-left-style: none;
+  border-left-width: 1px;
+  border-left-color: #D3D3D3;
+  border-right-style: none;
+  border-right-width: 1px;
+  border-right-color: #D3D3D3;
+  vertical-align: bottom;
+  padding-top: 5px;
+  padding-bottom: 6px;
+  padding-left: 5px;
+  padding-right: 5px;
+  overflow-x: hidden;
+}
+
+#uphlolqabb .gt_column_spanner_outer {
+  color: #333333;
+  background-color: #FFFFFF;
+  font-size: 100%;
+  font-weight: normal;
+  text-transform: inherit;
+  padding-top: 0;
+  padding-bottom: 0;
+  padding-left: 4px;
+  padding-right: 4px;
+}
+
+#uphlolqabb .gt_column_spanner_outer:first-child {
+  padding-left: 0;
+}
+
+#uphlolqabb .gt_column_spanner_outer:last-child {
+  padding-right: 0;
+}
+
+#uphlolqabb .gt_column_spanner {
+  border-bottom-style: solid;
+  border-bottom-width: 2px;
+  border-bottom-color: #D3D3D3;
+  vertical-align: bottom;
+  padding-top: 5px;
+  padding-bottom: 5px;
+  overflow-x: hidden;
+  display: inline-block;
+  width: 100%;
+}
+
+#uphlolqabb .gt_spanner_row {
+  border-bottom-style: hidden;
+}
+
+#uphlolqabb .gt_group_heading {
+  padding-top: 8px;
+  padding-bottom: 8px;
+  padding-left: 5px;
+  padding-right: 5px;
+  color: #333333;
+  background-color: #FFFFFF;
+  font-size: 100%;
+  font-weight: initial;
+  text-transform: inherit;
+  border-top-style: solid;
+  border-top-width: 2px;
+  border-top-color: #D3D3D3;
+  border-bottom-style: solid;
+  border-bottom-width: 2px;
+  border-bottom-color: #D3D3D3;
+  border-left-style: none;
+  border-left-width: 1px;
+  border-left-color: #D3D3D3;
+  border-right-style: none;
+  border-right-width: 1px;
+  border-right-color: #D3D3D3;
+  vertical-align: middle;
+  text-align: left;
+}
+
+#uphlolqabb .gt_empty_group_heading {
+  padding: 0.5px;
+  color: #333333;
+  background-color: #FFFFFF;
+  font-size: 100%;
+  font-weight: initial;
+  border-top-style: solid;
+  border-top-width: 2px;
+  border-top-color: #D3D3D3;
+  border-bottom-style: solid;
+  border-bottom-width: 2px;
+  border-bottom-color: #D3D3D3;
+  vertical-align: middle;
+}
+
+#uphlolqabb .gt_from_md > :first-child {
+  margin-top: 0;
+}
+
+#uphlolqabb .gt_from_md > :last-child {
+  margin-bottom: 0;
+}
+
+#uphlolqabb .gt_row {
+  padding-top: 8px;
+  padding-bottom: 8px;
+  padding-left: 5px;
+  padding-right: 5px;
+  margin: 10px;
+  border-top-style: solid;
+  border-top-width: 1px;
+  border-top-color: #D3D3D3;
+  border-left-style: none;
+  border-left-width: 1px;
+  border-left-color: #D3D3D3;
+  border-right-style: none;
+  border-right-width: 1px;
+  border-right-color: #D3D3D3;
+  vertical-align: middle;
+  overflow-x: hidden;
+}
+
+#uphlolqabb .gt_stub {
+  color: #333333;
+  background-color: #FFFFFF;
+  font-size: 100%;
+  font-weight: initial;
+  text-transform: inherit;
+  border-right-style: solid;
+  border-right-width: 2px;
+  border-right-color: #D3D3D3;
+  padding-left: 5px;
+  padding-right: 5px;
+}
+
+#uphlolqabb .gt_stub_row_group {
+  color: #333333;
+  background-color: #FFFFFF;
+  font-size: 100%;
+  font-weight: initial;
+  text-transform: inherit;
+  border-right-style: solid;
+  border-right-width: 2px;
+  border-right-color: #D3D3D3;
+  padding-left: 5px;
+  padding-right: 5px;
+  vertical-align: top;
+}
+
+#uphlolqabb .gt_row_group_first td {
+  border-top-width: 2px;
+}
+
+#uphlolqabb .gt_row_group_first th {
+  border-top-width: 2px;
+}
+
+#uphlolqabb .gt_summary_row {
+  color: #333333;
+  background-color: #FFFFFF;
+  text-transform: inherit;
+  padding-top: 8px;
+  padding-bottom: 8px;
+  padding-left: 5px;
+  padding-right: 5px;
+}
+
+#uphlolqabb .gt_first_summary_row {
+  border-top-style: solid;
+  border-top-color: #D3D3D3;
+}
+
+#uphlolqabb .gt_first_summary_row.thick {
+  border-top-width: 2px;
+}
+
+#uphlolqabb .gt_last_summary_row {
+  padding-top: 8px;
+  padding-bottom: 8px;
+  padding-left: 5px;
+  padding-right: 5px;
+  border-bottom-style: solid;
+  border-bottom-width: 2px;
+  border-bottom-color: #D3D3D3;
+}
+
+#uphlolqabb .gt_grand_summary_row {
+  color: #333333;
+  background-color: #FFFFFF;
+  text-transform: inherit;
+  padding-top: 8px;
+  padding-bottom: 8px;
+  padding-left: 5px;
+  padding-right: 5px;
+}
+
+#uphlolqabb .gt_first_grand_summary_row {
+  padding-top: 8px;
+  padding-bottom: 8px;
+  padding-left: 5px;
+  padding-right: 5px;
+  border-top-style: double;
+  border-top-width: 6px;
+  border-top-color: #D3D3D3;
+}
+
+#uphlolqabb .gt_last_grand_summary_row_top {
+  padding-top: 8px;
+  padding-bottom: 8px;
+  padding-left: 5px;
+  padding-right: 5px;
+  border-bottom-style: double;
+  border-bottom-width: 6px;
+  border-bottom-color: #D3D3D3;
+}
+
+#uphlolqabb .gt_striped {
+  background-color: rgba(128, 128, 128, 0.05);
+}
+
+#uphlolqabb .gt_table_body {
+  border-top-style: solid;
+  border-top-width: 2px;
+  border-top-color: #D3D3D3;
+  border-bottom-style: solid;
+  border-bottom-width: 2px;
+  border-bottom-color: #D3D3D3;
+}
+
+#uphlolqabb .gt_footnotes {
+  color: #333333;
+  background-color: #FFFFFF;
+  border-bottom-style: none;
+  border-bottom-width: 2px;
+  border-bottom-color: #D3D3D3;
+  border-left-style: none;
+  border-left-width: 2px;
+  border-left-color: #D3D3D3;
+  border-right-style: none;
+  border-right-width: 2px;
+  border-right-color: #D3D3D3;
+}
+
+#uphlolqabb .gt_footnote {
+  margin: 0px;
+  font-size: 90%;
+  padding-top: 4px;
+  padding-bottom: 4px;
+  padding-left: 5px;
+  padding-right: 5px;
+}
+
+#uphlolqabb .gt_sourcenotes {
+  color: #333333;
+  background-color: #FFFFFF;
+  border-bottom-style: none;
+  border-bottom-width: 2px;
+  border-bottom-color: #D3D3D3;
+  border-left-style: none;
+  border-left-width: 2px;
+  border-left-color: #D3D3D3;
+  border-right-style: none;
+  border-right-width: 2px;
+  border-right-color: #D3D3D3;
+}
+
+#uphlolqabb .gt_sourcenote {
+  font-size: 90%;
+  padding-top: 4px;
+  padding-bottom: 4px;
+  padding-left: 5px;
+  padding-right: 5px;
+}
+
+#uphlolqabb .gt_left {
+  text-align: left;
+}
+
+#uphlolqabb .gt_center {
+  text-align: center;
+}
+
+#uphlolqabb .gt_right {
+  text-align: right;
+  font-variant-numeric: tabular-nums;
+}
+
+#uphlolqabb .gt_font_normal {
+  font-weight: normal;
+}
+
+#uphlolqabb .gt_font_bold {
+  font-weight: bold;
+}
+
+#uphlolqabb .gt_font_italic {
+  font-style: italic;
+}
+
+#uphlolqabb .gt_super {
+  font-size: 65%;
+}
+
+#uphlolqabb .gt_footnote_marks {
+  font-size: 75%;
+  vertical-align: 0.4em;
+  position: initial;
+}
+
+#uphlolqabb .gt_asterisk {
+  font-size: 100%;
+  vertical-align: 0;
+}
+
+#uphlolqabb .gt_indent_1 {
+  text-indent: 5px;
+}
+
+#uphlolqabb .gt_indent_2 {
+  text-indent: 10px;
+}
+
+#uphlolqabb .gt_indent_3 {
+  text-indent: 15px;
+}
+
+#uphlolqabb .gt_indent_4 {
+  text-indent: 20px;
+}
+
+#uphlolqabb .gt_indent_5 {
+  text-indent: 25px;
+}
+</style>
+<table class="gt_table" data-quarto-disable-processing="false" data-quarto-bootstrap="false">
+  <caption><span id="tab:ncvs-vign-vt2a">TABLE 13.6: </span>Estimates of total property victimizations with standard errors calculated using the household design object, 2021 (vt2a)</caption>
+  <thead>
+    
+    <tr class="gt_col_headings gt_spanner_row">
+      <th class="gt_center gt_columns_top_border gt_column_spanner_outer" rowspan="1" colspan="2" scope="colgroup" id="Property crime">
+        <span class="gt_column_spanner">Property crime</span>
+      </th>
+    </tr>
+    <tr class="gt_col_headings">
+      <th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1" scope="col" id="Total">Total</th>
+      <th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1" scope="col" id="S.E.">S.E.</th>
+    </tr>
+  </thead>
+  <tbody class="gt_table_body">
+    <tr><td headers="Property_Vzn" class="gt_row gt_right">11,682,056</td>
+<td headers="Property_Vzn_se" class="gt_row gt_right">263,844</td></tr>
+  </tbody>
+  
+  
+</table>
+</div>
+
+<div id="ismfkpkdnv" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
+<style>#ismfkpkdnv table {
+  font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
+  -webkit-font-smoothing: antialiased;
+  -moz-osx-font-smoothing: grayscale;
+}
+
+#ismfkpkdnv thead, #ismfkpkdnv tbody, #ismfkpkdnv tfoot, #ismfkpkdnv tr, #ismfkpkdnv td, #ismfkpkdnv th {
+  border-style: none;
+}
+
+#ismfkpkdnv p {
+  margin: 0;
+  padding: 0;
+}
+
+#ismfkpkdnv .gt_table {
+  display: table;
+  border-collapse: collapse;
+  line-height: normal;
+  margin-left: auto;
+  margin-right: auto;
+  color: #333333;
+  font-size: 16px;
+  font-weight: normal;
+  font-style: normal;
+  background-color: #FFFFFF;
+  width: auto;
+  border-top-style: solid;
+  border-top-width: 2px;
+  border-top-color: #A8A8A8;
+  border-right-style: none;
+  border-right-width: 2px;
+  border-right-color: #D3D3D3;
+  border-bottom-style: solid;
+  border-bottom-width: 2px;
+  border-bottom-color: #A8A8A8;
+  border-left-style: none;
+  border-left-width: 2px;
+  border-left-color: #D3D3D3;
+}
+
+#ismfkpkdnv .gt_caption {
+  padding-top: 4px;
+  padding-bottom: 4px;
+}
+
+#ismfkpkdnv .gt_title {
+  color: #333333;
+  font-size: 125%;
+  font-weight: initial;
+  padding-top: 4px;
+  padding-bottom: 4px;
+  padding-left: 5px;
+  padding-right: 5px;
+  border-bottom-color: #FFFFFF;
+  border-bottom-width: 0;
+}
+
+#ismfkpkdnv .gt_subtitle {
+  color: #333333;
+  font-size: 85%;
+  font-weight: initial;
+  padding-top: 3px;
+  padding-bottom: 5px;
+  padding-left: 5px;
+  padding-right: 5px;
+  border-top-color: #FFFFFF;
+  border-top-width: 0;
+}
+
+#ismfkpkdnv .gt_heading {
+  background-color: #FFFFFF;
+  text-align: center;
+  border-bottom-color: #FFFFFF;
+  border-left-style: none;
+  border-left-width: 1px;
+  border-left-color: #D3D3D3;
+  border-right-style: none;
+  border-right-width: 1px;
+  border-right-color: #D3D3D3;
+}
+
+#ismfkpkdnv .gt_bottom_border {
+  border-bottom-style: solid;
+  border-bottom-width: 2px;
+  border-bottom-color: #D3D3D3;
+}
+
+#ismfkpkdnv .gt_col_headings {
+  border-top-style: solid;
+  border-top-width: 2px;
+  border-top-color: #D3D3D3;
+  border-bottom-style: solid;
+  border-bottom-width: 2px;
+  border-bottom-color: #D3D3D3;
+  border-left-style: none;
+  border-left-width: 1px;
+  border-left-color: #D3D3D3;
+  border-right-style: none;
+  border-right-width: 1px;
+  border-right-color: #D3D3D3;
+}
+
+#ismfkpkdnv .gt_col_heading {
+  color: #333333;
+  background-color: #FFFFFF;
+  font-size: 100%;
+  font-weight: normal;
+  text-transform: inherit;
+  border-left-style: none;
+  border-left-width: 1px;
+  border-left-color: #D3D3D3;
+  border-right-style: none;
+  border-right-width: 1px;
+  border-right-color: #D3D3D3;
+  vertical-align: bottom;
+  padding-top: 5px;
+  padding-bottom: 6px;
+  padding-left: 5px;
+  padding-right: 5px;
+  overflow-x: hidden;
+}
+
+#ismfkpkdnv .gt_column_spanner_outer {
+  color: #333333;
+  background-color: #FFFFFF;
+  font-size: 100%;
+  font-weight: normal;
+  text-transform: inherit;
+  padding-top: 0;
+  padding-bottom: 0;
+  padding-left: 4px;
+  padding-right: 4px;
+}
+
+#ismfkpkdnv .gt_column_spanner_outer:first-child {
+  padding-left: 0;
+}
+
+#ismfkpkdnv .gt_column_spanner_outer:last-child {
+  padding-right: 0;
+}
+
+#ismfkpkdnv .gt_column_spanner {
+  border-bottom-style: solid;
+  border-bottom-width: 2px;
+  border-bottom-color: #D3D3D3;
+  vertical-align: bottom;
+  padding-top: 5px;
+  padding-bottom: 5px;
+  overflow-x: hidden;
+  display: inline-block;
+  width: 100%;
+}
+
+#ismfkpkdnv .gt_spanner_row {
+  border-bottom-style: hidden;
+}
+
+#ismfkpkdnv .gt_group_heading {
+  padding-top: 8px;
+  padding-bottom: 8px;
+  padding-left: 5px;
+  padding-right: 5px;
+  color: #333333;
+  background-color: #FFFFFF;
+  font-size: 100%;
+  font-weight: initial;
+  text-transform: inherit;
+  border-top-style: solid;
+  border-top-width: 2px;
+  border-top-color: #D3D3D3;
+  border-bottom-style: solid;
+  border-bottom-width: 2px;
+  border-bottom-color: #D3D3D3;
+  border-left-style: none;
+  border-left-width: 1px;
+  border-left-color: #D3D3D3;
+  border-right-style: none;
+  border-right-width: 1px;
+  border-right-color: #D3D3D3;
+  vertical-align: middle;
+  text-align: left;
+}
+
+#ismfkpkdnv .gt_empty_group_heading {
+  padding: 0.5px;
+  color: #333333;
+  background-color: #FFFFFF;
+  font-size: 100%;
+  font-weight: initial;
+  border-top-style: solid;
+  border-top-width: 2px;
+  border-top-color: #D3D3D3;
+  border-bottom-style: solid;
+  border-bottom-width: 2px;
+  border-bottom-color: #D3D3D3;
+  vertical-align: middle;
+}
+
+#ismfkpkdnv .gt_from_md > :first-child {
+  margin-top: 0;
+}
+
+#ismfkpkdnv .gt_from_md > :last-child {
+  margin-bottom: 0;
+}
+
+#ismfkpkdnv .gt_row {
+  padding-top: 8px;
+  padding-bottom: 8px;
+  padding-left: 5px;
+  padding-right: 5px;
+  margin: 10px;
+  border-top-style: solid;
+  border-top-width: 1px;
+  border-top-color: #D3D3D3;
+  border-left-style: none;
+  border-left-width: 1px;
+  border-left-color: #D3D3D3;
+  border-right-style: none;
+  border-right-width: 1px;
+  border-right-color: #D3D3D3;
+  vertical-align: middle;
+  overflow-x: hidden;
+}
+
+#ismfkpkdnv .gt_stub {
+  color: #333333;
+  background-color: #FFFFFF;
+  font-size: 100%;
+  font-weight: initial;
+  text-transform: inherit;
+  border-right-style: solid;
+  border-right-width: 2px;
+  border-right-color: #D3D3D3;
+  padding-left: 5px;
+  padding-right: 5px;
+}
+
+#ismfkpkdnv .gt_stub_row_group {
+  color: #333333;
+  background-color: #FFFFFF;
+  font-size: 100%;
+  font-weight: initial;
+  text-transform: inherit;
+  border-right-style: solid;
+  border-right-width: 2px;
+  border-right-color: #D3D3D3;
+  padding-left: 5px;
+  padding-right: 5px;
+  vertical-align: top;
+}
+
+#ismfkpkdnv .gt_row_group_first td {
+  border-top-width: 2px;
+}
+
+#ismfkpkdnv .gt_row_group_first th {
+  border-top-width: 2px;
+}
+
+#ismfkpkdnv .gt_summary_row {
+  color: #333333;
+  background-color: #FFFFFF;
+  text-transform: inherit;
+  padding-top: 8px;
+  padding-bottom: 8px;
+  padding-left: 5px;
+  padding-right: 5px;
+}
+
+#ismfkpkdnv .gt_first_summary_row {
+  border-top-style: solid;
+  border-top-color: #D3D3D3;
+}
+
+#ismfkpkdnv .gt_first_summary_row.thick {
+  border-top-width: 2px;
+}
+
+#ismfkpkdnv .gt_last_summary_row {
+  padding-top: 8px;
+  padding-bottom: 8px;
+  padding-left: 5px;
+  padding-right: 5px;
+  border-bottom-style: solid;
+  border-bottom-width: 2px;
+  border-bottom-color: #D3D3D3;
+}
+
+#ismfkpkdnv .gt_grand_summary_row {
+  color: #333333;
+  background-color: #FFFFFF;
+  text-transform: inherit;
+  padding-top: 8px;
+  padding-bottom: 8px;
+  padding-left: 5px;
+  padding-right: 5px;
+}
+
+#ismfkpkdnv .gt_first_grand_summary_row {
+  padding-top: 8px;
+  padding-bottom: 8px;
+  padding-left: 5px;
+  padding-right: 5px;
+  border-top-style: double;
+  border-top-width: 6px;
+  border-top-color: #D3D3D3;
+}
+
+#ismfkpkdnv .gt_last_grand_summary_row_top {
+  padding-top: 8px;
+  padding-bottom: 8px;
+  padding-left: 5px;
+  padding-right: 5px;
+  border-bottom-style: double;
+  border-bottom-width: 6px;
+  border-bottom-color: #D3D3D3;
+}
+
+#ismfkpkdnv .gt_striped {
+  background-color: rgba(128, 128, 128, 0.05);
+}
+
+#ismfkpkdnv .gt_table_body {
+  border-top-style: solid;
+  border-top-width: 2px;
+  border-top-color: #D3D3D3;
+  border-bottom-style: solid;
+  border-bottom-width: 2px;
+  border-bottom-color: #D3D3D3;
+}
+
+#ismfkpkdnv .gt_footnotes {
+  color: #333333;
+  background-color: #FFFFFF;
+  border-bottom-style: none;
+  border-bottom-width: 2px;
+  border-bottom-color: #D3D3D3;
+  border-left-style: none;
+  border-left-width: 2px;
+  border-left-color: #D3D3D3;
+  border-right-style: none;
+  border-right-width: 2px;
+  border-right-color: #D3D3D3;
+}
+
+#ismfkpkdnv .gt_footnote {
+  margin: 0px;
+  font-size: 90%;
+  padding-top: 4px;
+  padding-bottom: 4px;
+  padding-left: 5px;
+  padding-right: 5px;
+}
+
+#ismfkpkdnv .gt_sourcenotes {
+  color: #333333;
+  background-color: #FFFFFF;
+  border-bottom-style: none;
+  border-bottom-width: 2px;
+  border-bottom-color: #D3D3D3;
+  border-left-style: none;
+  border-left-width: 2px;
+  border-left-color: #D3D3D3;
+  border-right-style: none;
+  border-right-width: 2px;
+  border-right-color: #D3D3D3;
+}
+
+#ismfkpkdnv .gt_sourcenote {
+  font-size: 90%;
+  padding-top: 4px;
+  padding-bottom: 4px;
+  padding-left: 5px;
+  padding-right: 5px;
+}
+
+#ismfkpkdnv .gt_left {
+  text-align: left;
+}
+
+#ismfkpkdnv .gt_center {
+  text-align: center;
+}
+
+#ismfkpkdnv .gt_right {
+  text-align: right;
+  font-variant-numeric: tabular-nums;
+}
+
+#ismfkpkdnv .gt_font_normal {
+  font-weight: normal;
+}
+
+#ismfkpkdnv .gt_font_bold {
+  font-weight: bold;
+}
+
+#ismfkpkdnv .gt_font_italic {
+  font-style: italic;
+}
+
+#ismfkpkdnv .gt_super {
+  font-size: 65%;
+}
+
+#ismfkpkdnv .gt_footnote_marks {
+  font-size: 75%;
+  vertical-align: 0.4em;
+  position: initial;
+}
+
+#ismfkpkdnv .gt_asterisk {
+  font-size: 100%;
+  vertical-align: 0;
+}
+
+#ismfkpkdnv .gt_indent_1 {
+  text-indent: 5px;
+}
+
+#ismfkpkdnv .gt_indent_2 {
+  text-indent: 10px;
+}
+
+#ismfkpkdnv .gt_indent_3 {
+  text-indent: 15px;
+}
+
+#ismfkpkdnv .gt_indent_4 {
+  text-indent: 20px;
+}
+
+#ismfkpkdnv .gt_indent_5 {
+  text-indent: 25px;
+}
+</style>
+<table class="gt_table" data-quarto-disable-processing="false" data-quarto-bootstrap="false">
+  <caption><span id="tab:ncvs-vign-vt2b">TABLE 13.7: </span>Estimates of total violent victimizations with standard errors calculated using the person design object, 2021 (vt2b)</caption>
+  <thead>
+    
+    <tr class="gt_col_headings gt_spanner_row">
+      <th class="gt_center gt_columns_top_border gt_column_spanner_outer" rowspan="1" colspan="2" scope="colgroup" id="Violent crime">
+        <span class="gt_column_spanner">Violent crime</span>
+      </th>
+    </tr>
+    <tr class="gt_col_headings">
+      <th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1" scope="col" id="Total">Total</th>
+      <th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1" scope="col" id="S.E.">S.E.</th>
+    </tr>
+  </thead>
+  <tbody class="gt_table_body">
+    <tr><td headers="Violent_Vzn" class="gt_row gt_right">4,598,306</td>
+<td headers="Violent_Vzn_se" class="gt_row gt_right">198,115</td></tr>
+  </tbody>
+  
+  
+</table>
+</div>
+<p>The number of victimizations estimated using the incident file is equivalent to the person and household file method. There are an estimated 11,682,056 property victimizations and 4,598,306 violent victimizations in 2021.</p>
 </div>
 <div id="vic-prop" class="section level3 hasAnchor" number="13.6.2">
 <h3><span class="header-section-number">13.6.2</span> Estimation 2: Victimization proportions<a href="c13-ncvs-vignette.html#vic-prop" class="anchor-section" aria-label="Anchor link to header"></a></h3>
-<p>Victimization proportions are proportions describing features of a victimization. The key here is that these are questions among victimizations, not among the population. These types of estimates can only be calculated using the incident design object (<code>inc_des</code>).</p>
+<p>Victimization proportions are proportions describing features of a victimization. The key here is that these are estimates among victimizations, not among the population. These types of estimates can only be calculated using the incident design object (<code>inc_des</code>.)</p>
 <p>For example, we could be interested in the percentage of property victimizations reported to the police as shown in the following code with an estimate, the standard error, and 95% confidence interval:</p>
-<div class="sourceCode" id="cb453"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb453-1"><a href="c13-ncvs-vignette.html#cb453-1" tabindex="-1"></a>prop1 <span class="ot">&lt;-</span> inc_des <span class="sc">%&gt;%</span></span>
-<span id="cb453-2"><a href="c13-ncvs-vignette.html#cb453-2" tabindex="-1"></a>  <span class="fu">filter</span>(Property) <span class="sc">%&gt;%</span></span>
-<span id="cb453-3"><a href="c13-ncvs-vignette.html#cb453-3" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">Pct =</span> <span class="fu">survey_mean</span>(ReportPolice, <span class="at">na.rm =</span> <span class="cn">TRUE</span>, <span class="at">proportion=</span><span class="cn">TRUE</span>, <span class="at">vartype=</span><span class="fu">c</span>(<span class="st">&quot;se&quot;</span>, <span class="st">&quot;ci&quot;</span>)) <span class="sc">*</span> <span class="dv">100</span>)</span>
-<span id="cb453-4"><a href="c13-ncvs-vignette.html#cb453-4" tabindex="-1"></a></span>
-<span id="cb453-5"><a href="c13-ncvs-vignette.html#cb453-5" tabindex="-1"></a>prop1</span></code></pre></div>
+<div class="sourceCode" id="cb448"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb448-1"><a href="c13-ncvs-vignette.html#cb448-1" tabindex="-1"></a>prop1 <span class="ot">&lt;-</span> inc_des <span class="sc">%&gt;%</span></span>
+<span id="cb448-2"><a href="c13-ncvs-vignette.html#cb448-2" tabindex="-1"></a>  <span class="fu">filter</span>(Property) <span class="sc">%&gt;%</span></span>
+<span id="cb448-3"><a href="c13-ncvs-vignette.html#cb448-3" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">Pct =</span> <span class="fu">survey_mean</span>(ReportPolice, </span>
+<span id="cb448-4"><a href="c13-ncvs-vignette.html#cb448-4" tabindex="-1"></a>                              <span class="at">na.rm =</span> <span class="cn">TRUE</span>, </span>
+<span id="cb448-5"><a href="c13-ncvs-vignette.html#cb448-5" tabindex="-1"></a>                              <span class="at">proportion=</span><span class="cn">TRUE</span>, </span>
+<span id="cb448-6"><a href="c13-ncvs-vignette.html#cb448-6" tabindex="-1"></a>                              <span class="at">vartype=</span><span class="fu">c</span>(<span class="st">&quot;se&quot;</span>, <span class="st">&quot;ci&quot;</span>)) <span class="sc">*</span> <span class="dv">100</span>)</span>
+<span id="cb448-7"><a href="c13-ncvs-vignette.html#cb448-7" tabindex="-1"></a></span>
+<span id="cb448-8"><a href="c13-ncvs-vignette.html#cb448-8" tabindex="-1"></a>prop1</span></code></pre></div>
 <pre><code>## # A tibble: 1 × 4
 ##     Pct Pct_se Pct_low Pct_upp
 ##   &lt;dbl&gt;  &lt;dbl&gt;   &lt;dbl&gt;   &lt;dbl&gt;
 ## 1  30.8  0.798    29.2    32.4</code></pre>
 <p>Or, the percentage of violent victimizations that are in urban areas:</p>
-<div class="sourceCode" id="cb455"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb455-1"><a href="c13-ncvs-vignette.html#cb455-1" tabindex="-1"></a>prop2 <span class="ot">&lt;-</span> inc_des <span class="sc">%&gt;%</span></span>
-<span id="cb455-2"><a href="c13-ncvs-vignette.html#cb455-2" tabindex="-1"></a>  <span class="fu">filter</span>(Violent) <span class="sc">%&gt;%</span></span>
-<span id="cb455-3"><a href="c13-ncvs-vignette.html#cb455-3" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">Pct =</span> <span class="fu">survey_mean</span>(Urbanicity<span class="sc">==</span><span class="st">&quot;Urban&quot;</span>, <span class="at">na.rm =</span> <span class="cn">TRUE</span>) <span class="sc">*</span> <span class="dv">100</span>)</span>
-<span id="cb455-4"><a href="c13-ncvs-vignette.html#cb455-4" tabindex="-1"></a></span>
-<span id="cb455-5"><a href="c13-ncvs-vignette.html#cb455-5" tabindex="-1"></a>prop2</span></code></pre></div>
+<div class="sourceCode" id="cb450"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb450-1"><a href="c13-ncvs-vignette.html#cb450-1" tabindex="-1"></a>prop2 <span class="ot">&lt;-</span> inc_des <span class="sc">%&gt;%</span></span>
+<span id="cb450-2"><a href="c13-ncvs-vignette.html#cb450-2" tabindex="-1"></a>  <span class="fu">filter</span>(Violent) <span class="sc">%&gt;%</span></span>
+<span id="cb450-3"><a href="c13-ncvs-vignette.html#cb450-3" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">Pct =</span> <span class="fu">survey_mean</span>(Urbanicity<span class="sc">==</span><span class="st">&quot;Urban&quot;</span>, </span>
+<span id="cb450-4"><a href="c13-ncvs-vignette.html#cb450-4" tabindex="-1"></a>                              <span class="at">na.rm =</span> <span class="cn">TRUE</span>) <span class="sc">*</span> <span class="dv">100</span>)</span>
+<span id="cb450-5"><a href="c13-ncvs-vignette.html#cb450-5" tabindex="-1"></a></span>
+<span id="cb450-6"><a href="c13-ncvs-vignette.html#cb450-6" tabindex="-1"></a>prop2</span></code></pre></div>
 <pre><code>## # A tibble: 1 × 2
 ##     Pct Pct_se
 ##   &lt;dbl&gt;  &lt;dbl&gt;
 ## 1  18.1   1.49</code></pre>
-<p>In 2021, we estimate that 30.8% of property crimes were reported to the police and 18.1% of violent crimes occurred in urban areas.</p>
+<p>In 2021, we estimate that 30.8% of property crimes were reported to the police, and 18.1% of violent crimes occurred in urban areas.</p>
 </div>
 <div id="vic-rate" class="section level3 hasAnchor" number="13.6.3">
 <h3><span class="header-section-number">13.6.3</span> Estimation 3: Victimization rates<a href="c13-ncvs-vignette.html#vic-rate" class="anchor-section" aria-label="Anchor link to header"></a></h3>
-<p>Victimization rates measure the number of victimizations per population. They are not an estimate of the proportion of households or persons who are victimized, which is a prevalence rate described in section <a href="c13-ncvs-vignette.html#prev-rate">13.6.4</a>. Victimization rates are estimated using the household (<code>hh_des</code>) or person (<code>pers_des</code>) design objects depending on the type of crime, and the adjustment factor (<code>ADJINC_WT</code>) must be incorporated. We return to the example of property and violent victimizations used in the example for victimization totals (section <a href="c13-ncvs-vignette.html#vic-tot">13.6.1</a>). In the following example, the property victimization totals are calculated as above, as well as the property victimization rate (using <code>survey_mean()</code>) and the population size using <code>survey_total()</code>.</p>
-<p>As mentioned in the introduction, victimization rates use the incident weight in the numerator and the person or household weight in the denominator. This is accomplished by calculating the rates with the weight adjustment (<code>ADJINC_WT</code>) multiplied by the estimate of interest. Let’s look at an example of property victimization.</p>
-<div class="sourceCode" id="cb457"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb457-1"><a href="c13-ncvs-vignette.html#cb457-1" tabindex="-1"></a>vr_prop <span class="ot">&lt;-</span> hh_des <span class="sc">%&gt;%</span></span>
-<span id="cb457-2"><a href="c13-ncvs-vignette.html#cb457-2" tabindex="-1"></a>  <span class="fu">summarize</span>(</span>
-<span id="cb457-3"><a href="c13-ncvs-vignette.html#cb457-3" tabindex="-1"></a>    <span class="at">Property_Vzn =</span> <span class="fu">survey_total</span>(Property <span class="sc">*</span> ADJINC_WT, </span>
-<span id="cb457-4"><a href="c13-ncvs-vignette.html#cb457-4" tabindex="-1"></a>                                <span class="at">na.rm =</span> <span class="cn">TRUE</span>),</span>
-<span id="cb457-5"><a href="c13-ncvs-vignette.html#cb457-5" tabindex="-1"></a>    <span class="at">Property_Rate =</span> <span class="fu">survey_mean</span>(Property <span class="sc">*</span> ADJINC_WT <span class="sc">*</span> <span class="dv">1000</span>,</span>
-<span id="cb457-6"><a href="c13-ncvs-vignette.html#cb457-6" tabindex="-1"></a>                                <span class="at">na.rm =</span> <span class="cn">TRUE</span>),</span>
-<span id="cb457-7"><a href="c13-ncvs-vignette.html#cb457-7" tabindex="-1"></a>    <span class="at">PopSize =</span> <span class="fu">survey_total</span>(<span class="dv">1</span>, <span class="at">vartype =</span> <span class="cn">NULL</span>)</span>
-<span id="cb457-8"><a href="c13-ncvs-vignette.html#cb457-8" tabindex="-1"></a>  )</span>
-<span id="cb457-9"><a href="c13-ncvs-vignette.html#cb457-9" tabindex="-1"></a></span>
-<span id="cb457-10"><a href="c13-ncvs-vignette.html#cb457-10" tabindex="-1"></a>vr_prop</span></code></pre></div>
+<p>Victimization rates measure the number of victimizations per population. They are not an estimate of the proportion of households or persons who are victimized, which is a prevalence rate described in Section <a href="c13-ncvs-vignette.html#prev-rate">13.6.4</a>. Victimization rates are estimated using the household (<code>hh_des</code>) or person (<code>pers_des</code>) design objects depending on the type of crime, and the adjustment factor (<code>ADJINC_WT</code>) must be incorporated. We return to the example of property and violent victimizations used in the example for victimization totals (Section <a href="c13-ncvs-vignette.html#vic-tot">13.6.1</a>.) In the following example, the property victimization totals are calculated as above, as well as the property victimization rate (using <code>survey_mean()</code>) and the population size using <code>survey_total()</code>.</p>
+<p>Victimization rates use the incident weight in the numerator and the person or household weight in the denominator. This is accomplished by calculating the rates with the weight adjustment (<code>ADJINC_WT</code>) multiplied by the estimate of interest. Let’s look at an example of property victimization.</p>
+<div class="sourceCode" id="cb452"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb452-1"><a href="c13-ncvs-vignette.html#cb452-1" tabindex="-1"></a>vr_prop <span class="ot">&lt;-</span> hh_des <span class="sc">%&gt;%</span></span>
+<span id="cb452-2"><a href="c13-ncvs-vignette.html#cb452-2" tabindex="-1"></a>  <span class="fu">summarize</span>(</span>
+<span id="cb452-3"><a href="c13-ncvs-vignette.html#cb452-3" tabindex="-1"></a>    <span class="at">Property_Vzn =</span> <span class="fu">survey_total</span>(Property <span class="sc">*</span> ADJINC_WT, </span>
+<span id="cb452-4"><a href="c13-ncvs-vignette.html#cb452-4" tabindex="-1"></a>                                <span class="at">na.rm =</span> <span class="cn">TRUE</span>),</span>
+<span id="cb452-5"><a href="c13-ncvs-vignette.html#cb452-5" tabindex="-1"></a>    <span class="at">Property_Rate =</span> <span class="fu">survey_mean</span>(Property <span class="sc">*</span> ADJINC_WT <span class="sc">*</span> <span class="dv">1000</span>,</span>
+<span id="cb452-6"><a href="c13-ncvs-vignette.html#cb452-6" tabindex="-1"></a>                                <span class="at">na.rm =</span> <span class="cn">TRUE</span>),</span>
+<span id="cb452-7"><a href="c13-ncvs-vignette.html#cb452-7" tabindex="-1"></a>    <span class="at">PopSize =</span> <span class="fu">survey_total</span>(<span class="dv">1</span>, <span class="at">vartype =</span> <span class="cn">NULL</span>)</span>
+<span id="cb452-8"><a href="c13-ncvs-vignette.html#cb452-8" tabindex="-1"></a>  )</span>
+<span id="cb452-9"><a href="c13-ncvs-vignette.html#cb452-9" tabindex="-1"></a></span>
+<span id="cb452-10"><a href="c13-ncvs-vignette.html#cb452-10" tabindex="-1"></a>vr_prop</span></code></pre></div>
 <pre><code>## # A tibble: 1 × 5
 ##   Property_Vzn Property_Vzn_se Property_Rate Property_Rate_se    PopSize
 ##          &lt;dbl&gt;           &lt;dbl&gt;         &lt;dbl&gt;            &lt;dbl&gt;      &lt;dbl&gt;
 ## 1    11682056.         263844.          90.3             1.95 129319232.</code></pre>
-<p>In the output above, we see the estimate for property victimization rate in 2021 was 90.3 per 1,000 households, which is consistent with calculating as the number of victimizations per 1,000 population as demonstrated in the next chunk:</p>
-<div class="sourceCode" id="cb459"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb459-1"><a href="c13-ncvs-vignette.html#cb459-1" tabindex="-1"></a>vr_prop <span class="sc">%&gt;%</span></span>
-<span id="cb459-2"><a href="c13-ncvs-vignette.html#cb459-2" tabindex="-1"></a>  <span class="fu">select</span>(<span class="sc">-</span><span class="fu">ends_with</span>(<span class="st">&quot;se&quot;</span>)) <span class="sc">%&gt;%</span></span>
-<span id="cb459-3"><a href="c13-ncvs-vignette.html#cb459-3" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">Property_Rate_manual=</span>Property_Vzn<span class="sc">/</span>PopSize<span class="sc">*</span><span class="dv">1000</span>)</span></code></pre></div>
+<p>In the output above, we see the estimate for property victimization rate in 2021 was 90.3 per 1,000 households. This is consistent with calculating the number of victimizations per 1,000 population, as demonstrated in the following code output.</p>
+<div class="sourceCode" id="cb454"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb454-1"><a href="c13-ncvs-vignette.html#cb454-1" tabindex="-1"></a>vr_prop <span class="sc">%&gt;%</span></span>
+<span id="cb454-2"><a href="c13-ncvs-vignette.html#cb454-2" tabindex="-1"></a>  <span class="fu">select</span>(<span class="sc">-</span><span class="fu">ends_with</span>(<span class="st">&quot;se&quot;</span>)) <span class="sc">%&gt;%</span></span>
+<span id="cb454-3"><a href="c13-ncvs-vignette.html#cb454-3" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">Property_Rate_manual=</span>Property_Vzn<span class="sc">/</span>PopSize<span class="sc">*</span><span class="dv">1000</span>)</span></code></pre></div>
 <pre><code>## # A tibble: 1 × 4
 ##   Property_Vzn Property_Rate    PopSize Property_Rate_manual
 ##          &lt;dbl&gt;         &lt;dbl&gt;      &lt;dbl&gt;                &lt;dbl&gt;
 ## 1    11682056.          90.3 129319232.                 90.3</code></pre>
-<p>Victimization rates can also be calculated for particular characteristics of the victimization. In the following example, the rate of aggravated assault with no weapon, with a firearm, with a knife, and with another weapon.</p>
-<div class="sourceCode" id="cb461"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb461-1"><a href="c13-ncvs-vignette.html#cb461-1" tabindex="-1"></a>pers_des <span class="sc">%&gt;%</span></span>
-<span id="cb461-2"><a href="c13-ncvs-vignette.html#cb461-2" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="fu">across</span>(</span>
-<span id="cb461-3"><a href="c13-ncvs-vignette.html#cb461-3" tabindex="-1"></a>    <span class="fu">starts_with</span>(<span class="st">&quot;AAST_&quot;</span>),</span>
-<span id="cb461-4"><a href="c13-ncvs-vignette.html#cb461-4" tabindex="-1"></a>    <span class="sc">~</span> <span class="fu">survey_mean</span>(. <span class="sc">*</span> ADJINC_WT <span class="sc">*</span> <span class="dv">1000</span>, <span class="at">na.rm =</span> <span class="cn">TRUE</span>)</span>
-<span id="cb461-5"><a href="c13-ncvs-vignette.html#cb461-5" tabindex="-1"></a>  ))</span></code></pre></div>
+<p>Victimization rates can also be calculated based on particular characteristics of the victimization. In the following example, we calculate the rate of aggravated assault with no weapon, a firearm, a knife, and another weapon.</p>
+<div class="sourceCode" id="cb456"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb456-1"><a href="c13-ncvs-vignette.html#cb456-1" tabindex="-1"></a>pers_des <span class="sc">%&gt;%</span></span>
+<span id="cb456-2"><a href="c13-ncvs-vignette.html#cb456-2" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="fu">across</span>(</span>
+<span id="cb456-3"><a href="c13-ncvs-vignette.html#cb456-3" tabindex="-1"></a>    <span class="fu">starts_with</span>(<span class="st">&quot;AAST_&quot;</span>),</span>
+<span id="cb456-4"><a href="c13-ncvs-vignette.html#cb456-4" tabindex="-1"></a>    <span class="sc">~</span> <span class="fu">survey_mean</span>(. <span class="sc">*</span> ADJINC_WT <span class="sc">*</span> <span class="dv">1000</span>, <span class="at">na.rm =</span> <span class="cn">TRUE</span>)</span>
+<span id="cb456-5"><a href="c13-ncvs-vignette.html#cb456-5" tabindex="-1"></a>  ))</span></code></pre></div>
 <pre><code>## # A tibble: 1 × 8
 ##   AAST_NoWeap AAST_NoWeap_se AAST_Firearm AAST_Firearm_se AAST_Knife
 ##         &lt;dbl&gt;          &lt;dbl&gt;        &lt;dbl&gt;           &lt;dbl&gt;      &lt;dbl&gt;
@@ -2064,103 +3481,103 @@ <h3><span class="header-section-number">13.6.3</span> Estimation 3: Victimizatio
 ## # ℹ 3 more variables: AAST_Knife_se &lt;dbl&gt;, AAST_Other &lt;dbl&gt;,
 ## #   AAST_Other_se &lt;dbl&gt;</code></pre>
 <p>A common desire is to calculate victimization rates by several characteristics. For example, we may want to calculate the violent victimization rate and aggravated assault rate by sex, race/Hispanic origin, age group, marital status, and household income. This requires a <code>group_by()</code> statement for each categorization separately. Thus, we make a function to do this and then use <code>map_df()</code> from the {purrr} package (part of the tidyverse) to loop through the variables <span class="citation">(<a href="#ref-R-purrr">Wickham and Henry 2023</a>)</span>. This function takes a demographic variable as its input (<code>byarvar</code>) and calculates the violent and aggravated assault victimization rate for each level. It then creates some columns with the variable, the level of each variable, and a numeric version of the variable (<code>LevelNum</code>) for sorting later. The function is run across multiple variables using <code>map()</code> and then stacks the results into a single output using <code>bind_rows()</code>.</p>
-<div class="sourceCode" id="cb463"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb463-1"><a href="c13-ncvs-vignette.html#cb463-1" tabindex="-1"></a>pers_est_by <span class="ot">&lt;-</span> <span class="cf">function</span>(byvar) {</span>
-<span id="cb463-2"><a href="c13-ncvs-vignette.html#cb463-2" tabindex="-1"></a>  pers_des <span class="sc">%&gt;%</span></span>
-<span id="cb463-3"><a href="c13-ncvs-vignette.html#cb463-3" tabindex="-1"></a>    <span class="fu">rename</span>(<span class="at">Level :=</span> {{byvar}}) <span class="sc">%&gt;%</span></span>
-<span id="cb463-4"><a href="c13-ncvs-vignette.html#cb463-4" tabindex="-1"></a>    <span class="fu">filter</span>(<span class="sc">!</span><span class="fu">is.na</span>(Level)) <span class="sc">%&gt;%</span></span>
-<span id="cb463-5"><a href="c13-ncvs-vignette.html#cb463-5" tabindex="-1"></a>    <span class="fu">group_by</span>(Level) <span class="sc">%&gt;%</span></span>
-<span id="cb463-6"><a href="c13-ncvs-vignette.html#cb463-6" tabindex="-1"></a>    <span class="fu">summarize</span>(</span>
-<span id="cb463-7"><a href="c13-ncvs-vignette.html#cb463-7" tabindex="-1"></a>      <span class="at">Violent =</span> <span class="fu">survey_mean</span>(Violent <span class="sc">*</span> ADJINC_WT <span class="sc">*</span> <span class="dv">1000</span>, <span class="at">na.rm =</span> <span class="cn">TRUE</span>),</span>
-<span id="cb463-8"><a href="c13-ncvs-vignette.html#cb463-8" tabindex="-1"></a>      <span class="at">AAST =</span> <span class="fu">survey_mean</span>(AAST <span class="sc">*</span> ADJINC_WT <span class="sc">*</span> <span class="dv">1000</span>, <span class="at">na.rm =</span> <span class="cn">TRUE</span>)</span>
-<span id="cb463-9"><a href="c13-ncvs-vignette.html#cb463-9" tabindex="-1"></a>    ) <span class="sc">%&gt;%</span></span>
-<span id="cb463-10"><a href="c13-ncvs-vignette.html#cb463-10" tabindex="-1"></a>    <span class="fu">mutate</span>(</span>
-<span id="cb463-11"><a href="c13-ncvs-vignette.html#cb463-11" tabindex="-1"></a>      <span class="at">Variable =</span> byvar,</span>
-<span id="cb463-12"><a href="c13-ncvs-vignette.html#cb463-12" tabindex="-1"></a>      <span class="at">LevelNum =</span> <span class="fu">as.numeric</span>(Level),</span>
-<span id="cb463-13"><a href="c13-ncvs-vignette.html#cb463-13" tabindex="-1"></a>      <span class="at">Level =</span> <span class="fu">as.character</span>(Level)</span>
-<span id="cb463-14"><a href="c13-ncvs-vignette.html#cb463-14" tabindex="-1"></a>    ) <span class="sc">%&gt;%</span></span>
-<span id="cb463-15"><a href="c13-ncvs-vignette.html#cb463-15" tabindex="-1"></a>    <span class="fu">select</span>(Variable, Level, LevelNum, <span class="fu">everything</span>())</span>
-<span id="cb463-16"><a href="c13-ncvs-vignette.html#cb463-16" tabindex="-1"></a>}</span>
-<span id="cb463-17"><a href="c13-ncvs-vignette.html#cb463-17" tabindex="-1"></a></span>
-<span id="cb463-18"><a href="c13-ncvs-vignette.html#cb463-18" tabindex="-1"></a>pers_est_df <span class="ot">&lt;-</span></span>
-<span id="cb463-19"><a href="c13-ncvs-vignette.html#cb463-19" tabindex="-1"></a>  <span class="fu">c</span>(<span class="st">&quot;Sex&quot;</span>, <span class="st">&quot;RaceHispOrigin&quot;</span>, <span class="st">&quot;AgeGroup&quot;</span>, <span class="st">&quot;MaritalStatus&quot;</span>, <span class="st">&quot;Income&quot;</span>) <span class="sc">%&gt;%</span></span>
-<span id="cb463-20"><a href="c13-ncvs-vignette.html#cb463-20" tabindex="-1"></a>  <span class="fu">map</span>(pers_est_by) <span class="sc">%&gt;%</span></span>
-<span id="cb463-21"><a href="c13-ncvs-vignette.html#cb463-21" tabindex="-1"></a>  <span class="fu">bind_rows</span>()</span></code></pre></div>
-<p>The output from all the estimates is cleanded to create better labels such as going from “RaceHispOrigin” to “Race/Hispanic Origin”. Finally, the {gt} package is used to make a publishable table (Table <a href="c13-ncvs-vignette.html#tab:ncvs-vign-rates-demo-tab">13.5</a>). Using the functions from the {gt} package, column labels and footnotes are added and estimates are presented to the first decimal place <span class="citation">(<a href="#ref-R-gt">Iannone et al. 2023</a>)</span>.</p>
-<div class="sourceCode" id="cb464"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb464-1"><a href="c13-ncvs-vignette.html#cb464-1" tabindex="-1"></a>vr_gt<span class="ot">&lt;-</span>pers_est_df <span class="sc">%&gt;%</span></span>
-<span id="cb464-2"><a href="c13-ncvs-vignette.html#cb464-2" tabindex="-1"></a>  <span class="fu">mutate</span>(</span>
-<span id="cb464-3"><a href="c13-ncvs-vignette.html#cb464-3" tabindex="-1"></a>    <span class="at">Variable =</span> <span class="fu">case_when</span>(</span>
-<span id="cb464-4"><a href="c13-ncvs-vignette.html#cb464-4" tabindex="-1"></a>      Variable <span class="sc">==</span> <span class="st">&quot;RaceHispOrigin&quot;</span> <span class="sc">~</span> <span class="st">&quot;Race/Hispanic origin&quot;</span>,</span>
-<span id="cb464-5"><a href="c13-ncvs-vignette.html#cb464-5" tabindex="-1"></a>      Variable <span class="sc">==</span> <span class="st">&quot;MaritalStatus&quot;</span> <span class="sc">~</span> <span class="st">&quot;Marital status&quot;</span>,</span>
-<span id="cb464-6"><a href="c13-ncvs-vignette.html#cb464-6" tabindex="-1"></a>      Variable <span class="sc">==</span> <span class="st">&quot;AgeGroup&quot;</span> <span class="sc">~</span> <span class="st">&quot;Age&quot;</span>,</span>
-<span id="cb464-7"><a href="c13-ncvs-vignette.html#cb464-7" tabindex="-1"></a>      <span class="cn">TRUE</span> <span class="sc">~</span> Variable</span>
-<span id="cb464-8"><a href="c13-ncvs-vignette.html#cb464-8" tabindex="-1"></a>    )</span>
-<span id="cb464-9"><a href="c13-ncvs-vignette.html#cb464-9" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
-<span id="cb464-10"><a href="c13-ncvs-vignette.html#cb464-10" tabindex="-1"></a>  <span class="fu">select</span>(<span class="sc">-</span>LevelNum) <span class="sc">%&gt;%</span></span>
-<span id="cb464-11"><a href="c13-ncvs-vignette.html#cb464-11" tabindex="-1"></a>  <span class="fu">group_by</span>(Variable) <span class="sc">%&gt;%</span></span>
-<span id="cb464-12"><a href="c13-ncvs-vignette.html#cb464-12" tabindex="-1"></a>  <span class="fu">gt</span>(<span class="at">rowname_col =</span> <span class="st">&quot;Level&quot;</span>) <span class="sc">%&gt;%</span></span>
-<span id="cb464-13"><a href="c13-ncvs-vignette.html#cb464-13" tabindex="-1"></a>  <span class="fu">tab_spanner</span>(</span>
-<span id="cb464-14"><a href="c13-ncvs-vignette.html#cb464-14" tabindex="-1"></a>    <span class="at">label =</span> <span class="st">&quot;Violent crime&quot;</span>,</span>
-<span id="cb464-15"><a href="c13-ncvs-vignette.html#cb464-15" tabindex="-1"></a>    <span class="at">id =</span> <span class="st">&quot;viol_span&quot;</span>,</span>
-<span id="cb464-16"><a href="c13-ncvs-vignette.html#cb464-16" tabindex="-1"></a>    <span class="at">columns =</span> <span class="fu">c</span>(<span class="st">&quot;Violent&quot;</span>, <span class="st">&quot;Violent_se&quot;</span>)</span>
-<span id="cb464-17"><a href="c13-ncvs-vignette.html#cb464-17" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
-<span id="cb464-18"><a href="c13-ncvs-vignette.html#cb464-18" tabindex="-1"></a>  <span class="fu">tab_spanner</span>(<span class="at">label =</span> <span class="st">&quot;Aggravated assault&quot;</span>,</span>
-<span id="cb464-19"><a href="c13-ncvs-vignette.html#cb464-19" tabindex="-1"></a>              <span class="at">columns =</span> <span class="fu">c</span>(<span class="st">&quot;AAST&quot;</span>, <span class="st">&quot;AAST_se&quot;</span>)) <span class="sc">%&gt;%</span></span>
-<span id="cb464-20"><a href="c13-ncvs-vignette.html#cb464-20" tabindex="-1"></a>  <span class="fu">cols_label</span>(</span>
-<span id="cb464-21"><a href="c13-ncvs-vignette.html#cb464-21" tabindex="-1"></a>    <span class="at">Violent =</span> <span class="st">&quot;Rate&quot;</span>,</span>
-<span id="cb464-22"><a href="c13-ncvs-vignette.html#cb464-22" tabindex="-1"></a>    <span class="at">Violent_se =</span> <span class="st">&quot;SE&quot;</span>,</span>
-<span id="cb464-23"><a href="c13-ncvs-vignette.html#cb464-23" tabindex="-1"></a>    <span class="at">AAST =</span> <span class="st">&quot;Rate&quot;</span>,</span>
-<span id="cb464-24"><a href="c13-ncvs-vignette.html#cb464-24" tabindex="-1"></a>    <span class="at">AAST_se =</span> <span class="st">&quot;SE&quot;</span>,</span>
-<span id="cb464-25"><a href="c13-ncvs-vignette.html#cb464-25" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
-<span id="cb464-26"><a href="c13-ncvs-vignette.html#cb464-26" tabindex="-1"></a>  <span class="fu">fmt_number</span>(</span>
-<span id="cb464-27"><a href="c13-ncvs-vignette.html#cb464-27" tabindex="-1"></a>    <span class="at">columns =</span> <span class="fu">c</span>(<span class="st">&quot;Violent&quot;</span>, <span class="st">&quot;Violent_se&quot;</span>, <span class="st">&quot;AAST&quot;</span>, <span class="st">&quot;AAST_se&quot;</span>),</span>
-<span id="cb464-28"><a href="c13-ncvs-vignette.html#cb464-28" tabindex="-1"></a>    <span class="at">decimals =</span> <span class="dv">1</span></span>
-<span id="cb464-29"><a href="c13-ncvs-vignette.html#cb464-29" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
-<span id="cb464-30"><a href="c13-ncvs-vignette.html#cb464-30" tabindex="-1"></a>  <span class="fu">tab_footnote</span>(</span>
-<span id="cb464-31"><a href="c13-ncvs-vignette.html#cb464-31" tabindex="-1"></a>    <span class="at">footnote =</span> <span class="st">&quot;Includes rape or sexual assault, robbery,</span></span>
-<span id="cb464-32"><a href="c13-ncvs-vignette.html#cb464-32" tabindex="-1"></a><span class="st">    aggravated assault, and simple assault.&quot;</span>,</span>
-<span id="cb464-33"><a href="c13-ncvs-vignette.html#cb464-33" tabindex="-1"></a>    <span class="at">locations =</span> <span class="fu">cells_column_spanners</span>(<span class="at">spanners =</span> <span class="st">&quot;viol_span&quot;</span>)</span>
-<span id="cb464-34"><a href="c13-ncvs-vignette.html#cb464-34" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
-<span id="cb464-35"><a href="c13-ncvs-vignette.html#cb464-35" tabindex="-1"></a>  <span class="fu">tab_footnote</span>(</span>
-<span id="cb464-36"><a href="c13-ncvs-vignette.html#cb464-36" tabindex="-1"></a>    <span class="at">footnote =</span> <span class="st">&quot;Excludes persons of Hispanic origin&quot;</span>,</span>
-<span id="cb464-37"><a href="c13-ncvs-vignette.html#cb464-37" tabindex="-1"></a>    <span class="at">locations =</span></span>
-<span id="cb464-38"><a href="c13-ncvs-vignette.html#cb464-38" tabindex="-1"></a>      <span class="fu">cells_stub</span>(<span class="at">rows =</span> Level <span class="sc">%in%</span></span>
-<span id="cb464-39"><a href="c13-ncvs-vignette.html#cb464-39" tabindex="-1"></a>                   <span class="fu">c</span>(<span class="st">&quot;White&quot;</span>, <span class="st">&quot;Black&quot;</span>, <span class="st">&quot;Asian&quot;</span>, NHOPI, <span class="st">&quot;Other&quot;</span>))) <span class="sc">%&gt;%</span></span>
-<span id="cb464-40"><a href="c13-ncvs-vignette.html#cb464-40" tabindex="-1"></a>  <span class="fu">tab_footnote</span>(</span>
-<span id="cb464-41"><a href="c13-ncvs-vignette.html#cb464-41" tabindex="-1"></a>    <span class="at">footnote =</span> <span class="st">&quot;Includes persons who identified as</span></span>
-<span id="cb464-42"><a href="c13-ncvs-vignette.html#cb464-42" tabindex="-1"></a><span class="st">    Native Hawaiian or Other Pacific Islander only.&quot;</span>,</span>
-<span id="cb464-43"><a href="c13-ncvs-vignette.html#cb464-43" tabindex="-1"></a>    <span class="at">locations =</span> <span class="fu">cells_stub</span>(<span class="at">rows =</span> Level <span class="sc">==</span> NHOPI)</span>
-<span id="cb464-44"><a href="c13-ncvs-vignette.html#cb464-44" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
-<span id="cb464-45"><a href="c13-ncvs-vignette.html#cb464-45" tabindex="-1"></a>  <span class="fu">tab_footnote</span>(</span>
-<span id="cb464-46"><a href="c13-ncvs-vignette.html#cb464-46" tabindex="-1"></a>    <span class="at">footnote =</span> <span class="st">&quot;Includes persons who identified as American Indian or</span></span>
-<span id="cb464-47"><a href="c13-ncvs-vignette.html#cb464-47" tabindex="-1"></a><span class="st">    Alaska Native only or as two or more races.&quot;</span>,</span>
-<span id="cb464-48"><a href="c13-ncvs-vignette.html#cb464-48" tabindex="-1"></a>    <span class="at">locations =</span> <span class="fu">cells_stub</span>(<span class="at">rows =</span> Level <span class="sc">==</span> <span class="st">&quot;Other&quot;</span>)</span>
-<span id="cb464-49"><a href="c13-ncvs-vignette.html#cb464-49" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
-<span id="cb464-50"><a href="c13-ncvs-vignette.html#cb464-50" tabindex="-1"></a>  <span class="fu">tab_source_note</span>(</span>
-<span id="cb464-51"><a href="c13-ncvs-vignette.html#cb464-51" tabindex="-1"></a>    <span class="at">source_note =</span> <span class="st">&quot;Note: Rates per 1,000 persons age 12 or older.&quot;</span>) <span class="sc">%&gt;%</span></span>
-<span id="cb464-52"><a href="c13-ncvs-vignette.html#cb464-52" tabindex="-1"></a>  <span class="fu">tab_source_note</span>(<span class="at">source_note =</span> <span class="st">&quot;Source: Bureau of Justice Statistics,</span></span>
-<span id="cb464-53"><a href="c13-ncvs-vignette.html#cb464-53" tabindex="-1"></a><span class="st">                  National Crime Victimization Survey, 2021.&quot;</span>) <span class="sc">%&gt;%</span></span>
-<span id="cb464-54"><a href="c13-ncvs-vignette.html#cb464-54" tabindex="-1"></a>  <span class="fu">tab_stubhead</span>(<span class="at">label =</span> <span class="st">&quot;Victim demographic&quot;</span>) <span class="sc">%&gt;%</span></span>
-<span id="cb464-55"><a href="c13-ncvs-vignette.html#cb464-55" tabindex="-1"></a>  <span class="fu">tab_caption</span>(<span class="st">&quot;Rate and standard error of violent victimization,</span></span>
-<span id="cb464-56"><a href="c13-ncvs-vignette.html#cb464-56" tabindex="-1"></a><span class="st">             by type of crime and demographic characteristics, 2021&quot;</span>)</span></code></pre></div>
-<div class="sourceCode" id="cb465"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb465-1"><a href="c13-ncvs-vignette.html#cb465-1" tabindex="-1"></a>vr_gt</span></code></pre></div>
-
-<div id="jslvphoojc" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
-<style>#jslvphoojc table {
+<div class="sourceCode" id="cb458"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb458-1"><a href="c13-ncvs-vignette.html#cb458-1" tabindex="-1"></a>pers_est_by <span class="ot">&lt;-</span> <span class="cf">function</span>(byvar) {</span>
+<span id="cb458-2"><a href="c13-ncvs-vignette.html#cb458-2" tabindex="-1"></a>  pers_des <span class="sc">%&gt;%</span></span>
+<span id="cb458-3"><a href="c13-ncvs-vignette.html#cb458-3" tabindex="-1"></a>    <span class="fu">rename</span>(<span class="at">Level :=</span> {{byvar}}) <span class="sc">%&gt;%</span></span>
+<span id="cb458-4"><a href="c13-ncvs-vignette.html#cb458-4" tabindex="-1"></a>    <span class="fu">filter</span>(<span class="sc">!</span><span class="fu">is.na</span>(Level)) <span class="sc">%&gt;%</span></span>
+<span id="cb458-5"><a href="c13-ncvs-vignette.html#cb458-5" tabindex="-1"></a>    <span class="fu">group_by</span>(Level) <span class="sc">%&gt;%</span></span>
+<span id="cb458-6"><a href="c13-ncvs-vignette.html#cb458-6" tabindex="-1"></a>    <span class="fu">summarize</span>(</span>
+<span id="cb458-7"><a href="c13-ncvs-vignette.html#cb458-7" tabindex="-1"></a>      <span class="at">Violent =</span> <span class="fu">survey_mean</span>(Violent <span class="sc">*</span> ADJINC_WT <span class="sc">*</span> <span class="dv">1000</span>, <span class="at">na.rm =</span> <span class="cn">TRUE</span>),</span>
+<span id="cb458-8"><a href="c13-ncvs-vignette.html#cb458-8" tabindex="-1"></a>      <span class="at">AAST =</span> <span class="fu">survey_mean</span>(AAST <span class="sc">*</span> ADJINC_WT <span class="sc">*</span> <span class="dv">1000</span>, <span class="at">na.rm =</span> <span class="cn">TRUE</span>)</span>
+<span id="cb458-9"><a href="c13-ncvs-vignette.html#cb458-9" tabindex="-1"></a>    ) <span class="sc">%&gt;%</span></span>
+<span id="cb458-10"><a href="c13-ncvs-vignette.html#cb458-10" tabindex="-1"></a>    <span class="fu">mutate</span>(</span>
+<span id="cb458-11"><a href="c13-ncvs-vignette.html#cb458-11" tabindex="-1"></a>      <span class="at">Variable =</span> byvar,</span>
+<span id="cb458-12"><a href="c13-ncvs-vignette.html#cb458-12" tabindex="-1"></a>      <span class="at">LevelNum =</span> <span class="fu">as.numeric</span>(Level),</span>
+<span id="cb458-13"><a href="c13-ncvs-vignette.html#cb458-13" tabindex="-1"></a>      <span class="at">Level =</span> <span class="fu">as.character</span>(Level)</span>
+<span id="cb458-14"><a href="c13-ncvs-vignette.html#cb458-14" tabindex="-1"></a>    ) <span class="sc">%&gt;%</span></span>
+<span id="cb458-15"><a href="c13-ncvs-vignette.html#cb458-15" tabindex="-1"></a>    <span class="fu">select</span>(Variable, Level, LevelNum, <span class="fu">everything</span>())</span>
+<span id="cb458-16"><a href="c13-ncvs-vignette.html#cb458-16" tabindex="-1"></a>}</span>
+<span id="cb458-17"><a href="c13-ncvs-vignette.html#cb458-17" tabindex="-1"></a></span>
+<span id="cb458-18"><a href="c13-ncvs-vignette.html#cb458-18" tabindex="-1"></a>pers_est_df <span class="ot">&lt;-</span></span>
+<span id="cb458-19"><a href="c13-ncvs-vignette.html#cb458-19" tabindex="-1"></a>  <span class="fu">c</span>(<span class="st">&quot;Sex&quot;</span>, <span class="st">&quot;RaceHispOrigin&quot;</span>, <span class="st">&quot;AgeGroup&quot;</span>, <span class="st">&quot;MaritalStatus&quot;</span>, <span class="st">&quot;Income&quot;</span>) <span class="sc">%&gt;%</span></span>
+<span id="cb458-20"><a href="c13-ncvs-vignette.html#cb458-20" tabindex="-1"></a>  <span class="fu">map</span>(pers_est_by) <span class="sc">%&gt;%</span></span>
+<span id="cb458-21"><a href="c13-ncvs-vignette.html#cb458-21" tabindex="-1"></a>  <span class="fu">bind_rows</span>()</span></code></pre></div>
+<p>The output from all the estimates is cleaned to create better labels, such as going from “RaceHispOrigin” to “Race/Hispanic Origin”. Finally, the {gt} package is used to make a publishable table (Table <a href="c13-ncvs-vignette.html#tab:ncvs-vign-rates-demo-tab">13.8</a>.) Using the functions from the {gt} package, we add column labels and footnotes and present estimates rounded to the first decimal place <span class="citation">(<a href="#ref-R-gt">Iannone et al. 2023</a>)</span>.</p>
+<div class="sourceCode" id="cb459"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb459-1"><a href="c13-ncvs-vignette.html#cb459-1" tabindex="-1"></a>vr_gt<span class="ot">&lt;-</span>pers_est_df <span class="sc">%&gt;%</span></span>
+<span id="cb459-2"><a href="c13-ncvs-vignette.html#cb459-2" tabindex="-1"></a>  <span class="fu">mutate</span>(</span>
+<span id="cb459-3"><a href="c13-ncvs-vignette.html#cb459-3" tabindex="-1"></a>    <span class="at">Variable =</span> <span class="fu">case_when</span>(</span>
+<span id="cb459-4"><a href="c13-ncvs-vignette.html#cb459-4" tabindex="-1"></a>      Variable <span class="sc">==</span> <span class="st">&quot;RaceHispOrigin&quot;</span> <span class="sc">~</span> <span class="st">&quot;Race/Hispanic origin&quot;</span>,</span>
+<span id="cb459-5"><a href="c13-ncvs-vignette.html#cb459-5" tabindex="-1"></a>      Variable <span class="sc">==</span> <span class="st">&quot;MaritalStatus&quot;</span> <span class="sc">~</span> <span class="st">&quot;Marital status&quot;</span>,</span>
+<span id="cb459-6"><a href="c13-ncvs-vignette.html#cb459-6" tabindex="-1"></a>      Variable <span class="sc">==</span> <span class="st">&quot;AgeGroup&quot;</span> <span class="sc">~</span> <span class="st">&quot;Age&quot;</span>,</span>
+<span id="cb459-7"><a href="c13-ncvs-vignette.html#cb459-7" tabindex="-1"></a>      <span class="cn">TRUE</span> <span class="sc">~</span> Variable</span>
+<span id="cb459-8"><a href="c13-ncvs-vignette.html#cb459-8" tabindex="-1"></a>    )</span>
+<span id="cb459-9"><a href="c13-ncvs-vignette.html#cb459-9" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
+<span id="cb459-10"><a href="c13-ncvs-vignette.html#cb459-10" tabindex="-1"></a>  <span class="fu">select</span>(<span class="sc">-</span>LevelNum) <span class="sc">%&gt;%</span></span>
+<span id="cb459-11"><a href="c13-ncvs-vignette.html#cb459-11" tabindex="-1"></a>  <span class="fu">group_by</span>(Variable) <span class="sc">%&gt;%</span></span>
+<span id="cb459-12"><a href="c13-ncvs-vignette.html#cb459-12" tabindex="-1"></a>  <span class="fu">gt</span>(<span class="at">rowname_col =</span> <span class="st">&quot;Level&quot;</span>) <span class="sc">%&gt;%</span></span>
+<span id="cb459-13"><a href="c13-ncvs-vignette.html#cb459-13" tabindex="-1"></a>  <span class="fu">tab_spanner</span>(</span>
+<span id="cb459-14"><a href="c13-ncvs-vignette.html#cb459-14" tabindex="-1"></a>    <span class="at">label =</span> <span class="st">&quot;Violent crime&quot;</span>,</span>
+<span id="cb459-15"><a href="c13-ncvs-vignette.html#cb459-15" tabindex="-1"></a>    <span class="at">id =</span> <span class="st">&quot;viol_span&quot;</span>,</span>
+<span id="cb459-16"><a href="c13-ncvs-vignette.html#cb459-16" tabindex="-1"></a>    <span class="at">columns =</span> <span class="fu">c</span>(<span class="st">&quot;Violent&quot;</span>, <span class="st">&quot;Violent_se&quot;</span>)</span>
+<span id="cb459-17"><a href="c13-ncvs-vignette.html#cb459-17" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
+<span id="cb459-18"><a href="c13-ncvs-vignette.html#cb459-18" tabindex="-1"></a>  <span class="fu">tab_spanner</span>(<span class="at">label =</span> <span class="st">&quot;Aggravated assault&quot;</span>,</span>
+<span id="cb459-19"><a href="c13-ncvs-vignette.html#cb459-19" tabindex="-1"></a>              <span class="at">columns =</span> <span class="fu">c</span>(<span class="st">&quot;AAST&quot;</span>, <span class="st">&quot;AAST_se&quot;</span>)) <span class="sc">%&gt;%</span></span>
+<span id="cb459-20"><a href="c13-ncvs-vignette.html#cb459-20" tabindex="-1"></a>  <span class="fu">cols_label</span>(</span>
+<span id="cb459-21"><a href="c13-ncvs-vignette.html#cb459-21" tabindex="-1"></a>    <span class="at">Violent =</span> <span class="st">&quot;Rate&quot;</span>,</span>
+<span id="cb459-22"><a href="c13-ncvs-vignette.html#cb459-22" tabindex="-1"></a>    <span class="at">Violent_se =</span> <span class="st">&quot;SE&quot;</span>,</span>
+<span id="cb459-23"><a href="c13-ncvs-vignette.html#cb459-23" tabindex="-1"></a>    <span class="at">AAST =</span> <span class="st">&quot;Rate&quot;</span>,</span>
+<span id="cb459-24"><a href="c13-ncvs-vignette.html#cb459-24" tabindex="-1"></a>    <span class="at">AAST_se =</span> <span class="st">&quot;SE&quot;</span>,</span>
+<span id="cb459-25"><a href="c13-ncvs-vignette.html#cb459-25" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
+<span id="cb459-26"><a href="c13-ncvs-vignette.html#cb459-26" tabindex="-1"></a>  <span class="fu">fmt_number</span>(</span>
+<span id="cb459-27"><a href="c13-ncvs-vignette.html#cb459-27" tabindex="-1"></a>    <span class="at">columns =</span> <span class="fu">c</span>(<span class="st">&quot;Violent&quot;</span>, <span class="st">&quot;Violent_se&quot;</span>, <span class="st">&quot;AAST&quot;</span>, <span class="st">&quot;AAST_se&quot;</span>),</span>
+<span id="cb459-28"><a href="c13-ncvs-vignette.html#cb459-28" tabindex="-1"></a>    <span class="at">decimals =</span> <span class="dv">1</span></span>
+<span id="cb459-29"><a href="c13-ncvs-vignette.html#cb459-29" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
+<span id="cb459-30"><a href="c13-ncvs-vignette.html#cb459-30" tabindex="-1"></a>  <span class="fu">tab_footnote</span>(</span>
+<span id="cb459-31"><a href="c13-ncvs-vignette.html#cb459-31" tabindex="-1"></a>    <span class="at">footnote =</span> <span class="st">&quot;Includes rape or sexual assault, robbery,</span></span>
+<span id="cb459-32"><a href="c13-ncvs-vignette.html#cb459-32" tabindex="-1"></a><span class="st">    aggravated assault, and simple assault.&quot;</span>,</span>
+<span id="cb459-33"><a href="c13-ncvs-vignette.html#cb459-33" tabindex="-1"></a>    <span class="at">locations =</span> <span class="fu">cells_column_spanners</span>(<span class="at">spanners =</span> <span class="st">&quot;viol_span&quot;</span>)</span>
+<span id="cb459-34"><a href="c13-ncvs-vignette.html#cb459-34" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
+<span id="cb459-35"><a href="c13-ncvs-vignette.html#cb459-35" tabindex="-1"></a>  <span class="fu">tab_footnote</span>(</span>
+<span id="cb459-36"><a href="c13-ncvs-vignette.html#cb459-36" tabindex="-1"></a>    <span class="at">footnote =</span> <span class="st">&quot;Excludes persons of Hispanic origin&quot;</span>,</span>
+<span id="cb459-37"><a href="c13-ncvs-vignette.html#cb459-37" tabindex="-1"></a>    <span class="at">locations =</span></span>
+<span id="cb459-38"><a href="c13-ncvs-vignette.html#cb459-38" tabindex="-1"></a>      <span class="fu">cells_stub</span>(<span class="at">rows =</span> Level <span class="sc">%in%</span></span>
+<span id="cb459-39"><a href="c13-ncvs-vignette.html#cb459-39" tabindex="-1"></a>                   <span class="fu">c</span>(<span class="st">&quot;White&quot;</span>, <span class="st">&quot;Black&quot;</span>, <span class="st">&quot;Asian&quot;</span>, NHOPI, <span class="st">&quot;Other&quot;</span>))) <span class="sc">%&gt;%</span></span>
+<span id="cb459-40"><a href="c13-ncvs-vignette.html#cb459-40" tabindex="-1"></a>  <span class="fu">tab_footnote</span>(</span>
+<span id="cb459-41"><a href="c13-ncvs-vignette.html#cb459-41" tabindex="-1"></a>    <span class="at">footnote =</span> <span class="st">&quot;Includes persons who identified as</span></span>
+<span id="cb459-42"><a href="c13-ncvs-vignette.html#cb459-42" tabindex="-1"></a><span class="st">    Native Hawaiian or Other Pacific Islander only.&quot;</span>,</span>
+<span id="cb459-43"><a href="c13-ncvs-vignette.html#cb459-43" tabindex="-1"></a>    <span class="at">locations =</span> <span class="fu">cells_stub</span>(<span class="at">rows =</span> Level <span class="sc">==</span> NHOPI)</span>
+<span id="cb459-44"><a href="c13-ncvs-vignette.html#cb459-44" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
+<span id="cb459-45"><a href="c13-ncvs-vignette.html#cb459-45" tabindex="-1"></a>  <span class="fu">tab_footnote</span>(</span>
+<span id="cb459-46"><a href="c13-ncvs-vignette.html#cb459-46" tabindex="-1"></a>    <span class="at">footnote =</span> <span class="st">&quot;Includes persons who identified as American Indian or</span></span>
+<span id="cb459-47"><a href="c13-ncvs-vignette.html#cb459-47" tabindex="-1"></a><span class="st">    Alaska Native only or as two or more races.&quot;</span>,</span>
+<span id="cb459-48"><a href="c13-ncvs-vignette.html#cb459-48" tabindex="-1"></a>    <span class="at">locations =</span> <span class="fu">cells_stub</span>(<span class="at">rows =</span> Level <span class="sc">==</span> <span class="st">&quot;Other&quot;</span>)</span>
+<span id="cb459-49"><a href="c13-ncvs-vignette.html#cb459-49" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
+<span id="cb459-50"><a href="c13-ncvs-vignette.html#cb459-50" tabindex="-1"></a>  <span class="fu">tab_source_note</span>(</span>
+<span id="cb459-51"><a href="c13-ncvs-vignette.html#cb459-51" tabindex="-1"></a>    <span class="at">source_note =</span> <span class="st">&quot;Note: Rates per 1,000 persons age 12 or older.&quot;</span>) <span class="sc">%&gt;%</span></span>
+<span id="cb459-52"><a href="c13-ncvs-vignette.html#cb459-52" tabindex="-1"></a>  <span class="fu">tab_source_note</span>(<span class="at">source_note =</span> <span class="st">&quot;Source: Bureau of Justice Statistics,</span></span>
+<span id="cb459-53"><a href="c13-ncvs-vignette.html#cb459-53" tabindex="-1"></a><span class="st">                  National Crime Victimization Survey, 2021.&quot;</span>) <span class="sc">%&gt;%</span></span>
+<span id="cb459-54"><a href="c13-ncvs-vignette.html#cb459-54" tabindex="-1"></a>  <span class="fu">tab_stubhead</span>(<span class="at">label =</span> <span class="st">&quot;Victim demographic&quot;</span>) <span class="sc">%&gt;%</span></span>
+<span id="cb459-55"><a href="c13-ncvs-vignette.html#cb459-55" tabindex="-1"></a>  <span class="fu">tab_caption</span>(<span class="st">&quot;Rate and standard error of violent victimization,</span></span>
+<span id="cb459-56"><a href="c13-ncvs-vignette.html#cb459-56" tabindex="-1"></a><span class="st">             by type of crime and demographic characteristics, 2021&quot;</span>)</span></code></pre></div>
+<div class="sourceCode" id="cb460"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb460-1"><a href="c13-ncvs-vignette.html#cb460-1" tabindex="-1"></a>vr_gt</span></code></pre></div>
+
+<div id="zpnruhcqur" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
+<style>#zpnruhcqur table {
   font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
   -webkit-font-smoothing: antialiased;
   -moz-osx-font-smoothing: grayscale;
 }
 
-#jslvphoojc thead, #jslvphoojc tbody, #jslvphoojc tfoot, #jslvphoojc tr, #jslvphoojc td, #jslvphoojc th {
+#zpnruhcqur thead, #zpnruhcqur tbody, #zpnruhcqur tfoot, #zpnruhcqur tr, #zpnruhcqur td, #zpnruhcqur th {
   border-style: none;
 }
 
-#jslvphoojc p {
+#zpnruhcqur p {
   margin: 0;
   padding: 0;
 }
 
-#jslvphoojc .gt_table {
+#zpnruhcqur .gt_table {
   display: table;
   border-collapse: collapse;
   line-height: normal;
@@ -2186,12 +3603,12 @@ <h3><span class="header-section-number">13.6.3</span> Estimation 3: Victimizatio
   border-left-color: #D3D3D3;
 }
 
-#jslvphoojc .gt_caption {
+#zpnruhcqur .gt_caption {
   padding-top: 4px;
   padding-bottom: 4px;
 }
 
-#jslvphoojc .gt_title {
+#zpnruhcqur .gt_title {
   color: #333333;
   font-size: 125%;
   font-weight: initial;
@@ -2203,7 +3620,7 @@ <h3><span class="header-section-number">13.6.3</span> Estimation 3: Victimizatio
   border-bottom-width: 0;
 }
 
-#jslvphoojc .gt_subtitle {
+#zpnruhcqur .gt_subtitle {
   color: #333333;
   font-size: 85%;
   font-weight: initial;
@@ -2215,7 +3632,7 @@ <h3><span class="header-section-number">13.6.3</span> Estimation 3: Victimizatio
   border-top-width: 0;
 }
 
-#jslvphoojc .gt_heading {
+#zpnruhcqur .gt_heading {
   background-color: #FFFFFF;
   text-align: center;
   border-bottom-color: #FFFFFF;
@@ -2227,13 +3644,13 @@ <h3><span class="header-section-number">13.6.3</span> Estimation 3: Victimizatio
   border-right-color: #D3D3D3;
 }
 
-#jslvphoojc .gt_bottom_border {
+#zpnruhcqur .gt_bottom_border {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
 }
 
-#jslvphoojc .gt_col_headings {
+#zpnruhcqur .gt_col_headings {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -2248,7 +3665,7 @@ <h3><span class="header-section-number">13.6.3</span> Estimation 3: Victimizatio
   border-right-color: #D3D3D3;
 }
 
-#jslvphoojc .gt_col_heading {
+#zpnruhcqur .gt_col_heading {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -2268,7 +3685,7 @@ <h3><span class="header-section-number">13.6.3</span> Estimation 3: Victimizatio
   overflow-x: hidden;
 }
 
-#jslvphoojc .gt_column_spanner_outer {
+#zpnruhcqur .gt_column_spanner_outer {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -2280,15 +3697,15 @@ <h3><span class="header-section-number">13.6.3</span> Estimation 3: Victimizatio
   padding-right: 4px;
 }
 
-#jslvphoojc .gt_column_spanner_outer:first-child {
+#zpnruhcqur .gt_column_spanner_outer:first-child {
   padding-left: 0;
 }
 
-#jslvphoojc .gt_column_spanner_outer:last-child {
+#zpnruhcqur .gt_column_spanner_outer:last-child {
   padding-right: 0;
 }
 
-#jslvphoojc .gt_column_spanner {
+#zpnruhcqur .gt_column_spanner {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
@@ -2300,11 +3717,11 @@ <h3><span class="header-section-number">13.6.3</span> Estimation 3: Victimizatio
   width: 100%;
 }
 
-#jslvphoojc .gt_spanner_row {
+#zpnruhcqur .gt_spanner_row {
   border-bottom-style: hidden;
 }
 
-#jslvphoojc .gt_group_heading {
+#zpnruhcqur .gt_group_heading {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -2330,7 +3747,7 @@ <h3><span class="header-section-number">13.6.3</span> Estimation 3: Victimizatio
   text-align: left;
 }
 
-#jslvphoojc .gt_empty_group_heading {
+#zpnruhcqur .gt_empty_group_heading {
   padding: 0.5px;
   color: #333333;
   background-color: #FFFFFF;
@@ -2345,15 +3762,15 @@ <h3><span class="header-section-number">13.6.3</span> Estimation 3: Victimizatio
   vertical-align: middle;
 }
 
-#jslvphoojc .gt_from_md > :first-child {
+#zpnruhcqur .gt_from_md > :first-child {
   margin-top: 0;
 }
 
-#jslvphoojc .gt_from_md > :last-child {
+#zpnruhcqur .gt_from_md > :last-child {
   margin-bottom: 0;
 }
 
-#jslvphoojc .gt_row {
+#zpnruhcqur .gt_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -2372,7 +3789,7 @@ <h3><span class="header-section-number">13.6.3</span> Estimation 3: Victimizatio
   overflow-x: hidden;
 }
 
-#jslvphoojc .gt_stub {
+#zpnruhcqur .gt_stub {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -2385,7 +3802,7 @@ <h3><span class="header-section-number">13.6.3</span> Estimation 3: Victimizatio
   padding-right: 5px;
 }
 
-#jslvphoojc .gt_stub_row_group {
+#zpnruhcqur .gt_stub_row_group {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -2399,15 +3816,15 @@ <h3><span class="header-section-number">13.6.3</span> Estimation 3: Victimizatio
   vertical-align: top;
 }
 
-#jslvphoojc .gt_row_group_first td {
+#zpnruhcqur .gt_row_group_first td {
   border-top-width: 2px;
 }
 
-#jslvphoojc .gt_row_group_first th {
+#zpnruhcqur .gt_row_group_first th {
   border-top-width: 2px;
 }
 
-#jslvphoojc .gt_summary_row {
+#zpnruhcqur .gt_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -2417,16 +3834,16 @@ <h3><span class="header-section-number">13.6.3</span> Estimation 3: Victimizatio
   padding-right: 5px;
 }
 
-#jslvphoojc .gt_first_summary_row {
+#zpnruhcqur .gt_first_summary_row {
   border-top-style: solid;
   border-top-color: #D3D3D3;
 }
 
-#jslvphoojc .gt_first_summary_row.thick {
+#zpnruhcqur .gt_first_summary_row.thick {
   border-top-width: 2px;
 }
 
-#jslvphoojc .gt_last_summary_row {
+#zpnruhcqur .gt_last_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -2436,7 +3853,7 @@ <h3><span class="header-section-number">13.6.3</span> Estimation 3: Victimizatio
   border-bottom-color: #D3D3D3;
 }
 
-#jslvphoojc .gt_grand_summary_row {
+#zpnruhcqur .gt_grand_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -2446,7 +3863,7 @@ <h3><span class="header-section-number">13.6.3</span> Estimation 3: Victimizatio
   padding-right: 5px;
 }
 
-#jslvphoojc .gt_first_grand_summary_row {
+#zpnruhcqur .gt_first_grand_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -2456,7 +3873,7 @@ <h3><span class="header-section-number">13.6.3</span> Estimation 3: Victimizatio
   border-top-color: #D3D3D3;
 }
 
-#jslvphoojc .gt_last_grand_summary_row_top {
+#zpnruhcqur .gt_last_grand_summary_row_top {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -2466,11 +3883,11 @@ <h3><span class="header-section-number">13.6.3</span> Estimation 3: Victimizatio
   border-bottom-color: #D3D3D3;
 }
 
-#jslvphoojc .gt_striped {
+#zpnruhcqur .gt_striped {
   background-color: rgba(128, 128, 128, 0.05);
 }
 
-#jslvphoojc .gt_table_body {
+#zpnruhcqur .gt_table_body {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -2479,7 +3896,7 @@ <h3><span class="header-section-number">13.6.3</span> Estimation 3: Victimizatio
   border-bottom-color: #D3D3D3;
 }
 
-#jslvphoojc .gt_footnotes {
+#zpnruhcqur .gt_footnotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -2493,7 +3910,7 @@ <h3><span class="header-section-number">13.6.3</span> Estimation 3: Victimizatio
   border-right-color: #D3D3D3;
 }
 
-#jslvphoojc .gt_footnote {
+#zpnruhcqur .gt_footnote {
   margin: 0px;
   font-size: 90%;
   padding-top: 4px;
@@ -2502,7 +3919,7 @@ <h3><span class="header-section-number">13.6.3</span> Estimation 3: Victimizatio
   padding-right: 5px;
 }
 
-#jslvphoojc .gt_sourcenotes {
+#zpnruhcqur .gt_sourcenotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -2516,7 +3933,7 @@ <h3><span class="header-section-number">13.6.3</span> Estimation 3: Victimizatio
   border-right-color: #D3D3D3;
 }
 
-#jslvphoojc .gt_sourcenote {
+#zpnruhcqur .gt_sourcenote {
   font-size: 90%;
   padding-top: 4px;
   padding-bottom: 4px;
@@ -2524,68 +3941,68 @@ <h3><span class="header-section-number">13.6.3</span> Estimation 3: Victimizatio
   padding-right: 5px;
 }
 
-#jslvphoojc .gt_left {
+#zpnruhcqur .gt_left {
   text-align: left;
 }
 
-#jslvphoojc .gt_center {
+#zpnruhcqur .gt_center {
   text-align: center;
 }
 
-#jslvphoojc .gt_right {
+#zpnruhcqur .gt_right {
   text-align: right;
   font-variant-numeric: tabular-nums;
 }
 
-#jslvphoojc .gt_font_normal {
+#zpnruhcqur .gt_font_normal {
   font-weight: normal;
 }
 
-#jslvphoojc .gt_font_bold {
+#zpnruhcqur .gt_font_bold {
   font-weight: bold;
 }
 
-#jslvphoojc .gt_font_italic {
+#zpnruhcqur .gt_font_italic {
   font-style: italic;
 }
 
-#jslvphoojc .gt_super {
+#zpnruhcqur .gt_super {
   font-size: 65%;
 }
 
-#jslvphoojc .gt_footnote_marks {
+#zpnruhcqur .gt_footnote_marks {
   font-size: 75%;
   vertical-align: 0.4em;
   position: initial;
 }
 
-#jslvphoojc .gt_asterisk {
+#zpnruhcqur .gt_asterisk {
   font-size: 100%;
   vertical-align: 0;
 }
 
-#jslvphoojc .gt_indent_1 {
+#zpnruhcqur .gt_indent_1 {
   text-indent: 5px;
 }
 
-#jslvphoojc .gt_indent_2 {
+#zpnruhcqur .gt_indent_2 {
   text-indent: 10px;
 }
 
-#jslvphoojc .gt_indent_3 {
+#zpnruhcqur .gt_indent_3 {
   text-indent: 15px;
 }
 
-#jslvphoojc .gt_indent_4 {
+#zpnruhcqur .gt_indent_4 {
   text-indent: 20px;
 }
 
-#jslvphoojc .gt_indent_5 {
+#zpnruhcqur .gt_indent_5 {
   text-indent: 25px;
 }
 </style>
 <table class="gt_table" data-quarto-disable-processing="false" data-quarto-bootstrap="false">
-  <caption><span id="tab:ncvs-vign-rates-demo-tab">TABLE 13.5: </span>Rate and standard error of violent victimization, by type of crime and demographic characteristics, 2021</caption>
+  <caption><span id="tab:ncvs-vign-rates-demo-tab">TABLE 13.8: </span>Rate and standard error of violent victimization, by type of crime and demographic characteristics, 2021</caption>
   <thead>
     
     <tr class="gt_col_headings gt_spanner_row">
@@ -2772,25 +4189,25 @@ <h3><span class="header-section-number">13.6.3</span> Estimation 3: Victimizatio
 </div>
 <div id="prev-rate" class="section level3 hasAnchor" number="13.6.4">
 <h3><span class="header-section-number">13.6.4</span> Estimation 4: Prevalence rates<a href="c13-ncvs-vignette.html#prev-rate" class="anchor-section" aria-label="Anchor link to header"></a></h3>
-<p>Prevalence rates differ from victimization rates as the numerator is the number of people or households victimized rather than the number of victimizations. To calculate the prevalence rates, we must run another summary of the data by calculating an indicator for whether a person or household is a victim of a particular crime at any point in the year. Below is an example of calculating first the indicator and then the prevalence rate of violent crime and aggravated assault.</p>
-<div class="sourceCode" id="cb466"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb466-1"><a href="c13-ncvs-vignette.html#cb466-1" tabindex="-1"></a>pers_prev_des <span class="ot">&lt;-</span></span>
-<span id="cb466-2"><a href="c13-ncvs-vignette.html#cb466-2" tabindex="-1"></a>  pers_vsum_slim <span class="sc">%&gt;%</span></span>
-<span id="cb466-3"><a href="c13-ncvs-vignette.html#cb466-3" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">Year =</span> <span class="fu">floor</span>(YEARQ)) <span class="sc">%&gt;%</span></span>
-<span id="cb466-4"><a href="c13-ncvs-vignette.html#cb466-4" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">Violent_Ind =</span> <span class="fu">sum</span>(Violent) <span class="sc">&gt;</span> <span class="dv">0</span>,</span>
-<span id="cb466-5"><a href="c13-ncvs-vignette.html#cb466-5" tabindex="-1"></a>         <span class="at">AAST_Ind =</span> <span class="fu">sum</span>(AAST) <span class="sc">&gt;</span> <span class="dv">0</span>,</span>
-<span id="cb466-6"><a href="c13-ncvs-vignette.html#cb466-6" tabindex="-1"></a>         <span class="at">.by =</span> <span class="fu">c</span>(<span class="st">&quot;Year&quot;</span>, <span class="st">&quot;IDHH&quot;</span>, <span class="st">&quot;IDPER&quot;</span>)) <span class="sc">%&gt;%</span></span>
-<span id="cb466-7"><a href="c13-ncvs-vignette.html#cb466-7" tabindex="-1"></a>  <span class="fu">as_survey</span>(</span>
-<span id="cb466-8"><a href="c13-ncvs-vignette.html#cb466-8" tabindex="-1"></a>    <span class="at">weight =</span> WGTPERCY,</span>
-<span id="cb466-9"><a href="c13-ncvs-vignette.html#cb466-9" tabindex="-1"></a>    <span class="at">strata =</span> V2117,</span>
-<span id="cb466-10"><a href="c13-ncvs-vignette.html#cb466-10" tabindex="-1"></a>    <span class="at">ids =</span> V2118,</span>
-<span id="cb466-11"><a href="c13-ncvs-vignette.html#cb466-11" tabindex="-1"></a>    <span class="at">nest =</span> <span class="cn">TRUE</span></span>
-<span id="cb466-12"><a href="c13-ncvs-vignette.html#cb466-12" tabindex="-1"></a>  )</span>
-<span id="cb466-13"><a href="c13-ncvs-vignette.html#cb466-13" tabindex="-1"></a></span>
-<span id="cb466-14"><a href="c13-ncvs-vignette.html#cb466-14" tabindex="-1"></a>pers_prev_ests <span class="ot">&lt;-</span> pers_prev_des <span class="sc">%&gt;%</span></span>
-<span id="cb466-15"><a href="c13-ncvs-vignette.html#cb466-15" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">Violent_Prev =</span> <span class="fu">survey_mean</span>(Violent_Ind <span class="sc">*</span> <span class="dv">100</span>),</span>
-<span id="cb466-16"><a href="c13-ncvs-vignette.html#cb466-16" tabindex="-1"></a>            <span class="at">AAST_Prev =</span> <span class="fu">survey_mean</span>(AAST_Ind <span class="sc">*</span> <span class="dv">100</span>))</span>
-<span id="cb466-17"><a href="c13-ncvs-vignette.html#cb466-17" tabindex="-1"></a></span>
-<span id="cb466-18"><a href="c13-ncvs-vignette.html#cb466-18" tabindex="-1"></a>pers_prev_ests</span></code></pre></div>
+<p>Prevalence rates differ from victimization rates as the numerator is the number of people or households victimized rather than the number of victimizations. To calculate the prevalence rates, we must run another summary of the data by calculating an indicator for whether a person or household is a victim of a particular crime at any point in the year. Below is an example of calculating the indicator and then the prevalence rate of violent crime and aggravated assault.</p>
+<div class="sourceCode" id="cb461"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb461-1"><a href="c13-ncvs-vignette.html#cb461-1" tabindex="-1"></a>pers_prev_des <span class="ot">&lt;-</span></span>
+<span id="cb461-2"><a href="c13-ncvs-vignette.html#cb461-2" tabindex="-1"></a>  pers_vsum_slim <span class="sc">%&gt;%</span></span>
+<span id="cb461-3"><a href="c13-ncvs-vignette.html#cb461-3" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">Year =</span> <span class="fu">floor</span>(YEARQ)) <span class="sc">%&gt;%</span></span>
+<span id="cb461-4"><a href="c13-ncvs-vignette.html#cb461-4" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">Violent_Ind =</span> <span class="fu">sum</span>(Violent) <span class="sc">&gt;</span> <span class="dv">0</span>,</span>
+<span id="cb461-5"><a href="c13-ncvs-vignette.html#cb461-5" tabindex="-1"></a>         <span class="at">AAST_Ind =</span> <span class="fu">sum</span>(AAST) <span class="sc">&gt;</span> <span class="dv">0</span>,</span>
+<span id="cb461-6"><a href="c13-ncvs-vignette.html#cb461-6" tabindex="-1"></a>         <span class="at">.by =</span> <span class="fu">c</span>(<span class="st">&quot;Year&quot;</span>, <span class="st">&quot;IDHH&quot;</span>, <span class="st">&quot;IDPER&quot;</span>)) <span class="sc">%&gt;%</span></span>
+<span id="cb461-7"><a href="c13-ncvs-vignette.html#cb461-7" tabindex="-1"></a>  <span class="fu">as_survey</span>(</span>
+<span id="cb461-8"><a href="c13-ncvs-vignette.html#cb461-8" tabindex="-1"></a>    <span class="at">weight =</span> WGTPERCY,</span>
+<span id="cb461-9"><a href="c13-ncvs-vignette.html#cb461-9" tabindex="-1"></a>    <span class="at">strata =</span> V2117,</span>
+<span id="cb461-10"><a href="c13-ncvs-vignette.html#cb461-10" tabindex="-1"></a>    <span class="at">ids =</span> V2118,</span>
+<span id="cb461-11"><a href="c13-ncvs-vignette.html#cb461-11" tabindex="-1"></a>    <span class="at">nest =</span> <span class="cn">TRUE</span></span>
+<span id="cb461-12"><a href="c13-ncvs-vignette.html#cb461-12" tabindex="-1"></a>  )</span>
+<span id="cb461-13"><a href="c13-ncvs-vignette.html#cb461-13" tabindex="-1"></a></span>
+<span id="cb461-14"><a href="c13-ncvs-vignette.html#cb461-14" tabindex="-1"></a>pers_prev_ests <span class="ot">&lt;-</span> pers_prev_des <span class="sc">%&gt;%</span></span>
+<span id="cb461-15"><a href="c13-ncvs-vignette.html#cb461-15" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">Violent_Prev =</span> <span class="fu">survey_mean</span>(Violent_Ind <span class="sc">*</span> <span class="dv">100</span>),</span>
+<span id="cb461-16"><a href="c13-ncvs-vignette.html#cb461-16" tabindex="-1"></a>            <span class="at">AAST_Prev =</span> <span class="fu">survey_mean</span>(AAST_Ind <span class="sc">*</span> <span class="dv">100</span>))</span>
+<span id="cb461-17"><a href="c13-ncvs-vignette.html#cb461-17" tabindex="-1"></a></span>
+<span id="cb461-18"><a href="c13-ncvs-vignette.html#cb461-18" tabindex="-1"></a>pers_prev_ests</span></code></pre></div>
 <pre><code>## # A tibble: 1 × 4
 ##   Violent_Prev Violent_Prev_se AAST_Prev AAST_Prev_se
 ##          &lt;dbl&gt;           &lt;dbl&gt;     &lt;dbl&gt;        &lt;dbl&gt;
@@ -2801,60 +4218,53 @@ <h3><span class="header-section-number">13.6.4</span> Estimation 4: Prevalence r
 <div id="statistical-testing" class="section level2 hasAnchor" number="13.7">
 <h2><span class="header-section-number">13.7</span> Statistical testing<a href="c13-ncvs-vignette.html#statistical-testing" class="anchor-section" aria-label="Anchor link to header"></a></h2>
 <p>For any of the types of estimates discussed, we can also perform statistical testing. For example, we could test whether property victimization rates are different between properties that are owned versus rented. First, we calculate the point estimates.</p>
-<div class="sourceCode" id="cb468"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb468-1"><a href="c13-ncvs-vignette.html#cb468-1" tabindex="-1"></a>prop_tenure <span class="ot">&lt;-</span> hh_des <span class="sc">%&gt;%</span></span>
-<span id="cb468-2"><a href="c13-ncvs-vignette.html#cb468-2" tabindex="-1"></a>  <span class="fu">group_by</span>(Tenure) <span class="sc">%&gt;%</span></span>
-<span id="cb468-3"><a href="c13-ncvs-vignette.html#cb468-3" tabindex="-1"></a>  <span class="fu">summarize</span>(</span>
-<span id="cb468-4"><a href="c13-ncvs-vignette.html#cb468-4" tabindex="-1"></a>    <span class="at">Property_Rate =</span> <span class="fu">survey_mean</span>(Property <span class="sc">*</span> ADJINC_WT <span class="sc">*</span> <span class="dv">1000</span>,</span>
-<span id="cb468-5"><a href="c13-ncvs-vignette.html#cb468-5" tabindex="-1"></a>                                <span class="at">na.rm =</span> <span class="cn">TRUE</span>, <span class="at">vartype=</span><span class="st">&quot;ci&quot;</span>),</span>
-<span id="cb468-6"><a href="c13-ncvs-vignette.html#cb468-6" tabindex="-1"></a>  )</span>
-<span id="cb468-7"><a href="c13-ncvs-vignette.html#cb468-7" tabindex="-1"></a></span>
-<span id="cb468-8"><a href="c13-ncvs-vignette.html#cb468-8" tabindex="-1"></a>prop_tenure  </span></code></pre></div>
+<div class="sourceCode" id="cb463"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb463-1"><a href="c13-ncvs-vignette.html#cb463-1" tabindex="-1"></a>prop_tenure <span class="ot">&lt;-</span> hh_des <span class="sc">%&gt;%</span></span>
+<span id="cb463-2"><a href="c13-ncvs-vignette.html#cb463-2" tabindex="-1"></a>  <span class="fu">group_by</span>(Tenure) <span class="sc">%&gt;%</span></span>
+<span id="cb463-3"><a href="c13-ncvs-vignette.html#cb463-3" tabindex="-1"></a>  <span class="fu">summarize</span>(</span>
+<span id="cb463-4"><a href="c13-ncvs-vignette.html#cb463-4" tabindex="-1"></a>    <span class="at">Property_Rate =</span> <span class="fu">survey_mean</span>(Property <span class="sc">*</span> ADJINC_WT <span class="sc">*</span> <span class="dv">1000</span>,</span>
+<span id="cb463-5"><a href="c13-ncvs-vignette.html#cb463-5" tabindex="-1"></a>                                <span class="at">na.rm =</span> <span class="cn">TRUE</span>, <span class="at">vartype=</span><span class="st">&quot;ci&quot;</span>),</span>
+<span id="cb463-6"><a href="c13-ncvs-vignette.html#cb463-6" tabindex="-1"></a>  )</span>
+<span id="cb463-7"><a href="c13-ncvs-vignette.html#cb463-7" tabindex="-1"></a></span>
+<span id="cb463-8"><a href="c13-ncvs-vignette.html#cb463-8" tabindex="-1"></a>prop_tenure  </span></code></pre></div>
 <pre><code>## # A tibble: 3 × 4
 ##   Tenure Property_Rate Property_Rate_low Property_Rate_upp
 ##   &lt;fct&gt;          &lt;dbl&gt;             &lt;dbl&gt;             &lt;dbl&gt;
 ## 1 Owned           68.2              64.3              72.1
 ## 2 Rented         130.              123.              137. 
 ## 3 &lt;NA&gt;           NaN               NaN               NaN</code></pre>
-<p>The property victimization rate for rented households is 129.8 per 1,000 households while the property victimization rate for owned households is 68.2, which seem very different especially given the non-overlapping confidence intervals. However, survey data is inheriently non-independent so statistical testing cannot be done by comparing confidence intervals. To conduct the statistical test, we first need to create a variable that we will compare which incorporates the adjusted incident weight (<code>ADJINC_WT</code>) and then the test can be conducted as discussed in Chapter <a href="c06-statistical-testing.html#c06-statistical-testing">6</a>.</p>
-<div class="sourceCode" id="cb470"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb470-1"><a href="c13-ncvs-vignette.html#cb470-1" tabindex="-1"></a>prop_tenure_test <span class="ot">&lt;-</span> hh_des <span class="sc">%&gt;%</span></span>
-<span id="cb470-2"><a href="c13-ncvs-vignette.html#cb470-2" tabindex="-1"></a>  <span class="fu">mutate</span>(</span>
-<span id="cb470-3"><a href="c13-ncvs-vignette.html#cb470-3" tabindex="-1"></a>    <span class="at">Prop_Adj=</span>Property <span class="sc">*</span> ADJINC_WT <span class="sc">*</span> <span class="dv">1000</span></span>
-<span id="cb470-4"><a href="c13-ncvs-vignette.html#cb470-4" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
-<span id="cb470-5"><a href="c13-ncvs-vignette.html#cb470-5" tabindex="-1"></a>  <span class="fu">svyttest</span>(</span>
-<span id="cb470-6"><a href="c13-ncvs-vignette.html#cb470-6" tabindex="-1"></a>    <span class="at">formula =</span> Prop_Adj <span class="sc">~</span> Tenure,</span>
-<span id="cb470-7"><a href="c13-ncvs-vignette.html#cb470-7" tabindex="-1"></a>    <span class="at">design =</span> .,</span>
-<span id="cb470-8"><a href="c13-ncvs-vignette.html#cb470-8" tabindex="-1"></a>    <span class="at">na.rm =</span> <span class="cn">TRUE</span></span>
-<span id="cb470-9"><a href="c13-ncvs-vignette.html#cb470-9" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
-<span id="cb470-10"><a href="c13-ncvs-vignette.html#cb470-10" tabindex="-1"></a>  broom<span class="sc">::</span><span class="fu">tidy</span>()</span>
-<span id="cb470-11"><a href="c13-ncvs-vignette.html#cb470-11" tabindex="-1"></a></span>
-<span id="cb470-12"><a href="c13-ncvs-vignette.html#cb470-12" tabindex="-1"></a>prop_tenure_test</span></code></pre></div>
-<pre><code>## # A tibble: 1 × 8
-##   estimate statistic  p.value parameter conf.low conf.high method       
-##      &lt;dbl&gt;     &lt;dbl&gt;    &lt;dbl&gt;     &lt;dbl&gt;    &lt;dbl&gt;     &lt;dbl&gt; &lt;chr&gt;        
-## 1     61.6      16.0 8.91e-36       169     54.0      69.2 Design-based…
-## # ℹ 1 more variable: alternative &lt;chr&gt;</code></pre>
-<div class="sourceCode" id="cb472"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb472-1"><a href="c13-ncvs-vignette.html#cb472-1" tabindex="-1"></a>prop_tenure_test <span class="sc">%&gt;%</span></span>
-<span id="cb472-2"><a href="c13-ncvs-vignette.html#cb472-2" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">p.value =</span> <span class="fu">pretty_p_value</span>(p.value)) <span class="sc">%&gt;%</span></span>
-<span id="cb472-3"><a href="c13-ncvs-vignette.html#cb472-3" tabindex="-1"></a>  <span class="fu">gt</span>() <span class="sc">%&gt;%</span></span>
-<span id="cb472-4"><a href="c13-ncvs-vignette.html#cb472-4" tabindex="-1"></a>  <span class="fu">fmt_number</span>()</span></code></pre></div>
-
-<div id="uphlolqabb" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
-<style>#uphlolqabb table {
+<p>The property victimization rate for rented households is 129.8 per 1,000 households, while the property victimization rate for owned households is 68.2, which seem very different, especially given the non-overlapping confidence intervals. However, survey data are inherently non-independent, so statistical testing cannot be done by comparing confidence intervals. To conduct the statistical test, we first need to create a variable that incorporates the adjusted incident weight (<code>ADJINC_WT</code>), and then the test can be conducted on this adjusted variable as discussed in Chapter <a href="c06-statistical-testing.html#c06-statistical-testing">6</a>.</p>
+<div class="sourceCode" id="cb465"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb465-1"><a href="c13-ncvs-vignette.html#cb465-1" tabindex="-1"></a>prop_tenure_test <span class="ot">&lt;-</span> hh_des <span class="sc">%&gt;%</span></span>
+<span id="cb465-2"><a href="c13-ncvs-vignette.html#cb465-2" tabindex="-1"></a>  <span class="fu">mutate</span>(</span>
+<span id="cb465-3"><a href="c13-ncvs-vignette.html#cb465-3" tabindex="-1"></a>    <span class="at">Prop_Adj=</span>Property <span class="sc">*</span> ADJINC_WT <span class="sc">*</span> <span class="dv">1000</span></span>
+<span id="cb465-4"><a href="c13-ncvs-vignette.html#cb465-4" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
+<span id="cb465-5"><a href="c13-ncvs-vignette.html#cb465-5" tabindex="-1"></a>  <span class="fu">svyttest</span>(</span>
+<span id="cb465-6"><a href="c13-ncvs-vignette.html#cb465-6" tabindex="-1"></a>    <span class="at">formula =</span> Prop_Adj <span class="sc">~</span> Tenure,</span>
+<span id="cb465-7"><a href="c13-ncvs-vignette.html#cb465-7" tabindex="-1"></a>    <span class="at">design =</span> .,</span>
+<span id="cb465-8"><a href="c13-ncvs-vignette.html#cb465-8" tabindex="-1"></a>    <span class="at">na.rm =</span> <span class="cn">TRUE</span></span>
+<span id="cb465-9"><a href="c13-ncvs-vignette.html#cb465-9" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
+<span id="cb465-10"><a href="c13-ncvs-vignette.html#cb465-10" tabindex="-1"></a>  broom<span class="sc">::</span><span class="fu">tidy</span>()</span></code></pre></div>
+<div class="sourceCode" id="cb466"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb466-1"><a href="c13-ncvs-vignette.html#cb466-1" tabindex="-1"></a>prop_tenure_test <span class="sc">%&gt;%</span></span>
+<span id="cb466-2"><a href="c13-ncvs-vignette.html#cb466-2" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">p.value =</span> <span class="fu">pretty_p_value</span>(p.value)) <span class="sc">%&gt;%</span></span>
+<span id="cb466-3"><a href="c13-ncvs-vignette.html#cb466-3" tabindex="-1"></a>  <span class="fu">gt</span>() <span class="sc">%&gt;%</span></span>
+<span id="cb466-4"><a href="c13-ncvs-vignette.html#cb466-4" tabindex="-1"></a>  <span class="fu">fmt_number</span>()</span></code></pre></div>
+
+<div id="sgxskozkog" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
+<style>#sgxskozkog table {
   font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
   -webkit-font-smoothing: antialiased;
   -moz-osx-font-smoothing: grayscale;
 }
 
-#uphlolqabb thead, #uphlolqabb tbody, #uphlolqabb tfoot, #uphlolqabb tr, #uphlolqabb td, #uphlolqabb th {
+#sgxskozkog thead, #sgxskozkog tbody, #sgxskozkog tfoot, #sgxskozkog tr, #sgxskozkog td, #sgxskozkog th {
   border-style: none;
 }
 
-#uphlolqabb p {
+#sgxskozkog p {
   margin: 0;
   padding: 0;
 }
 
-#uphlolqabb .gt_table {
+#sgxskozkog .gt_table {
   display: table;
   border-collapse: collapse;
   line-height: normal;
@@ -2880,12 +4290,12 @@ <h2><span class="header-section-number">13.7</span> Statistical testing<a href="
   border-left-color: #D3D3D3;
 }
 
-#uphlolqabb .gt_caption {
+#sgxskozkog .gt_caption {
   padding-top: 4px;
   padding-bottom: 4px;
 }
 
-#uphlolqabb .gt_title {
+#sgxskozkog .gt_title {
   color: #333333;
   font-size: 125%;
   font-weight: initial;
@@ -2897,7 +4307,7 @@ <h2><span class="header-section-number">13.7</span> Statistical testing<a href="
   border-bottom-width: 0;
 }
 
-#uphlolqabb .gt_subtitle {
+#sgxskozkog .gt_subtitle {
   color: #333333;
   font-size: 85%;
   font-weight: initial;
@@ -2909,7 +4319,7 @@ <h2><span class="header-section-number">13.7</span> Statistical testing<a href="
   border-top-width: 0;
 }
 
-#uphlolqabb .gt_heading {
+#sgxskozkog .gt_heading {
   background-color: #FFFFFF;
   text-align: center;
   border-bottom-color: #FFFFFF;
@@ -2921,13 +4331,13 @@ <h2><span class="header-section-number">13.7</span> Statistical testing<a href="
   border-right-color: #D3D3D3;
 }
 
-#uphlolqabb .gt_bottom_border {
+#sgxskozkog .gt_bottom_border {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
 }
 
-#uphlolqabb .gt_col_headings {
+#sgxskozkog .gt_col_headings {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -2942,7 +4352,7 @@ <h2><span class="header-section-number">13.7</span> Statistical testing<a href="
   border-right-color: #D3D3D3;
 }
 
-#uphlolqabb .gt_col_heading {
+#sgxskozkog .gt_col_heading {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -2962,7 +4372,7 @@ <h2><span class="header-section-number">13.7</span> Statistical testing<a href="
   overflow-x: hidden;
 }
 
-#uphlolqabb .gt_column_spanner_outer {
+#sgxskozkog .gt_column_spanner_outer {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -2974,15 +4384,15 @@ <h2><span class="header-section-number">13.7</span> Statistical testing<a href="
   padding-right: 4px;
 }
 
-#uphlolqabb .gt_column_spanner_outer:first-child {
+#sgxskozkog .gt_column_spanner_outer:first-child {
   padding-left: 0;
 }
 
-#uphlolqabb .gt_column_spanner_outer:last-child {
+#sgxskozkog .gt_column_spanner_outer:last-child {
   padding-right: 0;
 }
 
-#uphlolqabb .gt_column_spanner {
+#sgxskozkog .gt_column_spanner {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
@@ -2994,11 +4404,11 @@ <h2><span class="header-section-number">13.7</span> Statistical testing<a href="
   width: 100%;
 }
 
-#uphlolqabb .gt_spanner_row {
+#sgxskozkog .gt_spanner_row {
   border-bottom-style: hidden;
 }
 
-#uphlolqabb .gt_group_heading {
+#sgxskozkog .gt_group_heading {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -3024,7 +4434,7 @@ <h2><span class="header-section-number">13.7</span> Statistical testing<a href="
   text-align: left;
 }
 
-#uphlolqabb .gt_empty_group_heading {
+#sgxskozkog .gt_empty_group_heading {
   padding: 0.5px;
   color: #333333;
   background-color: #FFFFFF;
@@ -3039,15 +4449,15 @@ <h2><span class="header-section-number">13.7</span> Statistical testing<a href="
   vertical-align: middle;
 }
 
-#uphlolqabb .gt_from_md > :first-child {
+#sgxskozkog .gt_from_md > :first-child {
   margin-top: 0;
 }
 
-#uphlolqabb .gt_from_md > :last-child {
+#sgxskozkog .gt_from_md > :last-child {
   margin-bottom: 0;
 }
 
-#uphlolqabb .gt_row {
+#sgxskozkog .gt_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -3066,7 +4476,7 @@ <h2><span class="header-section-number">13.7</span> Statistical testing<a href="
   overflow-x: hidden;
 }
 
-#uphlolqabb .gt_stub {
+#sgxskozkog .gt_stub {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -3079,7 +4489,7 @@ <h2><span class="header-section-number">13.7</span> Statistical testing<a href="
   padding-right: 5px;
 }
 
-#uphlolqabb .gt_stub_row_group {
+#sgxskozkog .gt_stub_row_group {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -3093,15 +4503,15 @@ <h2><span class="header-section-number">13.7</span> Statistical testing<a href="
   vertical-align: top;
 }
 
-#uphlolqabb .gt_row_group_first td {
+#sgxskozkog .gt_row_group_first td {
   border-top-width: 2px;
 }
 
-#uphlolqabb .gt_row_group_first th {
+#sgxskozkog .gt_row_group_first th {
   border-top-width: 2px;
 }
 
-#uphlolqabb .gt_summary_row {
+#sgxskozkog .gt_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -3111,16 +4521,16 @@ <h2><span class="header-section-number">13.7</span> Statistical testing<a href="
   padding-right: 5px;
 }
 
-#uphlolqabb .gt_first_summary_row {
+#sgxskozkog .gt_first_summary_row {
   border-top-style: solid;
   border-top-color: #D3D3D3;
 }
 
-#uphlolqabb .gt_first_summary_row.thick {
+#sgxskozkog .gt_first_summary_row.thick {
   border-top-width: 2px;
 }
 
-#uphlolqabb .gt_last_summary_row {
+#sgxskozkog .gt_last_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -3130,7 +4540,7 @@ <h2><span class="header-section-number">13.7</span> Statistical testing<a href="
   border-bottom-color: #D3D3D3;
 }
 
-#uphlolqabb .gt_grand_summary_row {
+#sgxskozkog .gt_grand_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -3140,7 +4550,7 @@ <h2><span class="header-section-number">13.7</span> Statistical testing<a href="
   padding-right: 5px;
 }
 
-#uphlolqabb .gt_first_grand_summary_row {
+#sgxskozkog .gt_first_grand_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -3150,7 +4560,7 @@ <h2><span class="header-section-number">13.7</span> Statistical testing<a href="
   border-top-color: #D3D3D3;
 }
 
-#uphlolqabb .gt_last_grand_summary_row_top {
+#sgxskozkog .gt_last_grand_summary_row_top {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -3160,11 +4570,11 @@ <h2><span class="header-section-number">13.7</span> Statistical testing<a href="
   border-bottom-color: #D3D3D3;
 }
 
-#uphlolqabb .gt_striped {
+#sgxskozkog .gt_striped {
   background-color: rgba(128, 128, 128, 0.05);
 }
 
-#uphlolqabb .gt_table_body {
+#sgxskozkog .gt_table_body {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -3173,7 +4583,7 @@ <h2><span class="header-section-number">13.7</span> Statistical testing<a href="
   border-bottom-color: #D3D3D3;
 }
 
-#uphlolqabb .gt_footnotes {
+#sgxskozkog .gt_footnotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -3187,7 +4597,7 @@ <h2><span class="header-section-number">13.7</span> Statistical testing<a href="
   border-right-color: #D3D3D3;
 }
 
-#uphlolqabb .gt_footnote {
+#sgxskozkog .gt_footnote {
   margin: 0px;
   font-size: 90%;
   padding-top: 4px;
@@ -3196,7 +4606,7 @@ <h2><span class="header-section-number">13.7</span> Statistical testing<a href="
   padding-right: 5px;
 }
 
-#uphlolqabb .gt_sourcenotes {
+#sgxskozkog .gt_sourcenotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -3210,7 +4620,7 @@ <h2><span class="header-section-number">13.7</span> Statistical testing<a href="
   border-right-color: #D3D3D3;
 }
 
-#uphlolqabb .gt_sourcenote {
+#sgxskozkog .gt_sourcenote {
   font-size: 90%;
   padding-top: 4px;
   padding-bottom: 4px;
@@ -3218,68 +4628,68 @@ <h2><span class="header-section-number">13.7</span> Statistical testing<a href="
   padding-right: 5px;
 }
 
-#uphlolqabb .gt_left {
+#sgxskozkog .gt_left {
   text-align: left;
 }
 
-#uphlolqabb .gt_center {
+#sgxskozkog .gt_center {
   text-align: center;
 }
 
-#uphlolqabb .gt_right {
+#sgxskozkog .gt_right {
   text-align: right;
   font-variant-numeric: tabular-nums;
 }
 
-#uphlolqabb .gt_font_normal {
+#sgxskozkog .gt_font_normal {
   font-weight: normal;
 }
 
-#uphlolqabb .gt_font_bold {
+#sgxskozkog .gt_font_bold {
   font-weight: bold;
 }
 
-#uphlolqabb .gt_font_italic {
+#sgxskozkog .gt_font_italic {
   font-style: italic;
 }
 
-#uphlolqabb .gt_super {
+#sgxskozkog .gt_super {
   font-size: 65%;
 }
 
-#uphlolqabb .gt_footnote_marks {
+#sgxskozkog .gt_footnote_marks {
   font-size: 75%;
   vertical-align: 0.4em;
   position: initial;
 }
 
-#uphlolqabb .gt_asterisk {
+#sgxskozkog .gt_asterisk {
   font-size: 100%;
   vertical-align: 0;
 }
 
-#uphlolqabb .gt_indent_1 {
+#sgxskozkog .gt_indent_1 {
   text-indent: 5px;
 }
 
-#uphlolqabb .gt_indent_2 {
+#sgxskozkog .gt_indent_2 {
   text-indent: 10px;
 }
 
-#uphlolqabb .gt_indent_3 {
+#sgxskozkog .gt_indent_3 {
   text-indent: 15px;
 }
 
-#uphlolqabb .gt_indent_4 {
+#sgxskozkog .gt_indent_4 {
   text-indent: 20px;
 }
 
-#uphlolqabb .gt_indent_5 {
+#sgxskozkog .gt_indent_5 {
   text-indent: 25px;
 }
 </style>
 <table class="gt_table" data-quarto-disable-processing="false" data-quarto-bootstrap="false">
-  <caption><span id="tab:ncvs-vgn-prop-stat-test-gt-tab">TABLE 13.6: </span>T-test output for estimates of property victimization rates between properties that are owned versus rented, NCVS 2021</caption>
+  <caption><span id="tab:ncvs-vign-prop-stat-test-gt-tab">TABLE 13.9: </span>T-test output for estimates of property victimization rates between properties that are owned versus rented, NCVS 2021</caption>
   <thead>
     
     <tr class="gt_col_headings">
@@ -3307,12 +4717,12 @@ <h2><span class="header-section-number">13.7</span> Statistical testing<a href="
   
 </table>
 </div>
-<p>The output of the statistical test shows the same difference of 61.6 between the property victimization rates of renters and owners and the test is highly significant with the p-value of &lt;0.0001.</p>
+<p>The output of the statistical test shown in Table <a href="c13-ncvs-vignette.html#tab:ncvs-vign-prop-stat-test-gt-tab">13.9</a> indicates a difference of 61.6 between the property victimization rates of renters and owners, and the test is highly significant with the p-value of &lt;0.0001.</p>
 </div>
 <div id="exercises-3" class="section level2 hasAnchor" number="13.8">
 <h2><span class="header-section-number">13.8</span> Exercises<a href="c13-ncvs-vignette.html#exercises-3" class="anchor-section" aria-label="Anchor link to header"></a></h2>
 <ol style="list-style-type: decimal">
-<li><p>What proportion of completed motor vehicle thefts are <strong>not</strong> reported to the police? Hint: Use the codebook to look at the definition of Type of Crime (V4529).</p></li>
+<li><p>What proportion of completed motor vehicle thefts are <strong>not</strong> reported to the police? Hint: Use the codebook to look at the definition of Type of Crime (V4529.)</p></li>
 <li><p>How many violent crimes occur in each region?</p></li>
 <li><p>What is the property victimization rate among each income level?</p></li>
 <li><p>What is the difference between the violent victimization rate between males and females? Is it statistically different?</p></li>
@@ -3340,8 +4750,8 @@ <h3>References<a href="references.html#references" class="anchor-section" aria-l
 </div>
 <div class="footnotes">
 <hr />
-<ol start="27">
-<li id="fn27"><p>BJS publishes victimization rates per 1,000, which are also presented in these examples<a href="c13-ncvs-vignette.html#fnref27" class="footnote-back">↩︎</a></p></li>
+<ol start="28">
+<li id="fn28"><p>BJS publishes victimization rates per 1,000, which are also presented in these examples<a href="c13-ncvs-vignette.html#fnref28" class="footnote-back">↩︎</a></p></li>
 </ol>
 </div>
             </section>
diff --git a/c14-ambarom-vignette.html b/c14-ambarom-vignette.html
index 3f061e5a..bea1509d 100644
--- a/c14-ambarom-vignette.html
+++ b/c14-ambarom-vignette.html
@@ -23,7 +23,7 @@
 <meta name="author" content="Stephanie Zimmer, Rebecca J. Powell, and Isabella Velásquez" />
 
 
-<meta name="date" content="2024-04-16" />
+<meta name="date" content="2024-04-24" />
 
   <meta name="viewport" content="width=device-width, initial-scale=1" />
   <meta name="apple-mobile-web-app-capable" content="yes" />
@@ -229,7 +229,7 @@
 <li class="chapter" data-level="4.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#introduction-3"><i class="fa fa-check"></i><b>4.1</b> Introduction</a></li>
 <li class="chapter" data-level="4.2" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup"><i class="fa fa-check"></i><b>4.2</b> Setup</a>
 <ul>
-<li class="chapter" data-level="4.2.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#packages"><i class="fa fa-check"></i><b>4.2.1</b> Packages</a></li>
+<li class="chapter" data-level="4.2.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup-load-pkgs"><i class="fa fa-check"></i><b>4.2.1</b> Packages</a></li>
 <li class="chapter" data-level="4.2.2" data-path="c04-getting-started.html"><a href="c04-getting-started.html#data"><i class="fa fa-check"></i><b>4.2.2</b> Data</a></li>
 <li class="chapter" data-level="4.2.3" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup-des-obj"><i class="fa fa-check"></i><b>4.2.3</b> Design objects</a></li>
 </ul></li>
@@ -358,7 +358,7 @@
 <li class="chapter" data-level="9.9.1" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#random-number-seeds"><i class="fa fa-check"></i><b>9.9.1</b> Random number seeds</a></li>
 <li class="chapter" data-level="9.9.2" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#descriptive-names-and-labels"><i class="fa fa-check"></i><b>9.9.2</b> Descriptive names and labels</a></li>
 </ul></li>
-<li class="chapter" data-level="9.10" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#summary"><i class="fa fa-check"></i><b>9.10</b> Summary</a></li>
+<li class="chapter" data-level="9.10" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#additional-resources-1"><i class="fa fa-check"></i><b>9.10</b> Additional resources</a></li>
 </ul></li>
 <li class="part"><span><b>IV Real life data</b></span></li>
 <li class="chapter" data-level="10" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html"><i class="fa fa-check"></i><b>10</b> Sample designs and replicate weights</a>
@@ -380,7 +380,7 @@
 <ul>
 <li class="chapter" data-level="10.4.1" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#balanced-repeated-replication-brr-method"><i class="fa fa-check"></i><b>10.4.1</b> Balanced Repeated Replication (BRR) method</a></li>
 <li class="chapter" data-level="10.4.2" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#fays-brr-method"><i class="fa fa-check"></i><b>10.4.2</b> Fay’s BRR method</a></li>
-<li class="chapter" data-level="10.4.3" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#jackknife-method"><i class="fa fa-check"></i><b>10.4.3</b> Jackknife method</a></li>
+<li class="chapter" data-level="10.4.3" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#samp-jackknife"><i class="fa fa-check"></i><b>10.4.3</b> Jackknife method</a></li>
 <li class="chapter" data-level="10.4.4" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#bootstrap-method"><i class="fa fa-check"></i><b>10.4.4</b> Bootstrap method</a></li>
 </ul></li>
 <li class="chapter" data-level="10.5" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#exercises-2"><i class="fa fa-check"></i><b>10.5</b> Exercises</a></li>
@@ -398,14 +398,14 @@
 <li class="chapter" data-level="11.4" data-path="c11-missing-data.html"><a href="c11-missing-data.html#analysis-with-missing-data"><i class="fa fa-check"></i><b>11.4</b> Analysis with missing data</a>
 <ul>
 <li class="chapter" data-level="11.4.1" data-path="c11-missing-data.html"><a href="c11-missing-data.html#recoding-missing-data"><i class="fa fa-check"></i><b>11.4.1</b> Recoding missing data</a></li>
-<li class="chapter" data-level="11.4.2" data-path="c11-missing-data.html"><a href="c11-missing-data.html#accounting-for-skip-patterns"><i class="fa fa-check"></i><b>11.4.2</b> Accounting for skip patterns</a></li>
+<li class="chapter" data-level="11.4.2" data-path="c11-missing-data.html"><a href="c11-missing-data.html#missing-skip-patt"><i class="fa fa-check"></i><b>11.4.2</b> Accounting for skip patterns</a></li>
 </ul></li>
 </ul></li>
 <li class="chapter" data-level="12" data-path="c12-recommendations.html"><a href="c12-recommendations.html"><i class="fa fa-check"></i><b>12</b> Successful survey analysis recommendations</a>
 <ul>
 <li class="chapter" data-level="" data-path="c12-recommendations.html"><a href="c12-recommendations.html#prereq12"><i class="fa fa-check"></i>Prerequisites</a></li>
 <li class="chapter" data-level="12.1" data-path="c12-recommendations.html"><a href="c12-recommendations.html#introduction-10"><i class="fa fa-check"></i><b>12.1</b> Introduction</a></li>
-<li class="chapter" data-level="12.2" data-path="c12-recommendations.html"><a href="c12-recommendations.html#recs-survey-process"><i class="fa fa-check"></i><b>12.2</b> Follow survey analysis process</a></li>
+<li class="chapter" data-level="12.2" data-path="c12-recommendations.html"><a href="c12-recommendations.html#recs-survey-process"><i class="fa fa-check"></i><b>12.2</b> Follow the survey analysis process</a></li>
 <li class="chapter" data-level="12.3" data-path="c12-recommendations.html"><a href="c12-recommendations.html#begin-with-descriptive-analysis"><i class="fa fa-check"></i><b>12.3</b> Begin with descriptive analysis</a>
 <ul>
 <li class="chapter" data-level="12.3.1" data-path="c12-recommendations.html"><a href="c12-recommendations.html#table-review"><i class="fa fa-check"></i><b>12.3.1</b> Table review</a></li>
@@ -524,31 +524,31 @@ <h3>Prerequisites<a href="c14-ambarom-vignette.html#prereq10" class="anchor-sect
 </div>
 <div class="prereqbox">
 <p>For this chapter, load the following packages:</p>
-<div class="sourceCode" id="cb473"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb473-1"><a href="c14-ambarom-vignette.html#cb473-1" tabindex="-1"></a><span class="fu">library</span>(tidyverse)</span>
-<span id="cb473-2"><a href="c14-ambarom-vignette.html#cb473-2" tabindex="-1"></a><span class="fu">library</span>(survey)</span>
-<span id="cb473-3"><a href="c14-ambarom-vignette.html#cb473-3" tabindex="-1"></a><span class="fu">library</span>(srvyr)</span>
-<span id="cb473-4"><a href="c14-ambarom-vignette.html#cb473-4" tabindex="-1"></a><span class="fu">library</span>(sf)</span>
-<span id="cb473-5"><a href="c14-ambarom-vignette.html#cb473-5" tabindex="-1"></a><span class="fu">library</span>(rnaturalearth)</span>
-<span id="cb473-6"><a href="c14-ambarom-vignette.html#cb473-6" tabindex="-1"></a><span class="fu">library</span>(rnaturalearthdata)</span>
-<span id="cb473-7"><a href="c14-ambarom-vignette.html#cb473-7" tabindex="-1"></a><span class="fu">library</span>(gt)</span>
-<span id="cb473-8"><a href="c14-ambarom-vignette.html#cb473-8" tabindex="-1"></a><span class="fu">library</span>(ggpattern)</span></code></pre></div>
-<p>In this vignette, we use a subset of data from the 2021 AmericasBarometer survey. Download the raw files, available on the <a href="http://datasets.americasbarometer.org/database/index.php">LAPOP website</a>. We work with version 1.2 of the data, and there are separate files for each of the 22 countries. To read all files into R while ignoring the Stata labels, we recommend running code like this using <code>read_stata()</code> function from the {haven} package to import the data <span class="citation">(<a href="#ref-R-haven">Wickham, Miller, and Smith 2023</a>)</span>:</p>
-<div class="sourceCode" id="cb474"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb474-1"><a href="c14-ambarom-vignette.html#cb474-1" tabindex="-1"></a>stata_files <span class="ot">&lt;-</span> <span class="fu">list.files</span>(<span class="fu">here</span>(<span class="st">&quot;RawData&quot;</span>, <span class="st">&quot;LAPOP_2021&quot;</span>), <span class="st">&quot;*.dta&quot;</span>)</span>
-<span id="cb474-2"><a href="c14-ambarom-vignette.html#cb474-2" tabindex="-1"></a></span>
-<span id="cb474-3"><a href="c14-ambarom-vignette.html#cb474-3" tabindex="-1"></a>read_stata_unlabeled <span class="ot">&lt;-</span> <span class="cf">function</span>(file) {</span>
-<span id="cb474-4"><a href="c14-ambarom-vignette.html#cb474-4" tabindex="-1"></a>  <span class="fu">read_stata</span>(file) <span class="sc">%&gt;%</span></span>
-<span id="cb474-5"><a href="c14-ambarom-vignette.html#cb474-5" tabindex="-1"></a>    <span class="fu">zap_labels</span>() <span class="sc">%&gt;%</span></span>
-<span id="cb474-6"><a href="c14-ambarom-vignette.html#cb474-6" tabindex="-1"></a>    <span class="fu">zap_label</span>()</span>
-<span id="cb474-7"><a href="c14-ambarom-vignette.html#cb474-7" tabindex="-1"></a>}</span>
-<span id="cb474-8"><a href="c14-ambarom-vignette.html#cb474-8" tabindex="-1"></a></span>
-<span id="cb474-9"><a href="c14-ambarom-vignette.html#cb474-9" tabindex="-1"></a>ambarom_in <span class="ot">&lt;-</span> <span class="fu">here</span>(<span class="st">&quot;RawData&quot;</span>, <span class="st">&quot;LAPOP_2021&quot;</span>, stata_files) <span class="sc">%&gt;%</span></span>
-<span id="cb474-10"><a href="c14-ambarom-vignette.html#cb474-10" tabindex="-1"></a>  <span class="fu">map_df</span>(read_stata_unlabeled) <span class="sc">%&gt;%</span></span>
-<span id="cb474-11"><a href="c14-ambarom-vignette.html#cb474-11" tabindex="-1"></a>  <span class="fu">select</span>(pais, strata, upm, weight1500, strata, core_a_core_b,</span>
-<span id="cb474-12"><a href="c14-ambarom-vignette.html#cb474-12" tabindex="-1"></a>         q2, q1tb, covid2at, a4, idio2, idio2cov, it1, jc13,</span>
-<span id="cb474-13"><a href="c14-ambarom-vignette.html#cb474-13" tabindex="-1"></a>         m1, mil10a, mil10e, ccch1, ccch3, ccus1, ccus3,</span>
-<span id="cb474-14"><a href="c14-ambarom-vignette.html#cb474-14" tabindex="-1"></a>         edr, ocup4a, q14, q11n, q12c, q12bn,</span>
-<span id="cb474-15"><a href="c14-ambarom-vignette.html#cb474-15" tabindex="-1"></a>         <span class="fu">starts_with</span>(<span class="st">&quot;covidedu1&quot;</span>), gi0n,</span>
-<span id="cb474-16"><a href="c14-ambarom-vignette.html#cb474-16" tabindex="-1"></a>         r15, r18n, r18) </span></code></pre></div>
+<div class="sourceCode" id="cb467"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb467-1"><a href="c14-ambarom-vignette.html#cb467-1" tabindex="-1"></a><span class="fu">library</span>(tidyverse)</span>
+<span id="cb467-2"><a href="c14-ambarom-vignette.html#cb467-2" tabindex="-1"></a><span class="fu">library</span>(survey)</span>
+<span id="cb467-3"><a href="c14-ambarom-vignette.html#cb467-3" tabindex="-1"></a><span class="fu">library</span>(srvyr)</span>
+<span id="cb467-4"><a href="c14-ambarom-vignette.html#cb467-4" tabindex="-1"></a><span class="fu">library</span>(sf)</span>
+<span id="cb467-5"><a href="c14-ambarom-vignette.html#cb467-5" tabindex="-1"></a><span class="fu">library</span>(rnaturalearth)</span>
+<span id="cb467-6"><a href="c14-ambarom-vignette.html#cb467-6" tabindex="-1"></a><span class="fu">library</span>(rnaturalearthdata)</span>
+<span id="cb467-7"><a href="c14-ambarom-vignette.html#cb467-7" tabindex="-1"></a><span class="fu">library</span>(gt)</span>
+<span id="cb467-8"><a href="c14-ambarom-vignette.html#cb467-8" tabindex="-1"></a><span class="fu">library</span>(ggpattern)</span></code></pre></div>
+<p>In this vignette, we use a subset of data from the 2021 AmericasBarometer survey. Download the raw files, available on the <a href="http://datasets.americasbarometer.org/database/index.php">LAPOP website.</a> We work with version 1.2 of the data, and there are separate files for each of the 22 countries. To read all files into R while ignoring the Stata labels, we recommend running the following code using <code>read_stata()</code> function from the {haven} package to import the data <span class="citation">(<a href="#ref-R-haven">Wickham, Miller, and Smith 2023</a>)</span>:</p>
+<div class="sourceCode" id="cb468"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb468-1"><a href="c14-ambarom-vignette.html#cb468-1" tabindex="-1"></a>stata_files <span class="ot">&lt;-</span> <span class="fu">list.files</span>(<span class="fu">here</span>(<span class="st">&quot;RawData&quot;</span>, <span class="st">&quot;LAPOP_2021&quot;</span>), <span class="st">&quot;*.dta&quot;</span>)</span>
+<span id="cb468-2"><a href="c14-ambarom-vignette.html#cb468-2" tabindex="-1"></a></span>
+<span id="cb468-3"><a href="c14-ambarom-vignette.html#cb468-3" tabindex="-1"></a>read_stata_unlabeled <span class="ot">&lt;-</span> <span class="cf">function</span>(file) {</span>
+<span id="cb468-4"><a href="c14-ambarom-vignette.html#cb468-4" tabindex="-1"></a>  <span class="fu">read_stata</span>(file) <span class="sc">%&gt;%</span></span>
+<span id="cb468-5"><a href="c14-ambarom-vignette.html#cb468-5" tabindex="-1"></a>    <span class="fu">zap_labels</span>() <span class="sc">%&gt;%</span></span>
+<span id="cb468-6"><a href="c14-ambarom-vignette.html#cb468-6" tabindex="-1"></a>    <span class="fu">zap_label</span>()</span>
+<span id="cb468-7"><a href="c14-ambarom-vignette.html#cb468-7" tabindex="-1"></a>}</span>
+<span id="cb468-8"><a href="c14-ambarom-vignette.html#cb468-8" tabindex="-1"></a></span>
+<span id="cb468-9"><a href="c14-ambarom-vignette.html#cb468-9" tabindex="-1"></a>ambarom_in <span class="ot">&lt;-</span> <span class="fu">here</span>(<span class="st">&quot;RawData&quot;</span>, <span class="st">&quot;LAPOP_2021&quot;</span>, stata_files) <span class="sc">%&gt;%</span></span>
+<span id="cb468-10"><a href="c14-ambarom-vignette.html#cb468-10" tabindex="-1"></a>  <span class="fu">map_df</span>(read_stata_unlabeled) <span class="sc">%&gt;%</span></span>
+<span id="cb468-11"><a href="c14-ambarom-vignette.html#cb468-11" tabindex="-1"></a>  <span class="fu">select</span>(pais, strata, upm, weight1500, strata, core_a_core_b,</span>
+<span id="cb468-12"><a href="c14-ambarom-vignette.html#cb468-12" tabindex="-1"></a>         q2, q1tb, covid2at, a4, idio2, idio2cov, it1, jc13,</span>
+<span id="cb468-13"><a href="c14-ambarom-vignette.html#cb468-13" tabindex="-1"></a>         m1, mil10a, mil10e, ccch1, ccch3, ccus1, ccus3,</span>
+<span id="cb468-14"><a href="c14-ambarom-vignette.html#cb468-14" tabindex="-1"></a>         edr, ocup4a, q14, q11n, q12c, q12bn,</span>
+<span id="cb468-15"><a href="c14-ambarom-vignette.html#cb468-15" tabindex="-1"></a>         <span class="fu">starts_with</span>(<span class="st">&quot;covidedu1&quot;</span>), gi0n,</span>
+<span id="cb468-16"><a href="c14-ambarom-vignette.html#cb468-16" tabindex="-1"></a>         r15, r18n, r18) </span></code></pre></div>
 <p>The code above reads all <code>.dta</code> files and combines them into one tibble.</p>
 </div>
 <div id="introduction-12" class="section level2 hasAnchor" number="14.1">
@@ -559,57 +559,57 @@ <h2><span class="header-section-number">14.1</span> Introduction<a href="c14-amb
 </div>
 <div id="data-structure-1" class="section level2 hasAnchor" number="14.2">
 <h2><span class="header-section-number">14.2</span> Data structure<a href="c14-ambarom-vignette.html#data-structure-1" class="anchor-section" aria-label="Anchor link to header"></a></h2>
-<p>Each country and year has its own file available in Stata format (<code>.dta</code>). In this vignette, we download and combine all the data from the 22 participating countries in 2021. We subset the data to a smaller set of columns, as noted in the prerequisites box. Review the core questionnaire to understand the common variables across the countries <span class="citation">(<a href="#ref-lapop-svy">LAPOP 2021d</a>)</span>.</p>
+<p>Each country and year has its own file available in Stata format (<code>.dta</code>.) In this vignette, we download and combine all the data from the 22 participating countries in 2021. We subset the data to a smaller set of columns, as noted in the prerequisites box. We recommend reviewing the core questionnaire to understand the common variables across the countries <span class="citation">(<a href="#ref-lapop-svy">LAPOP 2021d</a>)</span>.</p>
 </div>
 <div id="preparing-files" class="section level2 hasAnchor" number="14.3">
 <h2><span class="header-section-number">14.3</span> Preparing files<a href="c14-ambarom-vignette.html#preparing-files" class="anchor-section" aria-label="Anchor link to header"></a></h2>
 <p>Many of the variables are coded as numeric and do not have intuitive variable names, so the next step is to create derived variables and wrangle the data for analysis. Using the core questionnaire as a codebook, we reference the factor descriptions to create derived variables with informative names:</p>
-<div class="sourceCode" id="cb475"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb475-1"><a href="c14-ambarom-vignette.html#cb475-1" tabindex="-1"></a>ambarom <span class="ot">&lt;-</span> ambarom_in <span class="sc">%&gt;%</span></span>
-<span id="cb475-2"><a href="c14-ambarom-vignette.html#cb475-2" tabindex="-1"></a>  <span class="fu">mutate</span>(</span>
-<span id="cb475-3"><a href="c14-ambarom-vignette.html#cb475-3" tabindex="-1"></a>    <span class="at">Country =</span> <span class="fu">factor</span>(</span>
-<span id="cb475-4"><a href="c14-ambarom-vignette.html#cb475-4" tabindex="-1"></a>      <span class="fu">case_match</span>(pais,</span>
-<span id="cb475-5"><a href="c14-ambarom-vignette.html#cb475-5" tabindex="-1"></a>                 <span class="dv">1</span> <span class="sc">~</span> <span class="st">&quot;Mexico&quot;</span>,</span>
-<span id="cb475-6"><a href="c14-ambarom-vignette.html#cb475-6" tabindex="-1"></a>                 <span class="dv">2</span> <span class="sc">~</span> <span class="st">&quot;Guatemala&quot;</span>,</span>
-<span id="cb475-7"><a href="c14-ambarom-vignette.html#cb475-7" tabindex="-1"></a>                 <span class="dv">3</span> <span class="sc">~</span> <span class="st">&quot;El Salvador&quot;</span>,</span>
-<span id="cb475-8"><a href="c14-ambarom-vignette.html#cb475-8" tabindex="-1"></a>                 <span class="dv">4</span> <span class="sc">~</span> <span class="st">&quot;Honduras&quot;</span>,</span>
-<span id="cb475-9"><a href="c14-ambarom-vignette.html#cb475-9" tabindex="-1"></a>                 <span class="dv">5</span> <span class="sc">~</span> <span class="st">&quot;Nicaragua&quot;</span>,</span>
-<span id="cb475-10"><a href="c14-ambarom-vignette.html#cb475-10" tabindex="-1"></a>                 <span class="dv">6</span> <span class="sc">~</span> <span class="st">&quot;Costa Rica&quot;</span>,</span>
-<span id="cb475-11"><a href="c14-ambarom-vignette.html#cb475-11" tabindex="-1"></a>                 <span class="dv">7</span> <span class="sc">~</span> <span class="st">&quot;Panama&quot;</span>,</span>
-<span id="cb475-12"><a href="c14-ambarom-vignette.html#cb475-12" tabindex="-1"></a>                 <span class="dv">8</span> <span class="sc">~</span> <span class="st">&quot;Colombia&quot;</span>,</span>
-<span id="cb475-13"><a href="c14-ambarom-vignette.html#cb475-13" tabindex="-1"></a>                 <span class="dv">9</span> <span class="sc">~</span> <span class="st">&quot;Ecuador&quot;</span>,</span>
-<span id="cb475-14"><a href="c14-ambarom-vignette.html#cb475-14" tabindex="-1"></a>                 <span class="dv">10</span> <span class="sc">~</span> <span class="st">&quot;Bolivia&quot;</span>,</span>
-<span id="cb475-15"><a href="c14-ambarom-vignette.html#cb475-15" tabindex="-1"></a>                 <span class="dv">11</span> <span class="sc">~</span> <span class="st">&quot;Peru&quot;</span>,</span>
-<span id="cb475-16"><a href="c14-ambarom-vignette.html#cb475-16" tabindex="-1"></a>                 <span class="dv">12</span> <span class="sc">~</span> <span class="st">&quot;Paraguay&quot;</span>,</span>
-<span id="cb475-17"><a href="c14-ambarom-vignette.html#cb475-17" tabindex="-1"></a>                 <span class="dv">13</span> <span class="sc">~</span> <span class="st">&quot;Chile&quot;</span>,</span>
-<span id="cb475-18"><a href="c14-ambarom-vignette.html#cb475-18" tabindex="-1"></a>                 <span class="dv">14</span> <span class="sc">~</span> <span class="st">&quot;Uruguay&quot;</span>,</span>
-<span id="cb475-19"><a href="c14-ambarom-vignette.html#cb475-19" tabindex="-1"></a>                 <span class="dv">15</span> <span class="sc">~</span> <span class="st">&quot;Brazil&quot;</span>,</span>
-<span id="cb475-20"><a href="c14-ambarom-vignette.html#cb475-20" tabindex="-1"></a>                 <span class="dv">17</span> <span class="sc">~</span> <span class="st">&quot;Argentina&quot;</span>,</span>
-<span id="cb475-21"><a href="c14-ambarom-vignette.html#cb475-21" tabindex="-1"></a>                 <span class="dv">21</span> <span class="sc">~</span> <span class="st">&quot;Dominican Republic&quot;</span>,</span>
-<span id="cb475-22"><a href="c14-ambarom-vignette.html#cb475-22" tabindex="-1"></a>                 <span class="dv">22</span> <span class="sc">~</span> <span class="st">&quot;Haiti&quot;</span>,</span>
-<span id="cb475-23"><a href="c14-ambarom-vignette.html#cb475-23" tabindex="-1"></a>                 <span class="dv">23</span> <span class="sc">~</span> <span class="st">&quot;Jamaica&quot;</span>,</span>
-<span id="cb475-24"><a href="c14-ambarom-vignette.html#cb475-24" tabindex="-1"></a>                 <span class="dv">24</span> <span class="sc">~</span> <span class="st">&quot;Guyana&quot;</span>,</span>
-<span id="cb475-25"><a href="c14-ambarom-vignette.html#cb475-25" tabindex="-1"></a>                 <span class="dv">40</span> <span class="sc">~</span> <span class="st">&quot;United States&quot;</span>,</span>
-<span id="cb475-26"><a href="c14-ambarom-vignette.html#cb475-26" tabindex="-1"></a>                 <span class="dv">41</span> <span class="sc">~</span> <span class="st">&quot;Canada&quot;</span>)),</span>
-<span id="cb475-27"><a href="c14-ambarom-vignette.html#cb475-27" tabindex="-1"></a>    <span class="at">CovidWorry =</span> <span class="fu">fct_reorder</span>(</span>
-<span id="cb475-28"><a href="c14-ambarom-vignette.html#cb475-28" tabindex="-1"></a>      <span class="fu">case_match</span>(covid2at,</span>
-<span id="cb475-29"><a href="c14-ambarom-vignette.html#cb475-29" tabindex="-1"></a>                 <span class="dv">1</span> <span class="sc">~</span> <span class="st">&quot;Very worried&quot;</span>,</span>
-<span id="cb475-30"><a href="c14-ambarom-vignette.html#cb475-30" tabindex="-1"></a>                 <span class="dv">2</span> <span class="sc">~</span> <span class="st">&quot;Somewhat worried&quot;</span>,</span>
-<span id="cb475-31"><a href="c14-ambarom-vignette.html#cb475-31" tabindex="-1"></a>                 <span class="dv">3</span> <span class="sc">~</span> <span class="st">&quot;A little worried&quot;</span>,</span>
-<span id="cb475-32"><a href="c14-ambarom-vignette.html#cb475-32" tabindex="-1"></a>                 <span class="dv">4</span> <span class="sc">~</span> <span class="st">&quot;Not worried at all&quot;</span>),</span>
-<span id="cb475-33"><a href="c14-ambarom-vignette.html#cb475-33" tabindex="-1"></a>      covid2at,</span>
-<span id="cb475-34"><a href="c14-ambarom-vignette.html#cb475-34" tabindex="-1"></a>      <span class="at">.na_rm =</span> <span class="cn">FALSE</span>)</span>
-<span id="cb475-35"><a href="c14-ambarom-vignette.html#cb475-35" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
-<span id="cb475-36"><a href="c14-ambarom-vignette.html#cb475-36" tabindex="-1"></a>  <span class="fu">rename</span>(<span class="at">Educ_NotInSchool =</span> covidedu1_1,</span>
-<span id="cb475-37"><a href="c14-ambarom-vignette.html#cb475-37" tabindex="-1"></a>         <span class="at">Educ_NormalSchool =</span> covidedu1_2,</span>
-<span id="cb475-38"><a href="c14-ambarom-vignette.html#cb475-38" tabindex="-1"></a>         <span class="at">Educ_VirtualSchool =</span> covidedu1_3,</span>
-<span id="cb475-39"><a href="c14-ambarom-vignette.html#cb475-39" tabindex="-1"></a>         <span class="at">Educ_Hybrid =</span> covidedu1_4,</span>
-<span id="cb475-40"><a href="c14-ambarom-vignette.html#cb475-40" tabindex="-1"></a>         <span class="at">Educ_NoSchool =</span> covidedu1_5,</span>
-<span id="cb475-41"><a href="c14-ambarom-vignette.html#cb475-41" tabindex="-1"></a>         <span class="at">BroadbandInternet =</span> r18n,</span>
-<span id="cb475-42"><a href="c14-ambarom-vignette.html#cb475-42" tabindex="-1"></a>         <span class="at">Internet =</span> r18)</span></code></pre></div>
+<div class="sourceCode" id="cb469"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb469-1"><a href="c14-ambarom-vignette.html#cb469-1" tabindex="-1"></a>ambarom <span class="ot">&lt;-</span> ambarom_in <span class="sc">%&gt;%</span></span>
+<span id="cb469-2"><a href="c14-ambarom-vignette.html#cb469-2" tabindex="-1"></a>  <span class="fu">mutate</span>(</span>
+<span id="cb469-3"><a href="c14-ambarom-vignette.html#cb469-3" tabindex="-1"></a>    <span class="at">Country =</span> <span class="fu">factor</span>(</span>
+<span id="cb469-4"><a href="c14-ambarom-vignette.html#cb469-4" tabindex="-1"></a>      <span class="fu">case_match</span>(pais,</span>
+<span id="cb469-5"><a href="c14-ambarom-vignette.html#cb469-5" tabindex="-1"></a>                 <span class="dv">1</span> <span class="sc">~</span> <span class="st">&quot;Mexico&quot;</span>,</span>
+<span id="cb469-6"><a href="c14-ambarom-vignette.html#cb469-6" tabindex="-1"></a>                 <span class="dv">2</span> <span class="sc">~</span> <span class="st">&quot;Guatemala&quot;</span>,</span>
+<span id="cb469-7"><a href="c14-ambarom-vignette.html#cb469-7" tabindex="-1"></a>                 <span class="dv">3</span> <span class="sc">~</span> <span class="st">&quot;El Salvador&quot;</span>,</span>
+<span id="cb469-8"><a href="c14-ambarom-vignette.html#cb469-8" tabindex="-1"></a>                 <span class="dv">4</span> <span class="sc">~</span> <span class="st">&quot;Honduras&quot;</span>,</span>
+<span id="cb469-9"><a href="c14-ambarom-vignette.html#cb469-9" tabindex="-1"></a>                 <span class="dv">5</span> <span class="sc">~</span> <span class="st">&quot;Nicaragua&quot;</span>,</span>
+<span id="cb469-10"><a href="c14-ambarom-vignette.html#cb469-10" tabindex="-1"></a>                 <span class="dv">6</span> <span class="sc">~</span> <span class="st">&quot;Costa Rica&quot;</span>,</span>
+<span id="cb469-11"><a href="c14-ambarom-vignette.html#cb469-11" tabindex="-1"></a>                 <span class="dv">7</span> <span class="sc">~</span> <span class="st">&quot;Panama&quot;</span>,</span>
+<span id="cb469-12"><a href="c14-ambarom-vignette.html#cb469-12" tabindex="-1"></a>                 <span class="dv">8</span> <span class="sc">~</span> <span class="st">&quot;Colombia&quot;</span>,</span>
+<span id="cb469-13"><a href="c14-ambarom-vignette.html#cb469-13" tabindex="-1"></a>                 <span class="dv">9</span> <span class="sc">~</span> <span class="st">&quot;Ecuador&quot;</span>,</span>
+<span id="cb469-14"><a href="c14-ambarom-vignette.html#cb469-14" tabindex="-1"></a>                 <span class="dv">10</span> <span class="sc">~</span> <span class="st">&quot;Bolivia&quot;</span>,</span>
+<span id="cb469-15"><a href="c14-ambarom-vignette.html#cb469-15" tabindex="-1"></a>                 <span class="dv">11</span> <span class="sc">~</span> <span class="st">&quot;Peru&quot;</span>,</span>
+<span id="cb469-16"><a href="c14-ambarom-vignette.html#cb469-16" tabindex="-1"></a>                 <span class="dv">12</span> <span class="sc">~</span> <span class="st">&quot;Paraguay&quot;</span>,</span>
+<span id="cb469-17"><a href="c14-ambarom-vignette.html#cb469-17" tabindex="-1"></a>                 <span class="dv">13</span> <span class="sc">~</span> <span class="st">&quot;Chile&quot;</span>,</span>
+<span id="cb469-18"><a href="c14-ambarom-vignette.html#cb469-18" tabindex="-1"></a>                 <span class="dv">14</span> <span class="sc">~</span> <span class="st">&quot;Uruguay&quot;</span>,</span>
+<span id="cb469-19"><a href="c14-ambarom-vignette.html#cb469-19" tabindex="-1"></a>                 <span class="dv">15</span> <span class="sc">~</span> <span class="st">&quot;Brazil&quot;</span>,</span>
+<span id="cb469-20"><a href="c14-ambarom-vignette.html#cb469-20" tabindex="-1"></a>                 <span class="dv">17</span> <span class="sc">~</span> <span class="st">&quot;Argentina&quot;</span>,</span>
+<span id="cb469-21"><a href="c14-ambarom-vignette.html#cb469-21" tabindex="-1"></a>                 <span class="dv">21</span> <span class="sc">~</span> <span class="st">&quot;Dominican Republic&quot;</span>,</span>
+<span id="cb469-22"><a href="c14-ambarom-vignette.html#cb469-22" tabindex="-1"></a>                 <span class="dv">22</span> <span class="sc">~</span> <span class="st">&quot;Haiti&quot;</span>,</span>
+<span id="cb469-23"><a href="c14-ambarom-vignette.html#cb469-23" tabindex="-1"></a>                 <span class="dv">23</span> <span class="sc">~</span> <span class="st">&quot;Jamaica&quot;</span>,</span>
+<span id="cb469-24"><a href="c14-ambarom-vignette.html#cb469-24" tabindex="-1"></a>                 <span class="dv">24</span> <span class="sc">~</span> <span class="st">&quot;Guyana&quot;</span>,</span>
+<span id="cb469-25"><a href="c14-ambarom-vignette.html#cb469-25" tabindex="-1"></a>                 <span class="dv">40</span> <span class="sc">~</span> <span class="st">&quot;United States&quot;</span>,</span>
+<span id="cb469-26"><a href="c14-ambarom-vignette.html#cb469-26" tabindex="-1"></a>                 <span class="dv">41</span> <span class="sc">~</span> <span class="st">&quot;Canada&quot;</span>)),</span>
+<span id="cb469-27"><a href="c14-ambarom-vignette.html#cb469-27" tabindex="-1"></a>    <span class="at">CovidWorry =</span> <span class="fu">fct_reorder</span>(</span>
+<span id="cb469-28"><a href="c14-ambarom-vignette.html#cb469-28" tabindex="-1"></a>      <span class="fu">case_match</span>(covid2at,</span>
+<span id="cb469-29"><a href="c14-ambarom-vignette.html#cb469-29" tabindex="-1"></a>                 <span class="dv">1</span> <span class="sc">~</span> <span class="st">&quot;Very worried&quot;</span>,</span>
+<span id="cb469-30"><a href="c14-ambarom-vignette.html#cb469-30" tabindex="-1"></a>                 <span class="dv">2</span> <span class="sc">~</span> <span class="st">&quot;Somewhat worried&quot;</span>,</span>
+<span id="cb469-31"><a href="c14-ambarom-vignette.html#cb469-31" tabindex="-1"></a>                 <span class="dv">3</span> <span class="sc">~</span> <span class="st">&quot;A little worried&quot;</span>,</span>
+<span id="cb469-32"><a href="c14-ambarom-vignette.html#cb469-32" tabindex="-1"></a>                 <span class="dv">4</span> <span class="sc">~</span> <span class="st">&quot;Not worried at all&quot;</span>),</span>
+<span id="cb469-33"><a href="c14-ambarom-vignette.html#cb469-33" tabindex="-1"></a>      covid2at,</span>
+<span id="cb469-34"><a href="c14-ambarom-vignette.html#cb469-34" tabindex="-1"></a>      <span class="at">.na_rm =</span> <span class="cn">FALSE</span>)</span>
+<span id="cb469-35"><a href="c14-ambarom-vignette.html#cb469-35" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
+<span id="cb469-36"><a href="c14-ambarom-vignette.html#cb469-36" tabindex="-1"></a>  <span class="fu">rename</span>(<span class="at">Educ_NotInSchool =</span> covidedu1_1,</span>
+<span id="cb469-37"><a href="c14-ambarom-vignette.html#cb469-37" tabindex="-1"></a>         <span class="at">Educ_NormalSchool =</span> covidedu1_2,</span>
+<span id="cb469-38"><a href="c14-ambarom-vignette.html#cb469-38" tabindex="-1"></a>         <span class="at">Educ_VirtualSchool =</span> covidedu1_3,</span>
+<span id="cb469-39"><a href="c14-ambarom-vignette.html#cb469-39" tabindex="-1"></a>         <span class="at">Educ_Hybrid =</span> covidedu1_4,</span>
+<span id="cb469-40"><a href="c14-ambarom-vignette.html#cb469-40" tabindex="-1"></a>         <span class="at">Educ_NoSchool =</span> covidedu1_5,</span>
+<span id="cb469-41"><a href="c14-ambarom-vignette.html#cb469-41" tabindex="-1"></a>         <span class="at">BroadbandInternet =</span> r18n,</span>
+<span id="cb469-42"><a href="c14-ambarom-vignette.html#cb469-42" tabindex="-1"></a>         <span class="at">Internet =</span> r18)</span></code></pre></div>
 <p>At this point, it is a good time to check the cross-tabs between the original and newly derived variables. These tables help us confirm that we have correctly matched the numeric data from the original dataset to the renamed factor data in the new dataset. For instance, let’s check the original variable <code>pais</code> and the derived variable <code>Country</code>. We can consult the questionnaire or codebook to confirm that Argentina is coded as <code>17</code>, Bolivia as <code>10</code>, etc. Similarly, for <code>CovidWorry</code> and <code>covid2at</code>, we can verify that <code>Very worried</code> is coded as <code>1</code>, and so on for the other variables.</p>
-<div class="sourceCode" id="cb476"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb476-1"><a href="c14-ambarom-vignette.html#cb476-1" tabindex="-1"></a>ambarom <span class="sc">%&gt;%</span></span>
-<span id="cb476-2"><a href="c14-ambarom-vignette.html#cb476-2" tabindex="-1"></a>  <span class="fu">count</span>(Country, pais) <span class="sc">%&gt;%</span></span>
-<span id="cb476-3"><a href="c14-ambarom-vignette.html#cb476-3" tabindex="-1"></a>  <span class="fu">print</span>(<span class="at">n =</span> <span class="dv">22</span>)</span></code></pre></div>
+<div class="sourceCode" id="cb470"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb470-1"><a href="c14-ambarom-vignette.html#cb470-1" tabindex="-1"></a>ambarom <span class="sc">%&gt;%</span></span>
+<span id="cb470-2"><a href="c14-ambarom-vignette.html#cb470-2" tabindex="-1"></a>  <span class="fu">count</span>(Country, pais) <span class="sc">%&gt;%</span></span>
+<span id="cb470-3"><a href="c14-ambarom-vignette.html#cb470-3" tabindex="-1"></a>  <span class="fu">print</span>(<span class="at">n =</span> <span class="dv">22</span>)</span></code></pre></div>
 <pre><code>## # A tibble: 22 × 3
 ##    Country             pais     n
 ##    &lt;fct&gt;              &lt;dbl&gt; &lt;int&gt;
@@ -635,8 +635,8 @@ <h2><span class="header-section-number">14.3</span> Preparing files<a href="c14-
 ## 20 Peru                  11  3038
 ## 21 United States         40  1500
 ## 22 Uruguay               14  3009</code></pre>
-<div class="sourceCode" id="cb478"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb478-1"><a href="c14-ambarom-vignette.html#cb478-1" tabindex="-1"></a>ambarom <span class="sc">%&gt;%</span></span>
-<span id="cb478-2"><a href="c14-ambarom-vignette.html#cb478-2" tabindex="-1"></a>  <span class="fu">count</span>(CovidWorry, covid2at)</span></code></pre></div>
+<div class="sourceCode" id="cb472"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb472-1"><a href="c14-ambarom-vignette.html#cb472-1" tabindex="-1"></a>ambarom <span class="sc">%&gt;%</span></span>
+<span id="cb472-2"><a href="c14-ambarom-vignette.html#cb472-2" tabindex="-1"></a>  <span class="fu">count</span>(CovidWorry, covid2at)</span></code></pre></div>
 <pre><code>## # A tibble: 5 × 3
 ##   CovidWorry         covid2at     n
 ##   &lt;fct&gt;                 &lt;dbl&gt; &lt;int&gt;
@@ -648,19 +648,19 @@ <h2><span class="header-section-number">14.3</span> Preparing files<a href="c14-
 </div>
 <div id="survey-design-objects-1" class="section level2 hasAnchor" number="14.4">
 <h2><span class="header-section-number">14.4</span> Survey design objects<a href="c14-ambarom-vignette.html#survey-design-objects-1" class="anchor-section" aria-label="Anchor link to header"></a></h2>
-<p>The technical report is the best reference for understanding how to specify the sampling design in R <span class="citation">(<a href="#ref-lapop-tech">LAPOP 2021c</a>)</span>. The data includes two weights: <code>wt</code> and <code>weight1500</code>. The first weight variable is specific to each country and sums to the sample size, but it is calibrated to reflect each country’s demographics. The second weight variable sums to 1500 for each country and is recommended for multi-country analyses. Although not explicitly stated in the documentation, the Stata syntax example (<code>svyset upm [pw=weight1500], strata(strata)</code>) indicates the variable <code>upm</code> is a clustering variable and <code>strata</code> is the strata variable. Therefore, the design object is created in R as follows:</p>
-<div class="sourceCode" id="cb480"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb480-1"><a href="c14-ambarom-vignette.html#cb480-1" tabindex="-1"></a>ambarom_des <span class="ot">&lt;-</span> ambarom <span class="sc">%&gt;%</span></span>
-<span id="cb480-2"><a href="c14-ambarom-vignette.html#cb480-2" tabindex="-1"></a>  <span class="fu">as_survey_design</span>(<span class="at">ids =</span> upm,</span>
-<span id="cb480-3"><a href="c14-ambarom-vignette.html#cb480-3" tabindex="-1"></a>                   <span class="at">strata =</span> strata,</span>
-<span id="cb480-4"><a href="c14-ambarom-vignette.html#cb480-4" tabindex="-1"></a>                   <span class="at">weight =</span> weight1500)</span></code></pre></div>
-<p>One interesting thing to note is that these weight variables can provide estimates for comparing countries but not for multi-country estimates. The reason is that the weights do not account for the different sizes of countries. For example, Canada has about 10% of the population of the United States, but an estimate that uses records from both countries would weigh them equally.</p>
+<p>The technical report is the best reference for understanding how to specify the sampling design in R <span class="citation">(<a href="#ref-lapop-tech">LAPOP 2021c</a>)</span>. The data include two weights: <code>wt</code> and <code>weight1500</code>. The first weight variable is specific to each country and sums to the sample size, but it is calibrated to reflect each country’s demographics. The second weight variable sums to 1500 for each country and is recommended for multi-country analyses. Although not explicitly stated in the documentation, the Stata syntax example (<code>svyset upm [pw=weight1500], strata(strata)</code>) indicates the variable <code>upm</code> is a clustering variable, and <code>strata</code> is the strata variable. Therefore, the design object for multi-country analysis is created in R as follows:</p>
+<div class="sourceCode" id="cb474"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb474-1"><a href="c14-ambarom-vignette.html#cb474-1" tabindex="-1"></a>ambarom_des <span class="ot">&lt;-</span> ambarom <span class="sc">%&gt;%</span></span>
+<span id="cb474-2"><a href="c14-ambarom-vignette.html#cb474-2" tabindex="-1"></a>  <span class="fu">as_survey_design</span>(<span class="at">ids =</span> upm,</span>
+<span id="cb474-3"><a href="c14-ambarom-vignette.html#cb474-3" tabindex="-1"></a>                   <span class="at">strata =</span> strata,</span>
+<span id="cb474-4"><a href="c14-ambarom-vignette.html#cb474-4" tabindex="-1"></a>                   <span class="at">weight =</span> weight1500)</span></code></pre></div>
+<p>One interesting thing to note is that these weight variables can provide estimates for comparing countries rather than for multi-country estimates. The reason is that the weights do not account for the different sizes of countries. For example, Canada has about 10% of the population of the United States, but an estimate that uses records from both countries would weigh them equally.</p>
 </div>
 <div id="ambarom-estimates" class="section level2 hasAnchor" number="14.5">
 <h2><span class="header-section-number">14.5</span> Calculating estimates<a href="c14-ambarom-vignette.html#ambarom-estimates" class="anchor-section" aria-label="Anchor link to header"></a></h2>
 <p>When calculating estimates from the data, we use the survey design object <code>ambarom_des</code> and then apply the <code>survey_mean()</code> function. The next sections walk through a few examples.</p>
 <div id="example-worried-about-covid" class="section level3 hasAnchor" number="14.5.1">
 <h3><span class="header-section-number">14.5.1</span> Example: Worried about COVID<a href="c14-ambarom-vignette.html#example-worried-about-covid" class="anchor-section" aria-label="Anchor link to header"></a></h3>
-<p>This survey was administered between March and August of 2021, with the specific timing varying by country<a href="#fn28" class="footnote-ref" id="fnref28"><sup>28</sup></a>. Given the state of the pandemic at that time, several questions about COVID were included. The first question about COVID asked:</p>
+<p>This survey was administered between March and August of 2021, with the specific timing varying by country.<a href="#fn29" class="footnote-ref" id="fnref29"><sup>29</sup></a> Given the state of the pandemic at that time, several questions about COVID were included. The first question about COVID asked:</p>
 <blockquote>
 <p>How worried are you about the possibility that you or someone in your household will get sick from coronavirus in the next 3 months?</p>
 <ul>
@@ -670,16 +670,16 @@ <h3><span class="header-section-number">14.5.1</span> Example: Worried about COV
 <li>Not worried at all</li>
 </ul>
 </blockquote>
-<p>If we are interested in those who are very worried or somewhat worried, we can create a new variable (<code>CovidWorry_bin</code>) that groups levels of the original question using the <code>fct_collapse()</code> function from the {forcats} package <span class="citation">(<a href="#ref-R-forcats">Wickham 2023a</a>)</span>. We then use the <code>survey_count()</code> function to understand how responses are distributed across each category of the original variable (<code>CovidWorry</code>) and the new variable (<code>CovidWorry_bin</code>).</p>
-<div class="sourceCode" id="cb481"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb481-1"><a href="c14-ambarom-vignette.html#cb481-1" tabindex="-1"></a>covid_worry_collapse <span class="ot">&lt;-</span> ambarom_des <span class="sc">%&gt;%</span></span>
-<span id="cb481-2"><a href="c14-ambarom-vignette.html#cb481-2" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">CovidWorry_bin =</span> <span class="fu">fct_collapse</span>(</span>
-<span id="cb481-3"><a href="c14-ambarom-vignette.html#cb481-3" tabindex="-1"></a>    CovidWorry,</span>
-<span id="cb481-4"><a href="c14-ambarom-vignette.html#cb481-4" tabindex="-1"></a>    <span class="at">WorriedHi =</span> <span class="fu">c</span>(<span class="st">&quot;Very worried&quot;</span>, <span class="st">&quot;Somewhat worried&quot;</span>),</span>
-<span id="cb481-5"><a href="c14-ambarom-vignette.html#cb481-5" tabindex="-1"></a>    <span class="at">WorriedLo =</span> <span class="fu">c</span>(<span class="st">&quot;A little worried&quot;</span>, <span class="st">&quot;Not worried at all&quot;</span>)</span>
-<span id="cb481-6"><a href="c14-ambarom-vignette.html#cb481-6" tabindex="-1"></a>  ))</span>
-<span id="cb481-7"><a href="c14-ambarom-vignette.html#cb481-7" tabindex="-1"></a></span>
-<span id="cb481-8"><a href="c14-ambarom-vignette.html#cb481-8" tabindex="-1"></a>covid_worry_collapse <span class="sc">%&gt;%</span></span>
-<span id="cb481-9"><a href="c14-ambarom-vignette.html#cb481-9" tabindex="-1"></a>  <span class="fu">survey_count</span>(CovidWorry_bin, CovidWorry)</span></code></pre></div>
+<p>If we are interested in those who are very worried or somewhat worried, we can create a new variable (<code>CovidWorry_bin</code>) that groups levels of the original question using the <code>fct_collapse()</code> function from the {forcats} package <span class="citation">(<a href="#ref-R-forcats">Wickham 2023a</a>)</span>. We then use the <code>survey_count()</code> function to understand how responses are distributed across each category of the original variable (<code>CovidWorry</code>) and the new variable (<code>CovidWorry_bin</code>.)</p>
+<div class="sourceCode" id="cb475"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb475-1"><a href="c14-ambarom-vignette.html#cb475-1" tabindex="-1"></a>covid_worry_collapse <span class="ot">&lt;-</span> ambarom_des <span class="sc">%&gt;%</span></span>
+<span id="cb475-2"><a href="c14-ambarom-vignette.html#cb475-2" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">CovidWorry_bin =</span> <span class="fu">fct_collapse</span>(</span>
+<span id="cb475-3"><a href="c14-ambarom-vignette.html#cb475-3" tabindex="-1"></a>    CovidWorry,</span>
+<span id="cb475-4"><a href="c14-ambarom-vignette.html#cb475-4" tabindex="-1"></a>    <span class="at">WorriedHi =</span> <span class="fu">c</span>(<span class="st">&quot;Very worried&quot;</span>, <span class="st">&quot;Somewhat worried&quot;</span>),</span>
+<span id="cb475-5"><a href="c14-ambarom-vignette.html#cb475-5" tabindex="-1"></a>    <span class="at">WorriedLo =</span> <span class="fu">c</span>(<span class="st">&quot;A little worried&quot;</span>, <span class="st">&quot;Not worried at all&quot;</span>)</span>
+<span id="cb475-6"><a href="c14-ambarom-vignette.html#cb475-6" tabindex="-1"></a>  ))</span>
+<span id="cb475-7"><a href="c14-ambarom-vignette.html#cb475-7" tabindex="-1"></a></span>
+<span id="cb475-8"><a href="c14-ambarom-vignette.html#cb475-8" tabindex="-1"></a>covid_worry_collapse <span class="sc">%&gt;%</span></span>
+<span id="cb475-9"><a href="c14-ambarom-vignette.html#cb475-9" tabindex="-1"></a>  <span class="fu">survey_count</span>(CovidWorry_bin, CovidWorry)</span></code></pre></div>
 <pre><code>## # A tibble: 5 × 4
 ##   CovidWorry_bin CovidWorry              n  n_se
 ##   &lt;fct&gt;          &lt;fct&gt;               &lt;dbl&gt; &lt;dbl&gt;
@@ -689,12 +689,12 @@ <h3><span class="header-section-number">14.5.1</span> Example: Worried about COV
 ## 4 WorriedLo      Not worried at all  4840.  59.7
 ## 5 &lt;NA&gt;           &lt;NA&gt;                3518.  42.2</code></pre>
 <p>With this new variable, we can now use <code>survey_mean()</code> to calculate the percentage of people in each country who are either very or somewhat worried about COVID. There are missing data, as indicated in the <code>survey_count()</code> output above, so we need to use <code>na.rm = TRUE</code> in the <code>survey_mean()</code> function to handle the missing values.</p>
-<div class="sourceCode" id="cb483"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb483-1"><a href="c14-ambarom-vignette.html#cb483-1" tabindex="-1"></a>covid_worry_country_ests <span class="ot">&lt;-</span> covid_worry_collapse <span class="sc">%&gt;%</span></span>
-<span id="cb483-2"><a href="c14-ambarom-vignette.html#cb483-2" tabindex="-1"></a>  <span class="fu">group_by</span>(Country) <span class="sc">%&gt;%</span></span>
-<span id="cb483-3"><a href="c14-ambarom-vignette.html#cb483-3" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">p =</span> <span class="fu">survey_mean</span>(CovidWorry_bin <span class="sc">==</span> <span class="st">&quot;WorriedHi&quot;</span>,</span>
-<span id="cb483-4"><a href="c14-ambarom-vignette.html#cb483-4" tabindex="-1"></a>                            <span class="at">na.rm =</span> <span class="cn">TRUE</span>) <span class="sc">*</span> <span class="dv">100</span>)</span>
-<span id="cb483-5"><a href="c14-ambarom-vignette.html#cb483-5" tabindex="-1"></a></span>
-<span id="cb483-6"><a href="c14-ambarom-vignette.html#cb483-6" tabindex="-1"></a>covid_worry_country_ests</span></code></pre></div>
+<div class="sourceCode" id="cb477"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb477-1"><a href="c14-ambarom-vignette.html#cb477-1" tabindex="-1"></a>covid_worry_country_ests <span class="ot">&lt;-</span> covid_worry_collapse <span class="sc">%&gt;%</span></span>
+<span id="cb477-2"><a href="c14-ambarom-vignette.html#cb477-2" tabindex="-1"></a>  <span class="fu">group_by</span>(Country) <span class="sc">%&gt;%</span></span>
+<span id="cb477-3"><a href="c14-ambarom-vignette.html#cb477-3" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">p =</span> <span class="fu">survey_mean</span>(CovidWorry_bin <span class="sc">==</span> <span class="st">&quot;WorriedHi&quot;</span>,</span>
+<span id="cb477-4"><a href="c14-ambarom-vignette.html#cb477-4" tabindex="-1"></a>                            <span class="at">na.rm =</span> <span class="cn">TRUE</span>) <span class="sc">*</span> <span class="dv">100</span>)</span>
+<span id="cb477-5"><a href="c14-ambarom-vignette.html#cb477-5" tabindex="-1"></a></span>
+<span id="cb477-6"><a href="c14-ambarom-vignette.html#cb477-6" tabindex="-1"></a>covid_worry_country_ests</span></code></pre></div>
 <pre><code>## # A tibble: 22 × 3
 ##    Country                p  p_se
 ##    &lt;fct&gt;              &lt;dbl&gt; &lt;dbl&gt;
@@ -710,31 +710,31 @@ <h3><span class="header-section-number">14.5.1</span> Example: Worried about COV
 ## 10 El Salvador         52.5 1.02 
 ## # ℹ 12 more rows</code></pre>
 <p>To view the results for all countries, we can use the {gt} package to create Table <a href="c14-ambarom-vignette.html#tab:ambarom-worry-tab">14.1</a> <span class="citation">(<a href="#ref-R-gt">Iannone et al. 2023</a>)</span>.</p>
-<div class="sourceCode" id="cb485"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb485-1"><a href="c14-ambarom-vignette.html#cb485-1" tabindex="-1"></a>covid_worry_country_ests_gt <span class="ot">&lt;-</span> covid_worry_country_ests <span class="sc">%&gt;%</span></span>
-<span id="cb485-2"><a href="c14-ambarom-vignette.html#cb485-2" tabindex="-1"></a>  <span class="fu">gt</span>(<span class="at">rowname_col =</span> <span class="st">&quot;Country&quot;</span>) <span class="sc">%&gt;%</span></span>
-<span id="cb485-3"><a href="c14-ambarom-vignette.html#cb485-3" tabindex="-1"></a>  <span class="fu">cols_label</span>(<span class="at">p =</span> <span class="st">&quot;Percent&quot;</span>,</span>
-<span id="cb485-4"><a href="c14-ambarom-vignette.html#cb485-4" tabindex="-1"></a>             <span class="at">p_se =</span> <span class="st">&quot;SE&quot;</span>) <span class="sc">%&gt;%</span></span>
-<span id="cb485-5"><a href="c14-ambarom-vignette.html#cb485-5" tabindex="-1"></a>  <span class="fu">fmt_number</span>(<span class="at">decimals =</span> <span class="dv">1</span>) <span class="sc">%&gt;%</span></span>
-<span id="cb485-6"><a href="c14-ambarom-vignette.html#cb485-6" tabindex="-1"></a>  <span class="fu">tab_source_note</span>(<span class="st">&quot;AmericasBarometer Surveys, 2021&quot;</span>)</span></code></pre></div>
-<div class="sourceCode" id="cb486"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb486-1"><a href="c14-ambarom-vignette.html#cb486-1" tabindex="-1"></a>covid_worry_country_ests_gt</span></code></pre></div>
-
-<div id="ismfkpkdnv" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
-<style>#ismfkpkdnv table {
+<div class="sourceCode" id="cb479"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb479-1"><a href="c14-ambarom-vignette.html#cb479-1" tabindex="-1"></a>covid_worry_country_ests_gt <span class="ot">&lt;-</span> covid_worry_country_ests <span class="sc">%&gt;%</span></span>
+<span id="cb479-2"><a href="c14-ambarom-vignette.html#cb479-2" tabindex="-1"></a>  <span class="fu">gt</span>(<span class="at">rowname_col =</span> <span class="st">&quot;Country&quot;</span>) <span class="sc">%&gt;%</span></span>
+<span id="cb479-3"><a href="c14-ambarom-vignette.html#cb479-3" tabindex="-1"></a>  <span class="fu">cols_label</span>(<span class="at">p =</span> <span class="st">&quot;Percent&quot;</span>,</span>
+<span id="cb479-4"><a href="c14-ambarom-vignette.html#cb479-4" tabindex="-1"></a>             <span class="at">p_se =</span> <span class="st">&quot;SE&quot;</span>) <span class="sc">%&gt;%</span></span>
+<span id="cb479-5"><a href="c14-ambarom-vignette.html#cb479-5" tabindex="-1"></a>  <span class="fu">fmt_number</span>(<span class="at">decimals =</span> <span class="dv">1</span>) <span class="sc">%&gt;%</span></span>
+<span id="cb479-6"><a href="c14-ambarom-vignette.html#cb479-6" tabindex="-1"></a>  <span class="fu">tab_source_note</span>(<span class="st">&quot;AmericasBarometer Surveys, 2021&quot;</span>)</span></code></pre></div>
+<div class="sourceCode" id="cb480"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb480-1"><a href="c14-ambarom-vignette.html#cb480-1" tabindex="-1"></a>covid_worry_country_ests_gt</span></code></pre></div>
+
+<div id="ibkckwmzsj" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
+<style>#ibkckwmzsj table {
   font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
   -webkit-font-smoothing: antialiased;
   -moz-osx-font-smoothing: grayscale;
 }
 
-#ismfkpkdnv thead, #ismfkpkdnv tbody, #ismfkpkdnv tfoot, #ismfkpkdnv tr, #ismfkpkdnv td, #ismfkpkdnv th {
+#ibkckwmzsj thead, #ibkckwmzsj tbody, #ibkckwmzsj tfoot, #ibkckwmzsj tr, #ibkckwmzsj td, #ibkckwmzsj th {
   border-style: none;
 }
 
-#ismfkpkdnv p {
+#ibkckwmzsj p {
   margin: 0;
   padding: 0;
 }
 
-#ismfkpkdnv .gt_table {
+#ibkckwmzsj .gt_table {
   display: table;
   border-collapse: collapse;
   line-height: normal;
@@ -760,12 +760,12 @@ <h3><span class="header-section-number">14.5.1</span> Example: Worried about COV
   border-left-color: #D3D3D3;
 }
 
-#ismfkpkdnv .gt_caption {
+#ibkckwmzsj .gt_caption {
   padding-top: 4px;
   padding-bottom: 4px;
 }
 
-#ismfkpkdnv .gt_title {
+#ibkckwmzsj .gt_title {
   color: #333333;
   font-size: 125%;
   font-weight: initial;
@@ -777,7 +777,7 @@ <h3><span class="header-section-number">14.5.1</span> Example: Worried about COV
   border-bottom-width: 0;
 }
 
-#ismfkpkdnv .gt_subtitle {
+#ibkckwmzsj .gt_subtitle {
   color: #333333;
   font-size: 85%;
   font-weight: initial;
@@ -789,7 +789,7 @@ <h3><span class="header-section-number">14.5.1</span> Example: Worried about COV
   border-top-width: 0;
 }
 
-#ismfkpkdnv .gt_heading {
+#ibkckwmzsj .gt_heading {
   background-color: #FFFFFF;
   text-align: center;
   border-bottom-color: #FFFFFF;
@@ -801,13 +801,13 @@ <h3><span class="header-section-number">14.5.1</span> Example: Worried about COV
   border-right-color: #D3D3D3;
 }
 
-#ismfkpkdnv .gt_bottom_border {
+#ibkckwmzsj .gt_bottom_border {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
 }
 
-#ismfkpkdnv .gt_col_headings {
+#ibkckwmzsj .gt_col_headings {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -822,7 +822,7 @@ <h3><span class="header-section-number">14.5.1</span> Example: Worried about COV
   border-right-color: #D3D3D3;
 }
 
-#ismfkpkdnv .gt_col_heading {
+#ibkckwmzsj .gt_col_heading {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -842,7 +842,7 @@ <h3><span class="header-section-number">14.5.1</span> Example: Worried about COV
   overflow-x: hidden;
 }
 
-#ismfkpkdnv .gt_column_spanner_outer {
+#ibkckwmzsj .gt_column_spanner_outer {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -854,15 +854,15 @@ <h3><span class="header-section-number">14.5.1</span> Example: Worried about COV
   padding-right: 4px;
 }
 
-#ismfkpkdnv .gt_column_spanner_outer:first-child {
+#ibkckwmzsj .gt_column_spanner_outer:first-child {
   padding-left: 0;
 }
 
-#ismfkpkdnv .gt_column_spanner_outer:last-child {
+#ibkckwmzsj .gt_column_spanner_outer:last-child {
   padding-right: 0;
 }
 
-#ismfkpkdnv .gt_column_spanner {
+#ibkckwmzsj .gt_column_spanner {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
@@ -874,11 +874,11 @@ <h3><span class="header-section-number">14.5.1</span> Example: Worried about COV
   width: 100%;
 }
 
-#ismfkpkdnv .gt_spanner_row {
+#ibkckwmzsj .gt_spanner_row {
   border-bottom-style: hidden;
 }
 
-#ismfkpkdnv .gt_group_heading {
+#ibkckwmzsj .gt_group_heading {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -904,7 +904,7 @@ <h3><span class="header-section-number">14.5.1</span> Example: Worried about COV
   text-align: left;
 }
 
-#ismfkpkdnv .gt_empty_group_heading {
+#ibkckwmzsj .gt_empty_group_heading {
   padding: 0.5px;
   color: #333333;
   background-color: #FFFFFF;
@@ -919,15 +919,15 @@ <h3><span class="header-section-number">14.5.1</span> Example: Worried about COV
   vertical-align: middle;
 }
 
-#ismfkpkdnv .gt_from_md > :first-child {
+#ibkckwmzsj .gt_from_md > :first-child {
   margin-top: 0;
 }
 
-#ismfkpkdnv .gt_from_md > :last-child {
+#ibkckwmzsj .gt_from_md > :last-child {
   margin-bottom: 0;
 }
 
-#ismfkpkdnv .gt_row {
+#ibkckwmzsj .gt_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -946,7 +946,7 @@ <h3><span class="header-section-number">14.5.1</span> Example: Worried about COV
   overflow-x: hidden;
 }
 
-#ismfkpkdnv .gt_stub {
+#ibkckwmzsj .gt_stub {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -959,7 +959,7 @@ <h3><span class="header-section-number">14.5.1</span> Example: Worried about COV
   padding-right: 5px;
 }
 
-#ismfkpkdnv .gt_stub_row_group {
+#ibkckwmzsj .gt_stub_row_group {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -973,15 +973,15 @@ <h3><span class="header-section-number">14.5.1</span> Example: Worried about COV
   vertical-align: top;
 }
 
-#ismfkpkdnv .gt_row_group_first td {
+#ibkckwmzsj .gt_row_group_first td {
   border-top-width: 2px;
 }
 
-#ismfkpkdnv .gt_row_group_first th {
+#ibkckwmzsj .gt_row_group_first th {
   border-top-width: 2px;
 }
 
-#ismfkpkdnv .gt_summary_row {
+#ibkckwmzsj .gt_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -991,16 +991,16 @@ <h3><span class="header-section-number">14.5.1</span> Example: Worried about COV
   padding-right: 5px;
 }
 
-#ismfkpkdnv .gt_first_summary_row {
+#ibkckwmzsj .gt_first_summary_row {
   border-top-style: solid;
   border-top-color: #D3D3D3;
 }
 
-#ismfkpkdnv .gt_first_summary_row.thick {
+#ibkckwmzsj .gt_first_summary_row.thick {
   border-top-width: 2px;
 }
 
-#ismfkpkdnv .gt_last_summary_row {
+#ibkckwmzsj .gt_last_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -1010,7 +1010,7 @@ <h3><span class="header-section-number">14.5.1</span> Example: Worried about COV
   border-bottom-color: #D3D3D3;
 }
 
-#ismfkpkdnv .gt_grand_summary_row {
+#ibkckwmzsj .gt_grand_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -1020,7 +1020,7 @@ <h3><span class="header-section-number">14.5.1</span> Example: Worried about COV
   padding-right: 5px;
 }
 
-#ismfkpkdnv .gt_first_grand_summary_row {
+#ibkckwmzsj .gt_first_grand_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -1030,7 +1030,7 @@ <h3><span class="header-section-number">14.5.1</span> Example: Worried about COV
   border-top-color: #D3D3D3;
 }
 
-#ismfkpkdnv .gt_last_grand_summary_row_top {
+#ibkckwmzsj .gt_last_grand_summary_row_top {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -1040,11 +1040,11 @@ <h3><span class="header-section-number">14.5.1</span> Example: Worried about COV
   border-bottom-color: #D3D3D3;
 }
 
-#ismfkpkdnv .gt_striped {
+#ibkckwmzsj .gt_striped {
   background-color: rgba(128, 128, 128, 0.05);
 }
 
-#ismfkpkdnv .gt_table_body {
+#ibkckwmzsj .gt_table_body {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -1053,7 +1053,7 @@ <h3><span class="header-section-number">14.5.1</span> Example: Worried about COV
   border-bottom-color: #D3D3D3;
 }
 
-#ismfkpkdnv .gt_footnotes {
+#ibkckwmzsj .gt_footnotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -1067,7 +1067,7 @@ <h3><span class="header-section-number">14.5.1</span> Example: Worried about COV
   border-right-color: #D3D3D3;
 }
 
-#ismfkpkdnv .gt_footnote {
+#ibkckwmzsj .gt_footnote {
   margin: 0px;
   font-size: 90%;
   padding-top: 4px;
@@ -1076,7 +1076,7 @@ <h3><span class="header-section-number">14.5.1</span> Example: Worried about COV
   padding-right: 5px;
 }
 
-#ismfkpkdnv .gt_sourcenotes {
+#ibkckwmzsj .gt_sourcenotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -1090,7 +1090,7 @@ <h3><span class="header-section-number">14.5.1</span> Example: Worried about COV
   border-right-color: #D3D3D3;
 }
 
-#ismfkpkdnv .gt_sourcenote {
+#ibkckwmzsj .gt_sourcenote {
   font-size: 90%;
   padding-top: 4px;
   padding-bottom: 4px;
@@ -1098,63 +1098,63 @@ <h3><span class="header-section-number">14.5.1</span> Example: Worried about COV
   padding-right: 5px;
 }
 
-#ismfkpkdnv .gt_left {
+#ibkckwmzsj .gt_left {
   text-align: left;
 }
 
-#ismfkpkdnv .gt_center {
+#ibkckwmzsj .gt_center {
   text-align: center;
 }
 
-#ismfkpkdnv .gt_right {
+#ibkckwmzsj .gt_right {
   text-align: right;
   font-variant-numeric: tabular-nums;
 }
 
-#ismfkpkdnv .gt_font_normal {
+#ibkckwmzsj .gt_font_normal {
   font-weight: normal;
 }
 
-#ismfkpkdnv .gt_font_bold {
+#ibkckwmzsj .gt_font_bold {
   font-weight: bold;
 }
 
-#ismfkpkdnv .gt_font_italic {
+#ibkckwmzsj .gt_font_italic {
   font-style: italic;
 }
 
-#ismfkpkdnv .gt_super {
+#ibkckwmzsj .gt_super {
   font-size: 65%;
 }
 
-#ismfkpkdnv .gt_footnote_marks {
+#ibkckwmzsj .gt_footnote_marks {
   font-size: 75%;
   vertical-align: 0.4em;
   position: initial;
 }
 
-#ismfkpkdnv .gt_asterisk {
+#ibkckwmzsj .gt_asterisk {
   font-size: 100%;
   vertical-align: 0;
 }
 
-#ismfkpkdnv .gt_indent_1 {
+#ibkckwmzsj .gt_indent_1 {
   text-indent: 5px;
 }
 
-#ismfkpkdnv .gt_indent_2 {
+#ibkckwmzsj .gt_indent_2 {
   text-indent: 10px;
 }
 
-#ismfkpkdnv .gt_indent_3 {
+#ibkckwmzsj .gt_indent_3 {
   text-indent: 15px;
 }
 
-#ismfkpkdnv .gt_indent_4 {
+#ibkckwmzsj .gt_indent_4 {
   text-indent: 20px;
 }
 
-#ismfkpkdnv .gt_indent_5 {
+#ibkckwmzsj .gt_indent_5 {
   text-indent: 25px;
 }
 </style>
@@ -1264,11 +1264,11 @@ <h3><span class="header-section-number">14.5.2</span> Example: Education affecte
 <li>Yes, they switched to a combination of virtual and in-person classes: <code>Educ_Hybrid</code></li>
 </ul>
 <p>The unweighted cross-tab for these responses is included below. It reveals a wide range of impacts, where many combinations of effects on education are possible.</p>
-<div class="sourceCode" id="cb487"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb487-1"><a href="c14-ambarom-vignette.html#cb487-1" tabindex="-1"></a>ambarom <span class="sc">%&gt;%</span></span>
-<span id="cb487-2"><a href="c14-ambarom-vignette.html#cb487-2" tabindex="-1"></a>  <span class="fu">filter</span>(Educ_NotInSchool <span class="sc">==</span> <span class="dv">0</span>) <span class="sc">%&gt;%</span></span>
-<span id="cb487-3"><a href="c14-ambarom-vignette.html#cb487-3" tabindex="-1"></a>  <span class="fu">count</span>(Educ_NormalSchool,</span>
-<span id="cb487-4"><a href="c14-ambarom-vignette.html#cb487-4" tabindex="-1"></a>        Educ_VirtualSchool,</span>
-<span id="cb487-5"><a href="c14-ambarom-vignette.html#cb487-5" tabindex="-1"></a>        Educ_Hybrid)</span></code></pre></div>
+<div class="sourceCode" id="cb481"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb481-1"><a href="c14-ambarom-vignette.html#cb481-1" tabindex="-1"></a>ambarom <span class="sc">%&gt;%</span></span>
+<span id="cb481-2"><a href="c14-ambarom-vignette.html#cb481-2" tabindex="-1"></a>  <span class="fu">filter</span>(Educ_NotInSchool <span class="sc">==</span> <span class="dv">0</span>) <span class="sc">%&gt;%</span></span>
+<span id="cb481-3"><a href="c14-ambarom-vignette.html#cb481-3" tabindex="-1"></a>  <span class="fu">count</span>(Educ_NormalSchool,</span>
+<span id="cb481-4"><a href="c14-ambarom-vignette.html#cb481-4" tabindex="-1"></a>        Educ_VirtualSchool,</span>
+<span id="cb481-5"><a href="c14-ambarom-vignette.html#cb481-5" tabindex="-1"></a>        Educ_Hybrid)</span></code></pre></div>
 <pre><code>## # A tibble: 8 × 4
 ##   Educ_NormalSchool Educ_VirtualSchool Educ_Hybrid     n
 ##               &lt;dbl&gt;              &lt;dbl&gt;       &lt;dbl&gt; &lt;int&gt;
@@ -1287,21 +1287,21 @@ <h3><span class="header-section-number">14.5.2</span> Example: Education affecte
 <li>What percentage of households indicated that they cut ties with their school?</li>
 </ul>
 <p>To find the answers, we create indicators for the first two questions, make national estimates for all three questions, and then construct a summary table for easy viewing. First, we create and inspect the indicators and their distributions using <code>survey_count()</code>.</p>
-<div class="sourceCode" id="cb489"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb489-1"><a href="c14-ambarom-vignette.html#cb489-1" tabindex="-1"></a>ambarom_des_educ <span class="ot">&lt;-</span> ambarom_des <span class="sc">%&gt;%</span></span>
-<span id="cb489-2"><a href="c14-ambarom-vignette.html#cb489-2" tabindex="-1"></a>  <span class="fu">filter</span>(Educ_NotInSchool <span class="sc">==</span> <span class="dv">0</span>) <span class="sc">%&gt;%</span></span>
-<span id="cb489-3"><a href="c14-ambarom-vignette.html#cb489-3" tabindex="-1"></a>  <span class="fu">mutate</span>(</span>
-<span id="cb489-4"><a href="c14-ambarom-vignette.html#cb489-4" tabindex="-1"></a>    <span class="at">Educ_OnlyNormal =</span> (Educ_NormalSchool <span class="sc">==</span> <span class="dv">1</span> <span class="sc">&amp;</span></span>
-<span id="cb489-5"><a href="c14-ambarom-vignette.html#cb489-5" tabindex="-1"></a>                         Educ_VirtualSchool <span class="sc">==</span> <span class="dv">0</span> <span class="sc">&amp;</span></span>
-<span id="cb489-6"><a href="c14-ambarom-vignette.html#cb489-6" tabindex="-1"></a>                         Educ_Hybrid <span class="sc">==</span> <span class="dv">0</span>),</span>
-<span id="cb489-7"><a href="c14-ambarom-vignette.html#cb489-7" tabindex="-1"></a>    <span class="at">Educ_MediumChange =</span> (Educ_VirtualSchool <span class="sc">==</span> <span class="dv">1</span> <span class="sc">|</span></span>
-<span id="cb489-8"><a href="c14-ambarom-vignette.html#cb489-8" tabindex="-1"></a>                           Educ_Hybrid <span class="sc">==</span> <span class="dv">1</span>)</span>
-<span id="cb489-9"><a href="c14-ambarom-vignette.html#cb489-9" tabindex="-1"></a>  )</span>
-<span id="cb489-10"><a href="c14-ambarom-vignette.html#cb489-10" tabindex="-1"></a></span>
-<span id="cb489-11"><a href="c14-ambarom-vignette.html#cb489-11" tabindex="-1"></a>ambarom_des_educ <span class="sc">%&gt;%</span></span>
-<span id="cb489-12"><a href="c14-ambarom-vignette.html#cb489-12" tabindex="-1"></a>  <span class="fu">survey_count</span>(Educ_OnlyNormal,</span>
-<span id="cb489-13"><a href="c14-ambarom-vignette.html#cb489-13" tabindex="-1"></a>               Educ_NormalSchool,</span>
-<span id="cb489-14"><a href="c14-ambarom-vignette.html#cb489-14" tabindex="-1"></a>               Educ_VirtualSchool,</span>
-<span id="cb489-15"><a href="c14-ambarom-vignette.html#cb489-15" tabindex="-1"></a>               Educ_Hybrid)</span></code></pre></div>
+<div class="sourceCode" id="cb483"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb483-1"><a href="c14-ambarom-vignette.html#cb483-1" tabindex="-1"></a>ambarom_des_educ <span class="ot">&lt;-</span> ambarom_des <span class="sc">%&gt;%</span></span>
+<span id="cb483-2"><a href="c14-ambarom-vignette.html#cb483-2" tabindex="-1"></a>  <span class="fu">filter</span>(Educ_NotInSchool <span class="sc">==</span> <span class="dv">0</span>) <span class="sc">%&gt;%</span></span>
+<span id="cb483-3"><a href="c14-ambarom-vignette.html#cb483-3" tabindex="-1"></a>  <span class="fu">mutate</span>(</span>
+<span id="cb483-4"><a href="c14-ambarom-vignette.html#cb483-4" tabindex="-1"></a>    <span class="at">Educ_OnlyNormal =</span> (Educ_NormalSchool <span class="sc">==</span> <span class="dv">1</span> <span class="sc">&amp;</span></span>
+<span id="cb483-5"><a href="c14-ambarom-vignette.html#cb483-5" tabindex="-1"></a>                         Educ_VirtualSchool <span class="sc">==</span> <span class="dv">0</span> <span class="sc">&amp;</span></span>
+<span id="cb483-6"><a href="c14-ambarom-vignette.html#cb483-6" tabindex="-1"></a>                         Educ_Hybrid <span class="sc">==</span> <span class="dv">0</span>),</span>
+<span id="cb483-7"><a href="c14-ambarom-vignette.html#cb483-7" tabindex="-1"></a>    <span class="at">Educ_MediumChange =</span> (Educ_VirtualSchool <span class="sc">==</span> <span class="dv">1</span> <span class="sc">|</span></span>
+<span id="cb483-8"><a href="c14-ambarom-vignette.html#cb483-8" tabindex="-1"></a>                           Educ_Hybrid <span class="sc">==</span> <span class="dv">1</span>)</span>
+<span id="cb483-9"><a href="c14-ambarom-vignette.html#cb483-9" tabindex="-1"></a>  )</span>
+<span id="cb483-10"><a href="c14-ambarom-vignette.html#cb483-10" tabindex="-1"></a></span>
+<span id="cb483-11"><a href="c14-ambarom-vignette.html#cb483-11" tabindex="-1"></a>ambarom_des_educ <span class="sc">%&gt;%</span></span>
+<span id="cb483-12"><a href="c14-ambarom-vignette.html#cb483-12" tabindex="-1"></a>  <span class="fu">survey_count</span>(Educ_OnlyNormal,</span>
+<span id="cb483-13"><a href="c14-ambarom-vignette.html#cb483-13" tabindex="-1"></a>               Educ_NormalSchool,</span>
+<span id="cb483-14"><a href="c14-ambarom-vignette.html#cb483-14" tabindex="-1"></a>               Educ_VirtualSchool,</span>
+<span id="cb483-15"><a href="c14-ambarom-vignette.html#cb483-15" tabindex="-1"></a>               Educ_Hybrid)</span></code></pre></div>
 <pre><code>## # A tibble: 8 × 6
 ##   Educ_OnlyNormal Educ_NormalSchool Educ_VirtualSchool Educ_Hybrid
 ##   &lt;lgl&gt;                       &lt;dbl&gt;              &lt;dbl&gt;       &lt;dbl&gt;
@@ -1314,10 +1314,10 @@ <h3><span class="header-section-number">14.5.2</span> Example: Education affecte
 ## 7 FALSE                           1                  1           1
 ## 8 TRUE                            1                  0           0
 ## # ℹ 2 more variables: n &lt;dbl&gt;, n_se &lt;dbl&gt;</code></pre>
-<div class="sourceCode" id="cb491"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb491-1"><a href="c14-ambarom-vignette.html#cb491-1" tabindex="-1"></a>ambarom_des_educ <span class="sc">%&gt;%</span></span>
-<span id="cb491-2"><a href="c14-ambarom-vignette.html#cb491-2" tabindex="-1"></a>  <span class="fu">survey_count</span>(Educ_MediumChange,</span>
-<span id="cb491-3"><a href="c14-ambarom-vignette.html#cb491-3" tabindex="-1"></a>               Educ_VirtualSchool,</span>
-<span id="cb491-4"><a href="c14-ambarom-vignette.html#cb491-4" tabindex="-1"></a>               Educ_Hybrid)</span></code></pre></div>
+<div class="sourceCode" id="cb485"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb485-1"><a href="c14-ambarom-vignette.html#cb485-1" tabindex="-1"></a>ambarom_des_educ <span class="sc">%&gt;%</span></span>
+<span id="cb485-2"><a href="c14-ambarom-vignette.html#cb485-2" tabindex="-1"></a>  <span class="fu">survey_count</span>(Educ_MediumChange,</span>
+<span id="cb485-3"><a href="c14-ambarom-vignette.html#cb485-3" tabindex="-1"></a>               Educ_VirtualSchool,</span>
+<span id="cb485-4"><a href="c14-ambarom-vignette.html#cb485-4" tabindex="-1"></a>               Educ_Hybrid)</span></code></pre></div>
 <pre><code>## # A tibble: 4 × 5
 ##   Educ_MediumChange Educ_VirtualSchool Educ_Hybrid     n  n_se
 ##   &lt;lgl&gt;                          &lt;dbl&gt;       &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt;
@@ -1326,16 +1326,16 @@ <h3><span class="header-section-number">14.5.2</span> Example: Education affecte
 ## 3 TRUE                               1           0 3812. 49.4 
 ## 4 TRUE                               1           1  136.  9.86</code></pre>
 <p>Next, we group the data by country and calculate the population estimates for our three questions.</p>
-<div class="sourceCode" id="cb493"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb493-1"><a href="c14-ambarom-vignette.html#cb493-1" tabindex="-1"></a>covid_educ_ests <span class="ot">&lt;-</span></span>
-<span id="cb493-2"><a href="c14-ambarom-vignette.html#cb493-2" tabindex="-1"></a>  ambarom_des_educ <span class="sc">%&gt;%</span></span>
-<span id="cb493-3"><a href="c14-ambarom-vignette.html#cb493-3" tabindex="-1"></a>  <span class="fu">group_by</span>(Country) <span class="sc">%&gt;%</span></span>
-<span id="cb493-4"><a href="c14-ambarom-vignette.html#cb493-4" tabindex="-1"></a>  <span class="fu">summarize</span>(</span>
-<span id="cb493-5"><a href="c14-ambarom-vignette.html#cb493-5" tabindex="-1"></a>    <span class="at">p_onlynormal =</span> <span class="fu">survey_mean</span>(Educ_OnlyNormal, <span class="at">na.rm =</span> <span class="cn">TRUE</span>) <span class="sc">*</span> <span class="dv">100</span>,</span>
-<span id="cb493-6"><a href="c14-ambarom-vignette.html#cb493-6" tabindex="-1"></a>    <span class="at">p_mediumchange =</span> <span class="fu">survey_mean</span>(Educ_MediumChange, <span class="at">na.rm =</span> <span class="cn">TRUE</span>) <span class="sc">*</span> <span class="dv">100</span>,</span>
-<span id="cb493-7"><a href="c14-ambarom-vignette.html#cb493-7" tabindex="-1"></a>    <span class="at">p_noschool =</span> <span class="fu">survey_mean</span>(Educ_NoSchool, <span class="at">na.rm =</span> <span class="cn">TRUE</span>) <span class="sc">*</span> <span class="dv">100</span>,</span>
-<span id="cb493-8"><a href="c14-ambarom-vignette.html#cb493-8" tabindex="-1"></a>  ) </span>
-<span id="cb493-9"><a href="c14-ambarom-vignette.html#cb493-9" tabindex="-1"></a></span>
-<span id="cb493-10"><a href="c14-ambarom-vignette.html#cb493-10" tabindex="-1"></a>covid_educ_ests</span></code></pre></div>
+<div class="sourceCode" id="cb487"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb487-1"><a href="c14-ambarom-vignette.html#cb487-1" tabindex="-1"></a>covid_educ_ests <span class="ot">&lt;-</span></span>
+<span id="cb487-2"><a href="c14-ambarom-vignette.html#cb487-2" tabindex="-1"></a>  ambarom_des_educ <span class="sc">%&gt;%</span></span>
+<span id="cb487-3"><a href="c14-ambarom-vignette.html#cb487-3" tabindex="-1"></a>  <span class="fu">group_by</span>(Country) <span class="sc">%&gt;%</span></span>
+<span id="cb487-4"><a href="c14-ambarom-vignette.html#cb487-4" tabindex="-1"></a>  <span class="fu">summarize</span>(</span>
+<span id="cb487-5"><a href="c14-ambarom-vignette.html#cb487-5" tabindex="-1"></a>    <span class="at">p_onlynormal =</span> <span class="fu">survey_mean</span>(Educ_OnlyNormal, <span class="at">na.rm =</span> <span class="cn">TRUE</span>) <span class="sc">*</span> <span class="dv">100</span>,</span>
+<span id="cb487-6"><a href="c14-ambarom-vignette.html#cb487-6" tabindex="-1"></a>    <span class="at">p_mediumchange =</span> <span class="fu">survey_mean</span>(Educ_MediumChange, <span class="at">na.rm =</span> <span class="cn">TRUE</span>) <span class="sc">*</span> <span class="dv">100</span>,</span>
+<span id="cb487-7"><a href="c14-ambarom-vignette.html#cb487-7" tabindex="-1"></a>    <span class="at">p_noschool =</span> <span class="fu">survey_mean</span>(Educ_NoSchool, <span class="at">na.rm =</span> <span class="cn">TRUE</span>) <span class="sc">*</span> <span class="dv">100</span>,</span>
+<span id="cb487-8"><a href="c14-ambarom-vignette.html#cb487-8" tabindex="-1"></a>  ) </span>
+<span id="cb487-9"><a href="c14-ambarom-vignette.html#cb487-9" tabindex="-1"></a></span>
+<span id="cb487-10"><a href="c14-ambarom-vignette.html#cb487-10" tabindex="-1"></a>covid_educ_ests</span></code></pre></div>
 <pre><code>## # A tibble: 16 × 7
 ##    Country p_onlynormal p_onlynormal_se p_mediumchange p_mediumchange_se
 ##    &lt;fct&gt;          &lt;dbl&gt;           &lt;dbl&gt;          &lt;dbl&gt;             &lt;dbl&gt;
@@ -1357,43 +1357,43 @@ <h3><span class="header-section-number">14.5.2</span> Example: Education affecte
 ## 16 Uruguay        8.60            1.40           84.3              2.02 
 ## # ℹ 2 more variables: p_noschool &lt;dbl&gt;, p_noschool_se &lt;dbl&gt;</code></pre>
 <p>Finally, to view the results for all countries, we can use the {gt} package to construct Table <a href="c14-ambarom-vignette.html#tab:ambarom-covid-ed-der-tab">14.2</a>.</p>
-<div class="sourceCode" id="cb495"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb495-1"><a href="c14-ambarom-vignette.html#cb495-1" tabindex="-1"></a>covid_educ_ests_gt <span class="ot">&lt;-</span> covid_educ_ests <span class="sc">%&gt;%</span></span>
-<span id="cb495-2"><a href="c14-ambarom-vignette.html#cb495-2" tabindex="-1"></a>  <span class="fu">gt</span>(<span class="at">rowname_col =</span> <span class="st">&quot;Country&quot;</span>) <span class="sc">%&gt;%</span></span>
-<span id="cb495-3"><a href="c14-ambarom-vignette.html#cb495-3" tabindex="-1"></a>  <span class="fu">cols_label</span>(</span>
-<span id="cb495-4"><a href="c14-ambarom-vignette.html#cb495-4" tabindex="-1"></a>    <span class="at">p_onlynormal =</span> <span class="st">&quot;%&quot;</span>,</span>
-<span id="cb495-5"><a href="c14-ambarom-vignette.html#cb495-5" tabindex="-1"></a>    <span class="at">p_onlynormal_se =</span> <span class="st">&quot;SE&quot;</span>,</span>
-<span id="cb495-6"><a href="c14-ambarom-vignette.html#cb495-6" tabindex="-1"></a>    <span class="at">p_mediumchange =</span> <span class="st">&quot;%&quot;</span>,</span>
-<span id="cb495-7"><a href="c14-ambarom-vignette.html#cb495-7" tabindex="-1"></a>    <span class="at">p_mediumchange_se =</span> <span class="st">&quot;SE&quot;</span>,</span>
-<span id="cb495-8"><a href="c14-ambarom-vignette.html#cb495-8" tabindex="-1"></a>    <span class="at">p_noschool =</span> <span class="st">&quot;%&quot;</span>,</span>
-<span id="cb495-9"><a href="c14-ambarom-vignette.html#cb495-9" tabindex="-1"></a>    <span class="at">p_noschool_se =</span> <span class="st">&quot;SE&quot;</span></span>
-<span id="cb495-10"><a href="c14-ambarom-vignette.html#cb495-10" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
-<span id="cb495-11"><a href="c14-ambarom-vignette.html#cb495-11" tabindex="-1"></a>  <span class="fu">tab_spanner</span>(<span class="at">label =</span> <span class="st">&quot;Normal school only&quot;</span>,</span>
-<span id="cb495-12"><a href="c14-ambarom-vignette.html#cb495-12" tabindex="-1"></a>              <span class="at">columns =</span> <span class="fu">c</span>(<span class="st">&quot;p_onlynormal&quot;</span>, <span class="st">&quot;p_onlynormal_se&quot;</span>)) <span class="sc">%&gt;%</span></span>
-<span id="cb495-13"><a href="c14-ambarom-vignette.html#cb495-13" tabindex="-1"></a>  <span class="fu">tab_spanner</span>(<span class="at">label =</span> <span class="st">&quot;Medium change&quot;</span>,</span>
-<span id="cb495-14"><a href="c14-ambarom-vignette.html#cb495-14" tabindex="-1"></a>              <span class="at">columns =</span> <span class="fu">c</span>(<span class="st">&quot;p_mediumchange&quot;</span>, <span class="st">&quot;p_mediumchange_se&quot;</span>)) <span class="sc">%&gt;%</span></span>
-<span id="cb495-15"><a href="c14-ambarom-vignette.html#cb495-15" tabindex="-1"></a>  <span class="fu">tab_spanner</span>(<span class="at">label =</span> <span class="st">&quot;Cut ties with school&quot;</span>,</span>
-<span id="cb495-16"><a href="c14-ambarom-vignette.html#cb495-16" tabindex="-1"></a>              <span class="at">columns =</span> <span class="fu">c</span>(<span class="st">&quot;p_noschool&quot;</span>, <span class="st">&quot;p_noschool_se&quot;</span>)) <span class="sc">%&gt;%</span></span>
-<span id="cb495-17"><a href="c14-ambarom-vignette.html#cb495-17" tabindex="-1"></a>  <span class="fu">fmt_number</span>(<span class="at">decimals =</span> <span class="dv">1</span>) <span class="sc">%&gt;%</span></span>
-<span id="cb495-18"><a href="c14-ambarom-vignette.html#cb495-18" tabindex="-1"></a>  <span class="fu">tab_source_note</span>(<span class="st">&quot;AmericasBarometer Surveys, 2021&quot;</span>)</span></code></pre></div>
-<div class="sourceCode" id="cb496"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb496-1"><a href="c14-ambarom-vignette.html#cb496-1" tabindex="-1"></a>covid_educ_ests_gt</span></code></pre></div>
-
-<div id="zpnruhcqur" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
-<style>#zpnruhcqur table {
+<div class="sourceCode" id="cb489"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb489-1"><a href="c14-ambarom-vignette.html#cb489-1" tabindex="-1"></a>covid_educ_ests_gt <span class="ot">&lt;-</span> covid_educ_ests <span class="sc">%&gt;%</span></span>
+<span id="cb489-2"><a href="c14-ambarom-vignette.html#cb489-2" tabindex="-1"></a>  <span class="fu">gt</span>(<span class="at">rowname_col =</span> <span class="st">&quot;Country&quot;</span>) <span class="sc">%&gt;%</span></span>
+<span id="cb489-3"><a href="c14-ambarom-vignette.html#cb489-3" tabindex="-1"></a>  <span class="fu">cols_label</span>(</span>
+<span id="cb489-4"><a href="c14-ambarom-vignette.html#cb489-4" tabindex="-1"></a>    <span class="at">p_onlynormal =</span> <span class="st">&quot;%&quot;</span>,</span>
+<span id="cb489-5"><a href="c14-ambarom-vignette.html#cb489-5" tabindex="-1"></a>    <span class="at">p_onlynormal_se =</span> <span class="st">&quot;SE&quot;</span>,</span>
+<span id="cb489-6"><a href="c14-ambarom-vignette.html#cb489-6" tabindex="-1"></a>    <span class="at">p_mediumchange =</span> <span class="st">&quot;%&quot;</span>,</span>
+<span id="cb489-7"><a href="c14-ambarom-vignette.html#cb489-7" tabindex="-1"></a>    <span class="at">p_mediumchange_se =</span> <span class="st">&quot;SE&quot;</span>,</span>
+<span id="cb489-8"><a href="c14-ambarom-vignette.html#cb489-8" tabindex="-1"></a>    <span class="at">p_noschool =</span> <span class="st">&quot;%&quot;</span>,</span>
+<span id="cb489-9"><a href="c14-ambarom-vignette.html#cb489-9" tabindex="-1"></a>    <span class="at">p_noschool_se =</span> <span class="st">&quot;SE&quot;</span></span>
+<span id="cb489-10"><a href="c14-ambarom-vignette.html#cb489-10" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
+<span id="cb489-11"><a href="c14-ambarom-vignette.html#cb489-11" tabindex="-1"></a>  <span class="fu">tab_spanner</span>(<span class="at">label =</span> <span class="st">&quot;Normal school only&quot;</span>,</span>
+<span id="cb489-12"><a href="c14-ambarom-vignette.html#cb489-12" tabindex="-1"></a>              <span class="at">columns =</span> <span class="fu">c</span>(<span class="st">&quot;p_onlynormal&quot;</span>, <span class="st">&quot;p_onlynormal_se&quot;</span>)) <span class="sc">%&gt;%</span></span>
+<span id="cb489-13"><a href="c14-ambarom-vignette.html#cb489-13" tabindex="-1"></a>  <span class="fu">tab_spanner</span>(<span class="at">label =</span> <span class="st">&quot;Medium change&quot;</span>,</span>
+<span id="cb489-14"><a href="c14-ambarom-vignette.html#cb489-14" tabindex="-1"></a>              <span class="at">columns =</span> <span class="fu">c</span>(<span class="st">&quot;p_mediumchange&quot;</span>, <span class="st">&quot;p_mediumchange_se&quot;</span>)) <span class="sc">%&gt;%</span></span>
+<span id="cb489-15"><a href="c14-ambarom-vignette.html#cb489-15" tabindex="-1"></a>  <span class="fu">tab_spanner</span>(<span class="at">label =</span> <span class="st">&quot;Cut ties with school&quot;</span>,</span>
+<span id="cb489-16"><a href="c14-ambarom-vignette.html#cb489-16" tabindex="-1"></a>              <span class="at">columns =</span> <span class="fu">c</span>(<span class="st">&quot;p_noschool&quot;</span>, <span class="st">&quot;p_noschool_se&quot;</span>)) <span class="sc">%&gt;%</span></span>
+<span id="cb489-17"><a href="c14-ambarom-vignette.html#cb489-17" tabindex="-1"></a>  <span class="fu">fmt_number</span>(<span class="at">decimals =</span> <span class="dv">1</span>) <span class="sc">%&gt;%</span></span>
+<span id="cb489-18"><a href="c14-ambarom-vignette.html#cb489-18" tabindex="-1"></a>  <span class="fu">tab_source_note</span>(<span class="st">&quot;AmericasBarometer Surveys, 2021&quot;</span>)</span></code></pre></div>
+<div class="sourceCode" id="cb490"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb490-1"><a href="c14-ambarom-vignette.html#cb490-1" tabindex="-1"></a>covid_educ_ests_gt</span></code></pre></div>
+
+<div id="hrwokkyhya" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
+<style>#hrwokkyhya table {
   font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
   -webkit-font-smoothing: antialiased;
   -moz-osx-font-smoothing: grayscale;
 }
 
-#zpnruhcqur thead, #zpnruhcqur tbody, #zpnruhcqur tfoot, #zpnruhcqur tr, #zpnruhcqur td, #zpnruhcqur th {
+#hrwokkyhya thead, #hrwokkyhya tbody, #hrwokkyhya tfoot, #hrwokkyhya tr, #hrwokkyhya td, #hrwokkyhya th {
   border-style: none;
 }
 
-#zpnruhcqur p {
+#hrwokkyhya p {
   margin: 0;
   padding: 0;
 }
 
-#zpnruhcqur .gt_table {
+#hrwokkyhya .gt_table {
   display: table;
   border-collapse: collapse;
   line-height: normal;
@@ -1419,12 +1419,12 @@ <h3><span class="header-section-number">14.5.2</span> Example: Education affecte
   border-left-color: #D3D3D3;
 }
 
-#zpnruhcqur .gt_caption {
+#hrwokkyhya .gt_caption {
   padding-top: 4px;
   padding-bottom: 4px;
 }
 
-#zpnruhcqur .gt_title {
+#hrwokkyhya .gt_title {
   color: #333333;
   font-size: 125%;
   font-weight: initial;
@@ -1436,7 +1436,7 @@ <h3><span class="header-section-number">14.5.2</span> Example: Education affecte
   border-bottom-width: 0;
 }
 
-#zpnruhcqur .gt_subtitle {
+#hrwokkyhya .gt_subtitle {
   color: #333333;
   font-size: 85%;
   font-weight: initial;
@@ -1448,7 +1448,7 @@ <h3><span class="header-section-number">14.5.2</span> Example: Education affecte
   border-top-width: 0;
 }
 
-#zpnruhcqur .gt_heading {
+#hrwokkyhya .gt_heading {
   background-color: #FFFFFF;
   text-align: center;
   border-bottom-color: #FFFFFF;
@@ -1460,13 +1460,13 @@ <h3><span class="header-section-number">14.5.2</span> Example: Education affecte
   border-right-color: #D3D3D3;
 }
 
-#zpnruhcqur .gt_bottom_border {
+#hrwokkyhya .gt_bottom_border {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
 }
 
-#zpnruhcqur .gt_col_headings {
+#hrwokkyhya .gt_col_headings {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -1481,7 +1481,7 @@ <h3><span class="header-section-number">14.5.2</span> Example: Education affecte
   border-right-color: #D3D3D3;
 }
 
-#zpnruhcqur .gt_col_heading {
+#hrwokkyhya .gt_col_heading {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -1501,7 +1501,7 @@ <h3><span class="header-section-number">14.5.2</span> Example: Education affecte
   overflow-x: hidden;
 }
 
-#zpnruhcqur .gt_column_spanner_outer {
+#hrwokkyhya .gt_column_spanner_outer {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -1513,15 +1513,15 @@ <h3><span class="header-section-number">14.5.2</span> Example: Education affecte
   padding-right: 4px;
 }
 
-#zpnruhcqur .gt_column_spanner_outer:first-child {
+#hrwokkyhya .gt_column_spanner_outer:first-child {
   padding-left: 0;
 }
 
-#zpnruhcqur .gt_column_spanner_outer:last-child {
+#hrwokkyhya .gt_column_spanner_outer:last-child {
   padding-right: 0;
 }
 
-#zpnruhcqur .gt_column_spanner {
+#hrwokkyhya .gt_column_spanner {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
@@ -1533,11 +1533,11 @@ <h3><span class="header-section-number">14.5.2</span> Example: Education affecte
   width: 100%;
 }
 
-#zpnruhcqur .gt_spanner_row {
+#hrwokkyhya .gt_spanner_row {
   border-bottom-style: hidden;
 }
 
-#zpnruhcqur .gt_group_heading {
+#hrwokkyhya .gt_group_heading {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -1563,7 +1563,7 @@ <h3><span class="header-section-number">14.5.2</span> Example: Education affecte
   text-align: left;
 }
 
-#zpnruhcqur .gt_empty_group_heading {
+#hrwokkyhya .gt_empty_group_heading {
   padding: 0.5px;
   color: #333333;
   background-color: #FFFFFF;
@@ -1578,15 +1578,15 @@ <h3><span class="header-section-number">14.5.2</span> Example: Education affecte
   vertical-align: middle;
 }
 
-#zpnruhcqur .gt_from_md > :first-child {
+#hrwokkyhya .gt_from_md > :first-child {
   margin-top: 0;
 }
 
-#zpnruhcqur .gt_from_md > :last-child {
+#hrwokkyhya .gt_from_md > :last-child {
   margin-bottom: 0;
 }
 
-#zpnruhcqur .gt_row {
+#hrwokkyhya .gt_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -1605,7 +1605,7 @@ <h3><span class="header-section-number">14.5.2</span> Example: Education affecte
   overflow-x: hidden;
 }
 
-#zpnruhcqur .gt_stub {
+#hrwokkyhya .gt_stub {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -1618,7 +1618,7 @@ <h3><span class="header-section-number">14.5.2</span> Example: Education affecte
   padding-right: 5px;
 }
 
-#zpnruhcqur .gt_stub_row_group {
+#hrwokkyhya .gt_stub_row_group {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -1632,15 +1632,15 @@ <h3><span class="header-section-number">14.5.2</span> Example: Education affecte
   vertical-align: top;
 }
 
-#zpnruhcqur .gt_row_group_first td {
+#hrwokkyhya .gt_row_group_first td {
   border-top-width: 2px;
 }
 
-#zpnruhcqur .gt_row_group_first th {
+#hrwokkyhya .gt_row_group_first th {
   border-top-width: 2px;
 }
 
-#zpnruhcqur .gt_summary_row {
+#hrwokkyhya .gt_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -1650,16 +1650,16 @@ <h3><span class="header-section-number">14.5.2</span> Example: Education affecte
   padding-right: 5px;
 }
 
-#zpnruhcqur .gt_first_summary_row {
+#hrwokkyhya .gt_first_summary_row {
   border-top-style: solid;
   border-top-color: #D3D3D3;
 }
 
-#zpnruhcqur .gt_first_summary_row.thick {
+#hrwokkyhya .gt_first_summary_row.thick {
   border-top-width: 2px;
 }
 
-#zpnruhcqur .gt_last_summary_row {
+#hrwokkyhya .gt_last_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -1669,7 +1669,7 @@ <h3><span class="header-section-number">14.5.2</span> Example: Education affecte
   border-bottom-color: #D3D3D3;
 }
 
-#zpnruhcqur .gt_grand_summary_row {
+#hrwokkyhya .gt_grand_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -1679,7 +1679,7 @@ <h3><span class="header-section-number">14.5.2</span> Example: Education affecte
   padding-right: 5px;
 }
 
-#zpnruhcqur .gt_first_grand_summary_row {
+#hrwokkyhya .gt_first_grand_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -1689,7 +1689,7 @@ <h3><span class="header-section-number">14.5.2</span> Example: Education affecte
   border-top-color: #D3D3D3;
 }
 
-#zpnruhcqur .gt_last_grand_summary_row_top {
+#hrwokkyhya .gt_last_grand_summary_row_top {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -1699,11 +1699,11 @@ <h3><span class="header-section-number">14.5.2</span> Example: Education affecte
   border-bottom-color: #D3D3D3;
 }
 
-#zpnruhcqur .gt_striped {
+#hrwokkyhya .gt_striped {
   background-color: rgba(128, 128, 128, 0.05);
 }
 
-#zpnruhcqur .gt_table_body {
+#hrwokkyhya .gt_table_body {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -1712,7 +1712,7 @@ <h3><span class="header-section-number">14.5.2</span> Example: Education affecte
   border-bottom-color: #D3D3D3;
 }
 
-#zpnruhcqur .gt_footnotes {
+#hrwokkyhya .gt_footnotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -1726,7 +1726,7 @@ <h3><span class="header-section-number">14.5.2</span> Example: Education affecte
   border-right-color: #D3D3D3;
 }
 
-#zpnruhcqur .gt_footnote {
+#hrwokkyhya .gt_footnote {
   margin: 0px;
   font-size: 90%;
   padding-top: 4px;
@@ -1735,7 +1735,7 @@ <h3><span class="header-section-number">14.5.2</span> Example: Education affecte
   padding-right: 5px;
 }
 
-#zpnruhcqur .gt_sourcenotes {
+#hrwokkyhya .gt_sourcenotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -1749,7 +1749,7 @@ <h3><span class="header-section-number">14.5.2</span> Example: Education affecte
   border-right-color: #D3D3D3;
 }
 
-#zpnruhcqur .gt_sourcenote {
+#hrwokkyhya .gt_sourcenote {
   font-size: 90%;
   padding-top: 4px;
   padding-bottom: 4px;
@@ -1757,63 +1757,63 @@ <h3><span class="header-section-number">14.5.2</span> Example: Education affecte
   padding-right: 5px;
 }
 
-#zpnruhcqur .gt_left {
+#hrwokkyhya .gt_left {
   text-align: left;
 }
 
-#zpnruhcqur .gt_center {
+#hrwokkyhya .gt_center {
   text-align: center;
 }
 
-#zpnruhcqur .gt_right {
+#hrwokkyhya .gt_right {
   text-align: right;
   font-variant-numeric: tabular-nums;
 }
 
-#zpnruhcqur .gt_font_normal {
+#hrwokkyhya .gt_font_normal {
   font-weight: normal;
 }
 
-#zpnruhcqur .gt_font_bold {
+#hrwokkyhya .gt_font_bold {
   font-weight: bold;
 }
 
-#zpnruhcqur .gt_font_italic {
+#hrwokkyhya .gt_font_italic {
   font-style: italic;
 }
 
-#zpnruhcqur .gt_super {
+#hrwokkyhya .gt_super {
   font-size: 65%;
 }
 
-#zpnruhcqur .gt_footnote_marks {
+#hrwokkyhya .gt_footnote_marks {
   font-size: 75%;
   vertical-align: 0.4em;
   position: initial;
 }
 
-#zpnruhcqur .gt_asterisk {
+#hrwokkyhya .gt_asterisk {
   font-size: 100%;
   vertical-align: 0;
 }
 
-#zpnruhcqur .gt_indent_1 {
+#hrwokkyhya .gt_indent_1 {
   text-indent: 5px;
 }
 
-#zpnruhcqur .gt_indent_2 {
+#hrwokkyhya .gt_indent_2 {
   text-indent: 10px;
 }
 
-#zpnruhcqur .gt_indent_3 {
+#hrwokkyhya .gt_indent_3 {
   text-indent: 15px;
 }
 
-#zpnruhcqur .gt_indent_4 {
+#hrwokkyhya .gt_indent_4 {
   text-indent: 20px;
 }
 
-#zpnruhcqur .gt_indent_5 {
+#hrwokkyhya .gt_indent_5 {
   text-indent: 25px;
 }
 </style>
@@ -1969,54 +1969,54 @@ <h3><span class="header-section-number">14.5.2</span> Example: Education affecte
 </div>
 <div id="ambarom-maps" class="section level2 hasAnchor" number="14.6">
 <h2><span class="header-section-number">14.6</span> Mapping survey data<a href="c14-ambarom-vignette.html#ambarom-maps" class="anchor-section" aria-label="Anchor link to header"></a></h2>
-<p>While the table effectively presents the data, a map could also be insightful. To generate maps of the countries, we can use the package {rnaturalearth} and subset North and South America with the <code>ne_countries()</code> function <span class="citation">(<a href="#ref-R-rnaturalearth">Massicotte and South 2023</a>)</span>. The function returns an sf (simple features) object with many columns <span class="citation">(<a href="#ref-sf2023">Pebesma and Bivand 2023</a>)</span>, but most importantly, <code>soverignt</code> (sovereignty), <code>geounit</code> (country or territory), and <code>geometry</code> (the shape). For an example of the difference between sovereignty and country/territory, the United States, Puerto Rico, and the US Virgin Islands are all separate units with the same sovereignty. A map without data is plotted in Figure <a href="c14-ambarom-vignette.html#fig:ambarom-americas-map">14.1</a> using <code>geom_sf()</code> from the {ggplot2} package which plots sf objects <span class="citation">(<a href="#ref-ggplot22016">Wickham 2016</a>)</span>.</p>
-<div class="sourceCode" id="cb497"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb497-1"><a href="c14-ambarom-vignette.html#cb497-1" tabindex="-1"></a>country_shape <span class="ot">&lt;-</span></span>
-<span id="cb497-2"><a href="c14-ambarom-vignette.html#cb497-2" tabindex="-1"></a>  <span class="fu">ne_countries</span>(</span>
-<span id="cb497-3"><a href="c14-ambarom-vignette.html#cb497-3" tabindex="-1"></a>    <span class="at">scale =</span> <span class="st">&quot;medium&quot;</span>,</span>
-<span id="cb497-4"><a href="c14-ambarom-vignette.html#cb497-4" tabindex="-1"></a>    <span class="at">returnclass =</span> <span class="st">&quot;sf&quot;</span>,</span>
-<span id="cb497-5"><a href="c14-ambarom-vignette.html#cb497-5" tabindex="-1"></a>    <span class="at">continent =</span> <span class="fu">c</span>(<span class="st">&quot;North America&quot;</span>, <span class="st">&quot;South America&quot;</span>)</span>
-<span id="cb497-6"><a href="c14-ambarom-vignette.html#cb497-6" tabindex="-1"></a>  )</span>
-<span id="cb497-7"><a href="c14-ambarom-vignette.html#cb497-7" tabindex="-1"></a></span>
-<span id="cb497-8"><a href="c14-ambarom-vignette.html#cb497-8" tabindex="-1"></a>country_shape <span class="sc">%&gt;%</span></span>
-<span id="cb497-9"><a href="c14-ambarom-vignette.html#cb497-9" tabindex="-1"></a>  <span class="fu">ggplot</span>() <span class="sc">+</span></span>
-<span id="cb497-10"><a href="c14-ambarom-vignette.html#cb497-10" tabindex="-1"></a>  <span class="fu">geom_sf</span>()</span></code></pre></div>
+<p>While the table effectively presents the data, a map could also be insightful. To generate maps of the countries, we can use the package {rnaturalearth} and subset North and South America with the <code>ne_countries()</code> function <span class="citation">(<a href="#ref-R-rnaturalearth">Massicotte and South 2023</a>)</span>. The function returns an sf (simple features) object with many columns <span class="citation">(<a href="#ref-sf2023">Pebesma and Bivand 2023</a>)</span>, but most importantly, <code>soverignt</code> (sovereignty), <code>geounit</code> (country or territory), and <code>geometry</code> (the shape.) For an example of the difference between sovereignty and country/territory, the United States, Puerto Rico, and the U.S. Virgin Islands are all separate units with the same sovereignty. A map without data is plotted in Figure <a href="c14-ambarom-vignette.html#fig:ambarom-americas-map">14.1</a> using <code>geom_sf()</code> from the {ggplot2} package, which plots sf objects <span class="citation">(<a href="#ref-ggplot2wickham">Wickham 2016</a>)</span>.</p>
+<div class="sourceCode" id="cb491"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb491-1"><a href="c14-ambarom-vignette.html#cb491-1" tabindex="-1"></a>country_shape <span class="ot">&lt;-</span></span>
+<span id="cb491-2"><a href="c14-ambarom-vignette.html#cb491-2" tabindex="-1"></a>  <span class="fu">ne_countries</span>(</span>
+<span id="cb491-3"><a href="c14-ambarom-vignette.html#cb491-3" tabindex="-1"></a>    <span class="at">scale =</span> <span class="st">&quot;medium&quot;</span>,</span>
+<span id="cb491-4"><a href="c14-ambarom-vignette.html#cb491-4" tabindex="-1"></a>    <span class="at">returnclass =</span> <span class="st">&quot;sf&quot;</span>,</span>
+<span id="cb491-5"><a href="c14-ambarom-vignette.html#cb491-5" tabindex="-1"></a>    <span class="at">continent =</span> <span class="fu">c</span>(<span class="st">&quot;North America&quot;</span>, <span class="st">&quot;South America&quot;</span>)</span>
+<span id="cb491-6"><a href="c14-ambarom-vignette.html#cb491-6" tabindex="-1"></a>  )</span>
+<span id="cb491-7"><a href="c14-ambarom-vignette.html#cb491-7" tabindex="-1"></a></span>
+<span id="cb491-8"><a href="c14-ambarom-vignette.html#cb491-8" tabindex="-1"></a>country_shape <span class="sc">%&gt;%</span></span>
+<span id="cb491-9"><a href="c14-ambarom-vignette.html#cb491-9" tabindex="-1"></a>  <span class="fu">ggplot</span>() <span class="sc">+</span></span>
+<span id="cb491-10"><a href="c14-ambarom-vignette.html#cb491-10" tabindex="-1"></a>  <span class="fu">geom_sf</span>()</span></code></pre></div>
 <div class="figure"><span style="display:block;" id="fig:ambarom-americas-map"></span>
-<img src="bookdown_files/figure-html/ambarom-americas-map-1.png" alt="Map of North and South America" width="672" />
+<img src="bookdown_files/figure-html/ambarom-americas-map-1.png" alt="A blank map of the world, showing only the outlines of the countries in Western Hemisphere." width="672" />
 <p class="caption">
 FIGURE 14.1: Map of North and South America
 </p>
 </div>
 <p>The map in Figure <a href="c14-ambarom-vignette.html#fig:ambarom-americas-map">14.1</a> appears very wide due to the Aleutian islands in Alaska extending into the Eastern Hemisphere. We can crop the shapefile to include only the Western Hemisphere, which removes some of the trailing islands of Alaska using <code>st_crop()</code> from the {sf} package.</p>
-<div class="sourceCode" id="cb498"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb498-1"><a href="c14-ambarom-vignette.html#cb498-1" tabindex="-1"></a>country_shape_crop <span class="ot">&lt;-</span> country_shape <span class="sc">%&gt;%</span></span>
-<span id="cb498-2"><a href="c14-ambarom-vignette.html#cb498-2" tabindex="-1"></a>  <span class="fu">st_crop</span>(<span class="fu">c</span>(<span class="at">xmin =</span> <span class="sc">-</span><span class="dv">180</span>,</span>
-<span id="cb498-3"><a href="c14-ambarom-vignette.html#cb498-3" tabindex="-1"></a>            <span class="at">xmax =</span> <span class="dv">0</span>,</span>
-<span id="cb498-4"><a href="c14-ambarom-vignette.html#cb498-4" tabindex="-1"></a>            <span class="at">ymin =</span> <span class="sc">-</span><span class="dv">90</span>,</span>
-<span id="cb498-5"><a href="c14-ambarom-vignette.html#cb498-5" tabindex="-1"></a>            <span class="at">ymax =</span> <span class="dv">90</span>)) </span></code></pre></div>
-<p>Now that we have the necessary shape files, our next step is to match our survey data to the map. Countries can be named differently (e.g., “U.S”, “U.S.A”, “United States”). To make sure we can visualize our survey data on the map, we need to match the country names in both the survey data and the map data. To do this, we can use the <code>anti_join()</code> function to identify the countries in the survey data that aren’t in the map data. For example, as shown below, the United States is referred to as “United States” in the survey data but “United States of America” in the map data. Table <a href="c14-ambarom-vignette.html#tab:ambarom-map-merge-check-1-tab">14.3</a> shows the countries in the survey data but not the map data and Table <a href="c14-ambarom-vignette.html#tab:ambarom-map-merge-check-2-tab">14.4</a> shows the countries in the map data but not the survey data.</p>
-<div class="sourceCode" id="cb499"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb499-1"><a href="c14-ambarom-vignette.html#cb499-1" tabindex="-1"></a>survey_country_list <span class="ot">&lt;-</span> ambarom <span class="sc">%&gt;%</span> <span class="fu">distinct</span>(Country)</span>
-<span id="cb499-2"><a href="c14-ambarom-vignette.html#cb499-2" tabindex="-1"></a></span>
-<span id="cb499-3"><a href="c14-ambarom-vignette.html#cb499-3" tabindex="-1"></a>survey_country_list_gt <span class="ot">&lt;-</span> survey_country_list <span class="sc">%&gt;%</span></span>
-<span id="cb499-4"><a href="c14-ambarom-vignette.html#cb499-4" tabindex="-1"></a>  <span class="fu">anti_join</span>(country_shape_crop, <span class="at">by =</span> <span class="fu">c</span>(<span class="st">&quot;Country&quot;</span> <span class="ot">=</span> <span class="st">&quot;geounit&quot;</span>)) <span class="sc">%&gt;%</span></span>
-<span id="cb499-5"><a href="c14-ambarom-vignette.html#cb499-5" tabindex="-1"></a>  <span class="fu">gt</span>()</span></code></pre></div>
-<div class="sourceCode" id="cb500"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb500-1"><a href="c14-ambarom-vignette.html#cb500-1" tabindex="-1"></a>survey_country_list_gt</span></code></pre></div>
-
-<div id="sgxskozkog" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
-<style>#sgxskozkog table {
+<div class="sourceCode" id="cb492"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb492-1"><a href="c14-ambarom-vignette.html#cb492-1" tabindex="-1"></a>country_shape_crop <span class="ot">&lt;-</span> country_shape <span class="sc">%&gt;%</span></span>
+<span id="cb492-2"><a href="c14-ambarom-vignette.html#cb492-2" tabindex="-1"></a>  <span class="fu">st_crop</span>(<span class="fu">c</span>(<span class="at">xmin =</span> <span class="sc">-</span><span class="dv">180</span>,</span>
+<span id="cb492-3"><a href="c14-ambarom-vignette.html#cb492-3" tabindex="-1"></a>            <span class="at">xmax =</span> <span class="dv">0</span>,</span>
+<span id="cb492-4"><a href="c14-ambarom-vignette.html#cb492-4" tabindex="-1"></a>            <span class="at">ymin =</span> <span class="sc">-</span><span class="dv">90</span>,</span>
+<span id="cb492-5"><a href="c14-ambarom-vignette.html#cb492-5" tabindex="-1"></a>            <span class="at">ymax =</span> <span class="dv">90</span>)) </span></code></pre></div>
+<p>Now that we have the necessary shape files, our next step is to match our survey data to the map. Countries can be named differently (e.g., “U.S”, “U.S.A”, “United States”.) To make sure we can visualize our survey data on the map, we need to match the country names in both the survey data and the map data. To do this, we can use the <code>anti_join()</code> function to identify the countries in the survey data that aren’t in the map data. For example, as shown below, the United States is referred to as “United States” in the survey data but “United States of America” in the map data. Table <a href="c14-ambarom-vignette.html#tab:ambarom-map-merge-check-1-tab">14.3</a> shows the countries in the survey data but not the map data, and Table <a href="c14-ambarom-vignette.html#tab:ambarom-map-merge-check-2-tab">14.4</a> shows the countries in the map data but not the survey data.</p>
+<div class="sourceCode" id="cb493"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb493-1"><a href="c14-ambarom-vignette.html#cb493-1" tabindex="-1"></a>survey_country_list <span class="ot">&lt;-</span> ambarom <span class="sc">%&gt;%</span> <span class="fu">distinct</span>(Country)</span>
+<span id="cb493-2"><a href="c14-ambarom-vignette.html#cb493-2" tabindex="-1"></a></span>
+<span id="cb493-3"><a href="c14-ambarom-vignette.html#cb493-3" tabindex="-1"></a>survey_country_list_gt <span class="ot">&lt;-</span> survey_country_list <span class="sc">%&gt;%</span></span>
+<span id="cb493-4"><a href="c14-ambarom-vignette.html#cb493-4" tabindex="-1"></a>  <span class="fu">anti_join</span>(country_shape_crop, <span class="at">by =</span> <span class="fu">c</span>(<span class="st">&quot;Country&quot;</span> <span class="ot">=</span> <span class="st">&quot;geounit&quot;</span>)) <span class="sc">%&gt;%</span></span>
+<span id="cb493-5"><a href="c14-ambarom-vignette.html#cb493-5" tabindex="-1"></a>  <span class="fu">gt</span>()</span></code></pre></div>
+<div class="sourceCode" id="cb494"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb494-1"><a href="c14-ambarom-vignette.html#cb494-1" tabindex="-1"></a>survey_country_list_gt</span></code></pre></div>
+
+<div id="uqkclffrjq" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
+<style>#uqkclffrjq table {
   font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
   -webkit-font-smoothing: antialiased;
   -moz-osx-font-smoothing: grayscale;
 }
 
-#sgxskozkog thead, #sgxskozkog tbody, #sgxskozkog tfoot, #sgxskozkog tr, #sgxskozkog td, #sgxskozkog th {
+#uqkclffrjq thead, #uqkclffrjq tbody, #uqkclffrjq tfoot, #uqkclffrjq tr, #uqkclffrjq td, #uqkclffrjq th {
   border-style: none;
 }
 
-#sgxskozkog p {
+#uqkclffrjq p {
   margin: 0;
   padding: 0;
 }
 
-#sgxskozkog .gt_table {
+#uqkclffrjq .gt_table {
   display: table;
   border-collapse: collapse;
   line-height: normal;
@@ -2042,12 +2042,12 @@ <h2><span class="header-section-number">14.6</span> Mapping survey data<a href="
   border-left-color: #D3D3D3;
 }
 
-#sgxskozkog .gt_caption {
+#uqkclffrjq .gt_caption {
   padding-top: 4px;
   padding-bottom: 4px;
 }
 
-#sgxskozkog .gt_title {
+#uqkclffrjq .gt_title {
   color: #333333;
   font-size: 125%;
   font-weight: initial;
@@ -2059,7 +2059,7 @@ <h2><span class="header-section-number">14.6</span> Mapping survey data<a href="
   border-bottom-width: 0;
 }
 
-#sgxskozkog .gt_subtitle {
+#uqkclffrjq .gt_subtitle {
   color: #333333;
   font-size: 85%;
   font-weight: initial;
@@ -2071,7 +2071,7 @@ <h2><span class="header-section-number">14.6</span> Mapping survey data<a href="
   border-top-width: 0;
 }
 
-#sgxskozkog .gt_heading {
+#uqkclffrjq .gt_heading {
   background-color: #FFFFFF;
   text-align: center;
   border-bottom-color: #FFFFFF;
@@ -2083,13 +2083,13 @@ <h2><span class="header-section-number">14.6</span> Mapping survey data<a href="
   border-right-color: #D3D3D3;
 }
 
-#sgxskozkog .gt_bottom_border {
+#uqkclffrjq .gt_bottom_border {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
 }
 
-#sgxskozkog .gt_col_headings {
+#uqkclffrjq .gt_col_headings {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -2104,7 +2104,7 @@ <h2><span class="header-section-number">14.6</span> Mapping survey data<a href="
   border-right-color: #D3D3D3;
 }
 
-#sgxskozkog .gt_col_heading {
+#uqkclffrjq .gt_col_heading {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -2124,7 +2124,7 @@ <h2><span class="header-section-number">14.6</span> Mapping survey data<a href="
   overflow-x: hidden;
 }
 
-#sgxskozkog .gt_column_spanner_outer {
+#uqkclffrjq .gt_column_spanner_outer {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -2136,15 +2136,15 @@ <h2><span class="header-section-number">14.6</span> Mapping survey data<a href="
   padding-right: 4px;
 }
 
-#sgxskozkog .gt_column_spanner_outer:first-child {
+#uqkclffrjq .gt_column_spanner_outer:first-child {
   padding-left: 0;
 }
 
-#sgxskozkog .gt_column_spanner_outer:last-child {
+#uqkclffrjq .gt_column_spanner_outer:last-child {
   padding-right: 0;
 }
 
-#sgxskozkog .gt_column_spanner {
+#uqkclffrjq .gt_column_spanner {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
@@ -2156,11 +2156,11 @@ <h2><span class="header-section-number">14.6</span> Mapping survey data<a href="
   width: 100%;
 }
 
-#sgxskozkog .gt_spanner_row {
+#uqkclffrjq .gt_spanner_row {
   border-bottom-style: hidden;
 }
 
-#sgxskozkog .gt_group_heading {
+#uqkclffrjq .gt_group_heading {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -2186,7 +2186,7 @@ <h2><span class="header-section-number">14.6</span> Mapping survey data<a href="
   text-align: left;
 }
 
-#sgxskozkog .gt_empty_group_heading {
+#uqkclffrjq .gt_empty_group_heading {
   padding: 0.5px;
   color: #333333;
   background-color: #FFFFFF;
@@ -2201,15 +2201,15 @@ <h2><span class="header-section-number">14.6</span> Mapping survey data<a href="
   vertical-align: middle;
 }
 
-#sgxskozkog .gt_from_md > :first-child {
+#uqkclffrjq .gt_from_md > :first-child {
   margin-top: 0;
 }
 
-#sgxskozkog .gt_from_md > :last-child {
+#uqkclffrjq .gt_from_md > :last-child {
   margin-bottom: 0;
 }
 
-#sgxskozkog .gt_row {
+#uqkclffrjq .gt_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -2228,7 +2228,7 @@ <h2><span class="header-section-number">14.6</span> Mapping survey data<a href="
   overflow-x: hidden;
 }
 
-#sgxskozkog .gt_stub {
+#uqkclffrjq .gt_stub {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -2241,7 +2241,7 @@ <h2><span class="header-section-number">14.6</span> Mapping survey data<a href="
   padding-right: 5px;
 }
 
-#sgxskozkog .gt_stub_row_group {
+#uqkclffrjq .gt_stub_row_group {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -2255,15 +2255,15 @@ <h2><span class="header-section-number">14.6</span> Mapping survey data<a href="
   vertical-align: top;
 }
 
-#sgxskozkog .gt_row_group_first td {
+#uqkclffrjq .gt_row_group_first td {
   border-top-width: 2px;
 }
 
-#sgxskozkog .gt_row_group_first th {
+#uqkclffrjq .gt_row_group_first th {
   border-top-width: 2px;
 }
 
-#sgxskozkog .gt_summary_row {
+#uqkclffrjq .gt_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -2273,16 +2273,16 @@ <h2><span class="header-section-number">14.6</span> Mapping survey data<a href="
   padding-right: 5px;
 }
 
-#sgxskozkog .gt_first_summary_row {
+#uqkclffrjq .gt_first_summary_row {
   border-top-style: solid;
   border-top-color: #D3D3D3;
 }
 
-#sgxskozkog .gt_first_summary_row.thick {
+#uqkclffrjq .gt_first_summary_row.thick {
   border-top-width: 2px;
 }
 
-#sgxskozkog .gt_last_summary_row {
+#uqkclffrjq .gt_last_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -2292,7 +2292,7 @@ <h2><span class="header-section-number">14.6</span> Mapping survey data<a href="
   border-bottom-color: #D3D3D3;
 }
 
-#sgxskozkog .gt_grand_summary_row {
+#uqkclffrjq .gt_grand_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -2302,7 +2302,7 @@ <h2><span class="header-section-number">14.6</span> Mapping survey data<a href="
   padding-right: 5px;
 }
 
-#sgxskozkog .gt_first_grand_summary_row {
+#uqkclffrjq .gt_first_grand_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -2312,7 +2312,7 @@ <h2><span class="header-section-number">14.6</span> Mapping survey data<a href="
   border-top-color: #D3D3D3;
 }
 
-#sgxskozkog .gt_last_grand_summary_row_top {
+#uqkclffrjq .gt_last_grand_summary_row_top {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -2322,11 +2322,11 @@ <h2><span class="header-section-number">14.6</span> Mapping survey data<a href="
   border-bottom-color: #D3D3D3;
 }
 
-#sgxskozkog .gt_striped {
+#uqkclffrjq .gt_striped {
   background-color: rgba(128, 128, 128, 0.05);
 }
 
-#sgxskozkog .gt_table_body {
+#uqkclffrjq .gt_table_body {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -2335,7 +2335,7 @@ <h2><span class="header-section-number">14.6</span> Mapping survey data<a href="
   border-bottom-color: #D3D3D3;
 }
 
-#sgxskozkog .gt_footnotes {
+#uqkclffrjq .gt_footnotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -2349,7 +2349,7 @@ <h2><span class="header-section-number">14.6</span> Mapping survey data<a href="
   border-right-color: #D3D3D3;
 }
 
-#sgxskozkog .gt_footnote {
+#uqkclffrjq .gt_footnote {
   margin: 0px;
   font-size: 90%;
   padding-top: 4px;
@@ -2358,7 +2358,7 @@ <h2><span class="header-section-number">14.6</span> Mapping survey data<a href="
   padding-right: 5px;
 }
 
-#sgxskozkog .gt_sourcenotes {
+#uqkclffrjq .gt_sourcenotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -2372,7 +2372,7 @@ <h2><span class="header-section-number">14.6</span> Mapping survey data<a href="
   border-right-color: #D3D3D3;
 }
 
-#sgxskozkog .gt_sourcenote {
+#uqkclffrjq .gt_sourcenote {
   font-size: 90%;
   padding-top: 4px;
   padding-bottom: 4px;
@@ -2380,63 +2380,63 @@ <h2><span class="header-section-number">14.6</span> Mapping survey data<a href="
   padding-right: 5px;
 }
 
-#sgxskozkog .gt_left {
+#uqkclffrjq .gt_left {
   text-align: left;
 }
 
-#sgxskozkog .gt_center {
+#uqkclffrjq .gt_center {
   text-align: center;
 }
 
-#sgxskozkog .gt_right {
+#uqkclffrjq .gt_right {
   text-align: right;
   font-variant-numeric: tabular-nums;
 }
 
-#sgxskozkog .gt_font_normal {
+#uqkclffrjq .gt_font_normal {
   font-weight: normal;
 }
 
-#sgxskozkog .gt_font_bold {
+#uqkclffrjq .gt_font_bold {
   font-weight: bold;
 }
 
-#sgxskozkog .gt_font_italic {
+#uqkclffrjq .gt_font_italic {
   font-style: italic;
 }
 
-#sgxskozkog .gt_super {
+#uqkclffrjq .gt_super {
   font-size: 65%;
 }
 
-#sgxskozkog .gt_footnote_marks {
+#uqkclffrjq .gt_footnote_marks {
   font-size: 75%;
   vertical-align: 0.4em;
   position: initial;
 }
 
-#sgxskozkog .gt_asterisk {
+#uqkclffrjq .gt_asterisk {
   font-size: 100%;
   vertical-align: 0;
 }
 
-#sgxskozkog .gt_indent_1 {
+#uqkclffrjq .gt_indent_1 {
   text-indent: 5px;
 }
 
-#sgxskozkog .gt_indent_2 {
+#uqkclffrjq .gt_indent_2 {
   text-indent: 10px;
 }
 
-#sgxskozkog .gt_indent_3 {
+#uqkclffrjq .gt_indent_3 {
   text-indent: 15px;
 }
 
-#sgxskozkog .gt_indent_4 {
+#uqkclffrjq .gt_indent_4 {
   text-indent: 20px;
 }
 
-#sgxskozkog .gt_indent_5 {
+#uqkclffrjq .gt_indent_5 {
   text-indent: 25px;
 }
 </style>
@@ -2455,30 +2455,30 @@ <h2><span class="header-section-number">14.6</span> Mapping survey data<a href="
   
 </table>
 </div>
-<div class="sourceCode" id="cb501"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb501-1"><a href="c14-ambarom-vignette.html#cb501-1" tabindex="-1"></a>map_country_list_gt<span class="ot">&lt;-</span>country_shape_crop <span class="sc">%&gt;%</span> <span class="fu">as_tibble</span>() <span class="sc">%&gt;%</span> </span>
-<span id="cb501-2"><a href="c14-ambarom-vignette.html#cb501-2" tabindex="-1"></a>  <span class="fu">select</span>(geounit, sovereignt) <span class="sc">%&gt;%</span></span>
-<span id="cb501-3"><a href="c14-ambarom-vignette.html#cb501-3" tabindex="-1"></a>  <span class="fu">anti_join</span>(survey_country_list, <span class="at">by =</span> <span class="fu">c</span>(<span class="st">&quot;geounit&quot;</span> <span class="ot">=</span> <span class="st">&quot;Country&quot;</span>)) <span class="sc">%&gt;%</span></span>
-<span id="cb501-4"><a href="c14-ambarom-vignette.html#cb501-4" tabindex="-1"></a>  <span class="fu">arrange</span>(geounit) <span class="sc">%&gt;%</span></span>
-<span id="cb501-5"><a href="c14-ambarom-vignette.html#cb501-5" tabindex="-1"></a>  <span class="fu">gt</span>()</span></code></pre></div>
-<div class="sourceCode" id="cb502"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb502-1"><a href="c14-ambarom-vignette.html#cb502-1" tabindex="-1"></a>map_country_list_gt</span></code></pre></div>
-
-<div id="ibkckwmzsj" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
-<style>#ibkckwmzsj table {
+<div class="sourceCode" id="cb495"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb495-1"><a href="c14-ambarom-vignette.html#cb495-1" tabindex="-1"></a>map_country_list_gt<span class="ot">&lt;-</span>country_shape_crop <span class="sc">%&gt;%</span> <span class="fu">as_tibble</span>() <span class="sc">%&gt;%</span> </span>
+<span id="cb495-2"><a href="c14-ambarom-vignette.html#cb495-2" tabindex="-1"></a>  <span class="fu">select</span>(geounit, sovereignt) <span class="sc">%&gt;%</span></span>
+<span id="cb495-3"><a href="c14-ambarom-vignette.html#cb495-3" tabindex="-1"></a>  <span class="fu">anti_join</span>(survey_country_list, <span class="at">by =</span> <span class="fu">c</span>(<span class="st">&quot;geounit&quot;</span> <span class="ot">=</span> <span class="st">&quot;Country&quot;</span>)) <span class="sc">%&gt;%</span></span>
+<span id="cb495-4"><a href="c14-ambarom-vignette.html#cb495-4" tabindex="-1"></a>  <span class="fu">arrange</span>(geounit) <span class="sc">%&gt;%</span></span>
+<span id="cb495-5"><a href="c14-ambarom-vignette.html#cb495-5" tabindex="-1"></a>  <span class="fu">gt</span>()</span></code></pre></div>
+<div class="sourceCode" id="cb496"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb496-1"><a href="c14-ambarom-vignette.html#cb496-1" tabindex="-1"></a>map_country_list_gt</span></code></pre></div>
+
+<div id="xqossclppl" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
+<style>#xqossclppl table {
   font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
   -webkit-font-smoothing: antialiased;
   -moz-osx-font-smoothing: grayscale;
 }
 
-#ibkckwmzsj thead, #ibkckwmzsj tbody, #ibkckwmzsj tfoot, #ibkckwmzsj tr, #ibkckwmzsj td, #ibkckwmzsj th {
+#xqossclppl thead, #xqossclppl tbody, #xqossclppl tfoot, #xqossclppl tr, #xqossclppl td, #xqossclppl th {
   border-style: none;
 }
 
-#ibkckwmzsj p {
+#xqossclppl p {
   margin: 0;
   padding: 0;
 }
 
-#ibkckwmzsj .gt_table {
+#xqossclppl .gt_table {
   display: table;
   border-collapse: collapse;
   line-height: normal;
@@ -2504,12 +2504,12 @@ <h2><span class="header-section-number">14.6</span> Mapping survey data<a href="
   border-left-color: #D3D3D3;
 }
 
-#ibkckwmzsj .gt_caption {
+#xqossclppl .gt_caption {
   padding-top: 4px;
   padding-bottom: 4px;
 }
 
-#ibkckwmzsj .gt_title {
+#xqossclppl .gt_title {
   color: #333333;
   font-size: 125%;
   font-weight: initial;
@@ -2521,7 +2521,7 @@ <h2><span class="header-section-number">14.6</span> Mapping survey data<a href="
   border-bottom-width: 0;
 }
 
-#ibkckwmzsj .gt_subtitle {
+#xqossclppl .gt_subtitle {
   color: #333333;
   font-size: 85%;
   font-weight: initial;
@@ -2533,7 +2533,7 @@ <h2><span class="header-section-number">14.6</span> Mapping survey data<a href="
   border-top-width: 0;
 }
 
-#ibkckwmzsj .gt_heading {
+#xqossclppl .gt_heading {
   background-color: #FFFFFF;
   text-align: center;
   border-bottom-color: #FFFFFF;
@@ -2545,13 +2545,13 @@ <h2><span class="header-section-number">14.6</span> Mapping survey data<a href="
   border-right-color: #D3D3D3;
 }
 
-#ibkckwmzsj .gt_bottom_border {
+#xqossclppl .gt_bottom_border {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
 }
 
-#ibkckwmzsj .gt_col_headings {
+#xqossclppl .gt_col_headings {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -2566,7 +2566,7 @@ <h2><span class="header-section-number">14.6</span> Mapping survey data<a href="
   border-right-color: #D3D3D3;
 }
 
-#ibkckwmzsj .gt_col_heading {
+#xqossclppl .gt_col_heading {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -2586,7 +2586,7 @@ <h2><span class="header-section-number">14.6</span> Mapping survey data<a href="
   overflow-x: hidden;
 }
 
-#ibkckwmzsj .gt_column_spanner_outer {
+#xqossclppl .gt_column_spanner_outer {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -2598,15 +2598,15 @@ <h2><span class="header-section-number">14.6</span> Mapping survey data<a href="
   padding-right: 4px;
 }
 
-#ibkckwmzsj .gt_column_spanner_outer:first-child {
+#xqossclppl .gt_column_spanner_outer:first-child {
   padding-left: 0;
 }
 
-#ibkckwmzsj .gt_column_spanner_outer:last-child {
+#xqossclppl .gt_column_spanner_outer:last-child {
   padding-right: 0;
 }
 
-#ibkckwmzsj .gt_column_spanner {
+#xqossclppl .gt_column_spanner {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
@@ -2618,11 +2618,11 @@ <h2><span class="header-section-number">14.6</span> Mapping survey data<a href="
   width: 100%;
 }
 
-#ibkckwmzsj .gt_spanner_row {
+#xqossclppl .gt_spanner_row {
   border-bottom-style: hidden;
 }
 
-#ibkckwmzsj .gt_group_heading {
+#xqossclppl .gt_group_heading {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -2648,7 +2648,7 @@ <h2><span class="header-section-number">14.6</span> Mapping survey data<a href="
   text-align: left;
 }
 
-#ibkckwmzsj .gt_empty_group_heading {
+#xqossclppl .gt_empty_group_heading {
   padding: 0.5px;
   color: #333333;
   background-color: #FFFFFF;
@@ -2663,15 +2663,15 @@ <h2><span class="header-section-number">14.6</span> Mapping survey data<a href="
   vertical-align: middle;
 }
 
-#ibkckwmzsj .gt_from_md > :first-child {
+#xqossclppl .gt_from_md > :first-child {
   margin-top: 0;
 }
 
-#ibkckwmzsj .gt_from_md > :last-child {
+#xqossclppl .gt_from_md > :last-child {
   margin-bottom: 0;
 }
 
-#ibkckwmzsj .gt_row {
+#xqossclppl .gt_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -2690,7 +2690,7 @@ <h2><span class="header-section-number">14.6</span> Mapping survey data<a href="
   overflow-x: hidden;
 }
 
-#ibkckwmzsj .gt_stub {
+#xqossclppl .gt_stub {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -2703,7 +2703,7 @@ <h2><span class="header-section-number">14.6</span> Mapping survey data<a href="
   padding-right: 5px;
 }
 
-#ibkckwmzsj .gt_stub_row_group {
+#xqossclppl .gt_stub_row_group {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -2717,15 +2717,15 @@ <h2><span class="header-section-number">14.6</span> Mapping survey data<a href="
   vertical-align: top;
 }
 
-#ibkckwmzsj .gt_row_group_first td {
+#xqossclppl .gt_row_group_first td {
   border-top-width: 2px;
 }
 
-#ibkckwmzsj .gt_row_group_first th {
+#xqossclppl .gt_row_group_first th {
   border-top-width: 2px;
 }
 
-#ibkckwmzsj .gt_summary_row {
+#xqossclppl .gt_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -2735,16 +2735,16 @@ <h2><span class="header-section-number">14.6</span> Mapping survey data<a href="
   padding-right: 5px;
 }
 
-#ibkckwmzsj .gt_first_summary_row {
+#xqossclppl .gt_first_summary_row {
   border-top-style: solid;
   border-top-color: #D3D3D3;
 }
 
-#ibkckwmzsj .gt_first_summary_row.thick {
+#xqossclppl .gt_first_summary_row.thick {
   border-top-width: 2px;
 }
 
-#ibkckwmzsj .gt_last_summary_row {
+#xqossclppl .gt_last_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -2754,7 +2754,7 @@ <h2><span class="header-section-number">14.6</span> Mapping survey data<a href="
   border-bottom-color: #D3D3D3;
 }
 
-#ibkckwmzsj .gt_grand_summary_row {
+#xqossclppl .gt_grand_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -2764,7 +2764,7 @@ <h2><span class="header-section-number">14.6</span> Mapping survey data<a href="
   padding-right: 5px;
 }
 
-#ibkckwmzsj .gt_first_grand_summary_row {
+#xqossclppl .gt_first_grand_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -2774,7 +2774,7 @@ <h2><span class="header-section-number">14.6</span> Mapping survey data<a href="
   border-top-color: #D3D3D3;
 }
 
-#ibkckwmzsj .gt_last_grand_summary_row_top {
+#xqossclppl .gt_last_grand_summary_row_top {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -2784,11 +2784,11 @@ <h2><span class="header-section-number">14.6</span> Mapping survey data<a href="
   border-bottom-color: #D3D3D3;
 }
 
-#ibkckwmzsj .gt_striped {
+#xqossclppl .gt_striped {
   background-color: rgba(128, 128, 128, 0.05);
 }
 
-#ibkckwmzsj .gt_table_body {
+#xqossclppl .gt_table_body {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -2797,7 +2797,7 @@ <h2><span class="header-section-number">14.6</span> Mapping survey data<a href="
   border-bottom-color: #D3D3D3;
 }
 
-#ibkckwmzsj .gt_footnotes {
+#xqossclppl .gt_footnotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -2811,7 +2811,7 @@ <h2><span class="header-section-number">14.6</span> Mapping survey data<a href="
   border-right-color: #D3D3D3;
 }
 
-#ibkckwmzsj .gt_footnote {
+#xqossclppl .gt_footnote {
   margin: 0px;
   font-size: 90%;
   padding-top: 4px;
@@ -2820,7 +2820,7 @@ <h2><span class="header-section-number">14.6</span> Mapping survey data<a href="
   padding-right: 5px;
 }
 
-#ibkckwmzsj .gt_sourcenotes {
+#xqossclppl .gt_sourcenotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -2834,7 +2834,7 @@ <h2><span class="header-section-number">14.6</span> Mapping survey data<a href="
   border-right-color: #D3D3D3;
 }
 
-#ibkckwmzsj .gt_sourcenote {
+#xqossclppl .gt_sourcenote {
   font-size: 90%;
   padding-top: 4px;
   padding-bottom: 4px;
@@ -2842,63 +2842,63 @@ <h2><span class="header-section-number">14.6</span> Mapping survey data<a href="
   padding-right: 5px;
 }
 
-#ibkckwmzsj .gt_left {
+#xqossclppl .gt_left {
   text-align: left;
 }
 
-#ibkckwmzsj .gt_center {
+#xqossclppl .gt_center {
   text-align: center;
 }
 
-#ibkckwmzsj .gt_right {
+#xqossclppl .gt_right {
   text-align: right;
   font-variant-numeric: tabular-nums;
 }
 
-#ibkckwmzsj .gt_font_normal {
+#xqossclppl .gt_font_normal {
   font-weight: normal;
 }
 
-#ibkckwmzsj .gt_font_bold {
+#xqossclppl .gt_font_bold {
   font-weight: bold;
 }
 
-#ibkckwmzsj .gt_font_italic {
+#xqossclppl .gt_font_italic {
   font-style: italic;
 }
 
-#ibkckwmzsj .gt_super {
+#xqossclppl .gt_super {
   font-size: 65%;
 }
 
-#ibkckwmzsj .gt_footnote_marks {
+#xqossclppl .gt_footnote_marks {
   font-size: 75%;
   vertical-align: 0.4em;
   position: initial;
 }
 
-#ibkckwmzsj .gt_asterisk {
+#xqossclppl .gt_asterisk {
   font-size: 100%;
   vertical-align: 0;
 }
 
-#ibkckwmzsj .gt_indent_1 {
+#xqossclppl .gt_indent_1 {
   text-indent: 5px;
 }
 
-#ibkckwmzsj .gt_indent_2 {
+#xqossclppl .gt_indent_2 {
   text-indent: 10px;
 }
 
-#ibkckwmzsj .gt_indent_3 {
+#xqossclppl .gt_indent_3 {
   text-indent: 15px;
 }
 
-#ibkckwmzsj .gt_indent_4 {
+#xqossclppl .gt_indent_4 {
   text-indent: 20px;
 }
 
-#ibkckwmzsj .gt_indent_5 {
+#xqossclppl .gt_indent_5 {
   text-indent: 25px;
 }
 </style>
@@ -2978,108 +2978,108 @@ <h2><span class="header-section-number">14.6</span> Mapping survey data<a href="
 </table>
 </div>
 <p>There are several ways to fix the mismatched names for a successful join. The simplest solution is to rename the data in the shape object before merging. Since only one country name in the survey data differs from the map data, we rename the map data accordingly.</p>
-<div class="sourceCode" id="cb503"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb503-1"><a href="c14-ambarom-vignette.html#cb503-1" tabindex="-1"></a>country_shape_upd <span class="ot">&lt;-</span> country_shape_crop <span class="sc">%&gt;%</span></span>
-<span id="cb503-2"><a href="c14-ambarom-vignette.html#cb503-2" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">geounit =</span> <span class="fu">if_else</span>(geounit <span class="sc">==</span> <span class="st">&quot;United States of America&quot;</span>, </span>
-<span id="cb503-3"><a href="c14-ambarom-vignette.html#cb503-3" tabindex="-1"></a>                           <span class="st">&quot;United States&quot;</span>, geounit))</span></code></pre></div>
-<p>Now that the country names match, we can merge the survey and map data and then plot the data. We begin with the map file and merge it with the survey estimates generated in Section <a href="c14-ambarom-vignette.html#ambarom-estimates">14.5</a> (<code>covid_worry_country_ests</code> and <code>covid_educ_ests</code>). We use the {sf} function of <code>full_join()</code>, which joins the rows in the map data and the survey estimates based on the columns <code>geounit</code> and <code>Country</code>. A full join keeps all the rows from both datasets, matching rows when possible. For any rows without matches, the function fills in an <code>NA</code> for the missing value <span class="citation">(<a href="#ref-sf2023">Pebesma and Bivand 2023</a>)</span>.</p>
-<div class="sourceCode" id="cb504"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb504-1"><a href="c14-ambarom-vignette.html#cb504-1" tabindex="-1"></a>covid_sf <span class="ot">&lt;-</span> country_shape_upd <span class="sc">%&gt;%</span></span>
-<span id="cb504-2"><a href="c14-ambarom-vignette.html#cb504-2" tabindex="-1"></a>  <span class="fu">full_join</span>(covid_worry_country_ests, </span>
-<span id="cb504-3"><a href="c14-ambarom-vignette.html#cb504-3" tabindex="-1"></a>            <span class="at">by =</span> <span class="fu">c</span>(<span class="st">&quot;geounit&quot;</span> <span class="ot">=</span> <span class="st">&quot;Country&quot;</span>)) <span class="sc">%&gt;%</span></span>
-<span id="cb504-4"><a href="c14-ambarom-vignette.html#cb504-4" tabindex="-1"></a>  <span class="fu">full_join</span>(covid_educ_ests,</span>
-<span id="cb504-5"><a href="c14-ambarom-vignette.html#cb504-5" tabindex="-1"></a>            <span class="at">by =</span> <span class="fu">c</span>(<span class="st">&quot;geounit&quot;</span> <span class="ot">=</span> <span class="st">&quot;Country&quot;</span>))</span></code></pre></div>
-<p>After the merge, we create two figures that display the population estimates for the percentage of people worried about COVID (Figure <a href="c14-ambarom-vignette.html#fig:ambarom-make-maps-covid">14.2</a>) and the percentage of households with at least one child participating in virtual or hybrid learning (Figure <a href="c14-ambarom-vignette.html#fig:ambarom-make-maps-covid-ed">14.3</a>). We also add a cross-hatching pattern to the countries without any data using the <code>geom_sf_pattern()</code> function from the {ggpattern} package <span class="citation">(<a href="#ref-R-ggpattern">FC, Davis, and ggplot2 authors 2022</a>)</span>.</p>
-<div class="sourceCode" id="cb505"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb505-1"><a href="c14-ambarom-vignette.html#cb505-1" tabindex="-1"></a><span class="fu">ggplot</span>() <span class="sc">+</span></span>
-<span id="cb505-2"><a href="c14-ambarom-vignette.html#cb505-2" tabindex="-1"></a>  <span class="fu">geom_sf</span>(<span class="at">data =</span> covid_sf,</span>
-<span id="cb505-3"><a href="c14-ambarom-vignette.html#cb505-3" tabindex="-1"></a>          <span class="fu">aes</span>(<span class="at">fill =</span> p, <span class="at">geometry =</span> geometry),</span>
-<span id="cb505-4"><a href="c14-ambarom-vignette.html#cb505-4" tabindex="-1"></a>          <span class="at">color =</span> <span class="st">&quot;darkgray&quot;</span>) <span class="sc">+</span></span>
-<span id="cb505-5"><a href="c14-ambarom-vignette.html#cb505-5" tabindex="-1"></a>  <span class="fu">scale_fill_gradientn</span>(</span>
-<span id="cb505-6"><a href="c14-ambarom-vignette.html#cb505-6" tabindex="-1"></a>    <span class="at">guide =</span> <span class="st">&quot;colorbar&quot;</span>,</span>
-<span id="cb505-7"><a href="c14-ambarom-vignette.html#cb505-7" tabindex="-1"></a>    <span class="at">name =</span> <span class="st">&quot;Percent&quot;</span>,</span>
-<span id="cb505-8"><a href="c14-ambarom-vignette.html#cb505-8" tabindex="-1"></a>    <span class="at">labels =</span> scales<span class="sc">::</span>comma,</span>
-<span id="cb505-9"><a href="c14-ambarom-vignette.html#cb505-9" tabindex="-1"></a>    <span class="at">colors =</span> <span class="fu">c</span>(<span class="st">&quot;#BFD7EA&quot;</span>, <span class="st">&quot;#087e8b&quot;</span>, <span class="st">&quot;#0B3954&quot;</span>),</span>
-<span id="cb505-10"><a href="c14-ambarom-vignette.html#cb505-10" tabindex="-1"></a>    <span class="at">na.value =</span> <span class="cn">NA</span></span>
-<span id="cb505-11"><a href="c14-ambarom-vignette.html#cb505-11" tabindex="-1"></a>  ) <span class="sc">+</span></span>
-<span id="cb505-12"><a href="c14-ambarom-vignette.html#cb505-12" tabindex="-1"></a>  <span class="fu">geom_sf_pattern</span>(</span>
-<span id="cb505-13"><a href="c14-ambarom-vignette.html#cb505-13" tabindex="-1"></a>    <span class="at">data =</span> <span class="fu">filter</span>(covid_sf, <span class="fu">is.na</span>(p)),</span>
-<span id="cb505-14"><a href="c14-ambarom-vignette.html#cb505-14" tabindex="-1"></a>    <span class="at">pattern =</span> <span class="st">&quot;crosshatch&quot;</span>,</span>
-<span id="cb505-15"><a href="c14-ambarom-vignette.html#cb505-15" tabindex="-1"></a>    <span class="at">pattern_fill =</span> <span class="st">&quot;lightgray&quot;</span>,</span>
-<span id="cb505-16"><a href="c14-ambarom-vignette.html#cb505-16" tabindex="-1"></a>    <span class="at">pattern_color =</span> <span class="st">&quot;lightgray&quot;</span>,</span>
-<span id="cb505-17"><a href="c14-ambarom-vignette.html#cb505-17" tabindex="-1"></a>    <span class="at">fill =</span> <span class="cn">NA</span>,</span>
-<span id="cb505-18"><a href="c14-ambarom-vignette.html#cb505-18" tabindex="-1"></a>    <span class="at">color =</span> <span class="st">&quot;darkgray&quot;</span></span>
-<span id="cb505-19"><a href="c14-ambarom-vignette.html#cb505-19" tabindex="-1"></a>  ) <span class="sc">+</span></span>
-<span id="cb505-20"><a href="c14-ambarom-vignette.html#cb505-20" tabindex="-1"></a>  <span class="fu">theme_minimal</span>()</span></code></pre></div>
+<div class="sourceCode" id="cb497"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb497-1"><a href="c14-ambarom-vignette.html#cb497-1" tabindex="-1"></a>country_shape_upd <span class="ot">&lt;-</span> country_shape_crop <span class="sc">%&gt;%</span></span>
+<span id="cb497-2"><a href="c14-ambarom-vignette.html#cb497-2" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">geounit =</span> <span class="fu">if_else</span>(geounit <span class="sc">==</span> <span class="st">&quot;United States of America&quot;</span>, </span>
+<span id="cb497-3"><a href="c14-ambarom-vignette.html#cb497-3" tabindex="-1"></a>                           <span class="st">&quot;United States&quot;</span>, geounit))</span></code></pre></div>
+<p>Now that the country names match, we can merge the survey and map data and then plot the data. We begin with the map file and merge it with the survey estimates generated in Section <a href="c14-ambarom-vignette.html#ambarom-estimates">14.5</a> (<code>covid_worry_country_ests</code> and <code>covid_educ_ests</code>.) We use the {sf} function of <code>full_join()</code>, which joins the rows in the map data and the survey estimates based on the columns <code>geounit</code> and <code>Country</code>. A full join keeps all the rows from both datasets, matching rows when possible. For any rows without matches, the function fills in an <code>NA</code> for the missing value <span class="citation">(<a href="#ref-sf2023">Pebesma and Bivand 2023</a>)</span>.</p>
+<div class="sourceCode" id="cb498"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb498-1"><a href="c14-ambarom-vignette.html#cb498-1" tabindex="-1"></a>covid_sf <span class="ot">&lt;-</span> country_shape_upd <span class="sc">%&gt;%</span></span>
+<span id="cb498-2"><a href="c14-ambarom-vignette.html#cb498-2" tabindex="-1"></a>  <span class="fu">full_join</span>(covid_worry_country_ests, </span>
+<span id="cb498-3"><a href="c14-ambarom-vignette.html#cb498-3" tabindex="-1"></a>            <span class="at">by =</span> <span class="fu">c</span>(<span class="st">&quot;geounit&quot;</span> <span class="ot">=</span> <span class="st">&quot;Country&quot;</span>)) <span class="sc">%&gt;%</span></span>
+<span id="cb498-4"><a href="c14-ambarom-vignette.html#cb498-4" tabindex="-1"></a>  <span class="fu">full_join</span>(covid_educ_ests,</span>
+<span id="cb498-5"><a href="c14-ambarom-vignette.html#cb498-5" tabindex="-1"></a>            <span class="at">by =</span> <span class="fu">c</span>(<span class="st">&quot;geounit&quot;</span> <span class="ot">=</span> <span class="st">&quot;Country&quot;</span>))</span></code></pre></div>
+<p>After the merge, we create two figures that display the population estimates for the percentage of people worried about COVID (Figure <a href="c14-ambarom-vignette.html#fig:ambarom-make-maps-covid">14.2</a>) and the percentage of households with at least one child participating in virtual or hybrid learning (Figure <a href="c14-ambarom-vignette.html#fig:ambarom-make-maps-covid-ed">14.3</a>.) We also add a cross-hatching pattern to the countries without any data using the <code>geom_sf_pattern()</code> function from the {ggpattern} package <span class="citation">(<a href="#ref-R-ggpattern">FC, Davis, and ggplot2 authors 2022</a>)</span>.</p>
+<div class="sourceCode" id="cb499"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb499-1"><a href="c14-ambarom-vignette.html#cb499-1" tabindex="-1"></a><span class="fu">ggplot</span>() <span class="sc">+</span></span>
+<span id="cb499-2"><a href="c14-ambarom-vignette.html#cb499-2" tabindex="-1"></a>  <span class="fu">geom_sf</span>(<span class="at">data =</span> covid_sf,</span>
+<span id="cb499-3"><a href="c14-ambarom-vignette.html#cb499-3" tabindex="-1"></a>          <span class="fu">aes</span>(<span class="at">fill =</span> p, <span class="at">geometry =</span> geometry),</span>
+<span id="cb499-4"><a href="c14-ambarom-vignette.html#cb499-4" tabindex="-1"></a>          <span class="at">color =</span> <span class="st">&quot;darkgray&quot;</span>) <span class="sc">+</span></span>
+<span id="cb499-5"><a href="c14-ambarom-vignette.html#cb499-5" tabindex="-1"></a>  <span class="fu">scale_fill_gradientn</span>(</span>
+<span id="cb499-6"><a href="c14-ambarom-vignette.html#cb499-6" tabindex="-1"></a>    <span class="at">guide =</span> <span class="st">&quot;colorbar&quot;</span>,</span>
+<span id="cb499-7"><a href="c14-ambarom-vignette.html#cb499-7" tabindex="-1"></a>    <span class="at">name =</span> <span class="st">&quot;Percent&quot;</span>,</span>
+<span id="cb499-8"><a href="c14-ambarom-vignette.html#cb499-8" tabindex="-1"></a>    <span class="at">labels =</span> scales<span class="sc">::</span>comma,</span>
+<span id="cb499-9"><a href="c14-ambarom-vignette.html#cb499-9" tabindex="-1"></a>    <span class="at">colors =</span> <span class="fu">c</span>(<span class="st">&quot;#BFD7EA&quot;</span>, <span class="st">&quot;#087e8b&quot;</span>, <span class="st">&quot;#0B3954&quot;</span>),</span>
+<span id="cb499-10"><a href="c14-ambarom-vignette.html#cb499-10" tabindex="-1"></a>    <span class="at">na.value =</span> <span class="cn">NA</span></span>
+<span id="cb499-11"><a href="c14-ambarom-vignette.html#cb499-11" tabindex="-1"></a>  ) <span class="sc">+</span></span>
+<span id="cb499-12"><a href="c14-ambarom-vignette.html#cb499-12" tabindex="-1"></a>  <span class="fu">geom_sf_pattern</span>(</span>
+<span id="cb499-13"><a href="c14-ambarom-vignette.html#cb499-13" tabindex="-1"></a>    <span class="at">data =</span> <span class="fu">filter</span>(covid_sf, <span class="fu">is.na</span>(p)),</span>
+<span id="cb499-14"><a href="c14-ambarom-vignette.html#cb499-14" tabindex="-1"></a>    <span class="at">pattern =</span> <span class="st">&quot;crosshatch&quot;</span>,</span>
+<span id="cb499-15"><a href="c14-ambarom-vignette.html#cb499-15" tabindex="-1"></a>    <span class="at">pattern_fill =</span> <span class="st">&quot;lightgray&quot;</span>,</span>
+<span id="cb499-16"><a href="c14-ambarom-vignette.html#cb499-16" tabindex="-1"></a>    <span class="at">pattern_color =</span> <span class="st">&quot;lightgray&quot;</span>,</span>
+<span id="cb499-17"><a href="c14-ambarom-vignette.html#cb499-17" tabindex="-1"></a>    <span class="at">fill =</span> <span class="cn">NA</span>,</span>
+<span id="cb499-18"><a href="c14-ambarom-vignette.html#cb499-18" tabindex="-1"></a>    <span class="at">color =</span> <span class="st">&quot;darkgray&quot;</span></span>
+<span id="cb499-19"><a href="c14-ambarom-vignette.html#cb499-19" tabindex="-1"></a>  ) <span class="sc">+</span></span>
+<span id="cb499-20"><a href="c14-ambarom-vignette.html#cb499-20" tabindex="-1"></a>  <span class="fu">theme_minimal</span>()</span></code></pre></div>
 <div class="figure"><span style="display:block;" id="fig:ambarom-make-maps-covid"></span>
-<img src="bookdown_files/figure-html/ambarom-make-maps-covid-1.png" alt="Percent of households worried someone in their household will get COVID-19 in the next 3 months by country" width="672" />
+<img src="bookdown_files/figure-html/ambarom-make-maps-covid-1.png" alt="A choropleth map of the Western Hemisphere where the color scale filling in each country corresponds to the percent of households worried someone in their household will get COVID-19 in the next 3 months. The bottom of the range is 30% and the top of the range is 80%. Brazil and Chile look like the countries with the highest percentage of worry, with North America showing a lower percentage of worry. Countries without data, such as Venezuela, are displayed with a hash pattern." width="672" />
 <p class="caption">
 FIGURE 14.2: Percent of households worried someone in their household will get COVID-19 in the next 3 months by country
 </p>
 </div>
-<div class="sourceCode" id="cb506"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb506-1"><a href="c14-ambarom-vignette.html#cb506-1" tabindex="-1"></a><span class="fu">ggplot</span>() <span class="sc">+</span></span>
-<span id="cb506-2"><a href="c14-ambarom-vignette.html#cb506-2" tabindex="-1"></a>  <span class="fu">geom_sf</span>(</span>
-<span id="cb506-3"><a href="c14-ambarom-vignette.html#cb506-3" tabindex="-1"></a>    <span class="at">data =</span> covid_sf,</span>
-<span id="cb506-4"><a href="c14-ambarom-vignette.html#cb506-4" tabindex="-1"></a>    <span class="fu">aes</span>(<span class="at">fill =</span> p_mediumchange, <span class="at">geometry =</span> geometry),</span>
-<span id="cb506-5"><a href="c14-ambarom-vignette.html#cb506-5" tabindex="-1"></a>    <span class="at">color =</span> <span class="st">&quot;darkgray&quot;</span></span>
-<span id="cb506-6"><a href="c14-ambarom-vignette.html#cb506-6" tabindex="-1"></a>  ) <span class="sc">+</span></span>
-<span id="cb506-7"><a href="c14-ambarom-vignette.html#cb506-7" tabindex="-1"></a>  <span class="fu">scale_fill_gradientn</span>(</span>
-<span id="cb506-8"><a href="c14-ambarom-vignette.html#cb506-8" tabindex="-1"></a>    <span class="at">guide =</span> <span class="st">&quot;colorbar&quot;</span>,</span>
-<span id="cb506-9"><a href="c14-ambarom-vignette.html#cb506-9" tabindex="-1"></a>    <span class="at">name =</span> <span class="st">&quot;Percent&quot;</span>,</span>
-<span id="cb506-10"><a href="c14-ambarom-vignette.html#cb506-10" tabindex="-1"></a>    <span class="at">labels =</span> scales<span class="sc">::</span>comma,</span>
-<span id="cb506-11"><a href="c14-ambarom-vignette.html#cb506-11" tabindex="-1"></a>    <span class="at">colors =</span> <span class="fu">c</span>(<span class="st">&quot;#BFD7EA&quot;</span>, <span class="st">&quot;#087e8b&quot;</span>, <span class="st">&quot;#0B3954&quot;</span>),</span>
-<span id="cb506-12"><a href="c14-ambarom-vignette.html#cb506-12" tabindex="-1"></a>    <span class="at">na.value =</span> <span class="cn">NA</span></span>
-<span id="cb506-13"><a href="c14-ambarom-vignette.html#cb506-13" tabindex="-1"></a>  ) <span class="sc">+</span></span>
-<span id="cb506-14"><a href="c14-ambarom-vignette.html#cb506-14" tabindex="-1"></a>  <span class="fu">geom_sf_pattern</span>(</span>
-<span id="cb506-15"><a href="c14-ambarom-vignette.html#cb506-15" tabindex="-1"></a>    <span class="at">data =</span> <span class="fu">filter</span>(covid_sf, <span class="fu">is.na</span>(p_mediumchange)),</span>
-<span id="cb506-16"><a href="c14-ambarom-vignette.html#cb506-16" tabindex="-1"></a>    <span class="at">pattern =</span> <span class="st">&quot;crosshatch&quot;</span>,</span>
-<span id="cb506-17"><a href="c14-ambarom-vignette.html#cb506-17" tabindex="-1"></a>    <span class="at">pattern_fill =</span> <span class="st">&quot;lightgray&quot;</span>,</span>
-<span id="cb506-18"><a href="c14-ambarom-vignette.html#cb506-18" tabindex="-1"></a>    <span class="at">pattern_color =</span> <span class="st">&quot;lightgray&quot;</span>,</span>
-<span id="cb506-19"><a href="c14-ambarom-vignette.html#cb506-19" tabindex="-1"></a>    <span class="at">fill =</span> <span class="cn">NA</span>,</span>
-<span id="cb506-20"><a href="c14-ambarom-vignette.html#cb506-20" tabindex="-1"></a>    <span class="at">color =</span> <span class="st">&quot;darkgray&quot;</span></span>
-<span id="cb506-21"><a href="c14-ambarom-vignette.html#cb506-21" tabindex="-1"></a>  ) <span class="sc">+</span></span>
-<span id="cb506-22"><a href="c14-ambarom-vignette.html#cb506-22" tabindex="-1"></a>  <span class="fu">theme_minimal</span>()</span></code></pre></div>
+<div class="sourceCode" id="cb500"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb500-1"><a href="c14-ambarom-vignette.html#cb500-1" tabindex="-1"></a><span class="fu">ggplot</span>() <span class="sc">+</span></span>
+<span id="cb500-2"><a href="c14-ambarom-vignette.html#cb500-2" tabindex="-1"></a>  <span class="fu">geom_sf</span>(</span>
+<span id="cb500-3"><a href="c14-ambarom-vignette.html#cb500-3" tabindex="-1"></a>    <span class="at">data =</span> covid_sf,</span>
+<span id="cb500-4"><a href="c14-ambarom-vignette.html#cb500-4" tabindex="-1"></a>    <span class="fu">aes</span>(<span class="at">fill =</span> p_mediumchange, <span class="at">geometry =</span> geometry),</span>
+<span id="cb500-5"><a href="c14-ambarom-vignette.html#cb500-5" tabindex="-1"></a>    <span class="at">color =</span> <span class="st">&quot;darkgray&quot;</span></span>
+<span id="cb500-6"><a href="c14-ambarom-vignette.html#cb500-6" tabindex="-1"></a>  ) <span class="sc">+</span></span>
+<span id="cb500-7"><a href="c14-ambarom-vignette.html#cb500-7" tabindex="-1"></a>  <span class="fu">scale_fill_gradientn</span>(</span>
+<span id="cb500-8"><a href="c14-ambarom-vignette.html#cb500-8" tabindex="-1"></a>    <span class="at">guide =</span> <span class="st">&quot;colorbar&quot;</span>,</span>
+<span id="cb500-9"><a href="c14-ambarom-vignette.html#cb500-9" tabindex="-1"></a>    <span class="at">name =</span> <span class="st">&quot;Percent&quot;</span>,</span>
+<span id="cb500-10"><a href="c14-ambarom-vignette.html#cb500-10" tabindex="-1"></a>    <span class="at">labels =</span> scales<span class="sc">::</span>comma,</span>
+<span id="cb500-11"><a href="c14-ambarom-vignette.html#cb500-11" tabindex="-1"></a>    <span class="at">colors =</span> <span class="fu">c</span>(<span class="st">&quot;#BFD7EA&quot;</span>, <span class="st">&quot;#087e8b&quot;</span>, <span class="st">&quot;#0B3954&quot;</span>),</span>
+<span id="cb500-12"><a href="c14-ambarom-vignette.html#cb500-12" tabindex="-1"></a>    <span class="at">na.value =</span> <span class="cn">NA</span></span>
+<span id="cb500-13"><a href="c14-ambarom-vignette.html#cb500-13" tabindex="-1"></a>  ) <span class="sc">+</span></span>
+<span id="cb500-14"><a href="c14-ambarom-vignette.html#cb500-14" tabindex="-1"></a>  <span class="fu">geom_sf_pattern</span>(</span>
+<span id="cb500-15"><a href="c14-ambarom-vignette.html#cb500-15" tabindex="-1"></a>    <span class="at">data =</span> <span class="fu">filter</span>(covid_sf, <span class="fu">is.na</span>(p_mediumchange)),</span>
+<span id="cb500-16"><a href="c14-ambarom-vignette.html#cb500-16" tabindex="-1"></a>    <span class="at">pattern =</span> <span class="st">&quot;crosshatch&quot;</span>,</span>
+<span id="cb500-17"><a href="c14-ambarom-vignette.html#cb500-17" tabindex="-1"></a>    <span class="at">pattern_fill =</span> <span class="st">&quot;lightgray&quot;</span>,</span>
+<span id="cb500-18"><a href="c14-ambarom-vignette.html#cb500-18" tabindex="-1"></a>    <span class="at">pattern_color =</span> <span class="st">&quot;lightgray&quot;</span>,</span>
+<span id="cb500-19"><a href="c14-ambarom-vignette.html#cb500-19" tabindex="-1"></a>    <span class="at">fill =</span> <span class="cn">NA</span>,</span>
+<span id="cb500-20"><a href="c14-ambarom-vignette.html#cb500-20" tabindex="-1"></a>    <span class="at">color =</span> <span class="st">&quot;darkgray&quot;</span></span>
+<span id="cb500-21"><a href="c14-ambarom-vignette.html#cb500-21" tabindex="-1"></a>  ) <span class="sc">+</span></span>
+<span id="cb500-22"><a href="c14-ambarom-vignette.html#cb500-22" tabindex="-1"></a>  <span class="fu">theme_minimal</span>()</span></code></pre></div>
 <div class="figure"><span style="display:block;" id="fig:ambarom-make-maps-covid-ed"></span>
-<img src="bookdown_files/figure-html/ambarom-make-maps-covid-ed-1.png" alt="Percent of households who had at least one child participate in virtual or hybrid learning" width="672" />
+<img src="bookdown_files/figure-html/ambarom-make-maps-covid-ed-1.png" alt="A choropleth map of the Western Hemisphere where the color scale filling in each country corresponds to the percent of households who had at least one child participate in virtual or hybrid learning. The bottom of the range is 20% and the top of the range is 100%. Most of North America is missing data and are filled in with a hash pattern. The countries with data show a high percentage of households who had at least one child participate in virtual or hybrid learning." width="672" />
 <p class="caption">
 FIGURE 14.3: Percent of households who had at least one child participate in virtual or hybrid learning
 </p>
 </div>
 <p>In Figure <a href="c14-ambarom-vignette.html#fig:ambarom-make-maps-covid-ed">14.3</a>, we observe missing data (represented by the crosshatch pattern) for Canada, Mexico, and the United States. The questionnaires indicate that these three countries did not include the education question in the survey. To focus on countries with available data, we can remove North America from the map and show only Central and South America. We do this below by restricting the shape files to Latin America and the Caribbean, as depicted in Figure <a href="c14-ambarom-vignette.html#fig:ambarom-make-maps-covid-ed-c-s">14.4</a>.</p>
-<div class="sourceCode" id="cb507"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb507-1"><a href="c14-ambarom-vignette.html#cb507-1" tabindex="-1"></a>covid_c_s <span class="ot">&lt;-</span> covid_sf <span class="sc">%&gt;%</span></span>
-<span id="cb507-2"><a href="c14-ambarom-vignette.html#cb507-2" tabindex="-1"></a>  <span class="fu">filter</span>(region_wb <span class="sc">==</span> <span class="st">&quot;Latin America &amp; Caribbean&quot;</span>)</span>
-<span id="cb507-3"><a href="c14-ambarom-vignette.html#cb507-3" tabindex="-1"></a></span>
-<span id="cb507-4"><a href="c14-ambarom-vignette.html#cb507-4" tabindex="-1"></a><span class="fu">ggplot</span>() <span class="sc">+</span></span>
-<span id="cb507-5"><a href="c14-ambarom-vignette.html#cb507-5" tabindex="-1"></a>  <span class="fu">geom_sf</span>(</span>
-<span id="cb507-6"><a href="c14-ambarom-vignette.html#cb507-6" tabindex="-1"></a>    <span class="at">data =</span> covid_c_s,</span>
-<span id="cb507-7"><a href="c14-ambarom-vignette.html#cb507-7" tabindex="-1"></a>    <span class="fu">aes</span>(<span class="at">fill =</span> p_mediumchange, <span class="at">geometry =</span> geometry),</span>
-<span id="cb507-8"><a href="c14-ambarom-vignette.html#cb507-8" tabindex="-1"></a>    <span class="at">color =</span> <span class="st">&quot;darkgray&quot;</span></span>
-<span id="cb507-9"><a href="c14-ambarom-vignette.html#cb507-9" tabindex="-1"></a>  ) <span class="sc">+</span></span>
-<span id="cb507-10"><a href="c14-ambarom-vignette.html#cb507-10" tabindex="-1"></a>  <span class="fu">scale_fill_gradientn</span>(</span>
-<span id="cb507-11"><a href="c14-ambarom-vignette.html#cb507-11" tabindex="-1"></a>    <span class="at">guide =</span> <span class="st">&quot;colorbar&quot;</span>,</span>
-<span id="cb507-12"><a href="c14-ambarom-vignette.html#cb507-12" tabindex="-1"></a>    <span class="at">name =</span> <span class="st">&quot;Percent&quot;</span>,</span>
-<span id="cb507-13"><a href="c14-ambarom-vignette.html#cb507-13" tabindex="-1"></a>    <span class="at">labels =</span> scales<span class="sc">::</span>comma,</span>
-<span id="cb507-14"><a href="c14-ambarom-vignette.html#cb507-14" tabindex="-1"></a>    <span class="at">colors =</span> <span class="fu">c</span>(<span class="st">&quot;#BFD7EA&quot;</span>, <span class="st">&quot;#087e8b&quot;</span>, <span class="st">&quot;#0B3954&quot;</span>),</span>
-<span id="cb507-15"><a href="c14-ambarom-vignette.html#cb507-15" tabindex="-1"></a>    <span class="at">na.value =</span> <span class="cn">NA</span></span>
-<span id="cb507-16"><a href="c14-ambarom-vignette.html#cb507-16" tabindex="-1"></a>  ) <span class="sc">+</span></span>
-<span id="cb507-17"><a href="c14-ambarom-vignette.html#cb507-17" tabindex="-1"></a>  <span class="fu">geom_sf_pattern</span>(</span>
-<span id="cb507-18"><a href="c14-ambarom-vignette.html#cb507-18" tabindex="-1"></a>    <span class="at">data =</span> <span class="fu">filter</span>(covid_c_s, <span class="fu">is.na</span>(p_mediumchange)),</span>
-<span id="cb507-19"><a href="c14-ambarom-vignette.html#cb507-19" tabindex="-1"></a>    <span class="at">pattern =</span> <span class="st">&quot;crosshatch&quot;</span>,</span>
-<span id="cb507-20"><a href="c14-ambarom-vignette.html#cb507-20" tabindex="-1"></a>    <span class="at">pattern_fill =</span> <span class="st">&quot;lightgray&quot;</span>,</span>
-<span id="cb507-21"><a href="c14-ambarom-vignette.html#cb507-21" tabindex="-1"></a>    <span class="at">pattern_color =</span> <span class="st">&quot;lightgray&quot;</span>,</span>
-<span id="cb507-22"><a href="c14-ambarom-vignette.html#cb507-22" tabindex="-1"></a>    <span class="at">fill =</span> <span class="cn">NA</span>,</span>
-<span id="cb507-23"><a href="c14-ambarom-vignette.html#cb507-23" tabindex="-1"></a>    <span class="at">color =</span> <span class="st">&quot;darkgray&quot;</span></span>
-<span id="cb507-24"><a href="c14-ambarom-vignette.html#cb507-24" tabindex="-1"></a>  ) <span class="sc">+</span></span>
-<span id="cb507-25"><a href="c14-ambarom-vignette.html#cb507-25" tabindex="-1"></a>  <span class="fu">theme_minimal</span>()</span></code></pre></div>
+<div class="sourceCode" id="cb501"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb501-1"><a href="c14-ambarom-vignette.html#cb501-1" tabindex="-1"></a>covid_c_s <span class="ot">&lt;-</span> covid_sf <span class="sc">%&gt;%</span></span>
+<span id="cb501-2"><a href="c14-ambarom-vignette.html#cb501-2" tabindex="-1"></a>  <span class="fu">filter</span>(region_wb <span class="sc">==</span> <span class="st">&quot;Latin America &amp; Caribbean&quot;</span>)</span>
+<span id="cb501-3"><a href="c14-ambarom-vignette.html#cb501-3" tabindex="-1"></a></span>
+<span id="cb501-4"><a href="c14-ambarom-vignette.html#cb501-4" tabindex="-1"></a><span class="fu">ggplot</span>() <span class="sc">+</span></span>
+<span id="cb501-5"><a href="c14-ambarom-vignette.html#cb501-5" tabindex="-1"></a>  <span class="fu">geom_sf</span>(</span>
+<span id="cb501-6"><a href="c14-ambarom-vignette.html#cb501-6" tabindex="-1"></a>    <span class="at">data =</span> covid_c_s,</span>
+<span id="cb501-7"><a href="c14-ambarom-vignette.html#cb501-7" tabindex="-1"></a>    <span class="fu">aes</span>(<span class="at">fill =</span> p_mediumchange, <span class="at">geometry =</span> geometry),</span>
+<span id="cb501-8"><a href="c14-ambarom-vignette.html#cb501-8" tabindex="-1"></a>    <span class="at">color =</span> <span class="st">&quot;darkgray&quot;</span></span>
+<span id="cb501-9"><a href="c14-ambarom-vignette.html#cb501-9" tabindex="-1"></a>  ) <span class="sc">+</span></span>
+<span id="cb501-10"><a href="c14-ambarom-vignette.html#cb501-10" tabindex="-1"></a>  <span class="fu">scale_fill_gradientn</span>(</span>
+<span id="cb501-11"><a href="c14-ambarom-vignette.html#cb501-11" tabindex="-1"></a>    <span class="at">guide =</span> <span class="st">&quot;colorbar&quot;</span>,</span>
+<span id="cb501-12"><a href="c14-ambarom-vignette.html#cb501-12" tabindex="-1"></a>    <span class="at">name =</span> <span class="st">&quot;Percent&quot;</span>,</span>
+<span id="cb501-13"><a href="c14-ambarom-vignette.html#cb501-13" tabindex="-1"></a>    <span class="at">labels =</span> scales<span class="sc">::</span>comma,</span>
+<span id="cb501-14"><a href="c14-ambarom-vignette.html#cb501-14" tabindex="-1"></a>    <span class="at">colors =</span> <span class="fu">c</span>(<span class="st">&quot;#BFD7EA&quot;</span>, <span class="st">&quot;#087e8b&quot;</span>, <span class="st">&quot;#0B3954&quot;</span>),</span>
+<span id="cb501-15"><a href="c14-ambarom-vignette.html#cb501-15" tabindex="-1"></a>    <span class="at">na.value =</span> <span class="cn">NA</span></span>
+<span id="cb501-16"><a href="c14-ambarom-vignette.html#cb501-16" tabindex="-1"></a>  ) <span class="sc">+</span></span>
+<span id="cb501-17"><a href="c14-ambarom-vignette.html#cb501-17" tabindex="-1"></a>  <span class="fu">geom_sf_pattern</span>(</span>
+<span id="cb501-18"><a href="c14-ambarom-vignette.html#cb501-18" tabindex="-1"></a>    <span class="at">data =</span> <span class="fu">filter</span>(covid_c_s, <span class="fu">is.na</span>(p_mediumchange)),</span>
+<span id="cb501-19"><a href="c14-ambarom-vignette.html#cb501-19" tabindex="-1"></a>    <span class="at">pattern =</span> <span class="st">&quot;crosshatch&quot;</span>,</span>
+<span id="cb501-20"><a href="c14-ambarom-vignette.html#cb501-20" tabindex="-1"></a>    <span class="at">pattern_fill =</span> <span class="st">&quot;lightgray&quot;</span>,</span>
+<span id="cb501-21"><a href="c14-ambarom-vignette.html#cb501-21" tabindex="-1"></a>    <span class="at">pattern_color =</span> <span class="st">&quot;lightgray&quot;</span>,</span>
+<span id="cb501-22"><a href="c14-ambarom-vignette.html#cb501-22" tabindex="-1"></a>    <span class="at">fill =</span> <span class="cn">NA</span>,</span>
+<span id="cb501-23"><a href="c14-ambarom-vignette.html#cb501-23" tabindex="-1"></a>    <span class="at">color =</span> <span class="st">&quot;darkgray&quot;</span></span>
+<span id="cb501-24"><a href="c14-ambarom-vignette.html#cb501-24" tabindex="-1"></a>  ) <span class="sc">+</span></span>
+<span id="cb501-25"><a href="c14-ambarom-vignette.html#cb501-25" tabindex="-1"></a>  <span class="fu">theme_minimal</span>()</span></code></pre></div>
 <div class="figure"><span style="display:block;" id="fig:ambarom-make-maps-covid-ed-c-s"></span>
-<img src="bookdown_files/figure-html/ambarom-make-maps-covid-ed-c-s-1.png" alt="Percent of  households who had at least one child participate in virtual or hybrid learning, Central and South America" width="672" />
+<img src="bookdown_files/figure-html/ambarom-make-maps-covid-ed-c-s-1.png" alt="A choropleth map of Central and South America where the color scale filling in each country corresponds to the percent of households who had at least one child participate in virtual or hybrid learning. The bottom of the range is 20% and the top of the range is 100%. Most of North America is missing data and are filled in with a hash pattern. The countries with data show a high percentage of households who had at least one child participate in virtual or hybrid learning." width="672" />
 <p class="caption">
 FIGURE 14.4: Percent of households who had at least one child participate in virtual or hybrid learning, Central and South America
 </p>
 </div>
-<p>In Figure <a href="c14-ambarom-vignette.html#fig:ambarom-make-maps-covid-ed-c-s">14.4</a>, we can see that most countries with available data have similar percentages (reflected in their similar shades). However, Haiti stands out with a lighter shade, indicating a considerably lower percentage of households with at least one child participating in virtual or hybrid learning.</p>
+<p>In Figure <a href="c14-ambarom-vignette.html#fig:ambarom-make-maps-covid-ed-c-s">14.4</a>, we can see that most countries with available data have similar percentages (reflected in their similar shades.) However, Haiti stands out with a lighter shade, indicating a considerably lower percentage of households with at least one child participating in virtual or hybrid learning.</p>
 </div>
 <div id="exercises-4" class="section level2 hasAnchor" number="14.7">
 <h2><span class="header-section-number">14.7</span> Exercises<a href="c14-ambarom-vignette.html#exercises-4" class="anchor-section" aria-label="Anchor link to header"></a></h2>
 <ol style="list-style-type: decimal">
-<li><p>Calculate the percentage of households with broadband internet in and those with any internet at home, including from a phone or tablet in Latin America and the Caribbean. Hint: if you come across countries with 0% internet usage, you may want to filter by something first.</p></li>
+<li><p>Calculate the percentage of households with broadband internet and those with any internet at home, including from a phone or tablet in Latin America and the Caribbean. Hint: if there are countries with 0% internet usage, try filtering by something first.</p></li>
 <li><p>Create a faceted map showing both broadband internet and any internet usage.</p></li>
 </ol>
 
@@ -3091,7 +3091,7 @@ <h2><span class="header-section-number">14.7</span> Exercises<a href="c14-ambaro
 <h3>References<a href="references.html#references" class="anchor-section" aria-label="Anchor link to header"></a></h3>
 <div id="refs" class="references csl-bib-body hanging-indent" entry-spacing="0">
 <div id="ref-R-ggpattern" class="csl-entry">
-FC, Mike, Trevor L Davis, and ggplot2 authors. 2022. <em><span class="nocase">ggpattern</span>: Ggplot2 Pattern Geoms</em>.
+FC, Mike, Trevor L Davis, and ggplot2 authors. 2022. <em><span class="nocase">ggpattern</span>: ’<span class="nocase">ggplot2</span>’ Pattern Geoms</em>.
 </div>
 <div id="ref-R-gt" class="csl-entry">
 Iannone, Richard, Joe Cheng, Barret Schloerke, Ellis Hughes, Alexandra Lauer, and JooYoung Seo. 2023. <em><span class="nocase">gt</span>: Easily Create Presentation-Ready Display Tables</em>.
@@ -3120,20 +3120,20 @@ <h3>References<a href="references.html#references" class="anchor-section" aria-l
 <div id="ref-sf2023" class="csl-entry">
 Pebesma, Edzer, and Roger Bivand. 2023. <em><span class="nocase">Spatial Data Science: With applications in R</span></em>. <span>Chapman and Hall/CRC</span>. <a href="https://doi.org/10.1201/9780429459016">https://doi.org/10.1201/9780429459016</a>.
 </div>
-<div id="ref-ggplot22016" class="csl-entry">
-Wickham, Hadley. 2016. <em>Ggplot2: Elegant Graphics for Data Analysis</em>. Springer-Verlag New York. <a href="https://ggplot2.tidyverse.org">https://ggplot2.tidyverse.org</a>.
+<div id="ref-ggplot2wickham" class="csl-entry">
+Wickham, Hadley. 2016. <em><span class="nocase">ggplot2</span>: Elegant Graphics for Data Analysis</em>. Springer-Verlag New York. <a href="https://ggplot2.tidyverse.org">https://ggplot2.tidyverse.org</a>.
 </div>
 <div id="ref-R-forcats" class="csl-entry">
 ———. 2023a. <em><span class="nocase">forcats</span>: Tools for Working with Categorical Variables (Factors)</em>.
 </div>
 <div id="ref-R-haven" class="csl-entry">
-Wickham, Hadley, Evan Miller, and Danny Smith. 2023. <em><span class="nocase">haven</span>: Import and Export SPSS, Stata and SAS Files</em>.
+Wickham, Hadley, Evan Miller, and Danny Smith. 2023. <em><span class="nocase">haven</span>: Import and Export ’SPSS’, ’Stata’ and ’SAS’ Files</em>.
 </div>
 </div>
 <div class="footnotes">
 <hr />
-<ol start="28">
-<li id="fn28"><p>See Table 2 in <span class="citation">LAPOP (<a href="#ref-lapop-tech">2021c</a>)</span> for dates by country<a href="c14-ambarom-vignette.html#fnref28" class="footnote-back">↩︎</a></p></li>
+<ol start="29">
+<li id="fn29"><p>See Table 2 in <span class="citation">LAPOP (<a href="#ref-lapop-tech">2021c</a>)</span> for dates by country<a href="c14-ambarom-vignette.html#fnref29" class="footnote-back">↩︎</a></p></li>
 </ol>
 </div>
             </section>
diff --git a/exercise-solutions.html b/exercise-solutions.html
index 52f42e0c..83ad6ff6 100644
--- a/exercise-solutions.html
+++ b/exercise-solutions.html
@@ -23,7 +23,7 @@
 <meta name="author" content="Stephanie Zimmer, Rebecca J. Powell, and Isabella Velásquez" />
 
 
-<meta name="date" content="2024-04-16" />
+<meta name="date" content="2024-04-24" />
 
   <meta name="viewport" content="width=device-width, initial-scale=1" />
   <meta name="apple-mobile-web-app-capable" content="yes" />
@@ -229,7 +229,7 @@
 <li class="chapter" data-level="4.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#introduction-3"><i class="fa fa-check"></i><b>4.1</b> Introduction</a></li>
 <li class="chapter" data-level="4.2" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup"><i class="fa fa-check"></i><b>4.2</b> Setup</a>
 <ul>
-<li class="chapter" data-level="4.2.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#packages"><i class="fa fa-check"></i><b>4.2.1</b> Packages</a></li>
+<li class="chapter" data-level="4.2.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup-load-pkgs"><i class="fa fa-check"></i><b>4.2.1</b> Packages</a></li>
 <li class="chapter" data-level="4.2.2" data-path="c04-getting-started.html"><a href="c04-getting-started.html#data"><i class="fa fa-check"></i><b>4.2.2</b> Data</a></li>
 <li class="chapter" data-level="4.2.3" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup-des-obj"><i class="fa fa-check"></i><b>4.2.3</b> Design objects</a></li>
 </ul></li>
@@ -358,7 +358,7 @@
 <li class="chapter" data-level="9.9.1" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#random-number-seeds"><i class="fa fa-check"></i><b>9.9.1</b> Random number seeds</a></li>
 <li class="chapter" data-level="9.9.2" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#descriptive-names-and-labels"><i class="fa fa-check"></i><b>9.9.2</b> Descriptive names and labels</a></li>
 </ul></li>
-<li class="chapter" data-level="9.10" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#summary"><i class="fa fa-check"></i><b>9.10</b> Summary</a></li>
+<li class="chapter" data-level="9.10" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#additional-resources-1"><i class="fa fa-check"></i><b>9.10</b> Additional resources</a></li>
 </ul></li>
 <li class="part"><span><b>IV Real life data</b></span></li>
 <li class="chapter" data-level="10" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html"><i class="fa fa-check"></i><b>10</b> Sample designs and replicate weights</a>
@@ -380,7 +380,7 @@
 <ul>
 <li class="chapter" data-level="10.4.1" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#balanced-repeated-replication-brr-method"><i class="fa fa-check"></i><b>10.4.1</b> Balanced Repeated Replication (BRR) method</a></li>
 <li class="chapter" data-level="10.4.2" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#fays-brr-method"><i class="fa fa-check"></i><b>10.4.2</b> Fay’s BRR method</a></li>
-<li class="chapter" data-level="10.4.3" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#jackknife-method"><i class="fa fa-check"></i><b>10.4.3</b> Jackknife method</a></li>
+<li class="chapter" data-level="10.4.3" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#samp-jackknife"><i class="fa fa-check"></i><b>10.4.3</b> Jackknife method</a></li>
 <li class="chapter" data-level="10.4.4" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#bootstrap-method"><i class="fa fa-check"></i><b>10.4.4</b> Bootstrap method</a></li>
 </ul></li>
 <li class="chapter" data-level="10.5" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#exercises-2"><i class="fa fa-check"></i><b>10.5</b> Exercises</a></li>
@@ -398,14 +398,14 @@
 <li class="chapter" data-level="11.4" data-path="c11-missing-data.html"><a href="c11-missing-data.html#analysis-with-missing-data"><i class="fa fa-check"></i><b>11.4</b> Analysis with missing data</a>
 <ul>
 <li class="chapter" data-level="11.4.1" data-path="c11-missing-data.html"><a href="c11-missing-data.html#recoding-missing-data"><i class="fa fa-check"></i><b>11.4.1</b> Recoding missing data</a></li>
-<li class="chapter" data-level="11.4.2" data-path="c11-missing-data.html"><a href="c11-missing-data.html#accounting-for-skip-patterns"><i class="fa fa-check"></i><b>11.4.2</b> Accounting for skip patterns</a></li>
+<li class="chapter" data-level="11.4.2" data-path="c11-missing-data.html"><a href="c11-missing-data.html#missing-skip-patt"><i class="fa fa-check"></i><b>11.4.2</b> Accounting for skip patterns</a></li>
 </ul></li>
 </ul></li>
 <li class="chapter" data-level="12" data-path="c12-recommendations.html"><a href="c12-recommendations.html"><i class="fa fa-check"></i><b>12</b> Successful survey analysis recommendations</a>
 <ul>
 <li class="chapter" data-level="" data-path="c12-recommendations.html"><a href="c12-recommendations.html#prereq12"><i class="fa fa-check"></i>Prerequisites</a></li>
 <li class="chapter" data-level="12.1" data-path="c12-recommendations.html"><a href="c12-recommendations.html#introduction-10"><i class="fa fa-check"></i><b>12.1</b> Introduction</a></li>
-<li class="chapter" data-level="12.2" data-path="c12-recommendations.html"><a href="c12-recommendations.html#recs-survey-process"><i class="fa fa-check"></i><b>12.2</b> Follow survey analysis process</a></li>
+<li class="chapter" data-level="12.2" data-path="c12-recommendations.html"><a href="c12-recommendations.html#recs-survey-process"><i class="fa fa-check"></i><b>12.2</b> Follow the survey analysis process</a></li>
 <li class="chapter" data-level="12.3" data-path="c12-recommendations.html"><a href="c12-recommendations.html#begin-with-descriptive-analysis"><i class="fa fa-check"></i><b>12.3</b> Begin with descriptive analysis</a>
 <ul>
 <li class="chapter" data-level="12.3.1" data-path="c12-recommendations.html"><a href="c12-recommendations.html#table-review"><i class="fa fa-check"></i><b>12.3.1</b> Table review</a></li>
@@ -519,251 +519,252 @@ <h1>
             <section class="normal" id="section-">
 <div id="exercise-solutions" class="section level1 hasAnchor" number="18">
 <h1><span class="header-section-number">D</span> Exercise solutions<a href="exercise-solutions.html#exercise-solutions" class="anchor-section" aria-label="Anchor link to header"></a></h1>
-<p>The chapter exercises use the survey design objects and packages provided in the Prerequisites box in the beginning of the chapter. Please ensure they are loaded in your environment before running the exercise solutions. Code chunks to load these are also included below.</p>
-<div class="sourceCode" id="cb548"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb548-1"><a href="exercise-solutions.html#cb548-1" tabindex="-1"></a><span class="fu">library</span>(tidyverse)</span>
-<span id="cb548-2"><a href="exercise-solutions.html#cb548-2" tabindex="-1"></a><span class="fu">library</span>(survey)</span>
-<span id="cb548-3"><a href="exercise-solutions.html#cb548-3" tabindex="-1"></a><span class="fu">library</span>(srvyr)</span>
-<span id="cb548-4"><a href="exercise-solutions.html#cb548-4" tabindex="-1"></a><span class="fu">library</span>(srvyrexploR)</span>
-<span id="cb548-5"><a href="exercise-solutions.html#cb548-5" tabindex="-1"></a><span class="fu">library</span>(broom)</span>
-<span id="cb548-6"><a href="exercise-solutions.html#cb548-6" tabindex="-1"></a><span class="fu">library</span>(prettyunits)</span>
-<span id="cb548-7"><a href="exercise-solutions.html#cb548-7" tabindex="-1"></a><span class="fu">library</span>(gt)</span></code></pre></div>
-<div class="sourceCode" id="cb549"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb549-1"><a href="exercise-solutions.html#cb549-1" tabindex="-1"></a>targetpop <span class="ot">&lt;-</span> <span class="dv">231592693</span></span>
-<span id="cb549-2"><a href="exercise-solutions.html#cb549-2" tabindex="-1"></a></span>
-<span id="cb549-3"><a href="exercise-solutions.html#cb549-3" tabindex="-1"></a>anes_adjwgt <span class="ot">&lt;-</span> anes_2020 <span class="sc">%&gt;%</span></span>
-<span id="cb549-4"><a href="exercise-solutions.html#cb549-4" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">Weight =</span> Weight <span class="sc">/</span> <span class="fu">sum</span>(Weight) <span class="sc">*</span> targetpop)</span>
-<span id="cb549-5"><a href="exercise-solutions.html#cb549-5" tabindex="-1"></a></span>
-<span id="cb549-6"><a href="exercise-solutions.html#cb549-6" tabindex="-1"></a>anes_des <span class="ot">&lt;-</span> anes_adjwgt <span class="sc">%&gt;%</span></span>
-<span id="cb549-7"><a href="exercise-solutions.html#cb549-7" tabindex="-1"></a>  <span class="fu">as_survey_design</span>(</span>
-<span id="cb549-8"><a href="exercise-solutions.html#cb549-8" tabindex="-1"></a>    <span class="at">weights =</span> Weight,</span>
-<span id="cb549-9"><a href="exercise-solutions.html#cb549-9" tabindex="-1"></a>    <span class="at">strata =</span> Stratum,</span>
-<span id="cb549-10"><a href="exercise-solutions.html#cb549-10" tabindex="-1"></a>    <span class="at">ids =</span> VarUnit,</span>
-<span id="cb549-11"><a href="exercise-solutions.html#cb549-11" tabindex="-1"></a>    <span class="at">nest =</span> <span class="cn">TRUE</span></span>
-<span id="cb549-12"><a href="exercise-solutions.html#cb549-12" tabindex="-1"></a>  )</span></code></pre></div>
-<div class="sourceCode" id="cb550"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb550-1"><a href="exercise-solutions.html#cb550-1" tabindex="-1"></a>recs_des <span class="ot">&lt;-</span> recs_2020 <span class="sc">%&gt;%</span></span>
-<span id="cb550-2"><a href="exercise-solutions.html#cb550-2" tabindex="-1"></a>  <span class="fu">as_survey_rep</span>(</span>
-<span id="cb550-3"><a href="exercise-solutions.html#cb550-3" tabindex="-1"></a>    <span class="at">weights =</span> NWEIGHT,</span>
-<span id="cb550-4"><a href="exercise-solutions.html#cb550-4" tabindex="-1"></a>    <span class="at">repweights =</span> NWEIGHT1<span class="sc">:</span>NWEIGHT60,</span>
-<span id="cb550-5"><a href="exercise-solutions.html#cb550-5" tabindex="-1"></a>    <span class="at">type =</span> <span class="st">&quot;JK1&quot;</span>,</span>
-<span id="cb550-6"><a href="exercise-solutions.html#cb550-6" tabindex="-1"></a>    <span class="at">scale =</span> <span class="dv">59</span><span class="sc">/</span><span class="dv">60</span>,</span>
-<span id="cb550-7"><a href="exercise-solutions.html#cb550-7" tabindex="-1"></a>    <span class="at">mse =</span> <span class="cn">TRUE</span></span>
-<span id="cb550-8"><a href="exercise-solutions.html#cb550-8" tabindex="-1"></a>  )</span></code></pre></div>
-<div class="sourceCode" id="cb551"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb551-1"><a href="exercise-solutions.html#cb551-1" tabindex="-1"></a>inc_series <span class="ot">&lt;-</span> ncvs_2021_incident <span class="sc">%&gt;%</span></span>
-<span id="cb551-2"><a href="exercise-solutions.html#cb551-2" tabindex="-1"></a>  <span class="fu">mutate</span>(</span>
-<span id="cb551-3"><a href="exercise-solutions.html#cb551-3" tabindex="-1"></a>    <span class="at">series =</span> <span class="fu">case_when</span>(V4017 <span class="sc">%in%</span> <span class="fu">c</span>(<span class="dv">1</span>, <span class="dv">8</span>) <span class="sc">~</span> <span class="dv">1</span>,</span>
-<span id="cb551-4"><a href="exercise-solutions.html#cb551-4" tabindex="-1"></a>                       V4018 <span class="sc">%in%</span> <span class="fu">c</span>(<span class="dv">2</span>, <span class="dv">8</span>) <span class="sc">~</span> <span class="dv">1</span>,</span>
-<span id="cb551-5"><a href="exercise-solutions.html#cb551-5" tabindex="-1"></a>                       V4019 <span class="sc">%in%</span> <span class="fu">c</span>(<span class="dv">1</span>, <span class="dv">8</span>) <span class="sc">~</span> <span class="dv">1</span>,</span>
-<span id="cb551-6"><a href="exercise-solutions.html#cb551-6" tabindex="-1"></a>                       <span class="cn">TRUE</span> <span class="sc">~</span> <span class="dv">2</span></span>
-<span id="cb551-7"><a href="exercise-solutions.html#cb551-7" tabindex="-1"></a>    ),</span>
-<span id="cb551-8"><a href="exercise-solutions.html#cb551-8" tabindex="-1"></a>    <span class="at">n10v4016 =</span> <span class="fu">case_when</span>(V4016 <span class="sc">%in%</span> <span class="fu">c</span>(<span class="dv">997</span>, <span class="dv">998</span>) <span class="sc">~</span> <span class="cn">NA_real_</span>,</span>
-<span id="cb551-9"><a href="exercise-solutions.html#cb551-9" tabindex="-1"></a>                         V4016 <span class="sc">&gt;</span> <span class="dv">10</span> <span class="sc">~</span> <span class="dv">10</span>,</span>
-<span id="cb551-10"><a href="exercise-solutions.html#cb551-10" tabindex="-1"></a>                         <span class="cn">TRUE</span> <span class="sc">~</span> V4016),</span>
-<span id="cb551-11"><a href="exercise-solutions.html#cb551-11" tabindex="-1"></a>    <span class="at">serieswgt =</span> <span class="fu">case_when</span>(series <span class="sc">==</span> <span class="dv">2</span> <span class="sc">&amp;</span> <span class="fu">is.na</span>(n10v4016) <span class="sc">~</span> <span class="dv">6</span>,</span>
-<span id="cb551-12"><a href="exercise-solutions.html#cb551-12" tabindex="-1"></a>                          series <span class="sc">==</span> <span class="dv">2</span> <span class="sc">~</span> n10v4016,</span>
-<span id="cb551-13"><a href="exercise-solutions.html#cb551-13" tabindex="-1"></a>                          <span class="cn">TRUE</span> <span class="sc">~</span> <span class="dv">1</span>),</span>
-<span id="cb551-14"><a href="exercise-solutions.html#cb551-14" tabindex="-1"></a>    <span class="at">NEWWGT =</span> WGTVICCY <span class="sc">*</span> serieswgt</span>
-<span id="cb551-15"><a href="exercise-solutions.html#cb551-15" tabindex="-1"></a>  )</span>
-<span id="cb551-16"><a href="exercise-solutions.html#cb551-16" tabindex="-1"></a></span>
-<span id="cb551-17"><a href="exercise-solutions.html#cb551-17" tabindex="-1"></a>inc_ind <span class="ot">&lt;-</span> inc_series <span class="sc">%&gt;%</span></span>
-<span id="cb551-18"><a href="exercise-solutions.html#cb551-18" tabindex="-1"></a>  <span class="fu">filter</span>(V4022 <span class="sc">!=</span> <span class="dv">1</span>) <span class="sc">%&gt;%</span></span>
-<span id="cb551-19"><a href="exercise-solutions.html#cb551-19" tabindex="-1"></a>  <span class="fu">mutate</span>(</span>
-<span id="cb551-20"><a href="exercise-solutions.html#cb551-20" tabindex="-1"></a>    <span class="at">WeapCat =</span> <span class="fu">case_when</span>(</span>
-<span id="cb551-21"><a href="exercise-solutions.html#cb551-21" tabindex="-1"></a>      <span class="fu">is.na</span>(V4049) <span class="sc">~</span> <span class="cn">NA_character_</span>,</span>
-<span id="cb551-22"><a href="exercise-solutions.html#cb551-22" tabindex="-1"></a>      V4049 <span class="sc">==</span> <span class="dv">2</span> <span class="sc">~</span> <span class="st">&quot;NoWeap&quot;</span>,</span>
-<span id="cb551-23"><a href="exercise-solutions.html#cb551-23" tabindex="-1"></a>      V4049 <span class="sc">==</span> <span class="dv">3</span> <span class="sc">~</span> <span class="st">&quot;UnkWeapUse&quot;</span>,</span>
-<span id="cb551-24"><a href="exercise-solutions.html#cb551-24" tabindex="-1"></a>      V4050 <span class="sc">==</span> <span class="dv">3</span> <span class="sc">~</span> <span class="st">&quot;Other&quot;</span>,</span>
-<span id="cb551-25"><a href="exercise-solutions.html#cb551-25" tabindex="-1"></a>      V4051 <span class="sc">==</span> <span class="dv">1</span> <span class="sc">|</span> V4052 <span class="sc">==</span> <span class="dv">1</span> <span class="sc">|</span> V4050 <span class="sc">==</span> <span class="dv">7</span> <span class="sc">~</span> <span class="st">&quot;Firearm&quot;</span>,</span>
-<span id="cb551-26"><a href="exercise-solutions.html#cb551-26" tabindex="-1"></a>      V4053 <span class="sc">==</span> <span class="dv">1</span> <span class="sc">|</span> V4054 <span class="sc">==</span> <span class="dv">1</span> <span class="sc">~</span> <span class="st">&quot;Knife&quot;</span>,</span>
-<span id="cb551-27"><a href="exercise-solutions.html#cb551-27" tabindex="-1"></a>      <span class="cn">TRUE</span> <span class="sc">~</span> <span class="st">&quot;Other&quot;</span></span>
-<span id="cb551-28"><a href="exercise-solutions.html#cb551-28" tabindex="-1"></a>    ),</span>
-<span id="cb551-29"><a href="exercise-solutions.html#cb551-29" tabindex="-1"></a>    <span class="at">V4529_num =</span> <span class="fu">parse_number</span>(<span class="fu">as.character</span>(V4529)),</span>
-<span id="cb551-30"><a href="exercise-solutions.html#cb551-30" tabindex="-1"></a>    <span class="at">ReportPolice =</span> V4399 <span class="sc">==</span> <span class="dv">1</span>,</span>
-<span id="cb551-31"><a href="exercise-solutions.html#cb551-31" tabindex="-1"></a>    <span class="at">Property =</span> V4529_num <span class="sc">&gt;=</span> <span class="dv">31</span>,</span>
-<span id="cb551-32"><a href="exercise-solutions.html#cb551-32" tabindex="-1"></a>    <span class="at">Violent =</span> V4529_num <span class="sc">&lt;=</span> <span class="dv">20</span>,</span>
-<span id="cb551-33"><a href="exercise-solutions.html#cb551-33" tabindex="-1"></a>    <span class="at">Property_ReportPolice =</span> Property <span class="sc">&amp;</span> ReportPolice,</span>
-<span id="cb551-34"><a href="exercise-solutions.html#cb551-34" tabindex="-1"></a>    <span class="at">Violent_ReportPolice =</span> Violent <span class="sc">&amp;</span> ReportPolice,</span>
-<span id="cb551-35"><a href="exercise-solutions.html#cb551-35" tabindex="-1"></a>    <span class="at">AAST =</span> V4529_num <span class="sc">%in%</span> <span class="dv">11</span><span class="sc">:</span><span class="dv">13</span>,</span>
-<span id="cb551-36"><a href="exercise-solutions.html#cb551-36" tabindex="-1"></a>    <span class="at">AAST_NoWeap =</span> AAST <span class="sc">&amp;</span> WeapCat <span class="sc">==</span> <span class="st">&quot;NoWeap&quot;</span>,</span>
-<span id="cb551-37"><a href="exercise-solutions.html#cb551-37" tabindex="-1"></a>    <span class="at">AAST_Firearm =</span> AAST <span class="sc">&amp;</span> WeapCat <span class="sc">==</span> <span class="st">&quot;Firearm&quot;</span>,</span>
-<span id="cb551-38"><a href="exercise-solutions.html#cb551-38" tabindex="-1"></a>    <span class="at">AAST_Knife =</span> AAST <span class="sc">&amp;</span> WeapCat <span class="sc">==</span> <span class="st">&quot;Knife&quot;</span>,</span>
-<span id="cb551-39"><a href="exercise-solutions.html#cb551-39" tabindex="-1"></a>    <span class="at">AAST_Other =</span> AAST <span class="sc">&amp;</span> WeapCat <span class="sc">==</span> <span class="st">&quot;Other&quot;</span></span>
-<span id="cb551-40"><a href="exercise-solutions.html#cb551-40" tabindex="-1"></a>  )</span>
-<span id="cb551-41"><a href="exercise-solutions.html#cb551-41" tabindex="-1"></a>inc_hh_sums <span class="ot">&lt;-</span></span>
-<span id="cb551-42"><a href="exercise-solutions.html#cb551-42" tabindex="-1"></a>  inc_ind <span class="sc">%&gt;%</span></span>
-<span id="cb551-43"><a href="exercise-solutions.html#cb551-43" tabindex="-1"></a>  <span class="fu">filter</span>(V4529_num <span class="sc">&gt;</span> <span class="dv">23</span>) <span class="sc">%&gt;%</span> <span class="co"># restrict to household crimes</span></span>
-<span id="cb551-44"><a href="exercise-solutions.html#cb551-44" tabindex="-1"></a>  <span class="fu">group_by</span>(YEARQ, IDHH) <span class="sc">%&gt;%</span></span>
-<span id="cb551-45"><a href="exercise-solutions.html#cb551-45" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">WGTVICCY =</span> WGTVICCY[<span class="dv">1</span>],</span>
-<span id="cb551-46"><a href="exercise-solutions.html#cb551-46" tabindex="-1"></a>            <span class="fu">across</span>(<span class="fu">starts_with</span>(<span class="st">&quot;Property&quot;</span>), </span>
-<span id="cb551-47"><a href="exercise-solutions.html#cb551-47" tabindex="-1"></a>                   <span class="sc">~</span> <span class="fu">sum</span>(. <span class="sc">*</span> serieswgt),</span>
-<span id="cb551-48"><a href="exercise-solutions.html#cb551-48" tabindex="-1"></a>                   <span class="at">.names =</span> <span class="st">&quot;{.col}&quot;</span>),</span>
-<span id="cb551-49"><a href="exercise-solutions.html#cb551-49" tabindex="-1"></a>            <span class="at">.groups =</span> <span class="st">&quot;drop&quot;</span>)</span>
-<span id="cb551-50"><a href="exercise-solutions.html#cb551-50" tabindex="-1"></a></span>
-<span id="cb551-51"><a href="exercise-solutions.html#cb551-51" tabindex="-1"></a>inc_pers_sums <span class="ot">&lt;-</span></span>
-<span id="cb551-52"><a href="exercise-solutions.html#cb551-52" tabindex="-1"></a>  inc_ind <span class="sc">%&gt;%</span></span>
-<span id="cb551-53"><a href="exercise-solutions.html#cb551-53" tabindex="-1"></a>  <span class="fu">filter</span>(V4529_num <span class="sc">&lt;=</span> <span class="dv">23</span>) <span class="sc">%&gt;%</span> <span class="co"># restrict to person crimes</span></span>
-<span id="cb551-54"><a href="exercise-solutions.html#cb551-54" tabindex="-1"></a>  <span class="fu">group_by</span>(YEARQ, IDHH, IDPER) <span class="sc">%&gt;%</span></span>
-<span id="cb551-55"><a href="exercise-solutions.html#cb551-55" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">WGTVICCY =</span> WGTVICCY[<span class="dv">1</span>],</span>
-<span id="cb551-56"><a href="exercise-solutions.html#cb551-56" tabindex="-1"></a>            <span class="fu">across</span>(<span class="fu">c</span>(<span class="fu">starts_with</span>(<span class="st">&quot;Violent&quot;</span>), <span class="fu">starts_with</span>(<span class="st">&quot;AAST&quot;</span>)),</span>
-<span id="cb551-57"><a href="exercise-solutions.html#cb551-57" tabindex="-1"></a>                   <span class="sc">~</span> <span class="fu">sum</span>(. <span class="sc">*</span> serieswgt), </span>
-<span id="cb551-58"><a href="exercise-solutions.html#cb551-58" tabindex="-1"></a>                   <span class="at">.names =</span> <span class="st">&quot;{.col}&quot;</span>),</span>
-<span id="cb551-59"><a href="exercise-solutions.html#cb551-59" tabindex="-1"></a>            <span class="at">.groups =</span> <span class="st">&quot;drop&quot;</span>)</span>
-<span id="cb551-60"><a href="exercise-solutions.html#cb551-60" tabindex="-1"></a></span>
-<span id="cb551-61"><a href="exercise-solutions.html#cb551-61" tabindex="-1"></a>hh_z_list <span class="ot">&lt;-</span> <span class="fu">rep</span>(<span class="dv">0</span>, <span class="fu">ncol</span>(inc_hh_sums) <span class="sc">-</span> <span class="dv">3</span>) <span class="sc">%&gt;%</span> <span class="fu">as.list</span>() <span class="sc">%&gt;%</span></span>
-<span id="cb551-62"><a href="exercise-solutions.html#cb551-62" tabindex="-1"></a>  <span class="fu">setNames</span>(<span class="fu">names</span>(inc_hh_sums)[<span class="sc">-</span>(<span class="dv">1</span><span class="sc">:</span><span class="dv">3</span>)])</span>
-<span id="cb551-63"><a href="exercise-solutions.html#cb551-63" tabindex="-1"></a>pers_z_list <span class="ot">&lt;-</span> <span class="fu">rep</span>(<span class="dv">0</span>, <span class="fu">ncol</span>(inc_pers_sums) <span class="sc">-</span> <span class="dv">4</span>) <span class="sc">%&gt;%</span> <span class="fu">as.list</span>() <span class="sc">%&gt;%</span></span>
-<span id="cb551-64"><a href="exercise-solutions.html#cb551-64" tabindex="-1"></a>  <span class="fu">setNames</span>(<span class="fu">names</span>(inc_pers_sums)[<span class="sc">-</span>(<span class="dv">1</span><span class="sc">:</span><span class="dv">4</span>)])</span>
-<span id="cb551-65"><a href="exercise-solutions.html#cb551-65" tabindex="-1"></a></span>
-<span id="cb551-66"><a href="exercise-solutions.html#cb551-66" tabindex="-1"></a>hh_vsum <span class="ot">&lt;-</span> ncvs_2021_household <span class="sc">%&gt;%</span></span>
-<span id="cb551-67"><a href="exercise-solutions.html#cb551-67" tabindex="-1"></a>  <span class="fu">full_join</span>(inc_hh_sums, <span class="at">by =</span> <span class="fu">c</span>(<span class="st">&quot;YEARQ&quot;</span>, <span class="st">&quot;IDHH&quot;</span>)) <span class="sc">%&gt;%</span></span>
-<span id="cb551-68"><a href="exercise-solutions.html#cb551-68" tabindex="-1"></a>  <span class="fu">replace_na</span>(hh_z_list) <span class="sc">%&gt;%</span></span>
-<span id="cb551-69"><a href="exercise-solutions.html#cb551-69" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">ADJINC_WT =</span> <span class="fu">if_else</span>(<span class="fu">is.na</span>(WGTVICCY), <span class="dv">0</span>, WGTVICCY <span class="sc">/</span> WGTHHCY))</span>
-<span id="cb551-70"><a href="exercise-solutions.html#cb551-70" tabindex="-1"></a></span>
-<span id="cb551-71"><a href="exercise-solutions.html#cb551-71" tabindex="-1"></a>pers_vsum <span class="ot">&lt;-</span> ncvs_2021_person <span class="sc">%&gt;%</span></span>
-<span id="cb551-72"><a href="exercise-solutions.html#cb551-72" tabindex="-1"></a>  <span class="fu">full_join</span>(inc_pers_sums, <span class="at">by =</span> <span class="fu">c</span>(<span class="st">&quot;YEARQ&quot;</span>, <span class="st">&quot;IDHH&quot;</span>, <span class="st">&quot;IDPER&quot;</span>)) <span class="sc">%&gt;%</span></span>
-<span id="cb551-73"><a href="exercise-solutions.html#cb551-73" tabindex="-1"></a>  <span class="fu">replace_na</span>(pers_z_list) <span class="sc">%&gt;%</span></span>
-<span id="cb551-74"><a href="exercise-solutions.html#cb551-74" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">ADJINC_WT =</span> <span class="fu">if_else</span>(<span class="fu">is.na</span>(WGTVICCY), <span class="dv">0</span>, WGTVICCY <span class="sc">/</span> WGTPERCY))</span>
-<span id="cb551-75"><a href="exercise-solutions.html#cb551-75" tabindex="-1"></a></span>
-<span id="cb551-76"><a href="exercise-solutions.html#cb551-76" tabindex="-1"></a>hh_vsum_der <span class="ot">&lt;-</span> hh_vsum <span class="sc">%&gt;%</span></span>
-<span id="cb551-77"><a href="exercise-solutions.html#cb551-77" tabindex="-1"></a>  <span class="fu">mutate</span>(</span>
-<span id="cb551-78"><a href="exercise-solutions.html#cb551-78" tabindex="-1"></a>    <span class="at">Tenure =</span> <span class="fu">factor</span>(<span class="fu">case_when</span>(V2015 <span class="sc">==</span> <span class="dv">1</span> <span class="sc">~</span> <span class="st">&quot;Owned&quot;</span>, </span>
-<span id="cb551-79"><a href="exercise-solutions.html#cb551-79" tabindex="-1"></a>                              <span class="sc">!</span><span class="fu">is.na</span>(V2015) <span class="sc">~</span> <span class="st">&quot;Rented&quot;</span>),</span>
-<span id="cb551-80"><a href="exercise-solutions.html#cb551-80" tabindex="-1"></a>                    <span class="at">levels =</span> <span class="fu">c</span>(<span class="st">&quot;Owned&quot;</span>, <span class="st">&quot;Rented&quot;</span>)),</span>
-<span id="cb551-81"><a href="exercise-solutions.html#cb551-81" tabindex="-1"></a>    <span class="at">Urbanicity =</span> <span class="fu">factor</span>(<span class="fu">case_when</span>(V2143 <span class="sc">==</span> <span class="dv">1</span> <span class="sc">~</span> <span class="st">&quot;Urban&quot;</span>,</span>
-<span id="cb551-82"><a href="exercise-solutions.html#cb551-82" tabindex="-1"></a>                                  V2143 <span class="sc">==</span> <span class="dv">2</span> <span class="sc">~</span> <span class="st">&quot;Suburban&quot;</span>,</span>
-<span id="cb551-83"><a href="exercise-solutions.html#cb551-83" tabindex="-1"></a>                                  V2143 <span class="sc">==</span> <span class="dv">3</span> <span class="sc">~</span> <span class="st">&quot;Rural&quot;</span>),</span>
-<span id="cb551-84"><a href="exercise-solutions.html#cb551-84" tabindex="-1"></a>                        <span class="at">levels =</span> <span class="fu">c</span>(<span class="st">&quot;Urban&quot;</span>, <span class="st">&quot;Suburban&quot;</span>, <span class="st">&quot;Rural&quot;</span>)),</span>
-<span id="cb551-85"><a href="exercise-solutions.html#cb551-85" tabindex="-1"></a>    <span class="at">SC214A_num =</span> <span class="fu">as.numeric</span>(<span class="fu">as.character</span>(SC214A)),</span>
-<span id="cb551-86"><a href="exercise-solutions.html#cb551-86" tabindex="-1"></a>    <span class="at">Income =</span> <span class="fu">case_when</span>(SC214A_num <span class="sc">&lt;=</span> <span class="dv">8</span> <span class="sc">~</span> <span class="st">&quot;Less than $25,000&quot;</span>,</span>
-<span id="cb551-87"><a href="exercise-solutions.html#cb551-87" tabindex="-1"></a>                       SC214A_num <span class="sc">&lt;=</span> <span class="dv">12</span> <span class="sc">~</span> <span class="st">&quot;$25,000-49,999&quot;</span>,</span>
-<span id="cb551-88"><a href="exercise-solutions.html#cb551-88" tabindex="-1"></a>                       SC214A_num <span class="sc">&lt;=</span> <span class="dv">15</span> <span class="sc">~</span> <span class="st">&quot;$50,000-99,999&quot;</span>,</span>
-<span id="cb551-89"><a href="exercise-solutions.html#cb551-89" tabindex="-1"></a>                       SC214A_num <span class="sc">&lt;=</span> <span class="dv">17</span> <span class="sc">~</span> <span class="st">&quot;$100,000-199,999&quot;</span>,</span>
-<span id="cb551-90"><a href="exercise-solutions.html#cb551-90" tabindex="-1"></a>                       SC214A_num <span class="sc">&lt;=</span> <span class="dv">18</span> <span class="sc">~</span> <span class="st">&quot;$200,000 or more&quot;</span>),</span>
-<span id="cb551-91"><a href="exercise-solutions.html#cb551-91" tabindex="-1"></a>    <span class="at">Income =</span> <span class="fu">fct_reorder</span>(Income, SC214A_num, <span class="at">.na_rm =</span> <span class="cn">FALSE</span>),</span>
-<span id="cb551-92"><a href="exercise-solutions.html#cb551-92" tabindex="-1"></a>    <span class="at">PlaceSize =</span> <span class="fu">case_match</span>(<span class="fu">as.numeric</span>(<span class="fu">as.character</span>(V2126B)),</span>
-<span id="cb551-93"><a href="exercise-solutions.html#cb551-93" tabindex="-1"></a>                           <span class="dv">0</span> <span class="sc">~</span> <span class="st">&quot;Not in a place&quot;</span>,</span>
-<span id="cb551-94"><a href="exercise-solutions.html#cb551-94" tabindex="-1"></a>                           <span class="dv">13</span> <span class="sc">~</span> <span class="st">&quot;Under 10,000&quot;</span>,</span>
-<span id="cb551-95"><a href="exercise-solutions.html#cb551-95" tabindex="-1"></a>                           <span class="dv">16</span> <span class="sc">~</span> <span class="st">&quot;10,000-49,999&quot;</span>,</span>
-<span id="cb551-96"><a href="exercise-solutions.html#cb551-96" tabindex="-1"></a>                           <span class="dv">17</span> <span class="sc">~</span> <span class="st">&quot;50,000-99,999&quot;</span>,</span>
-<span id="cb551-97"><a href="exercise-solutions.html#cb551-97" tabindex="-1"></a>                           <span class="dv">18</span> <span class="sc">~</span> <span class="st">&quot;100,000-249,999&quot;</span>,</span>
-<span id="cb551-98"><a href="exercise-solutions.html#cb551-98" tabindex="-1"></a>                           <span class="dv">19</span> <span class="sc">~</span> <span class="st">&quot;250,000-499,999&quot;</span>,</span>
-<span id="cb551-99"><a href="exercise-solutions.html#cb551-99" tabindex="-1"></a>                           <span class="dv">20</span> <span class="sc">~</span> <span class="st">&quot;500,000-999,999&quot;</span>,</span>
-<span id="cb551-100"><a href="exercise-solutions.html#cb551-100" tabindex="-1"></a>                           <span class="fu">c</span>(<span class="dv">21</span>, <span class="dv">22</span>, <span class="dv">23</span>) <span class="sc">~</span> <span class="st">&quot;1,000,000 or more&quot;</span>),</span>
-<span id="cb551-101"><a href="exercise-solutions.html#cb551-101" tabindex="-1"></a>    <span class="at">PlaceSize =</span> <span class="fu">fct_reorder</span>(PlaceSize, <span class="fu">as.numeric</span>(V2126B)),</span>
-<span id="cb551-102"><a href="exercise-solutions.html#cb551-102" tabindex="-1"></a>    <span class="at">Region =</span> <span class="fu">case_match</span>(<span class="fu">as.numeric</span>(V2127B),</span>
-<span id="cb551-103"><a href="exercise-solutions.html#cb551-103" tabindex="-1"></a>                        <span class="dv">1</span> <span class="sc">~</span> <span class="st">&quot;Northeast&quot;</span>,</span>
-<span id="cb551-104"><a href="exercise-solutions.html#cb551-104" tabindex="-1"></a>                        <span class="dv">2</span> <span class="sc">~</span> <span class="st">&quot;Midwest&quot;</span>,</span>
-<span id="cb551-105"><a href="exercise-solutions.html#cb551-105" tabindex="-1"></a>                        <span class="dv">3</span> <span class="sc">~</span> <span class="st">&quot;South&quot;</span>,</span>
-<span id="cb551-106"><a href="exercise-solutions.html#cb551-106" tabindex="-1"></a>                        <span class="dv">4</span> <span class="sc">~</span> <span class="st">&quot;West&quot;</span>),</span>
-<span id="cb551-107"><a href="exercise-solutions.html#cb551-107" tabindex="-1"></a>    <span class="at">Region =</span> <span class="fu">fct_reorder</span>(Region, <span class="fu">as.numeric</span>(V2127B))</span>
-<span id="cb551-108"><a href="exercise-solutions.html#cb551-108" tabindex="-1"></a>  )</span>
-<span id="cb551-109"><a href="exercise-solutions.html#cb551-109" tabindex="-1"></a>NHOPI <span class="ot">&lt;-</span> <span class="st">&quot;Native Hawaiian or Other Pacific Islander&quot;</span></span>
-<span id="cb551-110"><a href="exercise-solutions.html#cb551-110" tabindex="-1"></a></span>
-<span id="cb551-111"><a href="exercise-solutions.html#cb551-111" tabindex="-1"></a>pers_vsum_der <span class="ot">&lt;-</span> pers_vsum <span class="sc">%&gt;%</span></span>
-<span id="cb551-112"><a href="exercise-solutions.html#cb551-112" tabindex="-1"></a>  <span class="fu">mutate</span>(</span>
-<span id="cb551-113"><a href="exercise-solutions.html#cb551-113" tabindex="-1"></a>    <span class="at">Sex =</span> <span class="fu">factor</span>(<span class="fu">case_when</span>(V3018 <span class="sc">==</span> <span class="dv">1</span> <span class="sc">~</span> <span class="st">&quot;Male&quot;</span>,</span>
-<span id="cb551-114"><a href="exercise-solutions.html#cb551-114" tabindex="-1"></a>                           V3018 <span class="sc">==</span> <span class="dv">2</span> <span class="sc">~</span> <span class="st">&quot;Female&quot;</span>)),</span>
-<span id="cb551-115"><a href="exercise-solutions.html#cb551-115" tabindex="-1"></a>    <span class="at">RaceHispOrigin =</span> <span class="fu">factor</span>(<span class="fu">case_when</span>(V3024 <span class="sc">==</span> <span class="dv">1</span> <span class="sc">~</span> <span class="st">&quot;Hispanic&quot;</span>,</span>
-<span id="cb551-116"><a href="exercise-solutions.html#cb551-116" tabindex="-1"></a>                                      V3023A <span class="sc">==</span> <span class="dv">1</span> <span class="sc">~</span> <span class="st">&quot;White&quot;</span>,</span>
-<span id="cb551-117"><a href="exercise-solutions.html#cb551-117" tabindex="-1"></a>                                      V3023A <span class="sc">==</span> <span class="dv">2</span> <span class="sc">~</span> <span class="st">&quot;Black&quot;</span>,</span>
-<span id="cb551-118"><a href="exercise-solutions.html#cb551-118" tabindex="-1"></a>                                      V3023A <span class="sc">==</span> <span class="dv">4</span> <span class="sc">~</span> <span class="st">&quot;Asian&quot;</span>,</span>
-<span id="cb551-119"><a href="exercise-solutions.html#cb551-119" tabindex="-1"></a>                                      V3023A <span class="sc">==</span> <span class="dv">5</span> <span class="sc">~</span> NHOPI,</span>
-<span id="cb551-120"><a href="exercise-solutions.html#cb551-120" tabindex="-1"></a>                                      <span class="cn">TRUE</span> <span class="sc">~</span> <span class="st">&quot;Other&quot;</span>),</span>
-<span id="cb551-121"><a href="exercise-solutions.html#cb551-121" tabindex="-1"></a>                            <span class="at">levels =</span> <span class="fu">c</span>(<span class="st">&quot;White&quot;</span>, <span class="st">&quot;Black&quot;</span>, <span class="st">&quot;Hispanic&quot;</span>, </span>
-<span id="cb551-122"><a href="exercise-solutions.html#cb551-122" tabindex="-1"></a>                                       <span class="st">&quot;Asian&quot;</span>, NHOPI, <span class="st">&quot;Other&quot;</span>)),</span>
-<span id="cb551-123"><a href="exercise-solutions.html#cb551-123" tabindex="-1"></a>    <span class="at">V3014_num =</span> <span class="fu">as.numeric</span>(<span class="fu">as.character</span>(V3014)),</span>
-<span id="cb551-124"><a href="exercise-solutions.html#cb551-124" tabindex="-1"></a>    <span class="at">AgeGroup =</span> <span class="fu">case_when</span>(V3014_num <span class="sc">&lt;=</span> <span class="dv">17</span> <span class="sc">~</span> <span class="st">&quot;12-17&quot;</span>,</span>
-<span id="cb551-125"><a href="exercise-solutions.html#cb551-125" tabindex="-1"></a>                         V3014_num <span class="sc">&lt;=</span> <span class="dv">24</span> <span class="sc">~</span> <span class="st">&quot;18-24&quot;</span>,</span>
-<span id="cb551-126"><a href="exercise-solutions.html#cb551-126" tabindex="-1"></a>                         V3014_num <span class="sc">&lt;=</span> <span class="dv">34</span> <span class="sc">~</span> <span class="st">&quot;25-34&quot;</span>,</span>
-<span id="cb551-127"><a href="exercise-solutions.html#cb551-127" tabindex="-1"></a>                         V3014_num <span class="sc">&lt;=</span> <span class="dv">49</span> <span class="sc">~</span> <span class="st">&quot;35-49&quot;</span>,</span>
-<span id="cb551-128"><a href="exercise-solutions.html#cb551-128" tabindex="-1"></a>                         V3014_num <span class="sc">&lt;=</span> <span class="dv">64</span> <span class="sc">~</span> <span class="st">&quot;50-64&quot;</span>,</span>
-<span id="cb551-129"><a href="exercise-solutions.html#cb551-129" tabindex="-1"></a>                         V3014_num <span class="sc">&lt;=</span> <span class="dv">90</span> <span class="sc">~</span> <span class="st">&quot;65 or older&quot;</span>),</span>
-<span id="cb551-130"><a href="exercise-solutions.html#cb551-130" tabindex="-1"></a>    <span class="at">AgeGroup =</span> <span class="fu">fct_reorder</span>(AgeGroup, V3014_num),</span>
-<span id="cb551-131"><a href="exercise-solutions.html#cb551-131" tabindex="-1"></a>    <span class="at">MaritalStatus =</span> <span class="fu">factor</span>(<span class="fu">case_when</span>(V3015 <span class="sc">==</span> <span class="dv">1</span> <span class="sc">~</span> <span class="st">&quot;Married&quot;</span>,</span>
-<span id="cb551-132"><a href="exercise-solutions.html#cb551-132" tabindex="-1"></a>                                     V3015 <span class="sc">==</span> <span class="dv">2</span> <span class="sc">~</span> <span class="st">&quot;Widowed&quot;</span>,</span>
-<span id="cb551-133"><a href="exercise-solutions.html#cb551-133" tabindex="-1"></a>                                     V3015 <span class="sc">==</span> <span class="dv">3</span> <span class="sc">~</span> <span class="st">&quot;Divorced&quot;</span>,</span>
-<span id="cb551-134"><a href="exercise-solutions.html#cb551-134" tabindex="-1"></a>                                     V3015 <span class="sc">==</span> <span class="dv">4</span> <span class="sc">~</span> <span class="st">&quot;Separated&quot;</span>,</span>
-<span id="cb551-135"><a href="exercise-solutions.html#cb551-135" tabindex="-1"></a>                                     V3015 <span class="sc">==</span> <span class="dv">5</span> <span class="sc">~</span> <span class="st">&quot;Never married&quot;</span>),</span>
-<span id="cb551-136"><a href="exercise-solutions.html#cb551-136" tabindex="-1"></a>                           <span class="at">levels =</span> <span class="fu">c</span>(<span class="st">&quot;Never married&quot;</span>, <span class="st">&quot;Married&quot;</span>, </span>
-<span id="cb551-137"><a href="exercise-solutions.html#cb551-137" tabindex="-1"></a>                                      <span class="st">&quot;Widowed&quot;</span>,<span class="st">&quot;Divorced&quot;</span>, </span>
-<span id="cb551-138"><a href="exercise-solutions.html#cb551-138" tabindex="-1"></a>                                      <span class="st">&quot;Separated&quot;</span>))</span>
-<span id="cb551-139"><a href="exercise-solutions.html#cb551-139" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span> </span>
-<span id="cb551-140"><a href="exercise-solutions.html#cb551-140" tabindex="-1"></a>  <span class="fu">left_join</span>(hh_vsum_der <span class="sc">%&gt;%</span> <span class="fu">select</span>(YEARQ, IDHH, </span>
-<span id="cb551-141"><a href="exercise-solutions.html#cb551-141" tabindex="-1"></a>                                   V2117, V2118, Tenure<span class="sc">:</span>Region),</span>
-<span id="cb551-142"><a href="exercise-solutions.html#cb551-142" tabindex="-1"></a>            <span class="at">by =</span> <span class="fu">c</span>(<span class="st">&quot;YEARQ&quot;</span>, <span class="st">&quot;IDHH&quot;</span>))</span>
-<span id="cb551-143"><a href="exercise-solutions.html#cb551-143" tabindex="-1"></a>hh_vsum_slim <span class="ot">&lt;-</span> hh_vsum_der <span class="sc">%&gt;%</span></span>
-<span id="cb551-144"><a href="exercise-solutions.html#cb551-144" tabindex="-1"></a>  <span class="fu">select</span>(YEARQ<span class="sc">:</span>V2118,</span>
-<span id="cb551-145"><a href="exercise-solutions.html#cb551-145" tabindex="-1"></a>         WGTVICCY<span class="sc">:</span>ADJINC_WT,</span>
-<span id="cb551-146"><a href="exercise-solutions.html#cb551-146" tabindex="-1"></a>         Tenure,</span>
-<span id="cb551-147"><a href="exercise-solutions.html#cb551-147" tabindex="-1"></a>         Urbanicity,</span>
-<span id="cb551-148"><a href="exercise-solutions.html#cb551-148" tabindex="-1"></a>         Income,</span>
-<span id="cb551-149"><a href="exercise-solutions.html#cb551-149" tabindex="-1"></a>         PlaceSize,</span>
-<span id="cb551-150"><a href="exercise-solutions.html#cb551-150" tabindex="-1"></a>         Region)</span>
-<span id="cb551-151"><a href="exercise-solutions.html#cb551-151" tabindex="-1"></a></span>
-<span id="cb551-152"><a href="exercise-solutions.html#cb551-152" tabindex="-1"></a>pers_vsum_slim <span class="ot">&lt;-</span> pers_vsum_der <span class="sc">%&gt;%</span></span>
-<span id="cb551-153"><a href="exercise-solutions.html#cb551-153" tabindex="-1"></a>  <span class="fu">select</span>(YEARQ<span class="sc">:</span>WGTPERCY, WGTVICCY<span class="sc">:</span>ADJINC_WT, Sex<span class="sc">:</span>Region)</span>
-<span id="cb551-154"><a href="exercise-solutions.html#cb551-154" tabindex="-1"></a></span>
-<span id="cb551-155"><a href="exercise-solutions.html#cb551-155" tabindex="-1"></a>dummy_records <span class="ot">&lt;-</span> hh_vsum_slim <span class="sc">%&gt;%</span></span>
-<span id="cb551-156"><a href="exercise-solutions.html#cb551-156" tabindex="-1"></a>  <span class="fu">distinct</span>(V2117, V2118) <span class="sc">%&gt;%</span></span>
-<span id="cb551-157"><a href="exercise-solutions.html#cb551-157" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">Dummy =</span> <span class="dv">1</span>,</span>
-<span id="cb551-158"><a href="exercise-solutions.html#cb551-158" tabindex="-1"></a>         <span class="at">WGTVICCY =</span> <span class="dv">1</span>,</span>
-<span id="cb551-159"><a href="exercise-solutions.html#cb551-159" tabindex="-1"></a>         <span class="at">NEWWGT =</span> <span class="dv">1</span>)</span>
-<span id="cb551-160"><a href="exercise-solutions.html#cb551-160" tabindex="-1"></a></span>
-<span id="cb551-161"><a href="exercise-solutions.html#cb551-161" tabindex="-1"></a>inc_analysis <span class="ot">&lt;-</span> inc_ind <span class="sc">%&gt;%</span></span>
-<span id="cb551-162"><a href="exercise-solutions.html#cb551-162" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">Dummy =</span> <span class="dv">0</span>) <span class="sc">%&gt;%</span></span>
-<span id="cb551-163"><a href="exercise-solutions.html#cb551-163" tabindex="-1"></a>  <span class="fu">left_join</span>(<span class="fu">select</span>(pers_vsum_slim, YEARQ, IDHH, IDPER, Sex<span class="sc">:</span>Region),</span>
-<span id="cb551-164"><a href="exercise-solutions.html#cb551-164" tabindex="-1"></a>            <span class="at">by =</span> <span class="fu">c</span>(<span class="st">&quot;YEARQ&quot;</span>, <span class="st">&quot;IDHH&quot;</span>, <span class="st">&quot;IDPER&quot;</span>)) <span class="sc">%&gt;%</span></span>
-<span id="cb551-165"><a href="exercise-solutions.html#cb551-165" tabindex="-1"></a>  <span class="fu">bind_rows</span>(dummy_records) <span class="sc">%&gt;%</span></span>
-<span id="cb551-166"><a href="exercise-solutions.html#cb551-166" tabindex="-1"></a>  <span class="fu">select</span>(YEARQ<span class="sc">:</span>IDPER,</span>
-<span id="cb551-167"><a href="exercise-solutions.html#cb551-167" tabindex="-1"></a>         WGTVICCY,</span>
-<span id="cb551-168"><a href="exercise-solutions.html#cb551-168" tabindex="-1"></a>         NEWWGT,</span>
-<span id="cb551-169"><a href="exercise-solutions.html#cb551-169" tabindex="-1"></a>         V4529,</span>
-<span id="cb551-170"><a href="exercise-solutions.html#cb551-170" tabindex="-1"></a>         WeapCat,</span>
-<span id="cb551-171"><a href="exercise-solutions.html#cb551-171" tabindex="-1"></a>         ReportPolice,</span>
-<span id="cb551-172"><a href="exercise-solutions.html#cb551-172" tabindex="-1"></a>         Property<span class="sc">:</span>Region)</span>
-<span id="cb551-173"><a href="exercise-solutions.html#cb551-173" tabindex="-1"></a></span>
-<span id="cb551-174"><a href="exercise-solutions.html#cb551-174" tabindex="-1"></a>inc_des <span class="ot">&lt;-</span> inc_analysis <span class="sc">%&gt;%</span></span>
-<span id="cb551-175"><a href="exercise-solutions.html#cb551-175" tabindex="-1"></a>  <span class="fu">as_survey</span>(</span>
-<span id="cb551-176"><a href="exercise-solutions.html#cb551-176" tabindex="-1"></a>    <span class="at">weight =</span> NEWWGT,</span>
-<span id="cb551-177"><a href="exercise-solutions.html#cb551-177" tabindex="-1"></a>    <span class="at">strata =</span> V2117,</span>
-<span id="cb551-178"><a href="exercise-solutions.html#cb551-178" tabindex="-1"></a>    <span class="at">ids =</span> V2118,</span>
-<span id="cb551-179"><a href="exercise-solutions.html#cb551-179" tabindex="-1"></a>    <span class="at">nest =</span> <span class="cn">TRUE</span></span>
-<span id="cb551-180"><a href="exercise-solutions.html#cb551-180" tabindex="-1"></a>  )</span>
-<span id="cb551-181"><a href="exercise-solutions.html#cb551-181" tabindex="-1"></a></span>
-<span id="cb551-182"><a href="exercise-solutions.html#cb551-182" tabindex="-1"></a>hh_des <span class="ot">&lt;-</span> hh_vsum_slim <span class="sc">%&gt;%</span></span>
-<span id="cb551-183"><a href="exercise-solutions.html#cb551-183" tabindex="-1"></a>  <span class="fu">as_survey</span>(</span>
-<span id="cb551-184"><a href="exercise-solutions.html#cb551-184" tabindex="-1"></a>    <span class="at">weight =</span> WGTHHCY,</span>
-<span id="cb551-185"><a href="exercise-solutions.html#cb551-185" tabindex="-1"></a>    <span class="at">strata =</span> V2117,</span>
-<span id="cb551-186"><a href="exercise-solutions.html#cb551-186" tabindex="-1"></a>    <span class="at">ids =</span> V2118,</span>
-<span id="cb551-187"><a href="exercise-solutions.html#cb551-187" tabindex="-1"></a>    <span class="at">nest =</span> <span class="cn">TRUE</span></span>
-<span id="cb551-188"><a href="exercise-solutions.html#cb551-188" tabindex="-1"></a>  )</span>
-<span id="cb551-189"><a href="exercise-solutions.html#cb551-189" tabindex="-1"></a></span>
-<span id="cb551-190"><a href="exercise-solutions.html#cb551-190" tabindex="-1"></a>pers_des <span class="ot">&lt;-</span> pers_vsum_slim <span class="sc">%&gt;%</span></span>
-<span id="cb551-191"><a href="exercise-solutions.html#cb551-191" tabindex="-1"></a>  <span class="fu">as_survey</span>(</span>
-<span id="cb551-192"><a href="exercise-solutions.html#cb551-192" tabindex="-1"></a>    <span class="at">weight =</span> WGTPERCY,</span>
-<span id="cb551-193"><a href="exercise-solutions.html#cb551-193" tabindex="-1"></a>    <span class="at">strata =</span> V2117,</span>
-<span id="cb551-194"><a href="exercise-solutions.html#cb551-194" tabindex="-1"></a>    <span class="at">ids =</span> V2118,</span>
-<span id="cb551-195"><a href="exercise-solutions.html#cb551-195" tabindex="-1"></a>    <span class="at">nest =</span> <span class="cn">TRUE</span></span>
-<span id="cb551-196"><a href="exercise-solutions.html#cb551-196" tabindex="-1"></a>  )</span></code></pre></div>
+<p>The chapter exercises use the survey design objects and packages provided in the Prerequisites box in the beginning of the chapter. Please ensure they are loaded in the environment before running the exercise solutions. Code chunks to load these are also included below.</p>
+<div class="sourceCode" id="cb546"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb546-1"><a href="exercise-solutions.html#cb546-1" tabindex="-1"></a><span class="fu">library</span>(tidyverse)</span>
+<span id="cb546-2"><a href="exercise-solutions.html#cb546-2" tabindex="-1"></a><span class="fu">library</span>(survey)</span>
+<span id="cb546-3"><a href="exercise-solutions.html#cb546-3" tabindex="-1"></a><span class="fu">library</span>(srvyr)</span>
+<span id="cb546-4"><a href="exercise-solutions.html#cb546-4" tabindex="-1"></a><span class="fu">library</span>(srvyrexploR)</span>
+<span id="cb546-5"><a href="exercise-solutions.html#cb546-5" tabindex="-1"></a><span class="fu">library</span>(broom)</span>
+<span id="cb546-6"><a href="exercise-solutions.html#cb546-6" tabindex="-1"></a><span class="fu">library</span>(prettyunits)</span>
+<span id="cb546-7"><a href="exercise-solutions.html#cb546-7" tabindex="-1"></a><span class="fu">library</span>(gt)</span></code></pre></div>
+<div class="sourceCode" id="cb547"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb547-1"><a href="exercise-solutions.html#cb547-1" tabindex="-1"></a>targetpop <span class="ot">&lt;-</span> <span class="dv">231592693</span></span>
+<span id="cb547-2"><a href="exercise-solutions.html#cb547-2" tabindex="-1"></a></span>
+<span id="cb547-3"><a href="exercise-solutions.html#cb547-3" tabindex="-1"></a>anes_adjwgt <span class="ot">&lt;-</span> anes_2020 <span class="sc">%&gt;%</span></span>
+<span id="cb547-4"><a href="exercise-solutions.html#cb547-4" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">Weight =</span> Weight <span class="sc">/</span> <span class="fu">sum</span>(Weight) <span class="sc">*</span> targetpop)</span>
+<span id="cb547-5"><a href="exercise-solutions.html#cb547-5" tabindex="-1"></a></span>
+<span id="cb547-6"><a href="exercise-solutions.html#cb547-6" tabindex="-1"></a>anes_des <span class="ot">&lt;-</span> anes_adjwgt <span class="sc">%&gt;%</span></span>
+<span id="cb547-7"><a href="exercise-solutions.html#cb547-7" tabindex="-1"></a>  <span class="fu">as_survey_design</span>(</span>
+<span id="cb547-8"><a href="exercise-solutions.html#cb547-8" tabindex="-1"></a>    <span class="at">weights =</span> Weight,</span>
+<span id="cb547-9"><a href="exercise-solutions.html#cb547-9" tabindex="-1"></a>    <span class="at">strata =</span> Stratum,</span>
+<span id="cb547-10"><a href="exercise-solutions.html#cb547-10" tabindex="-1"></a>    <span class="at">ids =</span> VarUnit,</span>
+<span id="cb547-11"><a href="exercise-solutions.html#cb547-11" tabindex="-1"></a>    <span class="at">nest =</span> <span class="cn">TRUE</span></span>
+<span id="cb547-12"><a href="exercise-solutions.html#cb547-12" tabindex="-1"></a>  )</span></code></pre></div>
+<div class="sourceCode" id="cb548"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb548-1"><a href="exercise-solutions.html#cb548-1" tabindex="-1"></a>recs_des <span class="ot">&lt;-</span> recs_2020 <span class="sc">%&gt;%</span></span>
+<span id="cb548-2"><a href="exercise-solutions.html#cb548-2" tabindex="-1"></a>  <span class="fu">as_survey_rep</span>(</span>
+<span id="cb548-3"><a href="exercise-solutions.html#cb548-3" tabindex="-1"></a>    <span class="at">weights =</span> NWEIGHT,</span>
+<span id="cb548-4"><a href="exercise-solutions.html#cb548-4" tabindex="-1"></a>    <span class="at">repweights =</span> NWEIGHT1<span class="sc">:</span>NWEIGHT60,</span>
+<span id="cb548-5"><a href="exercise-solutions.html#cb548-5" tabindex="-1"></a>    <span class="at">type =</span> <span class="st">&quot;JK1&quot;</span>,</span>
+<span id="cb548-6"><a href="exercise-solutions.html#cb548-6" tabindex="-1"></a>    <span class="at">scale =</span> <span class="dv">59</span><span class="sc">/</span><span class="dv">60</span>,</span>
+<span id="cb548-7"><a href="exercise-solutions.html#cb548-7" tabindex="-1"></a>    <span class="at">mse =</span> <span class="cn">TRUE</span></span>
+<span id="cb548-8"><a href="exercise-solutions.html#cb548-8" tabindex="-1"></a>  )</span></code></pre></div>
+<div class="sourceCode" id="cb549"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb549-1"><a href="exercise-solutions.html#cb549-1" tabindex="-1"></a>inc_series <span class="ot">&lt;-</span> ncvs_2021_incident <span class="sc">%&gt;%</span></span>
+<span id="cb549-2"><a href="exercise-solutions.html#cb549-2" tabindex="-1"></a>  <span class="fu">mutate</span>(</span>
+<span id="cb549-3"><a href="exercise-solutions.html#cb549-3" tabindex="-1"></a>    <span class="at">series =</span> <span class="fu">case_when</span>(V4017 <span class="sc">%in%</span> <span class="fu">c</span>(<span class="dv">1</span>, <span class="dv">8</span>) <span class="sc">~</span> <span class="dv">1</span>,</span>
+<span id="cb549-4"><a href="exercise-solutions.html#cb549-4" tabindex="-1"></a>                       V4018 <span class="sc">%in%</span> <span class="fu">c</span>(<span class="dv">2</span>, <span class="dv">8</span>) <span class="sc">~</span> <span class="dv">1</span>,</span>
+<span id="cb549-5"><a href="exercise-solutions.html#cb549-5" tabindex="-1"></a>                       V4019 <span class="sc">%in%</span> <span class="fu">c</span>(<span class="dv">1</span>, <span class="dv">8</span>) <span class="sc">~</span> <span class="dv">1</span>,</span>
+<span id="cb549-6"><a href="exercise-solutions.html#cb549-6" tabindex="-1"></a>                       <span class="cn">TRUE</span> <span class="sc">~</span> <span class="dv">2</span></span>
+<span id="cb549-7"><a href="exercise-solutions.html#cb549-7" tabindex="-1"></a>    ),</span>
+<span id="cb549-8"><a href="exercise-solutions.html#cb549-8" tabindex="-1"></a>    <span class="at">n10v4016 =</span> <span class="fu">case_when</span>(V4016 <span class="sc">%in%</span> <span class="fu">c</span>(<span class="dv">997</span>, <span class="dv">998</span>) <span class="sc">~</span> <span class="cn">NA_real_</span>,</span>
+<span id="cb549-9"><a href="exercise-solutions.html#cb549-9" tabindex="-1"></a>                         V4016 <span class="sc">&gt;</span> <span class="dv">10</span> <span class="sc">~</span> <span class="dv">10</span>,</span>
+<span id="cb549-10"><a href="exercise-solutions.html#cb549-10" tabindex="-1"></a>                         <span class="cn">TRUE</span> <span class="sc">~</span> V4016),</span>
+<span id="cb549-11"><a href="exercise-solutions.html#cb549-11" tabindex="-1"></a>    <span class="at">serieswgt =</span> <span class="fu">case_when</span>(series <span class="sc">==</span> <span class="dv">2</span> <span class="sc">&amp;</span> <span class="fu">is.na</span>(n10v4016) <span class="sc">~</span> <span class="dv">6</span>,</span>
+<span id="cb549-12"><a href="exercise-solutions.html#cb549-12" tabindex="-1"></a>                          series <span class="sc">==</span> <span class="dv">2</span> <span class="sc">~</span> n10v4016,</span>
+<span id="cb549-13"><a href="exercise-solutions.html#cb549-13" tabindex="-1"></a>                          <span class="cn">TRUE</span> <span class="sc">~</span> <span class="dv">1</span>),</span>
+<span id="cb549-14"><a href="exercise-solutions.html#cb549-14" tabindex="-1"></a>    <span class="at">NEWWGT =</span> WGTVICCY <span class="sc">*</span> serieswgt</span>
+<span id="cb549-15"><a href="exercise-solutions.html#cb549-15" tabindex="-1"></a>  )</span>
+<span id="cb549-16"><a href="exercise-solutions.html#cb549-16" tabindex="-1"></a></span>
+<span id="cb549-17"><a href="exercise-solutions.html#cb549-17" tabindex="-1"></a>inc_ind <span class="ot">&lt;-</span> inc_series <span class="sc">%&gt;%</span></span>
+<span id="cb549-18"><a href="exercise-solutions.html#cb549-18" tabindex="-1"></a>  <span class="fu">filter</span>(V4022 <span class="sc">!=</span> <span class="dv">1</span>) <span class="sc">%&gt;%</span></span>
+<span id="cb549-19"><a href="exercise-solutions.html#cb549-19" tabindex="-1"></a>  <span class="fu">mutate</span>(</span>
+<span id="cb549-20"><a href="exercise-solutions.html#cb549-20" tabindex="-1"></a>    <span class="at">WeapCat =</span> <span class="fu">case_when</span>(</span>
+<span id="cb549-21"><a href="exercise-solutions.html#cb549-21" tabindex="-1"></a>      <span class="fu">is.na</span>(V4049) <span class="sc">~</span> <span class="cn">NA_character_</span>,</span>
+<span id="cb549-22"><a href="exercise-solutions.html#cb549-22" tabindex="-1"></a>      V4049 <span class="sc">==</span> <span class="dv">2</span> <span class="sc">~</span> <span class="st">&quot;NoWeap&quot;</span>,</span>
+<span id="cb549-23"><a href="exercise-solutions.html#cb549-23" tabindex="-1"></a>      V4049 <span class="sc">==</span> <span class="dv">3</span> <span class="sc">~</span> <span class="st">&quot;UnkWeapUse&quot;</span>,</span>
+<span id="cb549-24"><a href="exercise-solutions.html#cb549-24" tabindex="-1"></a>      V4050 <span class="sc">==</span> <span class="dv">3</span> <span class="sc">~</span> <span class="st">&quot;Other&quot;</span>,</span>
+<span id="cb549-25"><a href="exercise-solutions.html#cb549-25" tabindex="-1"></a>      V4051 <span class="sc">==</span> <span class="dv">1</span> <span class="sc">|</span> V4052 <span class="sc">==</span> <span class="dv">1</span> <span class="sc">|</span> V4050 <span class="sc">==</span> <span class="dv">7</span> <span class="sc">~</span> <span class="st">&quot;Firearm&quot;</span>,</span>
+<span id="cb549-26"><a href="exercise-solutions.html#cb549-26" tabindex="-1"></a>      V4053 <span class="sc">==</span> <span class="dv">1</span> <span class="sc">|</span> V4054 <span class="sc">==</span> <span class="dv">1</span> <span class="sc">~</span> <span class="st">&quot;Knife&quot;</span>,</span>
+<span id="cb549-27"><a href="exercise-solutions.html#cb549-27" tabindex="-1"></a>      <span class="cn">TRUE</span> <span class="sc">~</span> <span class="st">&quot;Other&quot;</span></span>
+<span id="cb549-28"><a href="exercise-solutions.html#cb549-28" tabindex="-1"></a>    ),</span>
+<span id="cb549-29"><a href="exercise-solutions.html#cb549-29" tabindex="-1"></a>    <span class="at">V4529_num =</span> <span class="fu">parse_number</span>(<span class="fu">as.character</span>(V4529)),</span>
+<span id="cb549-30"><a href="exercise-solutions.html#cb549-30" tabindex="-1"></a>    <span class="at">ReportPolice =</span> V4399 <span class="sc">==</span> <span class="dv">1</span>,</span>
+<span id="cb549-31"><a href="exercise-solutions.html#cb549-31" tabindex="-1"></a>    <span class="at">Property =</span> V4529_num <span class="sc">&gt;=</span> <span class="dv">31</span>,</span>
+<span id="cb549-32"><a href="exercise-solutions.html#cb549-32" tabindex="-1"></a>    <span class="at">Violent =</span> V4529_num <span class="sc">&lt;=</span> <span class="dv">20</span>,</span>
+<span id="cb549-33"><a href="exercise-solutions.html#cb549-33" tabindex="-1"></a>    <span class="at">Property_ReportPolice =</span> Property <span class="sc">&amp;</span> ReportPolice,</span>
+<span id="cb549-34"><a href="exercise-solutions.html#cb549-34" tabindex="-1"></a>    <span class="at">Violent_ReportPolice =</span> Violent <span class="sc">&amp;</span> ReportPolice,</span>
+<span id="cb549-35"><a href="exercise-solutions.html#cb549-35" tabindex="-1"></a>    <span class="at">AAST =</span> V4529_num <span class="sc">%in%</span> <span class="dv">11</span><span class="sc">:</span><span class="dv">13</span>,</span>
+<span id="cb549-36"><a href="exercise-solutions.html#cb549-36" tabindex="-1"></a>    <span class="at">AAST_NoWeap =</span> AAST <span class="sc">&amp;</span> WeapCat <span class="sc">==</span> <span class="st">&quot;NoWeap&quot;</span>,</span>
+<span id="cb549-37"><a href="exercise-solutions.html#cb549-37" tabindex="-1"></a>    <span class="at">AAST_Firearm =</span> AAST <span class="sc">&amp;</span> WeapCat <span class="sc">==</span> <span class="st">&quot;Firearm&quot;</span>,</span>
+<span id="cb549-38"><a href="exercise-solutions.html#cb549-38" tabindex="-1"></a>    <span class="at">AAST_Knife =</span> AAST <span class="sc">&amp;</span> WeapCat <span class="sc">==</span> <span class="st">&quot;Knife&quot;</span>,</span>
+<span id="cb549-39"><a href="exercise-solutions.html#cb549-39" tabindex="-1"></a>    <span class="at">AAST_Other =</span> AAST <span class="sc">&amp;</span> WeapCat <span class="sc">==</span> <span class="st">&quot;Other&quot;</span></span>
+<span id="cb549-40"><a href="exercise-solutions.html#cb549-40" tabindex="-1"></a>  )</span>
+<span id="cb549-41"><a href="exercise-solutions.html#cb549-41" tabindex="-1"></a>inc_hh_sums <span class="ot">&lt;-</span></span>
+<span id="cb549-42"><a href="exercise-solutions.html#cb549-42" tabindex="-1"></a>  inc_ind <span class="sc">%&gt;%</span></span>
+<span id="cb549-43"><a href="exercise-solutions.html#cb549-43" tabindex="-1"></a>  <span class="fu">filter</span>(V4529_num <span class="sc">&gt;</span> <span class="dv">23</span>) <span class="sc">%&gt;%</span> <span class="co"># restrict to household crimes</span></span>
+<span id="cb549-44"><a href="exercise-solutions.html#cb549-44" tabindex="-1"></a>  <span class="fu">group_by</span>(YEARQ, IDHH) <span class="sc">%&gt;%</span></span>
+<span id="cb549-45"><a href="exercise-solutions.html#cb549-45" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">WGTVICCY =</span> WGTVICCY[<span class="dv">1</span>],</span>
+<span id="cb549-46"><a href="exercise-solutions.html#cb549-46" tabindex="-1"></a>            <span class="fu">across</span>(<span class="fu">starts_with</span>(<span class="st">&quot;Property&quot;</span>), </span>
+<span id="cb549-47"><a href="exercise-solutions.html#cb549-47" tabindex="-1"></a>                   <span class="sc">~</span> <span class="fu">sum</span>(. <span class="sc">*</span> serieswgt),</span>
+<span id="cb549-48"><a href="exercise-solutions.html#cb549-48" tabindex="-1"></a>                   <span class="at">.names =</span> <span class="st">&quot;{.col}&quot;</span>),</span>
+<span id="cb549-49"><a href="exercise-solutions.html#cb549-49" tabindex="-1"></a>            <span class="at">.groups =</span> <span class="st">&quot;drop&quot;</span>)</span>
+<span id="cb549-50"><a href="exercise-solutions.html#cb549-50" tabindex="-1"></a></span>
+<span id="cb549-51"><a href="exercise-solutions.html#cb549-51" tabindex="-1"></a>inc_pers_sums <span class="ot">&lt;-</span></span>
+<span id="cb549-52"><a href="exercise-solutions.html#cb549-52" tabindex="-1"></a>  inc_ind <span class="sc">%&gt;%</span></span>
+<span id="cb549-53"><a href="exercise-solutions.html#cb549-53" tabindex="-1"></a>  <span class="fu">filter</span>(V4529_num <span class="sc">&lt;=</span> <span class="dv">23</span>) <span class="sc">%&gt;%</span> <span class="co"># restrict to person crimes</span></span>
+<span id="cb549-54"><a href="exercise-solutions.html#cb549-54" tabindex="-1"></a>  <span class="fu">group_by</span>(YEARQ, IDHH, IDPER) <span class="sc">%&gt;%</span></span>
+<span id="cb549-55"><a href="exercise-solutions.html#cb549-55" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">WGTVICCY =</span> WGTVICCY[<span class="dv">1</span>],</span>
+<span id="cb549-56"><a href="exercise-solutions.html#cb549-56" tabindex="-1"></a>            <span class="fu">across</span>(<span class="fu">c</span>(<span class="fu">starts_with</span>(<span class="st">&quot;Violent&quot;</span>), <span class="fu">starts_with</span>(<span class="st">&quot;AAST&quot;</span>)),</span>
+<span id="cb549-57"><a href="exercise-solutions.html#cb549-57" tabindex="-1"></a>                   <span class="sc">~</span> <span class="fu">sum</span>(. <span class="sc">*</span> serieswgt), </span>
+<span id="cb549-58"><a href="exercise-solutions.html#cb549-58" tabindex="-1"></a>                   <span class="at">.names =</span> <span class="st">&quot;{.col}&quot;</span>),</span>
+<span id="cb549-59"><a href="exercise-solutions.html#cb549-59" tabindex="-1"></a>            <span class="at">.groups =</span> <span class="st">&quot;drop&quot;</span>)</span>
+<span id="cb549-60"><a href="exercise-solutions.html#cb549-60" tabindex="-1"></a></span>
+<span id="cb549-61"><a href="exercise-solutions.html#cb549-61" tabindex="-1"></a>hh_z_list <span class="ot">&lt;-</span> <span class="fu">rep</span>(<span class="dv">0</span>, <span class="fu">ncol</span>(inc_hh_sums) <span class="sc">-</span> <span class="dv">3</span>) <span class="sc">%&gt;%</span> <span class="fu">as.list</span>() <span class="sc">%&gt;%</span></span>
+<span id="cb549-62"><a href="exercise-solutions.html#cb549-62" tabindex="-1"></a>  <span class="fu">setNames</span>(<span class="fu">names</span>(inc_hh_sums)[<span class="sc">-</span>(<span class="dv">1</span><span class="sc">:</span><span class="dv">3</span>)])</span>
+<span id="cb549-63"><a href="exercise-solutions.html#cb549-63" tabindex="-1"></a>pers_z_list <span class="ot">&lt;-</span> <span class="fu">rep</span>(<span class="dv">0</span>, <span class="fu">ncol</span>(inc_pers_sums) <span class="sc">-</span> <span class="dv">4</span>) <span class="sc">%&gt;%</span> <span class="fu">as.list</span>() <span class="sc">%&gt;%</span></span>
+<span id="cb549-64"><a href="exercise-solutions.html#cb549-64" tabindex="-1"></a>  <span class="fu">setNames</span>(<span class="fu">names</span>(inc_pers_sums)[<span class="sc">-</span>(<span class="dv">1</span><span class="sc">:</span><span class="dv">4</span>)])</span>
+<span id="cb549-65"><a href="exercise-solutions.html#cb549-65" tabindex="-1"></a></span>
+<span id="cb549-66"><a href="exercise-solutions.html#cb549-66" tabindex="-1"></a>hh_vsum <span class="ot">&lt;-</span> ncvs_2021_household <span class="sc">%&gt;%</span></span>
+<span id="cb549-67"><a href="exercise-solutions.html#cb549-67" tabindex="-1"></a>  <span class="fu">full_join</span>(inc_hh_sums, <span class="at">by =</span> <span class="fu">c</span>(<span class="st">&quot;YEARQ&quot;</span>, <span class="st">&quot;IDHH&quot;</span>)) <span class="sc">%&gt;%</span></span>
+<span id="cb549-68"><a href="exercise-solutions.html#cb549-68" tabindex="-1"></a>  <span class="fu">replace_na</span>(hh_z_list) <span class="sc">%&gt;%</span></span>
+<span id="cb549-69"><a href="exercise-solutions.html#cb549-69" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">ADJINC_WT =</span> <span class="fu">if_else</span>(<span class="fu">is.na</span>(WGTVICCY), <span class="dv">0</span>, WGTVICCY <span class="sc">/</span> WGTHHCY))</span>
+<span id="cb549-70"><a href="exercise-solutions.html#cb549-70" tabindex="-1"></a></span>
+<span id="cb549-71"><a href="exercise-solutions.html#cb549-71" tabindex="-1"></a>pers_vsum <span class="ot">&lt;-</span> ncvs_2021_person <span class="sc">%&gt;%</span></span>
+<span id="cb549-72"><a href="exercise-solutions.html#cb549-72" tabindex="-1"></a>  <span class="fu">full_join</span>(inc_pers_sums, <span class="at">by =</span> <span class="fu">c</span>(<span class="st">&quot;YEARQ&quot;</span>, <span class="st">&quot;IDHH&quot;</span>, <span class="st">&quot;IDPER&quot;</span>)) <span class="sc">%&gt;%</span></span>
+<span id="cb549-73"><a href="exercise-solutions.html#cb549-73" tabindex="-1"></a>  <span class="fu">replace_na</span>(pers_z_list) <span class="sc">%&gt;%</span></span>
+<span id="cb549-74"><a href="exercise-solutions.html#cb549-74" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">ADJINC_WT =</span> <span class="fu">if_else</span>(<span class="fu">is.na</span>(WGTVICCY), <span class="dv">0</span>, WGTVICCY <span class="sc">/</span> WGTPERCY))</span>
+<span id="cb549-75"><a href="exercise-solutions.html#cb549-75" tabindex="-1"></a></span>
+<span id="cb549-76"><a href="exercise-solutions.html#cb549-76" tabindex="-1"></a>hh_vsum_der <span class="ot">&lt;-</span> hh_vsum <span class="sc">%&gt;%</span></span>
+<span id="cb549-77"><a href="exercise-solutions.html#cb549-77" tabindex="-1"></a>  <span class="fu">mutate</span>(</span>
+<span id="cb549-78"><a href="exercise-solutions.html#cb549-78" tabindex="-1"></a>    <span class="at">Tenure =</span> <span class="fu">factor</span>(<span class="fu">case_when</span>(V2015 <span class="sc">==</span> <span class="dv">1</span> <span class="sc">~</span> <span class="st">&quot;Owned&quot;</span>, </span>
+<span id="cb549-79"><a href="exercise-solutions.html#cb549-79" tabindex="-1"></a>                              <span class="sc">!</span><span class="fu">is.na</span>(V2015) <span class="sc">~</span> <span class="st">&quot;Rented&quot;</span>),</span>
+<span id="cb549-80"><a href="exercise-solutions.html#cb549-80" tabindex="-1"></a>                    <span class="at">levels =</span> <span class="fu">c</span>(<span class="st">&quot;Owned&quot;</span>, <span class="st">&quot;Rented&quot;</span>)),</span>
+<span id="cb549-81"><a href="exercise-solutions.html#cb549-81" tabindex="-1"></a>    <span class="at">Urbanicity =</span> <span class="fu">factor</span>(<span class="fu">case_when</span>(V2143 <span class="sc">==</span> <span class="dv">1</span> <span class="sc">~</span> <span class="st">&quot;Urban&quot;</span>,</span>
+<span id="cb549-82"><a href="exercise-solutions.html#cb549-82" tabindex="-1"></a>                                  V2143 <span class="sc">==</span> <span class="dv">2</span> <span class="sc">~</span> <span class="st">&quot;Suburban&quot;</span>,</span>
+<span id="cb549-83"><a href="exercise-solutions.html#cb549-83" tabindex="-1"></a>                                  V2143 <span class="sc">==</span> <span class="dv">3</span> <span class="sc">~</span> <span class="st">&quot;Rural&quot;</span>),</span>
+<span id="cb549-84"><a href="exercise-solutions.html#cb549-84" tabindex="-1"></a>                        <span class="at">levels =</span> <span class="fu">c</span>(<span class="st">&quot;Urban&quot;</span>, <span class="st">&quot;Suburban&quot;</span>, <span class="st">&quot;Rural&quot;</span>)),</span>
+<span id="cb549-85"><a href="exercise-solutions.html#cb549-85" tabindex="-1"></a>    <span class="at">SC214A_num =</span> <span class="fu">as.numeric</span>(<span class="fu">as.character</span>(SC214A)),</span>
+<span id="cb549-86"><a href="exercise-solutions.html#cb549-86" tabindex="-1"></a>    <span class="at">Income =</span> <span class="fu">case_when</span>(SC214A_num <span class="sc">&lt;=</span> <span class="dv">8</span> <span class="sc">~</span> <span class="st">&quot;Less than $25,000&quot;</span>,</span>
+<span id="cb549-87"><a href="exercise-solutions.html#cb549-87" tabindex="-1"></a>                       SC214A_num <span class="sc">&lt;=</span> <span class="dv">12</span> <span class="sc">~</span> <span class="st">&quot;$25,000-49,999&quot;</span>,</span>
+<span id="cb549-88"><a href="exercise-solutions.html#cb549-88" tabindex="-1"></a>                       SC214A_num <span class="sc">&lt;=</span> <span class="dv">15</span> <span class="sc">~</span> <span class="st">&quot;$50,000-99,999&quot;</span>,</span>
+<span id="cb549-89"><a href="exercise-solutions.html#cb549-89" tabindex="-1"></a>                       SC214A_num <span class="sc">&lt;=</span> <span class="dv">17</span> <span class="sc">~</span> <span class="st">&quot;$100,000-199,999&quot;</span>,</span>
+<span id="cb549-90"><a href="exercise-solutions.html#cb549-90" tabindex="-1"></a>                       SC214A_num <span class="sc">&lt;=</span> <span class="dv">18</span> <span class="sc">~</span> <span class="st">&quot;$200,000 or more&quot;</span>),</span>
+<span id="cb549-91"><a href="exercise-solutions.html#cb549-91" tabindex="-1"></a>    <span class="at">Income =</span> <span class="fu">fct_reorder</span>(Income, SC214A_num, <span class="at">.na_rm =</span> <span class="cn">FALSE</span>),</span>
+<span id="cb549-92"><a href="exercise-solutions.html#cb549-92" tabindex="-1"></a>    <span class="at">PlaceSize =</span> <span class="fu">case_match</span>(<span class="fu">as.numeric</span>(<span class="fu">as.character</span>(V2126B)),</span>
+<span id="cb549-93"><a href="exercise-solutions.html#cb549-93" tabindex="-1"></a>                           <span class="dv">0</span> <span class="sc">~</span> <span class="st">&quot;Not in a place&quot;</span>,</span>
+<span id="cb549-94"><a href="exercise-solutions.html#cb549-94" tabindex="-1"></a>                           <span class="dv">13</span> <span class="sc">~</span> <span class="st">&quot;Under 10,000&quot;</span>,</span>
+<span id="cb549-95"><a href="exercise-solutions.html#cb549-95" tabindex="-1"></a>                           <span class="dv">16</span> <span class="sc">~</span> <span class="st">&quot;10,000-49,999&quot;</span>,</span>
+<span id="cb549-96"><a href="exercise-solutions.html#cb549-96" tabindex="-1"></a>                           <span class="dv">17</span> <span class="sc">~</span> <span class="st">&quot;50,000-99,999&quot;</span>,</span>
+<span id="cb549-97"><a href="exercise-solutions.html#cb549-97" tabindex="-1"></a>                           <span class="dv">18</span> <span class="sc">~</span> <span class="st">&quot;100,000-249,999&quot;</span>,</span>
+<span id="cb549-98"><a href="exercise-solutions.html#cb549-98" tabindex="-1"></a>                           <span class="dv">19</span> <span class="sc">~</span> <span class="st">&quot;250,000-499,999&quot;</span>,</span>
+<span id="cb549-99"><a href="exercise-solutions.html#cb549-99" tabindex="-1"></a>                           <span class="dv">20</span> <span class="sc">~</span> <span class="st">&quot;500,000-999,999&quot;</span>,</span>
+<span id="cb549-100"><a href="exercise-solutions.html#cb549-100" tabindex="-1"></a>                           <span class="fu">c</span>(<span class="dv">21</span>, <span class="dv">22</span>, <span class="dv">23</span>) <span class="sc">~</span> <span class="st">&quot;1,000,000 or more&quot;</span>),</span>
+<span id="cb549-101"><a href="exercise-solutions.html#cb549-101" tabindex="-1"></a>    <span class="at">PlaceSize =</span> <span class="fu">fct_reorder</span>(PlaceSize, <span class="fu">as.numeric</span>(V2126B)),</span>
+<span id="cb549-102"><a href="exercise-solutions.html#cb549-102" tabindex="-1"></a>    <span class="at">Region =</span> <span class="fu">case_match</span>(<span class="fu">as.numeric</span>(V2127B),</span>
+<span id="cb549-103"><a href="exercise-solutions.html#cb549-103" tabindex="-1"></a>                        <span class="dv">1</span> <span class="sc">~</span> <span class="st">&quot;Northeast&quot;</span>,</span>
+<span id="cb549-104"><a href="exercise-solutions.html#cb549-104" tabindex="-1"></a>                        <span class="dv">2</span> <span class="sc">~</span> <span class="st">&quot;Midwest&quot;</span>,</span>
+<span id="cb549-105"><a href="exercise-solutions.html#cb549-105" tabindex="-1"></a>                        <span class="dv">3</span> <span class="sc">~</span> <span class="st">&quot;South&quot;</span>,</span>
+<span id="cb549-106"><a href="exercise-solutions.html#cb549-106" tabindex="-1"></a>                        <span class="dv">4</span> <span class="sc">~</span> <span class="st">&quot;West&quot;</span>),</span>
+<span id="cb549-107"><a href="exercise-solutions.html#cb549-107" tabindex="-1"></a>    <span class="at">Region =</span> <span class="fu">fct_reorder</span>(Region, <span class="fu">as.numeric</span>(V2127B))</span>
+<span id="cb549-108"><a href="exercise-solutions.html#cb549-108" tabindex="-1"></a>  )</span>
+<span id="cb549-109"><a href="exercise-solutions.html#cb549-109" tabindex="-1"></a>NHOPI <span class="ot">&lt;-</span> <span class="st">&quot;Native Hawaiian or Other Pacific Islander&quot;</span></span>
+<span id="cb549-110"><a href="exercise-solutions.html#cb549-110" tabindex="-1"></a></span>
+<span id="cb549-111"><a href="exercise-solutions.html#cb549-111" tabindex="-1"></a>pers_vsum_der <span class="ot">&lt;-</span> pers_vsum <span class="sc">%&gt;%</span></span>
+<span id="cb549-112"><a href="exercise-solutions.html#cb549-112" tabindex="-1"></a>  <span class="fu">mutate</span>(</span>
+<span id="cb549-113"><a href="exercise-solutions.html#cb549-113" tabindex="-1"></a>    <span class="at">Sex =</span> <span class="fu">factor</span>(<span class="fu">case_when</span>(V3018 <span class="sc">==</span> <span class="dv">1</span> <span class="sc">~</span> <span class="st">&quot;Male&quot;</span>,</span>
+<span id="cb549-114"><a href="exercise-solutions.html#cb549-114" tabindex="-1"></a>                           V3018 <span class="sc">==</span> <span class="dv">2</span> <span class="sc">~</span> <span class="st">&quot;Female&quot;</span>)),</span>
+<span id="cb549-115"><a href="exercise-solutions.html#cb549-115" tabindex="-1"></a>    <span class="at">RaceHispOrigin =</span> <span class="fu">factor</span>(<span class="fu">case_when</span>(V3024 <span class="sc">==</span> <span class="dv">1</span> <span class="sc">~</span> <span class="st">&quot;Hispanic&quot;</span>,</span>
+<span id="cb549-116"><a href="exercise-solutions.html#cb549-116" tabindex="-1"></a>                                      V3023A <span class="sc">==</span> <span class="dv">1</span> <span class="sc">~</span> <span class="st">&quot;White&quot;</span>,</span>
+<span id="cb549-117"><a href="exercise-solutions.html#cb549-117" tabindex="-1"></a>                                      V3023A <span class="sc">==</span> <span class="dv">2</span> <span class="sc">~</span> <span class="st">&quot;Black&quot;</span>,</span>
+<span id="cb549-118"><a href="exercise-solutions.html#cb549-118" tabindex="-1"></a>                                      V3023A <span class="sc">==</span> <span class="dv">4</span> <span class="sc">~</span> <span class="st">&quot;Asian&quot;</span>,</span>
+<span id="cb549-119"><a href="exercise-solutions.html#cb549-119" tabindex="-1"></a>                                      V3023A <span class="sc">==</span> <span class="dv">5</span> <span class="sc">~</span> NHOPI,</span>
+<span id="cb549-120"><a href="exercise-solutions.html#cb549-120" tabindex="-1"></a>                                      <span class="cn">TRUE</span> <span class="sc">~</span> <span class="st">&quot;Other&quot;</span>),</span>
+<span id="cb549-121"><a href="exercise-solutions.html#cb549-121" tabindex="-1"></a>                            <span class="at">levels =</span> <span class="fu">c</span>(<span class="st">&quot;White&quot;</span>, <span class="st">&quot;Black&quot;</span>, <span class="st">&quot;Hispanic&quot;</span>, </span>
+<span id="cb549-122"><a href="exercise-solutions.html#cb549-122" tabindex="-1"></a>                                       <span class="st">&quot;Asian&quot;</span>, NHOPI, <span class="st">&quot;Other&quot;</span>)),</span>
+<span id="cb549-123"><a href="exercise-solutions.html#cb549-123" tabindex="-1"></a>    <span class="at">V3014_num =</span> <span class="fu">as.numeric</span>(<span class="fu">as.character</span>(V3014)),</span>
+<span id="cb549-124"><a href="exercise-solutions.html#cb549-124" tabindex="-1"></a>    <span class="at">AgeGroup =</span> <span class="fu">case_when</span>(V3014_num <span class="sc">&lt;=</span> <span class="dv">17</span> <span class="sc">~</span> <span class="st">&quot;12-17&quot;</span>,</span>
+<span id="cb549-125"><a href="exercise-solutions.html#cb549-125" tabindex="-1"></a>                         V3014_num <span class="sc">&lt;=</span> <span class="dv">24</span> <span class="sc">~</span> <span class="st">&quot;18-24&quot;</span>,</span>
+<span id="cb549-126"><a href="exercise-solutions.html#cb549-126" tabindex="-1"></a>                         V3014_num <span class="sc">&lt;=</span> <span class="dv">34</span> <span class="sc">~</span> <span class="st">&quot;25-34&quot;</span>,</span>
+<span id="cb549-127"><a href="exercise-solutions.html#cb549-127" tabindex="-1"></a>                         V3014_num <span class="sc">&lt;=</span> <span class="dv">49</span> <span class="sc">~</span> <span class="st">&quot;35-49&quot;</span>,</span>
+<span id="cb549-128"><a href="exercise-solutions.html#cb549-128" tabindex="-1"></a>                         V3014_num <span class="sc">&lt;=</span> <span class="dv">64</span> <span class="sc">~</span> <span class="st">&quot;50-64&quot;</span>,</span>
+<span id="cb549-129"><a href="exercise-solutions.html#cb549-129" tabindex="-1"></a>                         V3014_num <span class="sc">&lt;=</span> <span class="dv">90</span> <span class="sc">~</span> <span class="st">&quot;65 or older&quot;</span>),</span>
+<span id="cb549-130"><a href="exercise-solutions.html#cb549-130" tabindex="-1"></a>    <span class="at">AgeGroup =</span> <span class="fu">fct_reorder</span>(AgeGroup, V3014_num),</span>
+<span id="cb549-131"><a href="exercise-solutions.html#cb549-131" tabindex="-1"></a>    <span class="at">MaritalStatus =</span> <span class="fu">factor</span>(<span class="fu">case_when</span>(V3015 <span class="sc">==</span> <span class="dv">1</span> <span class="sc">~</span> <span class="st">&quot;Married&quot;</span>,</span>
+<span id="cb549-132"><a href="exercise-solutions.html#cb549-132" tabindex="-1"></a>                                     V3015 <span class="sc">==</span> <span class="dv">2</span> <span class="sc">~</span> <span class="st">&quot;Widowed&quot;</span>,</span>
+<span id="cb549-133"><a href="exercise-solutions.html#cb549-133" tabindex="-1"></a>                                     V3015 <span class="sc">==</span> <span class="dv">3</span> <span class="sc">~</span> <span class="st">&quot;Divorced&quot;</span>,</span>
+<span id="cb549-134"><a href="exercise-solutions.html#cb549-134" tabindex="-1"></a>                                     V3015 <span class="sc">==</span> <span class="dv">4</span> <span class="sc">~</span> <span class="st">&quot;Separated&quot;</span>,</span>
+<span id="cb549-135"><a href="exercise-solutions.html#cb549-135" tabindex="-1"></a>                                     V3015 <span class="sc">==</span> <span class="dv">5</span> <span class="sc">~</span> <span class="st">&quot;Never married&quot;</span>),</span>
+<span id="cb549-136"><a href="exercise-solutions.html#cb549-136" tabindex="-1"></a>                           <span class="at">levels =</span> <span class="fu">c</span>(<span class="st">&quot;Never married&quot;</span>, <span class="st">&quot;Married&quot;</span>, </span>
+<span id="cb549-137"><a href="exercise-solutions.html#cb549-137" tabindex="-1"></a>                                      <span class="st">&quot;Widowed&quot;</span>,<span class="st">&quot;Divorced&quot;</span>, </span>
+<span id="cb549-138"><a href="exercise-solutions.html#cb549-138" tabindex="-1"></a>                                      <span class="st">&quot;Separated&quot;</span>))</span>
+<span id="cb549-139"><a href="exercise-solutions.html#cb549-139" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span> </span>
+<span id="cb549-140"><a href="exercise-solutions.html#cb549-140" tabindex="-1"></a>  <span class="fu">left_join</span>(hh_vsum_der <span class="sc">%&gt;%</span> <span class="fu">select</span>(YEARQ, IDHH, </span>
+<span id="cb549-141"><a href="exercise-solutions.html#cb549-141" tabindex="-1"></a>                                   V2117, V2118, Tenure<span class="sc">:</span>Region),</span>
+<span id="cb549-142"><a href="exercise-solutions.html#cb549-142" tabindex="-1"></a>            <span class="at">by =</span> <span class="fu">c</span>(<span class="st">&quot;YEARQ&quot;</span>, <span class="st">&quot;IDHH&quot;</span>))</span>
+<span id="cb549-143"><a href="exercise-solutions.html#cb549-143" tabindex="-1"></a>hh_vsum_slim <span class="ot">&lt;-</span> hh_vsum_der <span class="sc">%&gt;%</span></span>
+<span id="cb549-144"><a href="exercise-solutions.html#cb549-144" tabindex="-1"></a>  <span class="fu">select</span>(YEARQ<span class="sc">:</span>V2118,</span>
+<span id="cb549-145"><a href="exercise-solutions.html#cb549-145" tabindex="-1"></a>         WGTVICCY<span class="sc">:</span>ADJINC_WT,</span>
+<span id="cb549-146"><a href="exercise-solutions.html#cb549-146" tabindex="-1"></a>         Tenure,</span>
+<span id="cb549-147"><a href="exercise-solutions.html#cb549-147" tabindex="-1"></a>         Urbanicity,</span>
+<span id="cb549-148"><a href="exercise-solutions.html#cb549-148" tabindex="-1"></a>         Income,</span>
+<span id="cb549-149"><a href="exercise-solutions.html#cb549-149" tabindex="-1"></a>         PlaceSize,</span>
+<span id="cb549-150"><a href="exercise-solutions.html#cb549-150" tabindex="-1"></a>         Region)</span>
+<span id="cb549-151"><a href="exercise-solutions.html#cb549-151" tabindex="-1"></a></span>
+<span id="cb549-152"><a href="exercise-solutions.html#cb549-152" tabindex="-1"></a>pers_vsum_slim <span class="ot">&lt;-</span> pers_vsum_der <span class="sc">%&gt;%</span></span>
+<span id="cb549-153"><a href="exercise-solutions.html#cb549-153" tabindex="-1"></a>  <span class="fu">select</span>(YEARQ<span class="sc">:</span>WGTPERCY, WGTVICCY<span class="sc">:</span>ADJINC_WT, Sex<span class="sc">:</span>Region)</span>
+<span id="cb549-154"><a href="exercise-solutions.html#cb549-154" tabindex="-1"></a></span>
+<span id="cb549-155"><a href="exercise-solutions.html#cb549-155" tabindex="-1"></a>dummy_records <span class="ot">&lt;-</span> hh_vsum_slim <span class="sc">%&gt;%</span></span>
+<span id="cb549-156"><a href="exercise-solutions.html#cb549-156" tabindex="-1"></a>  <span class="fu">distinct</span>(V2117, V2118) <span class="sc">%&gt;%</span></span>
+<span id="cb549-157"><a href="exercise-solutions.html#cb549-157" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">Dummy =</span> <span class="dv">1</span>,</span>
+<span id="cb549-158"><a href="exercise-solutions.html#cb549-158" tabindex="-1"></a>         <span class="at">WGTVICCY =</span> <span class="dv">1</span>,</span>
+<span id="cb549-159"><a href="exercise-solutions.html#cb549-159" tabindex="-1"></a>         <span class="at">NEWWGT =</span> <span class="dv">1</span>)</span>
+<span id="cb549-160"><a href="exercise-solutions.html#cb549-160" tabindex="-1"></a></span>
+<span id="cb549-161"><a href="exercise-solutions.html#cb549-161" tabindex="-1"></a>inc_analysis <span class="ot">&lt;-</span> inc_ind <span class="sc">%&gt;%</span></span>
+<span id="cb549-162"><a href="exercise-solutions.html#cb549-162" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">Dummy =</span> <span class="dv">0</span>) <span class="sc">%&gt;%</span></span>
+<span id="cb549-163"><a href="exercise-solutions.html#cb549-163" tabindex="-1"></a>  <span class="fu">left_join</span>(<span class="fu">select</span>(pers_vsum_slim, YEARQ, IDHH, IDPER, Sex<span class="sc">:</span>Region),</span>
+<span id="cb549-164"><a href="exercise-solutions.html#cb549-164" tabindex="-1"></a>            <span class="at">by =</span> <span class="fu">c</span>(<span class="st">&quot;YEARQ&quot;</span>, <span class="st">&quot;IDHH&quot;</span>, <span class="st">&quot;IDPER&quot;</span>)) <span class="sc">%&gt;%</span></span>
+<span id="cb549-165"><a href="exercise-solutions.html#cb549-165" tabindex="-1"></a>  <span class="fu">bind_rows</span>(dummy_records) <span class="sc">%&gt;%</span></span>
+<span id="cb549-166"><a href="exercise-solutions.html#cb549-166" tabindex="-1"></a>  <span class="fu">select</span>(YEARQ<span class="sc">:</span>IDPER,</span>
+<span id="cb549-167"><a href="exercise-solutions.html#cb549-167" tabindex="-1"></a>         WGTVICCY,</span>
+<span id="cb549-168"><a href="exercise-solutions.html#cb549-168" tabindex="-1"></a>         NEWWGT,</span>
+<span id="cb549-169"><a href="exercise-solutions.html#cb549-169" tabindex="-1"></a>         V4529,</span>
+<span id="cb549-170"><a href="exercise-solutions.html#cb549-170" tabindex="-1"></a>         WeapCat,</span>
+<span id="cb549-171"><a href="exercise-solutions.html#cb549-171" tabindex="-1"></a>         ReportPolice,</span>
+<span id="cb549-172"><a href="exercise-solutions.html#cb549-172" tabindex="-1"></a>         Property<span class="sc">:</span>Region)</span>
+<span id="cb549-173"><a href="exercise-solutions.html#cb549-173" tabindex="-1"></a></span>
+<span id="cb549-174"><a href="exercise-solutions.html#cb549-174" tabindex="-1"></a>inc_des <span class="ot">&lt;-</span> inc_analysis <span class="sc">%&gt;%</span></span>
+<span id="cb549-175"><a href="exercise-solutions.html#cb549-175" tabindex="-1"></a>  <span class="fu">as_survey</span>(</span>
+<span id="cb549-176"><a href="exercise-solutions.html#cb549-176" tabindex="-1"></a>    <span class="at">weight =</span> NEWWGT,</span>
+<span id="cb549-177"><a href="exercise-solutions.html#cb549-177" tabindex="-1"></a>    <span class="at">strata =</span> V2117,</span>
+<span id="cb549-178"><a href="exercise-solutions.html#cb549-178" tabindex="-1"></a>    <span class="at">ids =</span> V2118,</span>
+<span id="cb549-179"><a href="exercise-solutions.html#cb549-179" tabindex="-1"></a>    <span class="at">nest =</span> <span class="cn">TRUE</span></span>
+<span id="cb549-180"><a href="exercise-solutions.html#cb549-180" tabindex="-1"></a>  )</span>
+<span id="cb549-181"><a href="exercise-solutions.html#cb549-181" tabindex="-1"></a></span>
+<span id="cb549-182"><a href="exercise-solutions.html#cb549-182" tabindex="-1"></a>hh_des <span class="ot">&lt;-</span> hh_vsum_slim <span class="sc">%&gt;%</span></span>
+<span id="cb549-183"><a href="exercise-solutions.html#cb549-183" tabindex="-1"></a>  <span class="fu">as_survey</span>(</span>
+<span id="cb549-184"><a href="exercise-solutions.html#cb549-184" tabindex="-1"></a>    <span class="at">weight =</span> WGTHHCY,</span>
+<span id="cb549-185"><a href="exercise-solutions.html#cb549-185" tabindex="-1"></a>    <span class="at">strata =</span> V2117,</span>
+<span id="cb549-186"><a href="exercise-solutions.html#cb549-186" tabindex="-1"></a>    <span class="at">ids =</span> V2118,</span>
+<span id="cb549-187"><a href="exercise-solutions.html#cb549-187" tabindex="-1"></a>    <span class="at">nest =</span> <span class="cn">TRUE</span></span>
+<span id="cb549-188"><a href="exercise-solutions.html#cb549-188" tabindex="-1"></a>  )</span>
+<span id="cb549-189"><a href="exercise-solutions.html#cb549-189" tabindex="-1"></a></span>
+<span id="cb549-190"><a href="exercise-solutions.html#cb549-190" tabindex="-1"></a>pers_des <span class="ot">&lt;-</span> pers_vsum_slim <span class="sc">%&gt;%</span></span>
+<span id="cb549-191"><a href="exercise-solutions.html#cb549-191" tabindex="-1"></a>  <span class="fu">as_survey</span>(</span>
+<span id="cb549-192"><a href="exercise-solutions.html#cb549-192" tabindex="-1"></a>    <span class="at">weight =</span> WGTPERCY,</span>
+<span id="cb549-193"><a href="exercise-solutions.html#cb549-193" tabindex="-1"></a>    <span class="at">strata =</span> V2117,</span>
+<span id="cb549-194"><a href="exercise-solutions.html#cb549-194" tabindex="-1"></a>    <span class="at">ids =</span> V2118,</span>
+<span id="cb549-195"><a href="exercise-solutions.html#cb549-195" tabindex="-1"></a>    <span class="at">nest =</span> <span class="cn">TRUE</span></span>
+<span id="cb549-196"><a href="exercise-solutions.html#cb549-196" tabindex="-1"></a>  )</span></code></pre></div>
+<p>The chapter exercises use the survey design objects and packages provided in the Prerequisites box in the beginning of the chapter. Please ensure they are loaded in the environment before running the exercise solutions.</p>
 <div id="descriptive-analysis" class="section level2 unnumbered hasAnchor">
 <h2>5 - Descriptive analysis<a href="exercise-solutions.html#descriptive-analysis" class="anchor-section" aria-label="Anchor link to header"></a></h2>
 <ol style="list-style-type: decimal">
 <li>How many females have a graduate degree? Hint: the variables <code>Gender</code> and <code>Education</code> will be useful.</li>
 </ol>
-<div class="sourceCode" id="cb552"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb552-1"><a href="exercise-solutions.html#cb552-1" tabindex="-1"></a><span class="co"># Option 1:</span></span>
-<span id="cb552-2"><a href="exercise-solutions.html#cb552-2" tabindex="-1"></a>femgd_option1 <span class="ot">&lt;-</span> anes_des <span class="sc">%&gt;%</span></span>
-<span id="cb552-3"><a href="exercise-solutions.html#cb552-3" tabindex="-1"></a>  <span class="fu">filter</span>(Gender <span class="sc">==</span> <span class="st">&quot;Female&quot;</span>, Education <span class="sc">==</span> <span class="st">&quot;Graduate&quot;</span>) <span class="sc">%&gt;%</span></span>
-<span id="cb552-4"><a href="exercise-solutions.html#cb552-4" tabindex="-1"></a>  <span class="fu">survey_count</span>(<span class="at">name =</span> <span class="st">&quot;n&quot;</span>)</span>
-<span id="cb552-5"><a href="exercise-solutions.html#cb552-5" tabindex="-1"></a></span>
-<span id="cb552-6"><a href="exercise-solutions.html#cb552-6" tabindex="-1"></a>femgd_option1</span></code></pre></div>
+<div class="sourceCode" id="cb550"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb550-1"><a href="exercise-solutions.html#cb550-1" tabindex="-1"></a><span class="co"># Option 1:</span></span>
+<span id="cb550-2"><a href="exercise-solutions.html#cb550-2" tabindex="-1"></a>femgd_option1 <span class="ot">&lt;-</span> anes_des <span class="sc">%&gt;%</span></span>
+<span id="cb550-3"><a href="exercise-solutions.html#cb550-3" tabindex="-1"></a>  <span class="fu">filter</span>(Gender <span class="sc">==</span> <span class="st">&quot;Female&quot;</span>, Education <span class="sc">==</span> <span class="st">&quot;Graduate&quot;</span>) <span class="sc">%&gt;%</span></span>
+<span id="cb550-4"><a href="exercise-solutions.html#cb550-4" tabindex="-1"></a>  <span class="fu">survey_count</span>(<span class="at">name =</span> <span class="st">&quot;n&quot;</span>)</span>
+<span id="cb550-5"><a href="exercise-solutions.html#cb550-5" tabindex="-1"></a></span>
+<span id="cb550-6"><a href="exercise-solutions.html#cb550-6" tabindex="-1"></a>femgd_option1</span></code></pre></div>
 <pre><code>## # A tibble: 1 × 2
 ##           n    n_se
 ##       &lt;dbl&gt;   &lt;dbl&gt;
 ## 1 15072196. 837872.</code></pre>
-<div class="sourceCode" id="cb554"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb554-1"><a href="exercise-solutions.html#cb554-1" tabindex="-1"></a><span class="co"># Option 2:</span></span>
-<span id="cb554-2"><a href="exercise-solutions.html#cb554-2" tabindex="-1"></a>femgd_option2 <span class="ot">&lt;-</span> anes_des <span class="sc">%&gt;%</span></span>
-<span id="cb554-3"><a href="exercise-solutions.html#cb554-3" tabindex="-1"></a>  <span class="fu">filter</span>(Gender <span class="sc">==</span> <span class="st">&quot;Female&quot;</span>, Education <span class="sc">==</span> <span class="st">&quot;Graduate&quot;</span>) <span class="sc">%&gt;%</span></span>
-<span id="cb554-4"><a href="exercise-solutions.html#cb554-4" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">N =</span> <span class="fu">survey_total</span>(), <span class="at">.groups =</span> <span class="st">&quot;drop&quot;</span>)</span>
-<span id="cb554-5"><a href="exercise-solutions.html#cb554-5" tabindex="-1"></a></span>
-<span id="cb554-6"><a href="exercise-solutions.html#cb554-6" tabindex="-1"></a>femgd_option2</span></code></pre></div>
+<div class="sourceCode" id="cb552"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb552-1"><a href="exercise-solutions.html#cb552-1" tabindex="-1"></a><span class="co"># Option 2:</span></span>
+<span id="cb552-2"><a href="exercise-solutions.html#cb552-2" tabindex="-1"></a>femgd_option2 <span class="ot">&lt;-</span> anes_des <span class="sc">%&gt;%</span></span>
+<span id="cb552-3"><a href="exercise-solutions.html#cb552-3" tabindex="-1"></a>  <span class="fu">filter</span>(Gender <span class="sc">==</span> <span class="st">&quot;Female&quot;</span>, Education <span class="sc">==</span> <span class="st">&quot;Graduate&quot;</span>) <span class="sc">%&gt;%</span></span>
+<span id="cb552-4"><a href="exercise-solutions.html#cb552-4" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">N =</span> <span class="fu">survey_total</span>(), <span class="at">.groups =</span> <span class="st">&quot;drop&quot;</span>)</span>
+<span id="cb552-5"><a href="exercise-solutions.html#cb552-5" tabindex="-1"></a></span>
+<span id="cb552-6"><a href="exercise-solutions.html#cb552-6" tabindex="-1"></a>femgd_option2</span></code></pre></div>
 <pre><code>## # A tibble: 1 × 2
 ##           N    N_se
 ##       &lt;dbl&gt;   &lt;dbl&gt;
@@ -772,12 +773,12 @@ <h2>5 - Descriptive analysis<a href="exercise-solutions.html#descriptive-analysi
 <ol start="2" style="list-style-type: decimal">
 <li>What percentage of people identify as “Strong Democrat”? Hint: The variable <code>PartyID</code> indicates someone’s party affiliation.</li>
 </ol>
-<div class="sourceCode" id="cb556"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb556-1"><a href="exercise-solutions.html#cb556-1" tabindex="-1"></a>psd <span class="ot">&lt;-</span> anes_des <span class="sc">%&gt;%</span></span>
-<span id="cb556-2"><a href="exercise-solutions.html#cb556-2" tabindex="-1"></a>  <span class="fu">group_by</span>(PartyID) <span class="sc">%&gt;%</span></span>
-<span id="cb556-3"><a href="exercise-solutions.html#cb556-3" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">p =</span> <span class="fu">survey_mean</span>()) <span class="sc">%&gt;%</span></span>
-<span id="cb556-4"><a href="exercise-solutions.html#cb556-4" tabindex="-1"></a>  <span class="fu">filter</span>(PartyID <span class="sc">==</span> <span class="st">&quot;Strong democrat&quot;</span>)</span>
-<span id="cb556-5"><a href="exercise-solutions.html#cb556-5" tabindex="-1"></a></span>
-<span id="cb556-6"><a href="exercise-solutions.html#cb556-6" tabindex="-1"></a>psd</span></code></pre></div>
+<div class="sourceCode" id="cb554"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb554-1"><a href="exercise-solutions.html#cb554-1" tabindex="-1"></a>psd <span class="ot">&lt;-</span> anes_des <span class="sc">%&gt;%</span></span>
+<span id="cb554-2"><a href="exercise-solutions.html#cb554-2" tabindex="-1"></a>  <span class="fu">group_by</span>(PartyID) <span class="sc">%&gt;%</span></span>
+<span id="cb554-3"><a href="exercise-solutions.html#cb554-3" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">p =</span> <span class="fu">survey_mean</span>()) <span class="sc">%&gt;%</span></span>
+<span id="cb554-4"><a href="exercise-solutions.html#cb554-4" tabindex="-1"></a>  <span class="fu">filter</span>(PartyID <span class="sc">==</span> <span class="st">&quot;Strong democrat&quot;</span>)</span>
+<span id="cb554-5"><a href="exercise-solutions.html#cb554-5" tabindex="-1"></a></span>
+<span id="cb554-6"><a href="exercise-solutions.html#cb554-6" tabindex="-1"></a>psd</span></code></pre></div>
 <pre><code>## # A tibble: 1 × 3
 ##   PartyID             p    p_se
 ##   &lt;fct&gt;           &lt;dbl&gt;   &lt;dbl&gt;
@@ -786,13 +787,13 @@ <h2>5 - Descriptive analysis<a href="exercise-solutions.html#descriptive-analysi
 <ol start="3" style="list-style-type: decimal">
 <li>What percentage of people who voted in the 2020 election identify as “Strong Republican”? Hint: The variable <code>VotedPres2020</code> indicates whether someone voted in 2020.</li>
 </ol>
-<div class="sourceCode" id="cb558"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb558-1"><a href="exercise-solutions.html#cb558-1" tabindex="-1"></a>psr <span class="ot">&lt;-</span> anes_des <span class="sc">%&gt;%</span></span>
-<span id="cb558-2"><a href="exercise-solutions.html#cb558-2" tabindex="-1"></a>  <span class="fu">filter</span>(VotedPres2020 <span class="sc">==</span> <span class="st">&quot;Yes&quot;</span>) <span class="sc">%&gt;%</span></span>
-<span id="cb558-3"><a href="exercise-solutions.html#cb558-3" tabindex="-1"></a>  <span class="fu">group_by</span>(PartyID) <span class="sc">%&gt;%</span></span>
-<span id="cb558-4"><a href="exercise-solutions.html#cb558-4" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">p =</span> <span class="fu">survey_mean</span>()) <span class="sc">%&gt;%</span></span>
-<span id="cb558-5"><a href="exercise-solutions.html#cb558-5" tabindex="-1"></a>  <span class="fu">filter</span>(PartyID <span class="sc">==</span> <span class="st">&quot;Strong republican&quot;</span>)</span>
-<span id="cb558-6"><a href="exercise-solutions.html#cb558-6" tabindex="-1"></a></span>
-<span id="cb558-7"><a href="exercise-solutions.html#cb558-7" tabindex="-1"></a>psr</span></code></pre></div>
+<div class="sourceCode" id="cb556"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb556-1"><a href="exercise-solutions.html#cb556-1" tabindex="-1"></a>psr <span class="ot">&lt;-</span> anes_des <span class="sc">%&gt;%</span></span>
+<span id="cb556-2"><a href="exercise-solutions.html#cb556-2" tabindex="-1"></a>  <span class="fu">filter</span>(VotedPres2020 <span class="sc">==</span> <span class="st">&quot;Yes&quot;</span>) <span class="sc">%&gt;%</span></span>
+<span id="cb556-3"><a href="exercise-solutions.html#cb556-3" tabindex="-1"></a>  <span class="fu">group_by</span>(PartyID) <span class="sc">%&gt;%</span></span>
+<span id="cb556-4"><a href="exercise-solutions.html#cb556-4" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">p =</span> <span class="fu">survey_mean</span>()) <span class="sc">%&gt;%</span></span>
+<span id="cb556-5"><a href="exercise-solutions.html#cb556-5" tabindex="-1"></a>  <span class="fu">filter</span>(PartyID <span class="sc">==</span> <span class="st">&quot;Strong republican&quot;</span>)</span>
+<span id="cb556-6"><a href="exercise-solutions.html#cb556-6" tabindex="-1"></a></span>
+<span id="cb556-7"><a href="exercise-solutions.html#cb556-7" tabindex="-1"></a>psr</span></code></pre></div>
 <pre><code>## # A tibble: 1 × 3
 ##   PartyID               p    p_se
 ##   &lt;fct&gt;             &lt;dbl&gt;   &lt;dbl&gt;
@@ -801,13 +802,13 @@ <h2>5 - Descriptive analysis<a href="exercise-solutions.html#descriptive-analysi
 <ol start="4" style="list-style-type: decimal">
 <li>What percentage of people voted in both the 2016 election and the 2020 election? Include the logit confidence interval. Hint: The variable <code>VotedPres2016</code> indicates whether someone voted in 2016.</li>
 </ol>
-<div class="sourceCode" id="cb560"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb560-1"><a href="exercise-solutions.html#cb560-1" tabindex="-1"></a>pvb <span class="ot">&lt;-</span> anes_des <span class="sc">%&gt;%</span></span>
-<span id="cb560-2"><a href="exercise-solutions.html#cb560-2" tabindex="-1"></a>  <span class="fu">filter</span>(<span class="sc">!</span><span class="fu">is.na</span>(VotedPres2016),<span class="sc">!</span><span class="fu">is.na</span>(VotedPres2020)) <span class="sc">%&gt;%</span></span>
-<span id="cb560-3"><a href="exercise-solutions.html#cb560-3" tabindex="-1"></a>  <span class="fu">group_by</span>(<span class="fu">interact</span>(VotedPres2016, VotedPres2020)) <span class="sc">%&gt;%</span></span>
-<span id="cb560-4"><a href="exercise-solutions.html#cb560-4" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">p =</span> <span class="fu">survey_prop</span>(<span class="at">var =</span> <span class="st">&quot;ci&quot;</span>, <span class="at">method =</span> <span class="st">&quot;logit&quot;</span>),) <span class="sc">%&gt;%</span></span>
-<span id="cb560-5"><a href="exercise-solutions.html#cb560-5" tabindex="-1"></a>  <span class="fu">filter</span>(VotedPres2016 <span class="sc">==</span> <span class="st">&quot;Yes&quot;</span>, VotedPres2020 <span class="sc">==</span> <span class="st">&quot;Yes&quot;</span>)</span>
-<span id="cb560-6"><a href="exercise-solutions.html#cb560-6" tabindex="-1"></a></span>
-<span id="cb560-7"><a href="exercise-solutions.html#cb560-7" tabindex="-1"></a>pvb</span></code></pre></div>
+<div class="sourceCode" id="cb558"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb558-1"><a href="exercise-solutions.html#cb558-1" tabindex="-1"></a>pvb <span class="ot">&lt;-</span> anes_des <span class="sc">%&gt;%</span></span>
+<span id="cb558-2"><a href="exercise-solutions.html#cb558-2" tabindex="-1"></a>  <span class="fu">filter</span>(<span class="sc">!</span><span class="fu">is.na</span>(VotedPres2016),<span class="sc">!</span><span class="fu">is.na</span>(VotedPres2020)) <span class="sc">%&gt;%</span></span>
+<span id="cb558-3"><a href="exercise-solutions.html#cb558-3" tabindex="-1"></a>  <span class="fu">group_by</span>(<span class="fu">interact</span>(VotedPres2016, VotedPres2020)) <span class="sc">%&gt;%</span></span>
+<span id="cb558-4"><a href="exercise-solutions.html#cb558-4" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">p =</span> <span class="fu">survey_prop</span>(<span class="at">var =</span> <span class="st">&quot;ci&quot;</span>, <span class="at">method =</span> <span class="st">&quot;logit&quot;</span>),) <span class="sc">%&gt;%</span></span>
+<span id="cb558-5"><a href="exercise-solutions.html#cb558-5" tabindex="-1"></a>  <span class="fu">filter</span>(VotedPres2016 <span class="sc">==</span> <span class="st">&quot;Yes&quot;</span>, VotedPres2020 <span class="sc">==</span> <span class="st">&quot;Yes&quot;</span>)</span>
+<span id="cb558-6"><a href="exercise-solutions.html#cb558-6" tabindex="-1"></a></span>
+<span id="cb558-7"><a href="exercise-solutions.html#cb558-7" tabindex="-1"></a>pvb</span></code></pre></div>
 <pre><code>## # A tibble: 1 × 5
 ##   VotedPres2016 VotedPres2020     p p_low p_upp
 ##   &lt;fct&gt;         &lt;fct&gt;         &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt;
@@ -816,13 +817,13 @@ <h2>5 - Descriptive analysis<a href="exercise-solutions.html#descriptive-analysi
 <ol start="5" style="list-style-type: decimal">
 <li>What is the design effect for the proportion of people who voted early? Hint: The variable <code>EarlyVote2020</code> indicates whether someone voted early in 2020.</li>
 </ol>
-<div class="sourceCode" id="cb562"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb562-1"><a href="exercise-solutions.html#cb562-1" tabindex="-1"></a>pdeff <span class="ot">&lt;-</span> anes_des <span class="sc">%&gt;%</span></span>
-<span id="cb562-2"><a href="exercise-solutions.html#cb562-2" tabindex="-1"></a>  <span class="fu">filter</span>(<span class="sc">!</span><span class="fu">is.na</span>(EarlyVote2020)) <span class="sc">%&gt;%</span></span>
-<span id="cb562-3"><a href="exercise-solutions.html#cb562-3" tabindex="-1"></a>  <span class="fu">group_by</span>(EarlyVote2020) <span class="sc">%&gt;%</span></span>
-<span id="cb562-4"><a href="exercise-solutions.html#cb562-4" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">p =</span> <span class="fu">survey_mean</span>(<span class="at">deff =</span> <span class="cn">TRUE</span>)) <span class="sc">%&gt;%</span></span>
-<span id="cb562-5"><a href="exercise-solutions.html#cb562-5" tabindex="-1"></a>  <span class="fu">filter</span>(EarlyVote2020 <span class="sc">==</span> <span class="st">&quot;Yes&quot;</span>)</span>
-<span id="cb562-6"><a href="exercise-solutions.html#cb562-6" tabindex="-1"></a></span>
-<span id="cb562-7"><a href="exercise-solutions.html#cb562-7" tabindex="-1"></a>pdeff</span></code></pre></div>
+<div class="sourceCode" id="cb560"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb560-1"><a href="exercise-solutions.html#cb560-1" tabindex="-1"></a>pdeff <span class="ot">&lt;-</span> anes_des <span class="sc">%&gt;%</span></span>
+<span id="cb560-2"><a href="exercise-solutions.html#cb560-2" tabindex="-1"></a>  <span class="fu">filter</span>(<span class="sc">!</span><span class="fu">is.na</span>(EarlyVote2020)) <span class="sc">%&gt;%</span></span>
+<span id="cb560-3"><a href="exercise-solutions.html#cb560-3" tabindex="-1"></a>  <span class="fu">group_by</span>(EarlyVote2020) <span class="sc">%&gt;%</span></span>
+<span id="cb560-4"><a href="exercise-solutions.html#cb560-4" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">p =</span> <span class="fu">survey_mean</span>(<span class="at">deff =</span> <span class="cn">TRUE</span>)) <span class="sc">%&gt;%</span></span>
+<span id="cb560-5"><a href="exercise-solutions.html#cb560-5" tabindex="-1"></a>  <span class="fu">filter</span>(EarlyVote2020 <span class="sc">==</span> <span class="st">&quot;Yes&quot;</span>)</span>
+<span id="cb560-6"><a href="exercise-solutions.html#cb560-6" tabindex="-1"></a></span>
+<span id="cb560-7"><a href="exercise-solutions.html#cb560-7" tabindex="-1"></a>pdeff</span></code></pre></div>
 <pre><code>## # A tibble: 1 × 4
 ##   EarlyVote2020     p   p_se p_deff
 ##   &lt;fct&gt;         &lt;dbl&gt;  &lt;dbl&gt;  &lt;dbl&gt;
@@ -831,11 +832,11 @@ <h2>5 - Descriptive analysis<a href="exercise-solutions.html#descriptive-analysi
 <ol start="6" style="list-style-type: decimal">
 <li>What is the median temperature people set their thermostats to at night during the winter? Hint: The variable <code>WinterTempNight</code> indicates the temperature that people set their temperature in the winter at night.</li>
 </ol>
-<div class="sourceCode" id="cb564"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb564-1"><a href="exercise-solutions.html#cb564-1" tabindex="-1"></a>med_wintertempnight <span class="ot">&lt;-</span> recs_des <span class="sc">%&gt;%</span></span>
-<span id="cb564-2"><a href="exercise-solutions.html#cb564-2" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">wtn_med =</span> <span class="fu">survey_median</span>(<span class="at">x =</span> WinterTempNight,</span>
-<span id="cb564-3"><a href="exercise-solutions.html#cb564-3" tabindex="-1"></a>                                   <span class="at">na.rm =</span> <span class="cn">TRUE</span>))</span>
-<span id="cb564-4"><a href="exercise-solutions.html#cb564-4" tabindex="-1"></a></span>
-<span id="cb564-5"><a href="exercise-solutions.html#cb564-5" tabindex="-1"></a>med_wintertempnight</span></code></pre></div>
+<div class="sourceCode" id="cb562"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb562-1"><a href="exercise-solutions.html#cb562-1" tabindex="-1"></a>med_wintertempnight <span class="ot">&lt;-</span> recs_des <span class="sc">%&gt;%</span></span>
+<span id="cb562-2"><a href="exercise-solutions.html#cb562-2" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">wtn_med =</span> <span class="fu">survey_median</span>(<span class="at">x =</span> WinterTempNight,</span>
+<span id="cb562-3"><a href="exercise-solutions.html#cb562-3" tabindex="-1"></a>                                   <span class="at">na.rm =</span> <span class="cn">TRUE</span>))</span>
+<span id="cb562-4"><a href="exercise-solutions.html#cb562-4" tabindex="-1"></a></span>
+<span id="cb562-5"><a href="exercise-solutions.html#cb562-5" tabindex="-1"></a>med_wintertempnight</span></code></pre></div>
 <pre><code>## # A tibble: 1 × 2
 ##   wtn_med wtn_med_se
 ##     &lt;dbl&gt;      &lt;dbl&gt;
@@ -844,28 +845,28 @@ <h2>5 - Descriptive analysis<a href="exercise-solutions.html#descriptive-analysi
 <ol start="7" style="list-style-type: decimal">
 <li>People sometimes set their temperature differently over different seasons and during the day. What median temperatures do people set their thermostat to in the summer and winter, both during the day and at night? Include confidence intervals. Hint: Use the variables <code>WinterTempDay</code>, <code>WinterTempNight</code>, <code>SummerTempDay</code>, and <code>SummerTempNight</code>.</li>
 </ol>
-<div class="sourceCode" id="cb566"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb566-1"><a href="exercise-solutions.html#cb566-1" tabindex="-1"></a><span class="co"># Option 1</span></span>
-<span id="cb566-2"><a href="exercise-solutions.html#cb566-2" tabindex="-1"></a></span>
-<span id="cb566-3"><a href="exercise-solutions.html#cb566-3" tabindex="-1"></a>med_temps <span class="ot">&lt;-</span> recs_des <span class="sc">%&gt;%</span></span>
-<span id="cb566-4"><a href="exercise-solutions.html#cb566-4" tabindex="-1"></a>  <span class="fu">summarize</span>(</span>
-<span id="cb566-5"><a href="exercise-solutions.html#cb566-5" tabindex="-1"></a>    <span class="fu">across</span>(<span class="fu">c</span>(WinterTempDay, WinterTempNight, SummerTempDay, SummerTempNight), <span class="sc">~</span><span class="fu">survey_median</span>(.x, <span class="at">na.rm=</span><span class="cn">TRUE</span>))</span>
-<span id="cb566-6"><a href="exercise-solutions.html#cb566-6" tabindex="-1"></a>  )</span>
-<span id="cb566-7"><a href="exercise-solutions.html#cb566-7" tabindex="-1"></a></span>
-<span id="cb566-8"><a href="exercise-solutions.html#cb566-8" tabindex="-1"></a>med_temps</span></code></pre></div>
+<div class="sourceCode" id="cb564"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb564-1"><a href="exercise-solutions.html#cb564-1" tabindex="-1"></a><span class="co"># Option 1</span></span>
+<span id="cb564-2"><a href="exercise-solutions.html#cb564-2" tabindex="-1"></a></span>
+<span id="cb564-3"><a href="exercise-solutions.html#cb564-3" tabindex="-1"></a>med_temps <span class="ot">&lt;-</span> recs_des <span class="sc">%&gt;%</span></span>
+<span id="cb564-4"><a href="exercise-solutions.html#cb564-4" tabindex="-1"></a>  <span class="fu">summarize</span>(</span>
+<span id="cb564-5"><a href="exercise-solutions.html#cb564-5" tabindex="-1"></a>    <span class="fu">across</span>(<span class="fu">c</span>(WinterTempDay, WinterTempNight, SummerTempDay, SummerTempNight), <span class="sc">~</span><span class="fu">survey_median</span>(.x, <span class="at">na.rm=</span><span class="cn">TRUE</span>))</span>
+<span id="cb564-6"><a href="exercise-solutions.html#cb564-6" tabindex="-1"></a>  )</span>
+<span id="cb564-7"><a href="exercise-solutions.html#cb564-7" tabindex="-1"></a></span>
+<span id="cb564-8"><a href="exercise-solutions.html#cb564-8" tabindex="-1"></a>med_temps</span></code></pre></div>
 <pre><code>## # A tibble: 1 × 8
 ##   WinterTempDay WinterTempDay_se WinterTempNight WinterTempNight_se
 ##           &lt;dbl&gt;            &lt;dbl&gt;           &lt;dbl&gt;              &lt;dbl&gt;
 ## 1            70            0.250              68              0.250
 ## # ℹ 4 more variables: SummerTempDay &lt;dbl&gt;, SummerTempDay_se &lt;dbl&gt;,
 ## #   SummerTempNight &lt;dbl&gt;, SummerTempNight_se &lt;dbl&gt;</code></pre>
-<div class="sourceCode" id="cb568"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb568-1"><a href="exercise-solutions.html#cb568-1" tabindex="-1"></a><span class="co"># Alternatively, could use `survey_quantile()` as shown below for WinterTempNight:</span></span>
-<span id="cb568-2"><a href="exercise-solutions.html#cb568-2" tabindex="-1"></a></span>
-<span id="cb568-3"><a href="exercise-solutions.html#cb568-3" tabindex="-1"></a>quant_temps <span class="ot">&lt;-</span> recs_des <span class="sc">%&gt;%</span></span>
-<span id="cb568-4"><a href="exercise-solutions.html#cb568-4" tabindex="-1"></a>  <span class="fu">summarize</span>(</span>
-<span id="cb568-5"><a href="exercise-solutions.html#cb568-5" tabindex="-1"></a>    <span class="fu">across</span>(<span class="fu">c</span>(WinterTempDay, WinterTempNight, SummerTempDay, SummerTempNight), <span class="sc">~</span><span class="fu">survey_quantile</span>(.x, <span class="at">quantiles=</span><span class="fl">0.5</span>, <span class="at">na.rm=</span><span class="cn">TRUE</span>))</span>
-<span id="cb568-6"><a href="exercise-solutions.html#cb568-6" tabindex="-1"></a>  )</span>
-<span id="cb568-7"><a href="exercise-solutions.html#cb568-7" tabindex="-1"></a></span>
-<span id="cb568-8"><a href="exercise-solutions.html#cb568-8" tabindex="-1"></a>quant_temps</span></code></pre></div>
+<div class="sourceCode" id="cb566"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb566-1"><a href="exercise-solutions.html#cb566-1" tabindex="-1"></a><span class="co"># Alternatively, could use `survey_quantile()` as shown below for WinterTempNight:</span></span>
+<span id="cb566-2"><a href="exercise-solutions.html#cb566-2" tabindex="-1"></a></span>
+<span id="cb566-3"><a href="exercise-solutions.html#cb566-3" tabindex="-1"></a>quant_temps <span class="ot">&lt;-</span> recs_des <span class="sc">%&gt;%</span></span>
+<span id="cb566-4"><a href="exercise-solutions.html#cb566-4" tabindex="-1"></a>  <span class="fu">summarize</span>(</span>
+<span id="cb566-5"><a href="exercise-solutions.html#cb566-5" tabindex="-1"></a>    <span class="fu">across</span>(<span class="fu">c</span>(WinterTempDay, WinterTempNight, SummerTempDay, SummerTempNight), <span class="sc">~</span><span class="fu">survey_quantile</span>(.x, <span class="at">quantiles=</span><span class="fl">0.5</span>, <span class="at">na.rm=</span><span class="cn">TRUE</span>))</span>
+<span id="cb566-6"><a href="exercise-solutions.html#cb566-6" tabindex="-1"></a>  )</span>
+<span id="cb566-7"><a href="exercise-solutions.html#cb566-7" tabindex="-1"></a></span>
+<span id="cb566-8"><a href="exercise-solutions.html#cb566-8" tabindex="-1"></a>quant_temps</span></code></pre></div>
 <pre><code>## # A tibble: 1 × 8
 ##   WinterTempDay_q50 WinterTempDay_q50_se WinterTempNight_q50
 ##               &lt;dbl&gt;                &lt;dbl&gt;               &lt;dbl&gt;
@@ -881,11 +882,11 @@ <h2>5 - Descriptive analysis<a href="exercise-solutions.html#descriptive-analysi
 <ol start="8" style="list-style-type: decimal">
 <li>What is the correlation between the temperature that people set their temperature at during the night and during the day in the summer?</li>
 </ol>
-<div class="sourceCode" id="cb570"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb570-1"><a href="exercise-solutions.html#cb570-1" tabindex="-1"></a>corr_summer_temp <span class="ot">&lt;-</span> recs_des <span class="sc">%&gt;%</span></span>
-<span id="cb570-2"><a href="exercise-solutions.html#cb570-2" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">summer_corr =</span> <span class="fu">survey_corr</span>(SummerTempNight, SummerTempDay,</span>
-<span id="cb570-3"><a href="exercise-solutions.html#cb570-3" tabindex="-1"></a>                                      <span class="at">na.rm =</span> <span class="cn">TRUE</span>))</span>
-<span id="cb570-4"><a href="exercise-solutions.html#cb570-4" tabindex="-1"></a></span>
-<span id="cb570-5"><a href="exercise-solutions.html#cb570-5" tabindex="-1"></a>corr_summer_temp</span></code></pre></div>
+<div class="sourceCode" id="cb568"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb568-1"><a href="exercise-solutions.html#cb568-1" tabindex="-1"></a>corr_summer_temp <span class="ot">&lt;-</span> recs_des <span class="sc">%&gt;%</span></span>
+<span id="cb568-2"><a href="exercise-solutions.html#cb568-2" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">summer_corr =</span> <span class="fu">survey_corr</span>(SummerTempNight, SummerTempDay,</span>
+<span id="cb568-3"><a href="exercise-solutions.html#cb568-3" tabindex="-1"></a>                                      <span class="at">na.rm =</span> <span class="cn">TRUE</span>))</span>
+<span id="cb568-4"><a href="exercise-solutions.html#cb568-4" tabindex="-1"></a></span>
+<span id="cb568-5"><a href="exercise-solutions.html#cb568-5" tabindex="-1"></a>corr_summer_temp</span></code></pre></div>
 <pre><code>## # A tibble: 1 × 2
 ##   summer_corr summer_corr_se
 ##         &lt;dbl&gt;          &lt;dbl&gt;
@@ -894,16 +895,16 @@ <h2>5 - Descriptive analysis<a href="exercise-solutions.html#descriptive-analysi
 <ol start="9" style="list-style-type: decimal">
 <li>What is the 1st, 2nd, and 3rd quartile of the amount of money spent on energy by Building America (BA) climate zone? Hint: <code>TOTALDOL</code> indicates the total amount spent on all fuel, and <code>ClimateRegion_BA</code> indicates the BA climate zones.</li>
 </ol>
-<div class="sourceCode" id="cb572"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb572-1"><a href="exercise-solutions.html#cb572-1" tabindex="-1"></a>quant_baenergyexp <span class="ot">&lt;-</span> recs_des <span class="sc">%&gt;%</span></span>
-<span id="cb572-2"><a href="exercise-solutions.html#cb572-2" tabindex="-1"></a>  <span class="fu">group_by</span>(ClimateRegion_BA) <span class="sc">%&gt;%</span></span>
-<span id="cb572-3"><a href="exercise-solutions.html#cb572-3" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">dol_quant =</span> <span class="fu">survey_quantile</span>(</span>
-<span id="cb572-4"><a href="exercise-solutions.html#cb572-4" tabindex="-1"></a>    TOTALDOL,</span>
-<span id="cb572-5"><a href="exercise-solutions.html#cb572-5" tabindex="-1"></a>    <span class="at">quantiles =</span> <span class="fu">c</span>(<span class="fl">0.25</span>, <span class="fl">0.5</span>, <span class="fl">0.75</span>),</span>
-<span id="cb572-6"><a href="exercise-solutions.html#cb572-6" tabindex="-1"></a>    <span class="at">vartype =</span> <span class="st">&quot;se&quot;</span>,</span>
-<span id="cb572-7"><a href="exercise-solutions.html#cb572-7" tabindex="-1"></a>    <span class="at">na.rm =</span> <span class="cn">TRUE</span></span>
-<span id="cb572-8"><a href="exercise-solutions.html#cb572-8" tabindex="-1"></a>  ))</span>
-<span id="cb572-9"><a href="exercise-solutions.html#cb572-9" tabindex="-1"></a></span>
-<span id="cb572-10"><a href="exercise-solutions.html#cb572-10" tabindex="-1"></a>quant_baenergyexp</span></code></pre></div>
+<div class="sourceCode" id="cb570"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb570-1"><a href="exercise-solutions.html#cb570-1" tabindex="-1"></a>quant_baenergyexp <span class="ot">&lt;-</span> recs_des <span class="sc">%&gt;%</span></span>
+<span id="cb570-2"><a href="exercise-solutions.html#cb570-2" tabindex="-1"></a>  <span class="fu">group_by</span>(ClimateRegion_BA) <span class="sc">%&gt;%</span></span>
+<span id="cb570-3"><a href="exercise-solutions.html#cb570-3" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">dol_quant =</span> <span class="fu">survey_quantile</span>(</span>
+<span id="cb570-4"><a href="exercise-solutions.html#cb570-4" tabindex="-1"></a>    TOTALDOL,</span>
+<span id="cb570-5"><a href="exercise-solutions.html#cb570-5" tabindex="-1"></a>    <span class="at">quantiles =</span> <span class="fu">c</span>(<span class="fl">0.25</span>, <span class="fl">0.5</span>, <span class="fl">0.75</span>),</span>
+<span id="cb570-6"><a href="exercise-solutions.html#cb570-6" tabindex="-1"></a>    <span class="at">vartype =</span> <span class="st">&quot;se&quot;</span>,</span>
+<span id="cb570-7"><a href="exercise-solutions.html#cb570-7" tabindex="-1"></a>    <span class="at">na.rm =</span> <span class="cn">TRUE</span></span>
+<span id="cb570-8"><a href="exercise-solutions.html#cb570-8" tabindex="-1"></a>  ))</span>
+<span id="cb570-9"><a href="exercise-solutions.html#cb570-9" tabindex="-1"></a></span>
+<span id="cb570-10"><a href="exercise-solutions.html#cb570-10" tabindex="-1"></a>quant_baenergyexp</span></code></pre></div>
 <pre><code>## # A tibble: 8 × 7
 ##   ClimateRegion_BA dol_quant_q25 dol_quant_q50 dol_quant_q75
 ##   &lt;fct&gt;                    &lt;dbl&gt;         &lt;dbl&gt;         &lt;dbl&gt;
@@ -918,23 +919,23 @@ <h2>5 - Descriptive analysis<a href="exercise-solutions.html#descriptive-analysi
 ## # ℹ 3 more variables: dol_quant_q25_se &lt;dbl&gt;, dol_quant_q50_se &lt;dbl&gt;,
 ## #   dol_quant_q75_se &lt;dbl&gt;</code></pre>
 <p>Answer:</p>
-<div id="hrwokkyhya" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
-<style>#hrwokkyhya table {
+<div id="rwvnyhqfyu" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
+<style>#rwvnyhqfyu table {
   font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
   -webkit-font-smoothing: antialiased;
   -moz-osx-font-smoothing: grayscale;
 }
 
-#hrwokkyhya thead, #hrwokkyhya tbody, #hrwokkyhya tfoot, #hrwokkyhya tr, #hrwokkyhya td, #hrwokkyhya th {
+#rwvnyhqfyu thead, #rwvnyhqfyu tbody, #rwvnyhqfyu tfoot, #rwvnyhqfyu tr, #rwvnyhqfyu td, #rwvnyhqfyu th {
   border-style: none;
 }
 
-#hrwokkyhya p {
+#rwvnyhqfyu p {
   margin: 0;
   padding: 0;
 }
 
-#hrwokkyhya .gt_table {
+#rwvnyhqfyu .gt_table {
   display: table;
   border-collapse: collapse;
   line-height: normal;
@@ -960,12 +961,12 @@ <h2>5 - Descriptive analysis<a href="exercise-solutions.html#descriptive-analysi
   border-left-color: #D3D3D3;
 }
 
-#hrwokkyhya .gt_caption {
+#rwvnyhqfyu .gt_caption {
   padding-top: 4px;
   padding-bottom: 4px;
 }
 
-#hrwokkyhya .gt_title {
+#rwvnyhqfyu .gt_title {
   color: #333333;
   font-size: 125%;
   font-weight: initial;
@@ -977,7 +978,7 @@ <h2>5 - Descriptive analysis<a href="exercise-solutions.html#descriptive-analysi
   border-bottom-width: 0;
 }
 
-#hrwokkyhya .gt_subtitle {
+#rwvnyhqfyu .gt_subtitle {
   color: #333333;
   font-size: 85%;
   font-weight: initial;
@@ -989,7 +990,7 @@ <h2>5 - Descriptive analysis<a href="exercise-solutions.html#descriptive-analysi
   border-top-width: 0;
 }
 
-#hrwokkyhya .gt_heading {
+#rwvnyhqfyu .gt_heading {
   background-color: #FFFFFF;
   text-align: center;
   border-bottom-color: #FFFFFF;
@@ -1001,13 +1002,13 @@ <h2>5 - Descriptive analysis<a href="exercise-solutions.html#descriptive-analysi
   border-right-color: #D3D3D3;
 }
 
-#hrwokkyhya .gt_bottom_border {
+#rwvnyhqfyu .gt_bottom_border {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
 }
 
-#hrwokkyhya .gt_col_headings {
+#rwvnyhqfyu .gt_col_headings {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -1022,7 +1023,7 @@ <h2>5 - Descriptive analysis<a href="exercise-solutions.html#descriptive-analysi
   border-right-color: #D3D3D3;
 }
 
-#hrwokkyhya .gt_col_heading {
+#rwvnyhqfyu .gt_col_heading {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -1042,7 +1043,7 @@ <h2>5 - Descriptive analysis<a href="exercise-solutions.html#descriptive-analysi
   overflow-x: hidden;
 }
 
-#hrwokkyhya .gt_column_spanner_outer {
+#rwvnyhqfyu .gt_column_spanner_outer {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -1054,15 +1055,15 @@ <h2>5 - Descriptive analysis<a href="exercise-solutions.html#descriptive-analysi
   padding-right: 4px;
 }
 
-#hrwokkyhya .gt_column_spanner_outer:first-child {
+#rwvnyhqfyu .gt_column_spanner_outer:first-child {
   padding-left: 0;
 }
 
-#hrwokkyhya .gt_column_spanner_outer:last-child {
+#rwvnyhqfyu .gt_column_spanner_outer:last-child {
   padding-right: 0;
 }
 
-#hrwokkyhya .gt_column_spanner {
+#rwvnyhqfyu .gt_column_spanner {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
@@ -1074,11 +1075,11 @@ <h2>5 - Descriptive analysis<a href="exercise-solutions.html#descriptive-analysi
   width: 100%;
 }
 
-#hrwokkyhya .gt_spanner_row {
+#rwvnyhqfyu .gt_spanner_row {
   border-bottom-style: hidden;
 }
 
-#hrwokkyhya .gt_group_heading {
+#rwvnyhqfyu .gt_group_heading {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -1104,7 +1105,7 @@ <h2>5 - Descriptive analysis<a href="exercise-solutions.html#descriptive-analysi
   text-align: left;
 }
 
-#hrwokkyhya .gt_empty_group_heading {
+#rwvnyhqfyu .gt_empty_group_heading {
   padding: 0.5px;
   color: #333333;
   background-color: #FFFFFF;
@@ -1119,15 +1120,15 @@ <h2>5 - Descriptive analysis<a href="exercise-solutions.html#descriptive-analysi
   vertical-align: middle;
 }
 
-#hrwokkyhya .gt_from_md > :first-child {
+#rwvnyhqfyu .gt_from_md > :first-child {
   margin-top: 0;
 }
 
-#hrwokkyhya .gt_from_md > :last-child {
+#rwvnyhqfyu .gt_from_md > :last-child {
   margin-bottom: 0;
 }
 
-#hrwokkyhya .gt_row {
+#rwvnyhqfyu .gt_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -1146,7 +1147,7 @@ <h2>5 - Descriptive analysis<a href="exercise-solutions.html#descriptive-analysi
   overflow-x: hidden;
 }
 
-#hrwokkyhya .gt_stub {
+#rwvnyhqfyu .gt_stub {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -1159,7 +1160,7 @@ <h2>5 - Descriptive analysis<a href="exercise-solutions.html#descriptive-analysi
   padding-right: 5px;
 }
 
-#hrwokkyhya .gt_stub_row_group {
+#rwvnyhqfyu .gt_stub_row_group {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -1173,15 +1174,15 @@ <h2>5 - Descriptive analysis<a href="exercise-solutions.html#descriptive-analysi
   vertical-align: top;
 }
 
-#hrwokkyhya .gt_row_group_first td {
+#rwvnyhqfyu .gt_row_group_first td {
   border-top-width: 2px;
 }
 
-#hrwokkyhya .gt_row_group_first th {
+#rwvnyhqfyu .gt_row_group_first th {
   border-top-width: 2px;
 }
 
-#hrwokkyhya .gt_summary_row {
+#rwvnyhqfyu .gt_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -1191,16 +1192,16 @@ <h2>5 - Descriptive analysis<a href="exercise-solutions.html#descriptive-analysi
   padding-right: 5px;
 }
 
-#hrwokkyhya .gt_first_summary_row {
+#rwvnyhqfyu .gt_first_summary_row {
   border-top-style: solid;
   border-top-color: #D3D3D3;
 }
 
-#hrwokkyhya .gt_first_summary_row.thick {
+#rwvnyhqfyu .gt_first_summary_row.thick {
   border-top-width: 2px;
 }
 
-#hrwokkyhya .gt_last_summary_row {
+#rwvnyhqfyu .gt_last_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -1210,7 +1211,7 @@ <h2>5 - Descriptive analysis<a href="exercise-solutions.html#descriptive-analysi
   border-bottom-color: #D3D3D3;
 }
 
-#hrwokkyhya .gt_grand_summary_row {
+#rwvnyhqfyu .gt_grand_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -1220,7 +1221,7 @@ <h2>5 - Descriptive analysis<a href="exercise-solutions.html#descriptive-analysi
   padding-right: 5px;
 }
 
-#hrwokkyhya .gt_first_grand_summary_row {
+#rwvnyhqfyu .gt_first_grand_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -1230,7 +1231,7 @@ <h2>5 - Descriptive analysis<a href="exercise-solutions.html#descriptive-analysi
   border-top-color: #D3D3D3;
 }
 
-#hrwokkyhya .gt_last_grand_summary_row_top {
+#rwvnyhqfyu .gt_last_grand_summary_row_top {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -1240,11 +1241,11 @@ <h2>5 - Descriptive analysis<a href="exercise-solutions.html#descriptive-analysi
   border-bottom-color: #D3D3D3;
 }
 
-#hrwokkyhya .gt_striped {
+#rwvnyhqfyu .gt_striped {
   background-color: rgba(128, 128, 128, 0.05);
 }
 
-#hrwokkyhya .gt_table_body {
+#rwvnyhqfyu .gt_table_body {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -1253,7 +1254,7 @@ <h2>5 - Descriptive analysis<a href="exercise-solutions.html#descriptive-analysi
   border-bottom-color: #D3D3D3;
 }
 
-#hrwokkyhya .gt_footnotes {
+#rwvnyhqfyu .gt_footnotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -1267,7 +1268,7 @@ <h2>5 - Descriptive analysis<a href="exercise-solutions.html#descriptive-analysi
   border-right-color: #D3D3D3;
 }
 
-#hrwokkyhya .gt_footnote {
+#rwvnyhqfyu .gt_footnote {
   margin: 0px;
   font-size: 90%;
   padding-top: 4px;
@@ -1276,7 +1277,7 @@ <h2>5 - Descriptive analysis<a href="exercise-solutions.html#descriptive-analysi
   padding-right: 5px;
 }
 
-#hrwokkyhya .gt_sourcenotes {
+#rwvnyhqfyu .gt_sourcenotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -1290,7 +1291,7 @@ <h2>5 - Descriptive analysis<a href="exercise-solutions.html#descriptive-analysi
   border-right-color: #D3D3D3;
 }
 
-#hrwokkyhya .gt_sourcenote {
+#rwvnyhqfyu .gt_sourcenote {
   font-size: 90%;
   padding-top: 4px;
   padding-bottom: 4px;
@@ -1298,63 +1299,63 @@ <h2>5 - Descriptive analysis<a href="exercise-solutions.html#descriptive-analysi
   padding-right: 5px;
 }
 
-#hrwokkyhya .gt_left {
+#rwvnyhqfyu .gt_left {
   text-align: left;
 }
 
-#hrwokkyhya .gt_center {
+#rwvnyhqfyu .gt_center {
   text-align: center;
 }
 
-#hrwokkyhya .gt_right {
+#rwvnyhqfyu .gt_right {
   text-align: right;
   font-variant-numeric: tabular-nums;
 }
 
-#hrwokkyhya .gt_font_normal {
+#rwvnyhqfyu .gt_font_normal {
   font-weight: normal;
 }
 
-#hrwokkyhya .gt_font_bold {
+#rwvnyhqfyu .gt_font_bold {
   font-weight: bold;
 }
 
-#hrwokkyhya .gt_font_italic {
+#rwvnyhqfyu .gt_font_italic {
   font-style: italic;
 }
 
-#hrwokkyhya .gt_super {
+#rwvnyhqfyu .gt_super {
   font-size: 65%;
 }
 
-#hrwokkyhya .gt_footnote_marks {
+#rwvnyhqfyu .gt_footnote_marks {
   font-size: 75%;
   vertical-align: 0.4em;
   position: initial;
 }
 
-#hrwokkyhya .gt_asterisk {
+#rwvnyhqfyu .gt_asterisk {
   font-size: 100%;
   vertical-align: 0;
 }
 
-#hrwokkyhya .gt_indent_1 {
+#rwvnyhqfyu .gt_indent_1 {
   text-indent: 5px;
 }
 
-#hrwokkyhya .gt_indent_2 {
+#rwvnyhqfyu .gt_indent_2 {
   text-indent: 10px;
 }
 
-#hrwokkyhya .gt_indent_3 {
+#rwvnyhqfyu .gt_indent_3 {
   text-indent: 15px;
 }
 
-#hrwokkyhya .gt_indent_4 {
+#rwvnyhqfyu .gt_indent_4 {
   text-indent: 20px;
 }
 
-#hrwokkyhya .gt_indent_5 {
+#rwvnyhqfyu .gt_indent_5 {
   text-indent: 25px;
 }
 </style>
@@ -1413,16 +1414,16 @@ <h2>5 - Descriptive analysis<a href="exercise-solutions.html#descriptive-analysi
 <div id="statistical-testing-1" class="section level2 unnumbered hasAnchor">
 <h2>6 - Statistical testing<a href="exercise-solutions.html#statistical-testing-1" class="anchor-section" aria-label="Anchor link to header"></a></h2>
 <ol style="list-style-type: decimal">
-<li>Using the RECS data, do more than 50% of U.S. households use AC (<code>ACUsed</code>)?</li>
+<li>Using the RECS data, do more than 50% of U.S. households use A/C (<code>ACUsed</code>)?</li>
 </ol>
-<div class="sourceCode" id="cb574"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb574-1"><a href="exercise-solutions.html#cb574-1" tabindex="-1"></a>ttest_solution1 <span class="ot">&lt;-</span> recs_des <span class="sc">%&gt;%</span></span>
-<span id="cb574-2"><a href="exercise-solutions.html#cb574-2" tabindex="-1"></a>  <span class="fu">svyttest</span>(<span class="at">design =</span> .,</span>
-<span id="cb574-3"><a href="exercise-solutions.html#cb574-3" tabindex="-1"></a>           <span class="at">formula =</span> ((ACUsed <span class="sc">==</span> <span class="cn">TRUE</span>) <span class="sc">-</span> <span class="fl">0.5</span>) <span class="sc">~</span> <span class="dv">0</span>,</span>
-<span id="cb574-4"><a href="exercise-solutions.html#cb574-4" tabindex="-1"></a>           <span class="at">na.rm =</span> <span class="cn">TRUE</span>,</span>
-<span id="cb574-5"><a href="exercise-solutions.html#cb574-5" tabindex="-1"></a>           <span class="at">alternative=</span><span class="st">&quot;greater&quot;</span>) <span class="sc">%&gt;%</span></span>
-<span id="cb574-6"><a href="exercise-solutions.html#cb574-6" tabindex="-1"></a>  <span class="fu">tidy</span>()</span>
-<span id="cb574-7"><a href="exercise-solutions.html#cb574-7" tabindex="-1"></a></span>
-<span id="cb574-8"><a href="exercise-solutions.html#cb574-8" tabindex="-1"></a>ttest_solution1</span></code></pre></div>
+<div class="sourceCode" id="cb572"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb572-1"><a href="exercise-solutions.html#cb572-1" tabindex="-1"></a>ttest_solution1 <span class="ot">&lt;-</span> recs_des <span class="sc">%&gt;%</span></span>
+<span id="cb572-2"><a href="exercise-solutions.html#cb572-2" tabindex="-1"></a>  <span class="fu">svyttest</span>(<span class="at">design =</span> .,</span>
+<span id="cb572-3"><a href="exercise-solutions.html#cb572-3" tabindex="-1"></a>           <span class="at">formula =</span> ((ACUsed <span class="sc">==</span> <span class="cn">TRUE</span>) <span class="sc">-</span> <span class="fl">0.5</span>) <span class="sc">~</span> <span class="dv">0</span>,</span>
+<span id="cb572-4"><a href="exercise-solutions.html#cb572-4" tabindex="-1"></a>           <span class="at">na.rm =</span> <span class="cn">TRUE</span>,</span>
+<span id="cb572-5"><a href="exercise-solutions.html#cb572-5" tabindex="-1"></a>           <span class="at">alternative=</span><span class="st">&quot;greater&quot;</span>) <span class="sc">%&gt;%</span></span>
+<span id="cb572-6"><a href="exercise-solutions.html#cb572-6" tabindex="-1"></a>  <span class="fu">tidy</span>()</span>
+<span id="cb572-7"><a href="exercise-solutions.html#cb572-7" tabindex="-1"></a></span>
+<span id="cb572-8"><a href="exercise-solutions.html#cb572-8" tabindex="-1"></a>ttest_solution1</span></code></pre></div>
 <pre><code>## # A tibble: 1 × 8
 ##   estimate statistic  p.value parameter conf.low conf.high method       
 ##      &lt;dbl&gt;     &lt;dbl&gt;    &lt;dbl&gt;     &lt;dbl&gt;    &lt;dbl&gt;     &lt;dbl&gt; &lt;chr&gt;        
@@ -1432,15 +1433,15 @@ <h2>6 - Statistical testing<a href="exercise-solutions.html#statistical-testing-
 <ol start="2" style="list-style-type: decimal">
 <li>Using the RECS data, does the average temperature that U.S. households set their thermostats to differ between the day and night in the winter (<code>WinterTempDay</code> and <code>WinterTempNight</code>)?</li>
 </ol>
-<div class="sourceCode" id="cb576"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb576-1"><a href="exercise-solutions.html#cb576-1" tabindex="-1"></a>ttest_solution2 <span class="ot">&lt;-</span> recs_des <span class="sc">%&gt;%</span></span>
-<span id="cb576-2"><a href="exercise-solutions.html#cb576-2" tabindex="-1"></a>  <span class="fu">svyttest</span>(</span>
-<span id="cb576-3"><a href="exercise-solutions.html#cb576-3" tabindex="-1"></a>    <span class="at">design =</span> .,</span>
-<span id="cb576-4"><a href="exercise-solutions.html#cb576-4" tabindex="-1"></a>    <span class="at">formula =</span> WinterTempDay <span class="sc">-</span> WinterTempNight <span class="sc">~</span> <span class="dv">0</span>,</span>
-<span id="cb576-5"><a href="exercise-solutions.html#cb576-5" tabindex="-1"></a>    <span class="at">na.rm =</span> <span class="cn">TRUE</span></span>
-<span id="cb576-6"><a href="exercise-solutions.html#cb576-6" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
-<span id="cb576-7"><a href="exercise-solutions.html#cb576-7" tabindex="-1"></a>  <span class="fu">tidy</span>()</span>
-<span id="cb576-8"><a href="exercise-solutions.html#cb576-8" tabindex="-1"></a></span>
-<span id="cb576-9"><a href="exercise-solutions.html#cb576-9" tabindex="-1"></a>ttest_solution2</span></code></pre></div>
+<div class="sourceCode" id="cb574"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb574-1"><a href="exercise-solutions.html#cb574-1" tabindex="-1"></a>ttest_solution2 <span class="ot">&lt;-</span> recs_des <span class="sc">%&gt;%</span></span>
+<span id="cb574-2"><a href="exercise-solutions.html#cb574-2" tabindex="-1"></a>  <span class="fu">svyttest</span>(</span>
+<span id="cb574-3"><a href="exercise-solutions.html#cb574-3" tabindex="-1"></a>    <span class="at">design =</span> .,</span>
+<span id="cb574-4"><a href="exercise-solutions.html#cb574-4" tabindex="-1"></a>    <span class="at">formula =</span> WinterTempDay <span class="sc">-</span> WinterTempNight <span class="sc">~</span> <span class="dv">0</span>,</span>
+<span id="cb574-5"><a href="exercise-solutions.html#cb574-5" tabindex="-1"></a>    <span class="at">na.rm =</span> <span class="cn">TRUE</span></span>
+<span id="cb574-6"><a href="exercise-solutions.html#cb574-6" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
+<span id="cb574-7"><a href="exercise-solutions.html#cb574-7" tabindex="-1"></a>  <span class="fu">tidy</span>()</span>
+<span id="cb574-8"><a href="exercise-solutions.html#cb574-8" tabindex="-1"></a></span>
+<span id="cb574-9"><a href="exercise-solutions.html#cb574-9" tabindex="-1"></a>ttest_solution2</span></code></pre></div>
 <pre><code>## # A tibble: 1 × 8
 ##   estimate statistic  p.value parameter conf.low conf.high method       
 ##      &lt;dbl&gt;     &lt;dbl&gt;    &lt;dbl&gt;     &lt;dbl&gt;    &lt;dbl&gt;     &lt;dbl&gt; &lt;chr&gt;        
@@ -1450,16 +1451,16 @@ <h2>6 - Statistical testing<a href="exercise-solutions.html#statistical-testing-
 <ol start="3" style="list-style-type: decimal">
 <li>Using the ANES data, does the average age (<code>Age</code>) of those who voted for Joseph Biden in 2020 (<code>VotedPres2020_selection</code>) differ from those who voted for another candidate?</li>
 </ol>
-<div class="sourceCode" id="cb578"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb578-1"><a href="exercise-solutions.html#cb578-1" tabindex="-1"></a>ttest_solution3 <span class="ot">&lt;-</span> anes_des <span class="sc">%&gt;%</span></span>
-<span id="cb578-2"><a href="exercise-solutions.html#cb578-2" tabindex="-1"></a>  <span class="fu">filter</span>(<span class="sc">!</span><span class="fu">is.na</span>(VotedPres2020_selection)) <span class="sc">%&gt;%</span></span>
-<span id="cb578-3"><a href="exercise-solutions.html#cb578-3" tabindex="-1"></a>  <span class="fu">svyttest</span>(</span>
-<span id="cb578-4"><a href="exercise-solutions.html#cb578-4" tabindex="-1"></a>    <span class="at">design =</span> .,</span>
-<span id="cb578-5"><a href="exercise-solutions.html#cb578-5" tabindex="-1"></a>    <span class="at">formula =</span> Age <span class="sc">~</span> VotedPres2020_selection <span class="sc">==</span> <span class="st">&quot;Biden&quot;</span>,</span>
-<span id="cb578-6"><a href="exercise-solutions.html#cb578-6" tabindex="-1"></a>    <span class="at">na.rm =</span> <span class="cn">TRUE</span></span>
-<span id="cb578-7"><a href="exercise-solutions.html#cb578-7" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
-<span id="cb578-8"><a href="exercise-solutions.html#cb578-8" tabindex="-1"></a>  <span class="fu">tidy</span>()</span>
-<span id="cb578-9"><a href="exercise-solutions.html#cb578-9" tabindex="-1"></a></span>
-<span id="cb578-10"><a href="exercise-solutions.html#cb578-10" tabindex="-1"></a>ttest_solution3</span></code></pre></div>
+<div class="sourceCode" id="cb576"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb576-1"><a href="exercise-solutions.html#cb576-1" tabindex="-1"></a>ttest_solution3 <span class="ot">&lt;-</span> anes_des <span class="sc">%&gt;%</span></span>
+<span id="cb576-2"><a href="exercise-solutions.html#cb576-2" tabindex="-1"></a>  <span class="fu">filter</span>(<span class="sc">!</span><span class="fu">is.na</span>(VotedPres2020_selection)) <span class="sc">%&gt;%</span></span>
+<span id="cb576-3"><a href="exercise-solutions.html#cb576-3" tabindex="-1"></a>  <span class="fu">svyttest</span>(</span>
+<span id="cb576-4"><a href="exercise-solutions.html#cb576-4" tabindex="-1"></a>    <span class="at">design =</span> .,</span>
+<span id="cb576-5"><a href="exercise-solutions.html#cb576-5" tabindex="-1"></a>    <span class="at">formula =</span> Age <span class="sc">~</span> VotedPres2020_selection <span class="sc">==</span> <span class="st">&quot;Biden&quot;</span>,</span>
+<span id="cb576-6"><a href="exercise-solutions.html#cb576-6" tabindex="-1"></a>    <span class="at">na.rm =</span> <span class="cn">TRUE</span></span>
+<span id="cb576-7"><a href="exercise-solutions.html#cb576-7" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
+<span id="cb576-8"><a href="exercise-solutions.html#cb576-8" tabindex="-1"></a>  <span class="fu">tidy</span>()</span>
+<span id="cb576-9"><a href="exercise-solutions.html#cb576-9" tabindex="-1"></a></span>
+<span id="cb576-10"><a href="exercise-solutions.html#cb576-10" tabindex="-1"></a>ttest_solution3</span></code></pre></div>
 <pre><code>## # A tibble: 1 × 8
 ##   estimate statistic     p.value parameter conf.low conf.high method    
 ##      &lt;dbl&gt;     &lt;dbl&gt;       &lt;dbl&gt;     &lt;dbl&gt;    &lt;dbl&gt;     &lt;dbl&gt; &lt;chr&gt;     
@@ -1467,7 +1468,7 @@ <h2>6 - Statistical testing<a href="exercise-solutions.html#statistical-testing-
 ## # ℹ 1 more variable: alternative &lt;chr&gt;</code></pre>
 <p>On average, those who voted for Joseph Biden in 2020 were -3.6 years younger than voters for other candidates and this is significantly different (p &lt;0.0001).</p>
 <ol start="4" style="list-style-type: decimal">
-<li>If you wanted to determine if the political party affiliation differed for males and females, what test would you use?</li>
+<li>If we wanted to determine if the political party affiliation differed for males and females, what test would we use?</li>
 </ol>
 <ol style="list-style-type: lower-alpha">
 <li>Goodness of fit test (<code>svygofchisq()</code>)</li>
@@ -1478,15 +1479,15 @@ <h2>6 - Statistical testing<a href="exercise-solutions.html#statistical-testing-
 <ol start="5" style="list-style-type: decimal">
 <li>In the RECS data, is there a relationship between the type of housing unit (<code>HousingUnitType</code>) and the year the house was built (<code>YearMade</code>)?</li>
 </ol>
-<div class="sourceCode" id="cb580"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb580-1"><a href="exercise-solutions.html#cb580-1" tabindex="-1"></a>chisq_solution2 <span class="ot">&lt;-</span> recs_des <span class="sc">%&gt;%</span></span>
-<span id="cb580-2"><a href="exercise-solutions.html#cb580-2" tabindex="-1"></a>  <span class="fu">svychisq</span>(</span>
-<span id="cb580-3"><a href="exercise-solutions.html#cb580-3" tabindex="-1"></a>    <span class="at">formula =</span>  <span class="sc">~</span> HousingUnitType <span class="sc">+</span> YearMade,</span>
-<span id="cb580-4"><a href="exercise-solutions.html#cb580-4" tabindex="-1"></a>    <span class="at">design =</span> .,</span>
-<span id="cb580-5"><a href="exercise-solutions.html#cb580-5" tabindex="-1"></a>    <span class="at">statistic =</span> <span class="st">&quot;Wald&quot;</span>,</span>
-<span id="cb580-6"><a href="exercise-solutions.html#cb580-6" tabindex="-1"></a>    <span class="at">na.rm =</span> <span class="cn">TRUE</span></span>
-<span id="cb580-7"><a href="exercise-solutions.html#cb580-7" tabindex="-1"></a>  )</span>
-<span id="cb580-8"><a href="exercise-solutions.html#cb580-8" tabindex="-1"></a></span>
-<span id="cb580-9"><a href="exercise-solutions.html#cb580-9" tabindex="-1"></a>chisq_solution2 <span class="sc">%&gt;%</span> <span class="fu">tidy</span>()</span></code></pre></div>
+<div class="sourceCode" id="cb578"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb578-1"><a href="exercise-solutions.html#cb578-1" tabindex="-1"></a>chisq_solution2 <span class="ot">&lt;-</span> recs_des <span class="sc">%&gt;%</span></span>
+<span id="cb578-2"><a href="exercise-solutions.html#cb578-2" tabindex="-1"></a>  <span class="fu">svychisq</span>(</span>
+<span id="cb578-3"><a href="exercise-solutions.html#cb578-3" tabindex="-1"></a>    <span class="at">formula =</span>  <span class="sc">~</span> HousingUnitType <span class="sc">+</span> YearMade,</span>
+<span id="cb578-4"><a href="exercise-solutions.html#cb578-4" tabindex="-1"></a>    <span class="at">design =</span> .,</span>
+<span id="cb578-5"><a href="exercise-solutions.html#cb578-5" tabindex="-1"></a>    <span class="at">statistic =</span> <span class="st">&quot;Wald&quot;</span>,</span>
+<span id="cb578-6"><a href="exercise-solutions.html#cb578-6" tabindex="-1"></a>    <span class="at">na.rm =</span> <span class="cn">TRUE</span></span>
+<span id="cb578-7"><a href="exercise-solutions.html#cb578-7" tabindex="-1"></a>  )</span>
+<span id="cb578-8"><a href="exercise-solutions.html#cb578-8" tabindex="-1"></a></span>
+<span id="cb578-9"><a href="exercise-solutions.html#cb578-9" tabindex="-1"></a>chisq_solution2 <span class="sc">%&gt;%</span> <span class="fu">tidy</span>()</span></code></pre></div>
 <pre><code>## Multiple parameters; naming those columns ndf, ddf</code></pre>
 <pre><code>## # A tibble: 1 × 5
 ##     ndf   ddf statistic  p.value method                               
@@ -1496,16 +1497,16 @@ <h2>6 - Statistical testing<a href="exercise-solutions.html#statistical-testing-
 <ol start="6" style="list-style-type: decimal">
 <li>In the ANES data, is there a difference in the distribution of gender (<code>Gender</code>) across early voting status in 2020 (<code>EarlyVote2020</code>)?</li>
 </ol>
-<div class="sourceCode" id="cb583"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb583-1"><a href="exercise-solutions.html#cb583-1" tabindex="-1"></a>chisq_solution3 <span class="ot">&lt;-</span> anes_des <span class="sc">%&gt;%</span></span>
-<span id="cb583-2"><a href="exercise-solutions.html#cb583-2" tabindex="-1"></a>  <span class="fu">svychisq</span>(</span>
-<span id="cb583-3"><a href="exercise-solutions.html#cb583-3" tabindex="-1"></a>    <span class="at">formula =</span>  <span class="sc">~</span> Gender <span class="sc">+</span> EarlyVote2020,</span>
-<span id="cb583-4"><a href="exercise-solutions.html#cb583-4" tabindex="-1"></a>    <span class="at">design =</span> .,</span>
-<span id="cb583-5"><a href="exercise-solutions.html#cb583-5" tabindex="-1"></a>    <span class="at">statistic =</span> <span class="st">&quot;F&quot;</span>,</span>
-<span id="cb583-6"><a href="exercise-solutions.html#cb583-6" tabindex="-1"></a>    <span class="at">na.rm =</span> <span class="cn">TRUE</span></span>
-<span id="cb583-7"><a href="exercise-solutions.html#cb583-7" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
-<span id="cb583-8"><a href="exercise-solutions.html#cb583-8" tabindex="-1"></a>  <span class="fu">tidy</span>()</span></code></pre></div>
+<div class="sourceCode" id="cb581"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb581-1"><a href="exercise-solutions.html#cb581-1" tabindex="-1"></a>chisq_solution3 <span class="ot">&lt;-</span> anes_des <span class="sc">%&gt;%</span></span>
+<span id="cb581-2"><a href="exercise-solutions.html#cb581-2" tabindex="-1"></a>  <span class="fu">svychisq</span>(</span>
+<span id="cb581-3"><a href="exercise-solutions.html#cb581-3" tabindex="-1"></a>    <span class="at">formula =</span>  <span class="sc">~</span> Gender <span class="sc">+</span> EarlyVote2020,</span>
+<span id="cb581-4"><a href="exercise-solutions.html#cb581-4" tabindex="-1"></a>    <span class="at">design =</span> .,</span>
+<span id="cb581-5"><a href="exercise-solutions.html#cb581-5" tabindex="-1"></a>    <span class="at">statistic =</span> <span class="st">&quot;F&quot;</span>,</span>
+<span id="cb581-6"><a href="exercise-solutions.html#cb581-6" tabindex="-1"></a>    <span class="at">na.rm =</span> <span class="cn">TRUE</span></span>
+<span id="cb581-7"><a href="exercise-solutions.html#cb581-7" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
+<span id="cb581-8"><a href="exercise-solutions.html#cb581-8" tabindex="-1"></a>  <span class="fu">tidy</span>()</span></code></pre></div>
 <pre><code>## Multiple parameters; naming those columns ndf, ddf</code></pre>
-<div class="sourceCode" id="cb585"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb585-1"><a href="exercise-solutions.html#cb585-1" tabindex="-1"></a>chisq_solution3</span></code></pre></div>
+<div class="sourceCode" id="cb583"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb583-1"><a href="exercise-solutions.html#cb583-1" tabindex="-1"></a>chisq_solution3</span></code></pre></div>
 <pre><code>## # A tibble: 1 × 5
 ##     ndf   ddf statistic p.value method                               
 ##   &lt;dbl&gt; &lt;dbl&gt;     &lt;dbl&gt;   &lt;dbl&gt; &lt;chr&gt;                                
@@ -1517,13 +1518,13 @@ <h2>7 - Modeling<a href="exercise-solutions.html#modeling" class="anchor-section
 <ol style="list-style-type: decimal">
 <li>The type of housing unit may have an impact on energy expenses. Is there any relationship between housing unit type (<code>HousingUnitType</code>) and total energy expenditure (<code>TOTALDOL</code>)? First, find the average energy expenditure by housing unit type as a descriptive analysis and then do the test. The reference level in the comparison should be the housing unit type that is most common.</li>
 </ol>
-<div class="sourceCode" id="cb587"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb587-1"><a href="exercise-solutions.html#cb587-1" tabindex="-1"></a>expense_by_hut <span class="ot">&lt;-</span> recs_des <span class="sc">%&gt;%</span></span>
-<span id="cb587-2"><a href="exercise-solutions.html#cb587-2" tabindex="-1"></a>  <span class="fu">group_by</span>(HousingUnitType) <span class="sc">%&gt;%</span></span>
-<span id="cb587-3"><a href="exercise-solutions.html#cb587-3" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">Expense =</span> <span class="fu">survey_mean</span>(TOTALDOL, <span class="at">na.rm =</span> <span class="cn">TRUE</span>),</span>
-<span id="cb587-4"><a href="exercise-solutions.html#cb587-4" tabindex="-1"></a>            <span class="at">HUs =</span> <span class="fu">survey_total</span>()) <span class="sc">%&gt;%</span></span>
-<span id="cb587-5"><a href="exercise-solutions.html#cb587-5" tabindex="-1"></a>  <span class="fu">arrange</span>(<span class="fu">desc</span>(HUs))</span>
-<span id="cb587-6"><a href="exercise-solutions.html#cb587-6" tabindex="-1"></a></span>
-<span id="cb587-7"><a href="exercise-solutions.html#cb587-7" tabindex="-1"></a>expense_by_hut</span></code></pre></div>
+<div class="sourceCode" id="cb585"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb585-1"><a href="exercise-solutions.html#cb585-1" tabindex="-1"></a>expense_by_hut <span class="ot">&lt;-</span> recs_des <span class="sc">%&gt;%</span></span>
+<span id="cb585-2"><a href="exercise-solutions.html#cb585-2" tabindex="-1"></a>  <span class="fu">group_by</span>(HousingUnitType) <span class="sc">%&gt;%</span></span>
+<span id="cb585-3"><a href="exercise-solutions.html#cb585-3" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">Expense =</span> <span class="fu">survey_mean</span>(TOTALDOL, <span class="at">na.rm =</span> <span class="cn">TRUE</span>),</span>
+<span id="cb585-4"><a href="exercise-solutions.html#cb585-4" tabindex="-1"></a>            <span class="at">HUs =</span> <span class="fu">survey_total</span>()) <span class="sc">%&gt;%</span></span>
+<span id="cb585-5"><a href="exercise-solutions.html#cb585-5" tabindex="-1"></a>  <span class="fu">arrange</span>(<span class="fu">desc</span>(HUs))</span>
+<span id="cb585-6"><a href="exercise-solutions.html#cb585-6" tabindex="-1"></a></span>
+<span id="cb585-7"><a href="exercise-solutions.html#cb585-7" tabindex="-1"></a>expense_by_hut</span></code></pre></div>
 <pre><code>## # A tibble: 5 × 5
 ##   HousingUnitType            Expense Expense_se       HUs       HUs_se
 ##   &lt;fct&gt;                        &lt;dbl&gt;      &lt;dbl&gt;     &lt;dbl&gt;        &lt;dbl&gt;
@@ -1532,15 +1533,15 @@ <h2>7 - Modeling<a href="exercise-solutions.html#modeling" class="anchor-section
 ## 3 Apartment: 2-4 Units         1407.      24.2   9341795. 0.119       
 ## 4 Single-family attached       1653.      22.3   7451177. 0.114       
 ## 5 Mobile home                  1773.      26.2   6832499. 0.0000000927</code></pre>
-<div class="sourceCode" id="cb589"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb589-1"><a href="exercise-solutions.html#cb589-1" tabindex="-1"></a>exp_unit_out <span class="ot">&lt;-</span> recs_des <span class="sc">%&gt;%</span></span>
-<span id="cb589-2"><a href="exercise-solutions.html#cb589-2" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">HousingUnitType =</span> <span class="fu">fct_infreq</span>(HousingUnitType, NWEIGHT)) <span class="sc">%&gt;%</span></span>
-<span id="cb589-3"><a href="exercise-solutions.html#cb589-3" tabindex="-1"></a>  <span class="fu">svyglm</span>(</span>
-<span id="cb589-4"><a href="exercise-solutions.html#cb589-4" tabindex="-1"></a>    <span class="at">design =</span> .,</span>
-<span id="cb589-5"><a href="exercise-solutions.html#cb589-5" tabindex="-1"></a>    <span class="at">formula =</span> TOTALDOL <span class="sc">~</span> HousingUnitType,</span>
-<span id="cb589-6"><a href="exercise-solutions.html#cb589-6" tabindex="-1"></a>    <span class="at">na.action =</span> na.omit</span>
-<span id="cb589-7"><a href="exercise-solutions.html#cb589-7" tabindex="-1"></a>  )</span>
-<span id="cb589-8"><a href="exercise-solutions.html#cb589-8" tabindex="-1"></a></span>
-<span id="cb589-9"><a href="exercise-solutions.html#cb589-9" tabindex="-1"></a><span class="fu">tidy</span>(exp_unit_out)</span></code></pre></div>
+<div class="sourceCode" id="cb587"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb587-1"><a href="exercise-solutions.html#cb587-1" tabindex="-1"></a>exp_unit_out <span class="ot">&lt;-</span> recs_des <span class="sc">%&gt;%</span></span>
+<span id="cb587-2"><a href="exercise-solutions.html#cb587-2" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">HousingUnitType =</span> <span class="fu">fct_infreq</span>(HousingUnitType, NWEIGHT)) <span class="sc">%&gt;%</span></span>
+<span id="cb587-3"><a href="exercise-solutions.html#cb587-3" tabindex="-1"></a>  <span class="fu">svyglm</span>(</span>
+<span id="cb587-4"><a href="exercise-solutions.html#cb587-4" tabindex="-1"></a>    <span class="at">design =</span> .,</span>
+<span id="cb587-5"><a href="exercise-solutions.html#cb587-5" tabindex="-1"></a>    <span class="at">formula =</span> TOTALDOL <span class="sc">~</span> HousingUnitType,</span>
+<span id="cb587-6"><a href="exercise-solutions.html#cb587-6" tabindex="-1"></a>    <span class="at">na.action =</span> na.omit</span>
+<span id="cb587-7"><a href="exercise-solutions.html#cb587-7" tabindex="-1"></a>  )</span>
+<span id="cb587-8"><a href="exercise-solutions.html#cb587-8" tabindex="-1"></a></span>
+<span id="cb587-9"><a href="exercise-solutions.html#cb587-9" tabindex="-1"></a><span class="fu">tidy</span>(exp_unit_out)</span></code></pre></div>
 <pre><code>## # A tibble: 5 × 5
 ##   term                             estimate std.error statistic  p.value
 ##   &lt;chr&gt;                               &lt;dbl&gt;     &lt;dbl&gt;     &lt;dbl&gt;    &lt;dbl&gt;
@@ -1551,17 +1552,17 @@ <h2>7 - Modeling<a href="exercise-solutions.html#modeling" class="anchor-section
 ## 5 HousingUnitTypeMobile home          -431.     27.4      -15.7 5.36e-22</code></pre>
 <p>Answer: The reference level should be Single-family detached. All p-values are very small indicating there is a significant relationship between housing unit type and total energy expenditure.</p>
 <ol start="2" style="list-style-type: decimal">
-<li>Does temperature play a role in electricity expenditure (<code>DOLLAREL</code>)? Cooling degree days are a measure of how hot a place is. CDD65 for a given day indicates the number of degrees Fahrenheit warmer than 65°F (18.3°C) it is in a location. On a day that averages 65°F and below, CDD65=0. While a day that averages 85°F (29.4°C) would have CDD65=20 because it is 20 degrees Fahrenheit warmer. For each day in the year, this is summed to give an indicator of how hot the place is throughout the year. Similarly, HDD65 indicates the days colder than 65°F<a href="#fn29" class="footnote-ref" id="fnref29"><sup>29</sup></a>. Can energy expenditure be predicted using these temperature indicators along with square footage? Is there a significant relationship? Include main effects and two-way interactions.</li>
+<li>Does temperature play a role in electricity expenditure? Cooling degree days are a measure of how hot a place is. CDD65 for a given day indicates the number of degrees Fahrenheit warmer than 65°F (18.3°C) it is in a location. On a day that averages 65°F and below, CDD65=0. While a day that averages 85°F (29.4°C) would have CDD65=20 because it is 20 degrees Fahrenheit warmer <span class="citation">(<a href="#ref-eia-cdd">U.S. Energy Information Administration 2023d</a>)</span>. For each day in the year, this is summed to give an indicator of how hot the place is throughout the year. Similarly, HDD65 indicates the days colder than 65°F. Can energy expenditure be predicted using these temperature indicators along with square footage? Is there a significant relationship? Include main effects and two-way interactions.</li>
 </ol>
-<div class="sourceCode" id="cb591"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb591-1"><a href="exercise-solutions.html#cb591-1" tabindex="-1"></a>temps_sqft_exp <span class="ot">&lt;-</span> recs_des <span class="sc">%&gt;%</span></span>
-<span id="cb591-2"><a href="exercise-solutions.html#cb591-2" tabindex="-1"></a>  <span class="fu">svyglm</span>(</span>
-<span id="cb591-3"><a href="exercise-solutions.html#cb591-3" tabindex="-1"></a>    <span class="at">design =</span> .,</span>
-<span id="cb591-4"><a href="exercise-solutions.html#cb591-4" tabindex="-1"></a>    <span class="at">formula =</span> DOLLAREL <span class="sc">~</span> (TOTSQFT_EN <span class="sc">+</span> CDD65 <span class="sc">+</span> HDD65) <span class="sc">^</span> <span class="dv">2</span>,</span>
-<span id="cb591-5"><a href="exercise-solutions.html#cb591-5" tabindex="-1"></a>    <span class="at">na.action =</span> na.omit</span>
-<span id="cb591-6"><a href="exercise-solutions.html#cb591-6" tabindex="-1"></a>  )</span>
-<span id="cb591-7"><a href="exercise-solutions.html#cb591-7" tabindex="-1"></a></span>
-<span id="cb591-8"><a href="exercise-solutions.html#cb591-8" tabindex="-1"></a><span class="fu">tidy</span>(temps_sqft_exp) <span class="sc">%&gt;%</span></span>
-<span id="cb591-9"><a href="exercise-solutions.html#cb591-9" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">p.value=</span><span class="fu">pretty_p_value</span>(p.value) <span class="sc">%&gt;%</span> <span class="fu">str_pad</span>(<span class="dv">7</span>))</span></code></pre></div>
+<div class="sourceCode" id="cb589"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb589-1"><a href="exercise-solutions.html#cb589-1" tabindex="-1"></a>temps_sqft_exp <span class="ot">&lt;-</span> recs_des <span class="sc">%&gt;%</span></span>
+<span id="cb589-2"><a href="exercise-solutions.html#cb589-2" tabindex="-1"></a>  <span class="fu">svyglm</span>(</span>
+<span id="cb589-3"><a href="exercise-solutions.html#cb589-3" tabindex="-1"></a>    <span class="at">design =</span> .,</span>
+<span id="cb589-4"><a href="exercise-solutions.html#cb589-4" tabindex="-1"></a>    <span class="at">formula =</span> DOLLAREL <span class="sc">~</span> (TOTSQFT_EN <span class="sc">+</span> CDD65 <span class="sc">+</span> HDD65) <span class="sc">^</span> <span class="dv">2</span>,</span>
+<span id="cb589-5"><a href="exercise-solutions.html#cb589-5" tabindex="-1"></a>    <span class="at">na.action =</span> na.omit</span>
+<span id="cb589-6"><a href="exercise-solutions.html#cb589-6" tabindex="-1"></a>  )</span>
+<span id="cb589-7"><a href="exercise-solutions.html#cb589-7" tabindex="-1"></a></span>
+<span id="cb589-8"><a href="exercise-solutions.html#cb589-8" tabindex="-1"></a><span class="fu">tidy</span>(temps_sqft_exp) <span class="sc">%&gt;%</span></span>
+<span id="cb589-9"><a href="exercise-solutions.html#cb589-9" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">p.value=</span><span class="fu">pretty_p_value</span>(p.value) <span class="sc">%&gt;%</span> <span class="fu">str_pad</span>(<span class="dv">7</span>))</span></code></pre></div>
 <pre><code>## # A tibble: 7 × 5
 ##   term                 estimate   std.error statistic p.value  
 ##   &lt;chr&gt;                   &lt;dbl&gt;       &lt;dbl&gt;     &lt;dbl&gt; &lt;chr&gt;    
@@ -1577,33 +1578,33 @@ <h2>7 - Modeling<a href="exercise-solutions.html#modeling" class="anchor-section
 <li>Continuing with our results from question 2, create a plot between the actual and predicted expenditures and a residual plot for the predicted expenditures.</li>
 </ol>
 <p>Answer:</p>
-<div class="sourceCode" id="cb593"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb593-1"><a href="exercise-solutions.html#cb593-1" tabindex="-1"></a>temps_sqft_exp_fit <span class="ot">&lt;-</span> temps_sqft_exp <span class="sc">%&gt;%</span></span>
-<span id="cb593-2"><a href="exercise-solutions.html#cb593-2" tabindex="-1"></a>  <span class="fu">augment</span>() <span class="sc">%&gt;%</span></span>
-<span id="cb593-3"><a href="exercise-solutions.html#cb593-3" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">.se.fit =</span> <span class="fu">sqrt</span>(<span class="fu">attr</span>(.fitted, <span class="st">&quot;var&quot;</span>)), </span>
-<span id="cb593-4"><a href="exercise-solutions.html#cb593-4" tabindex="-1"></a>         <span class="co"># extract the variance of the fitted value</span></span>
-<span id="cb593-5"><a href="exercise-solutions.html#cb593-5" tabindex="-1"></a>         <span class="at">.fitted =</span> <span class="fu">as.numeric</span>(.fitted)) </span></code></pre></div>
-<div class="sourceCode" id="cb594"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb594-1"><a href="exercise-solutions.html#cb594-1" tabindex="-1"></a>temps_sqft_exp_fit <span class="sc">%&gt;%</span></span>
-<span id="cb594-2"><a href="exercise-solutions.html#cb594-2" tabindex="-1"></a>  <span class="fu">ggplot</span>(<span class="fu">aes</span>(<span class="at">x =</span> DOLLAREL, <span class="at">y =</span> .fitted)) <span class="sc">+</span></span>
-<span id="cb594-3"><a href="exercise-solutions.html#cb594-3" tabindex="-1"></a>  <span class="fu">geom_point</span>() <span class="sc">+</span></span>
-<span id="cb594-4"><a href="exercise-solutions.html#cb594-4" tabindex="-1"></a>  <span class="fu">geom_abline</span>(<span class="at">intercept =</span> <span class="dv">0</span>,</span>
-<span id="cb594-5"><a href="exercise-solutions.html#cb594-5" tabindex="-1"></a>              <span class="at">slope =</span> <span class="dv">1</span>,</span>
-<span id="cb594-6"><a href="exercise-solutions.html#cb594-6" tabindex="-1"></a>              <span class="at">color =</span> <span class="st">&quot;red&quot;</span>) <span class="sc">+</span></span>
-<span id="cb594-7"><a href="exercise-solutions.html#cb594-7" tabindex="-1"></a>  <span class="fu">xlab</span>(<span class="st">&quot;Actual expenditures&quot;</span>) <span class="sc">+</span></span>
-<span id="cb594-8"><a href="exercise-solutions.html#cb594-8" tabindex="-1"></a>  <span class="fu">ylab</span>(<span class="st">&quot;Predicted expenditures&quot;</span>) <span class="sc">+</span></span>
-<span id="cb594-9"><a href="exercise-solutions.html#cb594-9" tabindex="-1"></a>  <span class="fu">theme_minimal</span>()</span></code></pre></div>
+<div class="sourceCode" id="cb591"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb591-1"><a href="exercise-solutions.html#cb591-1" tabindex="-1"></a>temps_sqft_exp_fit <span class="ot">&lt;-</span> temps_sqft_exp <span class="sc">%&gt;%</span></span>
+<span id="cb591-2"><a href="exercise-solutions.html#cb591-2" tabindex="-1"></a>  <span class="fu">augment</span>() <span class="sc">%&gt;%</span></span>
+<span id="cb591-3"><a href="exercise-solutions.html#cb591-3" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">.se.fit =</span> <span class="fu">sqrt</span>(<span class="fu">attr</span>(.fitted, <span class="st">&quot;var&quot;</span>)), </span>
+<span id="cb591-4"><a href="exercise-solutions.html#cb591-4" tabindex="-1"></a>         <span class="co"># extract the variance of the fitted value</span></span>
+<span id="cb591-5"><a href="exercise-solutions.html#cb591-5" tabindex="-1"></a>         <span class="at">.fitted =</span> <span class="fu">as.numeric</span>(.fitted)) </span></code></pre></div>
+<div class="sourceCode" id="cb592"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb592-1"><a href="exercise-solutions.html#cb592-1" tabindex="-1"></a>temps_sqft_exp_fit <span class="sc">%&gt;%</span></span>
+<span id="cb592-2"><a href="exercise-solutions.html#cb592-2" tabindex="-1"></a>  <span class="fu">ggplot</span>(<span class="fu">aes</span>(<span class="at">x =</span> DOLLAREL, <span class="at">y =</span> .fitted)) <span class="sc">+</span></span>
+<span id="cb592-3"><a href="exercise-solutions.html#cb592-3" tabindex="-1"></a>  <span class="fu">geom_point</span>() <span class="sc">+</span></span>
+<span id="cb592-4"><a href="exercise-solutions.html#cb592-4" tabindex="-1"></a>  <span class="fu">geom_abline</span>(<span class="at">intercept =</span> <span class="dv">0</span>,</span>
+<span id="cb592-5"><a href="exercise-solutions.html#cb592-5" tabindex="-1"></a>              <span class="at">slope =</span> <span class="dv">1</span>,</span>
+<span id="cb592-6"><a href="exercise-solutions.html#cb592-6" tabindex="-1"></a>              <span class="at">color =</span> <span class="st">&quot;red&quot;</span>) <span class="sc">+</span></span>
+<span id="cb592-7"><a href="exercise-solutions.html#cb592-7" tabindex="-1"></a>  <span class="fu">xlab</span>(<span class="st">&quot;Actual expenditures&quot;</span>) <span class="sc">+</span></span>
+<span id="cb592-8"><a href="exercise-solutions.html#cb592-8" tabindex="-1"></a>  <span class="fu">ylab</span>(<span class="st">&quot;Predicted expenditures&quot;</span>) <span class="sc">+</span></span>
+<span id="cb592-9"><a href="exercise-solutions.html#cb592-9" tabindex="-1"></a>  <span class="fu">theme_minimal</span>()</span></code></pre></div>
 <div class="figure"><span style="display:block;" id="fig:model-ex-solution4"></span>
 <img src="bookdown_files/figure-html/model-ex-solution4-1.png" alt="Actual and predicted electricity expenditures" width="672" />
 <p class="caption">
 FIGURE D.1: Actual and predicted electricity expenditures
 </p>
 </div>
-<div class="sourceCode" id="cb595"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb595-1"><a href="exercise-solutions.html#cb595-1" tabindex="-1"></a>temps_sqft_exp_fit <span class="sc">%&gt;%</span></span>
-<span id="cb595-2"><a href="exercise-solutions.html#cb595-2" tabindex="-1"></a>  <span class="fu">ggplot</span>(<span class="fu">aes</span>(<span class="at">x =</span> .fitted, <span class="at">y =</span> .resid)) <span class="sc">+</span></span>
-<span id="cb595-3"><a href="exercise-solutions.html#cb595-3" tabindex="-1"></a>  <span class="fu">geom_point</span>() <span class="sc">+</span></span>
-<span id="cb595-4"><a href="exercise-solutions.html#cb595-4" tabindex="-1"></a>  <span class="fu">geom_hline</span>(<span class="at">yintercept =</span> <span class="dv">0</span>, <span class="at">color =</span> <span class="st">&quot;red&quot;</span>) <span class="sc">+</span></span>
-<span id="cb595-5"><a href="exercise-solutions.html#cb595-5" tabindex="-1"></a>  <span class="fu">xlab</span>(<span class="st">&quot;Predicted expenditure&quot;</span>) <span class="sc">+</span></span>
-<span id="cb595-6"><a href="exercise-solutions.html#cb595-6" tabindex="-1"></a>  <span class="fu">ylab</span>(<span class="st">&quot;Residual value of expenditure&quot;</span>) <span class="sc">+</span></span>
-<span id="cb595-7"><a href="exercise-solutions.html#cb595-7" tabindex="-1"></a>  <span class="fu">theme_minimal</span>()</span></code></pre></div>
+<div class="sourceCode" id="cb593"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb593-1"><a href="exercise-solutions.html#cb593-1" tabindex="-1"></a>temps_sqft_exp_fit <span class="sc">%&gt;%</span></span>
+<span id="cb593-2"><a href="exercise-solutions.html#cb593-2" tabindex="-1"></a>  <span class="fu">ggplot</span>(<span class="fu">aes</span>(<span class="at">x =</span> .fitted, <span class="at">y =</span> .resid)) <span class="sc">+</span></span>
+<span id="cb593-3"><a href="exercise-solutions.html#cb593-3" tabindex="-1"></a>  <span class="fu">geom_point</span>() <span class="sc">+</span></span>
+<span id="cb593-4"><a href="exercise-solutions.html#cb593-4" tabindex="-1"></a>  <span class="fu">geom_hline</span>(<span class="at">yintercept =</span> <span class="dv">0</span>, <span class="at">color =</span> <span class="st">&quot;red&quot;</span>) <span class="sc">+</span></span>
+<span id="cb593-5"><a href="exercise-solutions.html#cb593-5" tabindex="-1"></a>  <span class="fu">xlab</span>(<span class="st">&quot;Predicted expenditure&quot;</span>) <span class="sc">+</span></span>
+<span id="cb593-6"><a href="exercise-solutions.html#cb593-6" tabindex="-1"></a>  <span class="fu">ylab</span>(<span class="st">&quot;Residual value of expenditure&quot;</span>) <span class="sc">+</span></span>
+<span id="cb593-7"><a href="exercise-solutions.html#cb593-7" tabindex="-1"></a>  <span class="fu">theme_minimal</span>()</span></code></pre></div>
 <div class="figure"><span style="display:block;" id="fig:model-ex-solution5"></span>
 <img src="bookdown_files/figure-html/model-ex-solution5-1.png" alt="Residual plot of electric cost model with covariates TOTSQFT_EN, CDD65, and HDD65" width="672" />
 <p class="caption">
@@ -1611,18 +1612,18 @@ <h2>7 - Modeling<a href="exercise-solutions.html#modeling" class="anchor-section
 </p>
 </div>
 <ol start="4" style="list-style-type: decimal">
-<li>Early voting expanded in 2020<a href="#fn30" class="footnote-ref" id="fnref30"><sup>30</sup></a>. Build a logistic model predicting early voting in 2020 (<code>EarlyVote2020</code>) using age (<code>Age</code>), education (<code>Education</code>), and party identification (<code>PartyID</code>). Include two-way interactions.</li>
+<li>Early voting expanded in 2020 <span class="citation">(<a href="#ref-npr-voting-trend">Sprunt 2020</a>)</span>. Build a logistic model predicting early voting in 2020 (<code>EarlyVote2020</code>) using age (<code>Age</code>), education (<code>Education</code>), and party identification (<code>PartyID</code>.) Include two-way interactions.</li>
 </ol>
 <p>Answer:</p>
-<div class="sourceCode" id="cb596"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb596-1"><a href="exercise-solutions.html#cb596-1" tabindex="-1"></a>earlyvote_mod <span class="ot">&lt;-</span> anes_des <span class="sc">%&gt;%</span></span>
-<span id="cb596-2"><a href="exercise-solutions.html#cb596-2" tabindex="-1"></a>  <span class="fu">filter</span>(<span class="sc">!</span><span class="fu">is.na</span>(EarlyVote2020)) <span class="sc">%&gt;%</span></span>
-<span id="cb596-3"><a href="exercise-solutions.html#cb596-3" tabindex="-1"></a>  <span class="fu">svyglm</span>(</span>
-<span id="cb596-4"><a href="exercise-solutions.html#cb596-4" tabindex="-1"></a>    <span class="at">design =</span> .,</span>
-<span id="cb596-5"><a href="exercise-solutions.html#cb596-5" tabindex="-1"></a>    <span class="at">formula =</span> EarlyVote2020 <span class="sc">~</span> (Age <span class="sc">+</span> Education <span class="sc">+</span> PartyID) <span class="sc">^</span> <span class="dv">2</span> ,</span>
-<span id="cb596-6"><a href="exercise-solutions.html#cb596-6" tabindex="-1"></a>    <span class="at">family =</span> quasibinomial</span>
-<span id="cb596-7"><a href="exercise-solutions.html#cb596-7" tabindex="-1"></a>  )</span>
-<span id="cb596-8"><a href="exercise-solutions.html#cb596-8" tabindex="-1"></a></span>
-<span id="cb596-9"><a href="exercise-solutions.html#cb596-9" tabindex="-1"></a><span class="fu">tidy</span>(earlyvote_mod) <span class="sc">%&gt;%</span> <span class="fu">print</span>(<span class="at">n=</span><span class="dv">50</span>)</span></code></pre></div>
+<div class="sourceCode" id="cb594"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb594-1"><a href="exercise-solutions.html#cb594-1" tabindex="-1"></a>earlyvote_mod <span class="ot">&lt;-</span> anes_des <span class="sc">%&gt;%</span></span>
+<span id="cb594-2"><a href="exercise-solutions.html#cb594-2" tabindex="-1"></a>  <span class="fu">filter</span>(<span class="sc">!</span><span class="fu">is.na</span>(EarlyVote2020)) <span class="sc">%&gt;%</span></span>
+<span id="cb594-3"><a href="exercise-solutions.html#cb594-3" tabindex="-1"></a>  <span class="fu">svyglm</span>(</span>
+<span id="cb594-4"><a href="exercise-solutions.html#cb594-4" tabindex="-1"></a>    <span class="at">design =</span> .,</span>
+<span id="cb594-5"><a href="exercise-solutions.html#cb594-5" tabindex="-1"></a>    <span class="at">formula =</span> EarlyVote2020 <span class="sc">~</span> (Age <span class="sc">+</span> Education <span class="sc">+</span> PartyID) <span class="sc">^</span> <span class="dv">2</span> ,</span>
+<span id="cb594-6"><a href="exercise-solutions.html#cb594-6" tabindex="-1"></a>    <span class="at">family =</span> quasibinomial</span>
+<span id="cb594-7"><a href="exercise-solutions.html#cb594-7" tabindex="-1"></a>  )</span>
+<span id="cb594-8"><a href="exercise-solutions.html#cb594-8" tabindex="-1"></a></span>
+<span id="cb594-9"><a href="exercise-solutions.html#cb594-9" tabindex="-1"></a><span class="fu">tidy</span>(earlyvote_mod) <span class="sc">%&gt;%</span> <span class="fu">print</span>(<span class="at">n=</span><span class="dv">50</span>)</span></code></pre></div>
 <pre><code>## # A tibble: 46 × 5
 ##    term                             estimate std.error statistic p.value
 ##    &lt;chr&gt;                               &lt;dbl&gt;     &lt;dbl&gt;     &lt;dbl&gt;   &lt;dbl&gt;
@@ -1675,23 +1676,23 @@ <h2>7 - Modeling<a href="exercise-solutions.html#modeling" class="anchor-section
 <ol start="5" style="list-style-type: decimal">
 <li>Continuing from Exercise 4, predict the probability of early voting for two people. Both are 28 years old and have a graduate degree, but one person is a strong Democrat, and the other is a strong Republican.</li>
 </ol>
-<div class="sourceCode" id="cb598"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb598-1"><a href="exercise-solutions.html#cb598-1" tabindex="-1"></a>add_vote_dat <span class="ot">&lt;-</span> anes_2020 <span class="sc">%&gt;%</span></span>
-<span id="cb598-2"><a href="exercise-solutions.html#cb598-2" tabindex="-1"></a>  <span class="fu">select</span>(EarlyVote2020, Age, Education, PartyID) <span class="sc">%&gt;%</span></span>
-<span id="cb598-3"><a href="exercise-solutions.html#cb598-3" tabindex="-1"></a>  <span class="fu">rbind</span>(<span class="fu">tibble</span>(</span>
-<span id="cb598-4"><a href="exercise-solutions.html#cb598-4" tabindex="-1"></a>    <span class="at">EarlyVote2020 =</span> <span class="cn">NA</span>,</span>
-<span id="cb598-5"><a href="exercise-solutions.html#cb598-5" tabindex="-1"></a>    <span class="at">Age =</span> <span class="dv">28</span>,</span>
-<span id="cb598-6"><a href="exercise-solutions.html#cb598-6" tabindex="-1"></a>    <span class="at">Education =</span> <span class="st">&quot;Graduate&quot;</span>,</span>
-<span id="cb598-7"><a href="exercise-solutions.html#cb598-7" tabindex="-1"></a>    <span class="at">PartyID =</span> <span class="fu">c</span>(<span class="st">&quot;Strong democrat&quot;</span>, <span class="st">&quot;Strong republican&quot;</span>)</span>
-<span id="cb598-8"><a href="exercise-solutions.html#cb598-8" tabindex="-1"></a>  )) <span class="sc">%&gt;%</span></span>
-<span id="cb598-9"><a href="exercise-solutions.html#cb598-9" tabindex="-1"></a>  <span class="fu">tail</span>(<span class="dv">2</span>)</span>
-<span id="cb598-10"><a href="exercise-solutions.html#cb598-10" tabindex="-1"></a></span>
-<span id="cb598-11"><a href="exercise-solutions.html#cb598-11" tabindex="-1"></a>log_ex_2_out <span class="ot">&lt;-</span> earlyvote_mod <span class="sc">%&gt;%</span></span>
-<span id="cb598-12"><a href="exercise-solutions.html#cb598-12" tabindex="-1"></a>  <span class="fu">augment</span>(<span class="at">newdata =</span> add_vote_dat, <span class="at">type.predict =</span> <span class="st">&quot;response&quot;</span>) <span class="sc">%&gt;%</span></span>
-<span id="cb598-13"><a href="exercise-solutions.html#cb598-13" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">.se.fit =</span> <span class="fu">sqrt</span>(<span class="fu">attr</span>(.fitted, <span class="st">&quot;var&quot;</span>)),</span>
-<span id="cb598-14"><a href="exercise-solutions.html#cb598-14" tabindex="-1"></a>         <span class="co"># extract the variance of the fitted value</span></span>
-<span id="cb598-15"><a href="exercise-solutions.html#cb598-15" tabindex="-1"></a>         <span class="at">.fitted =</span> <span class="fu">as.numeric</span>(.fitted))</span>
-<span id="cb598-16"><a href="exercise-solutions.html#cb598-16" tabindex="-1"></a></span>
-<span id="cb598-17"><a href="exercise-solutions.html#cb598-17" tabindex="-1"></a>log_ex_2_out</span></code></pre></div>
+<div class="sourceCode" id="cb596"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb596-1"><a href="exercise-solutions.html#cb596-1" tabindex="-1"></a>add_vote_dat <span class="ot">&lt;-</span> anes_2020 <span class="sc">%&gt;%</span></span>
+<span id="cb596-2"><a href="exercise-solutions.html#cb596-2" tabindex="-1"></a>  <span class="fu">select</span>(EarlyVote2020, Age, Education, PartyID) <span class="sc">%&gt;%</span></span>
+<span id="cb596-3"><a href="exercise-solutions.html#cb596-3" tabindex="-1"></a>  <span class="fu">rbind</span>(<span class="fu">tibble</span>(</span>
+<span id="cb596-4"><a href="exercise-solutions.html#cb596-4" tabindex="-1"></a>    <span class="at">EarlyVote2020 =</span> <span class="cn">NA</span>,</span>
+<span id="cb596-5"><a href="exercise-solutions.html#cb596-5" tabindex="-1"></a>    <span class="at">Age =</span> <span class="dv">28</span>,</span>
+<span id="cb596-6"><a href="exercise-solutions.html#cb596-6" tabindex="-1"></a>    <span class="at">Education =</span> <span class="st">&quot;Graduate&quot;</span>,</span>
+<span id="cb596-7"><a href="exercise-solutions.html#cb596-7" tabindex="-1"></a>    <span class="at">PartyID =</span> <span class="fu">c</span>(<span class="st">&quot;Strong democrat&quot;</span>, <span class="st">&quot;Strong republican&quot;</span>)</span>
+<span id="cb596-8"><a href="exercise-solutions.html#cb596-8" tabindex="-1"></a>  )) <span class="sc">%&gt;%</span></span>
+<span id="cb596-9"><a href="exercise-solutions.html#cb596-9" tabindex="-1"></a>  <span class="fu">tail</span>(<span class="dv">2</span>)</span>
+<span id="cb596-10"><a href="exercise-solutions.html#cb596-10" tabindex="-1"></a></span>
+<span id="cb596-11"><a href="exercise-solutions.html#cb596-11" tabindex="-1"></a>log_ex_2_out <span class="ot">&lt;-</span> earlyvote_mod <span class="sc">%&gt;%</span></span>
+<span id="cb596-12"><a href="exercise-solutions.html#cb596-12" tabindex="-1"></a>  <span class="fu">augment</span>(<span class="at">newdata =</span> add_vote_dat, <span class="at">type.predict =</span> <span class="st">&quot;response&quot;</span>) <span class="sc">%&gt;%</span></span>
+<span id="cb596-13"><a href="exercise-solutions.html#cb596-13" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">.se.fit =</span> <span class="fu">sqrt</span>(<span class="fu">attr</span>(.fitted, <span class="st">&quot;var&quot;</span>)),</span>
+<span id="cb596-14"><a href="exercise-solutions.html#cb596-14" tabindex="-1"></a>         <span class="co"># extract the variance of the fitted value</span></span>
+<span id="cb596-15"><a href="exercise-solutions.html#cb596-15" tabindex="-1"></a>         <span class="at">.fitted =</span> <span class="fu">as.numeric</span>(.fitted))</span>
+<span id="cb596-16"><a href="exercise-solutions.html#cb596-16" tabindex="-1"></a></span>
+<span id="cb596-17"><a href="exercise-solutions.html#cb596-17" tabindex="-1"></a>log_ex_2_out</span></code></pre></div>
 <pre><code>## # A tibble: 2 × 6
 ##   EarlyVote2020   Age Education PartyID           .fitted .se.fit
 ##   &lt;fct&gt;         &lt;dbl&gt; &lt;fct&gt;     &lt;fct&gt;               &lt;dbl&gt;   &lt;dbl&gt;
@@ -1702,65 +1703,65 @@ <h2>7 - Modeling<a href="exercise-solutions.html#modeling" class="anchor-section
 <div id="specifying-sample-designs-and-replicate-weights-in-srvyr" class="section level2 unnumbered hasAnchor">
 <h2>10 - Specifying sample designs and replicate weights in {srvyr}<a href="exercise-solutions.html#specifying-sample-designs-and-replicate-weights-in-srvyr" class="anchor-section" aria-label="Anchor link to header"></a></h2>
 <ol style="list-style-type: decimal">
-<li>The National Health Interview Survey (NHIS) is an annual household survey conducted by the National Center for Health Statistics (NCHS). The NHIS includes a wide variety of health topics for adults including health status and conditions, functioning and disability, health care access and health service utilization, health-related behaviors, health promotion, mental health, barriers to care, and community engagement. Like many national in-person surveys, the sampling design is a stratified clustered design with details included in the Survey Description <span class="citation">(<a href="#ref-nhis-svy-des">National Center for Health Statistics 2023</a>)</span>. The Survey Description provides information on setting up syntax in SUDAAN, Stata, SPSS, SAS, and R ({survey} package implementation). You have imported the data and the variable containing the data is: <code>nhis_adult_data</code>. How would you specify the design using {srvyr} using either <code>as_survey_design()</code> or <code>as_survey_rep()</code>?</li>
+<li>The National Health Interview Survey (NHIS) is an annual household survey conducted by the National Center for Health Statistics (NCHS.) The NHIS includes a wide variety of health topics for adults including health status and conditions, functioning and disability, health care access and health service utilization, health-related behaviors, health promotion, mental health, barriers to care, and community engagement. Like many national in-person surveys, the sampling design is a stratified clustered design with details included in the Survey Description <span class="citation">(<a href="#ref-nhis-svy-des">National Center for Health Statistics 2023</a>)</span>. The Survey Description provides information on setting up syntax in SUDAAN, Stata, SPSS, SAS, and R ({survey} package implementation.) We have imported the data and the variable containing the data as: <code>nhis_adult_data</code>. How would we specify the design using either <code>as_survey_design()</code> or <code>as_survey_rep()</code>?</li>
 </ol>
 <p>Answer:</p>
-<div class="sourceCode" id="cb600"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb600-1"><a href="exercise-solutions.html#cb600-1" tabindex="-1"></a>nhis_adult_des <span class="ot">&lt;-</span> nhis_adult_data <span class="sc">%&gt;%</span></span>
-<span id="cb600-2"><a href="exercise-solutions.html#cb600-2" tabindex="-1"></a>  <span class="fu">as_survey_design</span>(</span>
-<span id="cb600-3"><a href="exercise-solutions.html#cb600-3" tabindex="-1"></a>    <span class="at">ids =</span> PPSU,</span>
-<span id="cb600-4"><a href="exercise-solutions.html#cb600-4" tabindex="-1"></a>    <span class="at">strata =</span> PSTRAT,</span>
-<span id="cb600-5"><a href="exercise-solutions.html#cb600-5" tabindex="-1"></a>    <span class="at">nest =</span> <span class="cn">TRUE</span>,</span>
-<span id="cb600-6"><a href="exercise-solutions.html#cb600-6" tabindex="-1"></a>    <span class="at">weights =</span> WTFA_A</span>
-<span id="cb600-7"><a href="exercise-solutions.html#cb600-7" tabindex="-1"></a>  )</span></code></pre></div>
+<div class="sourceCode" id="cb598"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb598-1"><a href="exercise-solutions.html#cb598-1" tabindex="-1"></a>nhis_adult_des <span class="ot">&lt;-</span> nhis_adult_data <span class="sc">%&gt;%</span></span>
+<span id="cb598-2"><a href="exercise-solutions.html#cb598-2" tabindex="-1"></a>  <span class="fu">as_survey_design</span>(</span>
+<span id="cb598-3"><a href="exercise-solutions.html#cb598-3" tabindex="-1"></a>    <span class="at">ids =</span> PPSU,</span>
+<span id="cb598-4"><a href="exercise-solutions.html#cb598-4" tabindex="-1"></a>    <span class="at">strata =</span> PSTRAT,</span>
+<span id="cb598-5"><a href="exercise-solutions.html#cb598-5" tabindex="-1"></a>    <span class="at">nest =</span> <span class="cn">TRUE</span>,</span>
+<span id="cb598-6"><a href="exercise-solutions.html#cb598-6" tabindex="-1"></a>    <span class="at">weights =</span> WTFA_A</span>
+<span id="cb598-7"><a href="exercise-solutions.html#cb598-7" tabindex="-1"></a>  )</span></code></pre></div>
 <ol start="2" style="list-style-type: decimal">
-<li>The General Social Survey is a survey that has been administered since 1972 on social, behavioral, and attitudinal topics. The 2016-2020 GSS Panel codebook provides examples of setting up syntax in SAS and Stata but not R <span class="citation">(<a href="#ref-gss-codebook">Davern et al. 2021</a>)</span>. You have imported the data and the variable containing the data is: <code>gss_data</code>. How would you specify the design in R using either <code>as_survey_design()</code> or <code>as_survey_rep()</code>?</li>
+<li>The General Social Survey is a survey that has been administered since 1972 on social, behavioral, and attitudinal topics. The 2016-2020 GSS Panel codebook provides examples of setting up syntax in SAS and Stata but not R <span class="citation">(<a href="#ref-gss-codebook">Davern et al. 2021</a>)</span>. We have imported the data and the variable containing the data as: <code>gss_data</code>. How would we specify the design in R using either <code>as_survey_design()</code> or <code>as_survey_rep()</code>?</li>
 </ol>
 <p>Answer:</p>
-<div class="sourceCode" id="cb601"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb601-1"><a href="exercise-solutions.html#cb601-1" tabindex="-1"></a>gss_des <span class="ot">&lt;-</span> gss_data <span class="sc">%&gt;%</span></span>
-<span id="cb601-2"><a href="exercise-solutions.html#cb601-2" tabindex="-1"></a>  <span class="fu">as_survey_design</span>(<span class="at">ids =</span> VPSU_2,</span>
-<span id="cb601-3"><a href="exercise-solutions.html#cb601-3" tabindex="-1"></a>                   <span class="at">strata =</span> VSTRAT_2,</span>
-<span id="cb601-4"><a href="exercise-solutions.html#cb601-4" tabindex="-1"></a>                   <span class="at">weights =</span> WTSSNR_2)</span></code></pre></div>
+<div class="sourceCode" id="cb599"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb599-1"><a href="exercise-solutions.html#cb599-1" tabindex="-1"></a>gss_des <span class="ot">&lt;-</span> gss_data <span class="sc">%&gt;%</span></span>
+<span id="cb599-2"><a href="exercise-solutions.html#cb599-2" tabindex="-1"></a>  <span class="fu">as_survey_design</span>(<span class="at">ids =</span> VPSU_2,</span>
+<span id="cb599-3"><a href="exercise-solutions.html#cb599-3" tabindex="-1"></a>                   <span class="at">strata =</span> VSTRAT_2,</span>
+<span id="cb599-4"><a href="exercise-solutions.html#cb599-4" tabindex="-1"></a>                   <span class="at">weights =</span> WTSSNR_2)</span></code></pre></div>
 </div>
 <div id="national-crime-victimization-survey-vignette" class="section level2 unnumbered hasAnchor">
 <h2>13 - National Crime Victimization Survey Vignette<a href="exercise-solutions.html#national-crime-victimization-survey-vignette" class="anchor-section" aria-label="Anchor link to header"></a></h2>
 <ol style="list-style-type: decimal">
-<li>What proportion of completed motor vehicle thefts are <strong>not</strong> reported to the police? Hint: Use the codebook to look at the definition of Type of Crime (V4529).</li>
+<li>What proportion of completed motor vehicle thefts are <strong>not</strong> reported to the police? Hint: Use the codebook to look at the definition of Type of Crime (V4529.)</li>
 </ol>
-<div class="sourceCode" id="cb602"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb602-1"><a href="exercise-solutions.html#cb602-1" tabindex="-1"></a>ans1 <span class="ot">&lt;-</span> inc_des <span class="sc">%&gt;%</span></span>
-<span id="cb602-2"><a href="exercise-solutions.html#cb602-2" tabindex="-1"></a>  <span class="fu">filter</span>(<span class="fu">str_detect</span>(V4529, <span class="st">&quot;40|41&quot;</span>)) <span class="sc">%&gt;%</span></span>
-<span id="cb602-3"><a href="exercise-solutions.html#cb602-3" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">Pct =</span> <span class="fu">survey_mean</span>(<span class="sc">!</span>ReportPolice, <span class="at">na.rm =</span> <span class="cn">TRUE</span>) <span class="sc">*</span> <span class="dv">100</span>)</span></code></pre></div>
+<div class="sourceCode" id="cb600"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb600-1"><a href="exercise-solutions.html#cb600-1" tabindex="-1"></a>ans1 <span class="ot">&lt;-</span> inc_des <span class="sc">%&gt;%</span></span>
+<span id="cb600-2"><a href="exercise-solutions.html#cb600-2" tabindex="-1"></a>  <span class="fu">filter</span>(<span class="fu">str_detect</span>(V4529, <span class="st">&quot;40|41&quot;</span>)) <span class="sc">%&gt;%</span></span>
+<span id="cb600-3"><a href="exercise-solutions.html#cb600-3" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">Pct =</span> <span class="fu">survey_mean</span>(<span class="sc">!</span>ReportPolice, <span class="at">na.rm =</span> <span class="cn">TRUE</span>) <span class="sc">*</span> <span class="dv">100</span>)</span></code></pre></div>
 <p>Answer: It is estimated that 23.1% of motor vehicle thefts are not reported to the police.</p>
 <ol start="2" style="list-style-type: decimal">
 <li>How many violent crimes occur in each region?</li>
 </ol>
 <p>Answer:</p>
-<div class="sourceCode" id="cb603"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb603-1"><a href="exercise-solutions.html#cb603-1" tabindex="-1"></a>inc_des <span class="sc">%&gt;%</span></span>
-<span id="cb603-2"><a href="exercise-solutions.html#cb603-2" tabindex="-1"></a>  <span class="fu">filter</span>(Violent) <span class="sc">%&gt;%</span></span>
-<span id="cb603-3"><a href="exercise-solutions.html#cb603-3" tabindex="-1"></a>  <span class="fu">survey_count</span>(Region) <span class="sc">%&gt;%</span></span>
-<span id="cb603-4"><a href="exercise-solutions.html#cb603-4" tabindex="-1"></a>  <span class="fu">select</span>(<span class="sc">-</span>n_se) <span class="sc">%&gt;%</span></span>
-<span id="cb603-5"><a href="exercise-solutions.html#cb603-5" tabindex="-1"></a>  <span class="fu">gt</span>(<span class="at">rowname_col=</span><span class="st">&quot;Region&quot;</span>) <span class="sc">%&gt;%</span></span>
-<span id="cb603-6"><a href="exercise-solutions.html#cb603-6" tabindex="-1"></a>  <span class="fu">fmt_integer</span>() <span class="sc">%&gt;%</span></span>
-<span id="cb603-7"><a href="exercise-solutions.html#cb603-7" tabindex="-1"></a>  <span class="fu">cols_label</span>(</span>
-<span id="cb603-8"><a href="exercise-solutions.html#cb603-8" tabindex="-1"></a>    <span class="at">n =</span><span class="st">&quot;Violent victimizations&quot;</span>,</span>
-<span id="cb603-9"><a href="exercise-solutions.html#cb603-9" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
-<span id="cb603-10"><a href="exercise-solutions.html#cb603-10" tabindex="-1"></a>  <span class="fu">tab_header</span>(<span class="st">&quot;Estimated number of violent crimes by region&quot;</span>)</span></code></pre></div>
-<div id="uqkclffrjq" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
-<style>#uqkclffrjq table {
+<div class="sourceCode" id="cb601"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb601-1"><a href="exercise-solutions.html#cb601-1" tabindex="-1"></a>inc_des <span class="sc">%&gt;%</span></span>
+<span id="cb601-2"><a href="exercise-solutions.html#cb601-2" tabindex="-1"></a>  <span class="fu">filter</span>(Violent) <span class="sc">%&gt;%</span></span>
+<span id="cb601-3"><a href="exercise-solutions.html#cb601-3" tabindex="-1"></a>  <span class="fu">survey_count</span>(Region) <span class="sc">%&gt;%</span></span>
+<span id="cb601-4"><a href="exercise-solutions.html#cb601-4" tabindex="-1"></a>  <span class="fu">select</span>(<span class="sc">-</span>n_se) <span class="sc">%&gt;%</span></span>
+<span id="cb601-5"><a href="exercise-solutions.html#cb601-5" tabindex="-1"></a>  <span class="fu">gt</span>(<span class="at">rowname_col=</span><span class="st">&quot;Region&quot;</span>) <span class="sc">%&gt;%</span></span>
+<span id="cb601-6"><a href="exercise-solutions.html#cb601-6" tabindex="-1"></a>  <span class="fu">fmt_integer</span>() <span class="sc">%&gt;%</span></span>
+<span id="cb601-7"><a href="exercise-solutions.html#cb601-7" tabindex="-1"></a>  <span class="fu">cols_label</span>(</span>
+<span id="cb601-8"><a href="exercise-solutions.html#cb601-8" tabindex="-1"></a>    <span class="at">n =</span><span class="st">&quot;Violent victimizations&quot;</span>,</span>
+<span id="cb601-9"><a href="exercise-solutions.html#cb601-9" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
+<span id="cb601-10"><a href="exercise-solutions.html#cb601-10" tabindex="-1"></a>  <span class="fu">tab_header</span>(<span class="st">&quot;Estimated number of violent crimes by region&quot;</span>)</span></code></pre></div>
+<div id="kqivcbyivm" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
+<style>#kqivcbyivm table {
   font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
   -webkit-font-smoothing: antialiased;
   -moz-osx-font-smoothing: grayscale;
 }
 
-#uqkclffrjq thead, #uqkclffrjq tbody, #uqkclffrjq tfoot, #uqkclffrjq tr, #uqkclffrjq td, #uqkclffrjq th {
+#kqivcbyivm thead, #kqivcbyivm tbody, #kqivcbyivm tfoot, #kqivcbyivm tr, #kqivcbyivm td, #kqivcbyivm th {
   border-style: none;
 }
 
-#uqkclffrjq p {
+#kqivcbyivm p {
   margin: 0;
   padding: 0;
 }
 
-#uqkclffrjq .gt_table {
+#kqivcbyivm .gt_table {
   display: table;
   border-collapse: collapse;
   line-height: normal;
@@ -1786,12 +1787,12 @@ <h2>13 - National Crime Victimization Survey Vignette<a href="exercise-solutions
   border-left-color: #D3D3D3;
 }
 
-#uqkclffrjq .gt_caption {
+#kqivcbyivm .gt_caption {
   padding-top: 4px;
   padding-bottom: 4px;
 }
 
-#uqkclffrjq .gt_title {
+#kqivcbyivm .gt_title {
   color: #333333;
   font-size: 125%;
   font-weight: initial;
@@ -1803,7 +1804,7 @@ <h2>13 - National Crime Victimization Survey Vignette<a href="exercise-solutions
   border-bottom-width: 0;
 }
 
-#uqkclffrjq .gt_subtitle {
+#kqivcbyivm .gt_subtitle {
   color: #333333;
   font-size: 85%;
   font-weight: initial;
@@ -1815,7 +1816,7 @@ <h2>13 - National Crime Victimization Survey Vignette<a href="exercise-solutions
   border-top-width: 0;
 }
 
-#uqkclffrjq .gt_heading {
+#kqivcbyivm .gt_heading {
   background-color: #FFFFFF;
   text-align: center;
   border-bottom-color: #FFFFFF;
@@ -1827,13 +1828,13 @@ <h2>13 - National Crime Victimization Survey Vignette<a href="exercise-solutions
   border-right-color: #D3D3D3;
 }
 
-#uqkclffrjq .gt_bottom_border {
+#kqivcbyivm .gt_bottom_border {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
 }
 
-#uqkclffrjq .gt_col_headings {
+#kqivcbyivm .gt_col_headings {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -1848,7 +1849,7 @@ <h2>13 - National Crime Victimization Survey Vignette<a href="exercise-solutions
   border-right-color: #D3D3D3;
 }
 
-#uqkclffrjq .gt_col_heading {
+#kqivcbyivm .gt_col_heading {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -1868,7 +1869,7 @@ <h2>13 - National Crime Victimization Survey Vignette<a href="exercise-solutions
   overflow-x: hidden;
 }
 
-#uqkclffrjq .gt_column_spanner_outer {
+#kqivcbyivm .gt_column_spanner_outer {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -1880,15 +1881,15 @@ <h2>13 - National Crime Victimization Survey Vignette<a href="exercise-solutions
   padding-right: 4px;
 }
 
-#uqkclffrjq .gt_column_spanner_outer:first-child {
+#kqivcbyivm .gt_column_spanner_outer:first-child {
   padding-left: 0;
 }
 
-#uqkclffrjq .gt_column_spanner_outer:last-child {
+#kqivcbyivm .gt_column_spanner_outer:last-child {
   padding-right: 0;
 }
 
-#uqkclffrjq .gt_column_spanner {
+#kqivcbyivm .gt_column_spanner {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
@@ -1900,11 +1901,11 @@ <h2>13 - National Crime Victimization Survey Vignette<a href="exercise-solutions
   width: 100%;
 }
 
-#uqkclffrjq .gt_spanner_row {
+#kqivcbyivm .gt_spanner_row {
   border-bottom-style: hidden;
 }
 
-#uqkclffrjq .gt_group_heading {
+#kqivcbyivm .gt_group_heading {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -1930,7 +1931,7 @@ <h2>13 - National Crime Victimization Survey Vignette<a href="exercise-solutions
   text-align: left;
 }
 
-#uqkclffrjq .gt_empty_group_heading {
+#kqivcbyivm .gt_empty_group_heading {
   padding: 0.5px;
   color: #333333;
   background-color: #FFFFFF;
@@ -1945,15 +1946,15 @@ <h2>13 - National Crime Victimization Survey Vignette<a href="exercise-solutions
   vertical-align: middle;
 }
 
-#uqkclffrjq .gt_from_md > :first-child {
+#kqivcbyivm .gt_from_md > :first-child {
   margin-top: 0;
 }
 
-#uqkclffrjq .gt_from_md > :last-child {
+#kqivcbyivm .gt_from_md > :last-child {
   margin-bottom: 0;
 }
 
-#uqkclffrjq .gt_row {
+#kqivcbyivm .gt_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -1972,7 +1973,7 @@ <h2>13 - National Crime Victimization Survey Vignette<a href="exercise-solutions
   overflow-x: hidden;
 }
 
-#uqkclffrjq .gt_stub {
+#kqivcbyivm .gt_stub {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -1985,7 +1986,7 @@ <h2>13 - National Crime Victimization Survey Vignette<a href="exercise-solutions
   padding-right: 5px;
 }
 
-#uqkclffrjq .gt_stub_row_group {
+#kqivcbyivm .gt_stub_row_group {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -1999,15 +2000,15 @@ <h2>13 - National Crime Victimization Survey Vignette<a href="exercise-solutions
   vertical-align: top;
 }
 
-#uqkclffrjq .gt_row_group_first td {
+#kqivcbyivm .gt_row_group_first td {
   border-top-width: 2px;
 }
 
-#uqkclffrjq .gt_row_group_first th {
+#kqivcbyivm .gt_row_group_first th {
   border-top-width: 2px;
 }
 
-#uqkclffrjq .gt_summary_row {
+#kqivcbyivm .gt_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -2017,16 +2018,16 @@ <h2>13 - National Crime Victimization Survey Vignette<a href="exercise-solutions
   padding-right: 5px;
 }
 
-#uqkclffrjq .gt_first_summary_row {
+#kqivcbyivm .gt_first_summary_row {
   border-top-style: solid;
   border-top-color: #D3D3D3;
 }
 
-#uqkclffrjq .gt_first_summary_row.thick {
+#kqivcbyivm .gt_first_summary_row.thick {
   border-top-width: 2px;
 }
 
-#uqkclffrjq .gt_last_summary_row {
+#kqivcbyivm .gt_last_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -2036,7 +2037,7 @@ <h2>13 - National Crime Victimization Survey Vignette<a href="exercise-solutions
   border-bottom-color: #D3D3D3;
 }
 
-#uqkclffrjq .gt_grand_summary_row {
+#kqivcbyivm .gt_grand_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -2046,7 +2047,7 @@ <h2>13 - National Crime Victimization Survey Vignette<a href="exercise-solutions
   padding-right: 5px;
 }
 
-#uqkclffrjq .gt_first_grand_summary_row {
+#kqivcbyivm .gt_first_grand_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -2056,7 +2057,7 @@ <h2>13 - National Crime Victimization Survey Vignette<a href="exercise-solutions
   border-top-color: #D3D3D3;
 }
 
-#uqkclffrjq .gt_last_grand_summary_row_top {
+#kqivcbyivm .gt_last_grand_summary_row_top {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -2066,11 +2067,11 @@ <h2>13 - National Crime Victimization Survey Vignette<a href="exercise-solutions
   border-bottom-color: #D3D3D3;
 }
 
-#uqkclffrjq .gt_striped {
+#kqivcbyivm .gt_striped {
   background-color: rgba(128, 128, 128, 0.05);
 }
 
-#uqkclffrjq .gt_table_body {
+#kqivcbyivm .gt_table_body {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -2079,7 +2080,7 @@ <h2>13 - National Crime Victimization Survey Vignette<a href="exercise-solutions
   border-bottom-color: #D3D3D3;
 }
 
-#uqkclffrjq .gt_footnotes {
+#kqivcbyivm .gt_footnotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -2093,7 +2094,7 @@ <h2>13 - National Crime Victimization Survey Vignette<a href="exercise-solutions
   border-right-color: #D3D3D3;
 }
 
-#uqkclffrjq .gt_footnote {
+#kqivcbyivm .gt_footnote {
   margin: 0px;
   font-size: 90%;
   padding-top: 4px;
@@ -2102,7 +2103,7 @@ <h2>13 - National Crime Victimization Survey Vignette<a href="exercise-solutions
   padding-right: 5px;
 }
 
-#uqkclffrjq .gt_sourcenotes {
+#kqivcbyivm .gt_sourcenotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -2116,7 +2117,7 @@ <h2>13 - National Crime Victimization Survey Vignette<a href="exercise-solutions
   border-right-color: #D3D3D3;
 }
 
-#uqkclffrjq .gt_sourcenote {
+#kqivcbyivm .gt_sourcenote {
   font-size: 90%;
   padding-top: 4px;
   padding-bottom: 4px;
@@ -2124,63 +2125,63 @@ <h2>13 - National Crime Victimization Survey Vignette<a href="exercise-solutions
   padding-right: 5px;
 }
 
-#uqkclffrjq .gt_left {
+#kqivcbyivm .gt_left {
   text-align: left;
 }
 
-#uqkclffrjq .gt_center {
+#kqivcbyivm .gt_center {
   text-align: center;
 }
 
-#uqkclffrjq .gt_right {
+#kqivcbyivm .gt_right {
   text-align: right;
   font-variant-numeric: tabular-nums;
 }
 
-#uqkclffrjq .gt_font_normal {
+#kqivcbyivm .gt_font_normal {
   font-weight: normal;
 }
 
-#uqkclffrjq .gt_font_bold {
+#kqivcbyivm .gt_font_bold {
   font-weight: bold;
 }
 
-#uqkclffrjq .gt_font_italic {
+#kqivcbyivm .gt_font_italic {
   font-style: italic;
 }
 
-#uqkclffrjq .gt_super {
+#kqivcbyivm .gt_super {
   font-size: 65%;
 }
 
-#uqkclffrjq .gt_footnote_marks {
+#kqivcbyivm .gt_footnote_marks {
   font-size: 75%;
   vertical-align: 0.4em;
   position: initial;
 }
 
-#uqkclffrjq .gt_asterisk {
+#kqivcbyivm .gt_asterisk {
   font-size: 100%;
   vertical-align: 0;
 }
 
-#uqkclffrjq .gt_indent_1 {
+#kqivcbyivm .gt_indent_1 {
   text-indent: 5px;
 }
 
-#uqkclffrjq .gt_indent_2 {
+#kqivcbyivm .gt_indent_2 {
   text-indent: 10px;
 }
 
-#uqkclffrjq .gt_indent_3 {
+#kqivcbyivm .gt_indent_3 {
   text-indent: 15px;
 }
 
-#uqkclffrjq .gt_indent_4 {
+#kqivcbyivm .gt_indent_4 {
   text-indent: 20px;
 }
 
-#uqkclffrjq .gt_indent_5 {
+#kqivcbyivm .gt_indent_5 {
   text-indent: 25px;
 }
 </style>
@@ -2213,35 +2214,35 @@ <h2>13 - National Crime Victimization Survey Vignette<a href="exercise-solutions
 <li>What is the property victimization rate among each income level?</li>
 </ol>
 <p>Answer:</p>
-<div class="sourceCode" id="cb604"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb604-1"><a href="exercise-solutions.html#cb604-1" tabindex="-1"></a>hh_des <span class="sc">%&gt;%</span></span>
-<span id="cb604-2"><a href="exercise-solutions.html#cb604-2" tabindex="-1"></a>  <span class="fu">filter</span>(<span class="sc">!</span><span class="fu">is.na</span>(Income)) <span class="sc">%&gt;%</span></span>
-<span id="cb604-3"><a href="exercise-solutions.html#cb604-3" tabindex="-1"></a>  <span class="fu">group_by</span>(Income) <span class="sc">%&gt;%</span></span>
-<span id="cb604-4"><a href="exercise-solutions.html#cb604-4" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">Property_Rate =</span> <span class="fu">survey_mean</span>(Property <span class="sc">*</span> ADJINC_WT <span class="sc">*</span> <span class="dv">1000</span>, </span>
-<span id="cb604-5"><a href="exercise-solutions.html#cb604-5" tabindex="-1"></a>                                        <span class="at">na.rm =</span> <span class="cn">TRUE</span>)) <span class="sc">%&gt;%</span></span>
-<span id="cb604-6"><a href="exercise-solutions.html#cb604-6" tabindex="-1"></a>  <span class="fu">gt</span>(<span class="at">rowname_col=</span><span class="st">&quot;Income&quot;</span>) <span class="sc">%&gt;%</span></span>
-<span id="cb604-7"><a href="exercise-solutions.html#cb604-7" tabindex="-1"></a>  <span class="fu">cols_label</span>(</span>
-<span id="cb604-8"><a href="exercise-solutions.html#cb604-8" tabindex="-1"></a>    <span class="at">Property_Rate=</span><span class="st">&quot;Rate&quot;</span>,</span>
-<span id="cb604-9"><a href="exercise-solutions.html#cb604-9" tabindex="-1"></a>    <span class="at">Property_Rate_se=</span><span class="st">&quot;Standard Error&quot;</span></span>
-<span id="cb604-10"><a href="exercise-solutions.html#cb604-10" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
-<span id="cb604-11"><a href="exercise-solutions.html#cb604-11" tabindex="-1"></a>  <span class="fu">fmt_number</span>(<span class="at">decimals=</span><span class="dv">1</span>) <span class="sc">%&gt;%</span></span>
-<span id="cb604-12"><a href="exercise-solutions.html#cb604-12" tabindex="-1"></a>  <span class="fu">tab_header</span>(<span class="st">&quot;Estimated property victimization rate by income level&quot;</span>)</span></code></pre></div>
-<div id="xqossclppl" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
-<style>#xqossclppl table {
+<div class="sourceCode" id="cb602"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb602-1"><a href="exercise-solutions.html#cb602-1" tabindex="-1"></a>hh_des <span class="sc">%&gt;%</span></span>
+<span id="cb602-2"><a href="exercise-solutions.html#cb602-2" tabindex="-1"></a>  <span class="fu">filter</span>(<span class="sc">!</span><span class="fu">is.na</span>(Income)) <span class="sc">%&gt;%</span></span>
+<span id="cb602-3"><a href="exercise-solutions.html#cb602-3" tabindex="-1"></a>  <span class="fu">group_by</span>(Income) <span class="sc">%&gt;%</span></span>
+<span id="cb602-4"><a href="exercise-solutions.html#cb602-4" tabindex="-1"></a>  <span class="fu">summarize</span>(<span class="at">Property_Rate =</span> <span class="fu">survey_mean</span>(Property <span class="sc">*</span> ADJINC_WT <span class="sc">*</span> <span class="dv">1000</span>, </span>
+<span id="cb602-5"><a href="exercise-solutions.html#cb602-5" tabindex="-1"></a>                                        <span class="at">na.rm =</span> <span class="cn">TRUE</span>)) <span class="sc">%&gt;%</span></span>
+<span id="cb602-6"><a href="exercise-solutions.html#cb602-6" tabindex="-1"></a>  <span class="fu">gt</span>(<span class="at">rowname_col=</span><span class="st">&quot;Income&quot;</span>) <span class="sc">%&gt;%</span></span>
+<span id="cb602-7"><a href="exercise-solutions.html#cb602-7" tabindex="-1"></a>  <span class="fu">cols_label</span>(</span>
+<span id="cb602-8"><a href="exercise-solutions.html#cb602-8" tabindex="-1"></a>    <span class="at">Property_Rate=</span><span class="st">&quot;Rate&quot;</span>,</span>
+<span id="cb602-9"><a href="exercise-solutions.html#cb602-9" tabindex="-1"></a>    <span class="at">Property_Rate_se=</span><span class="st">&quot;Standard Error&quot;</span></span>
+<span id="cb602-10"><a href="exercise-solutions.html#cb602-10" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
+<span id="cb602-11"><a href="exercise-solutions.html#cb602-11" tabindex="-1"></a>  <span class="fu">fmt_number</span>(<span class="at">decimals=</span><span class="dv">1</span>) <span class="sc">%&gt;%</span></span>
+<span id="cb602-12"><a href="exercise-solutions.html#cb602-12" tabindex="-1"></a>  <span class="fu">tab_header</span>(<span class="st">&quot;Estimated property victimization rate by income level&quot;</span>)</span></code></pre></div>
+<div id="mdvpqxzjwg" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
+<style>#mdvpqxzjwg table {
   font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
   -webkit-font-smoothing: antialiased;
   -moz-osx-font-smoothing: grayscale;
 }
 
-#xqossclppl thead, #xqossclppl tbody, #xqossclppl tfoot, #xqossclppl tr, #xqossclppl td, #xqossclppl th {
+#mdvpqxzjwg thead, #mdvpqxzjwg tbody, #mdvpqxzjwg tfoot, #mdvpqxzjwg tr, #mdvpqxzjwg td, #mdvpqxzjwg th {
   border-style: none;
 }
 
-#xqossclppl p {
+#mdvpqxzjwg p {
   margin: 0;
   padding: 0;
 }
 
-#xqossclppl .gt_table {
+#mdvpqxzjwg .gt_table {
   display: table;
   border-collapse: collapse;
   line-height: normal;
@@ -2267,12 +2268,12 @@ <h2>13 - National Crime Victimization Survey Vignette<a href="exercise-solutions
   border-left-color: #D3D3D3;
 }
 
-#xqossclppl .gt_caption {
+#mdvpqxzjwg .gt_caption {
   padding-top: 4px;
   padding-bottom: 4px;
 }
 
-#xqossclppl .gt_title {
+#mdvpqxzjwg .gt_title {
   color: #333333;
   font-size: 125%;
   font-weight: initial;
@@ -2284,7 +2285,7 @@ <h2>13 - National Crime Victimization Survey Vignette<a href="exercise-solutions
   border-bottom-width: 0;
 }
 
-#xqossclppl .gt_subtitle {
+#mdvpqxzjwg .gt_subtitle {
   color: #333333;
   font-size: 85%;
   font-weight: initial;
@@ -2296,7 +2297,7 @@ <h2>13 - National Crime Victimization Survey Vignette<a href="exercise-solutions
   border-top-width: 0;
 }
 
-#xqossclppl .gt_heading {
+#mdvpqxzjwg .gt_heading {
   background-color: #FFFFFF;
   text-align: center;
   border-bottom-color: #FFFFFF;
@@ -2308,13 +2309,13 @@ <h2>13 - National Crime Victimization Survey Vignette<a href="exercise-solutions
   border-right-color: #D3D3D3;
 }
 
-#xqossclppl .gt_bottom_border {
+#mdvpqxzjwg .gt_bottom_border {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
 }
 
-#xqossclppl .gt_col_headings {
+#mdvpqxzjwg .gt_col_headings {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -2329,7 +2330,7 @@ <h2>13 - National Crime Victimization Survey Vignette<a href="exercise-solutions
   border-right-color: #D3D3D3;
 }
 
-#xqossclppl .gt_col_heading {
+#mdvpqxzjwg .gt_col_heading {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -2349,7 +2350,7 @@ <h2>13 - National Crime Victimization Survey Vignette<a href="exercise-solutions
   overflow-x: hidden;
 }
 
-#xqossclppl .gt_column_spanner_outer {
+#mdvpqxzjwg .gt_column_spanner_outer {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -2361,15 +2362,15 @@ <h2>13 - National Crime Victimization Survey Vignette<a href="exercise-solutions
   padding-right: 4px;
 }
 
-#xqossclppl .gt_column_spanner_outer:first-child {
+#mdvpqxzjwg .gt_column_spanner_outer:first-child {
   padding-left: 0;
 }
 
-#xqossclppl .gt_column_spanner_outer:last-child {
+#mdvpqxzjwg .gt_column_spanner_outer:last-child {
   padding-right: 0;
 }
 
-#xqossclppl .gt_column_spanner {
+#mdvpqxzjwg .gt_column_spanner {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
@@ -2381,11 +2382,11 @@ <h2>13 - National Crime Victimization Survey Vignette<a href="exercise-solutions
   width: 100%;
 }
 
-#xqossclppl .gt_spanner_row {
+#mdvpqxzjwg .gt_spanner_row {
   border-bottom-style: hidden;
 }
 
-#xqossclppl .gt_group_heading {
+#mdvpqxzjwg .gt_group_heading {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -2411,7 +2412,7 @@ <h2>13 - National Crime Victimization Survey Vignette<a href="exercise-solutions
   text-align: left;
 }
 
-#xqossclppl .gt_empty_group_heading {
+#mdvpqxzjwg .gt_empty_group_heading {
   padding: 0.5px;
   color: #333333;
   background-color: #FFFFFF;
@@ -2426,15 +2427,15 @@ <h2>13 - National Crime Victimization Survey Vignette<a href="exercise-solutions
   vertical-align: middle;
 }
 
-#xqossclppl .gt_from_md > :first-child {
+#mdvpqxzjwg .gt_from_md > :first-child {
   margin-top: 0;
 }
 
-#xqossclppl .gt_from_md > :last-child {
+#mdvpqxzjwg .gt_from_md > :last-child {
   margin-bottom: 0;
 }
 
-#xqossclppl .gt_row {
+#mdvpqxzjwg .gt_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -2453,7 +2454,7 @@ <h2>13 - National Crime Victimization Survey Vignette<a href="exercise-solutions
   overflow-x: hidden;
 }
 
-#xqossclppl .gt_stub {
+#mdvpqxzjwg .gt_stub {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -2466,7 +2467,7 @@ <h2>13 - National Crime Victimization Survey Vignette<a href="exercise-solutions
   padding-right: 5px;
 }
 
-#xqossclppl .gt_stub_row_group {
+#mdvpqxzjwg .gt_stub_row_group {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -2480,15 +2481,15 @@ <h2>13 - National Crime Victimization Survey Vignette<a href="exercise-solutions
   vertical-align: top;
 }
 
-#xqossclppl .gt_row_group_first td {
+#mdvpqxzjwg .gt_row_group_first td {
   border-top-width: 2px;
 }
 
-#xqossclppl .gt_row_group_first th {
+#mdvpqxzjwg .gt_row_group_first th {
   border-top-width: 2px;
 }
 
-#xqossclppl .gt_summary_row {
+#mdvpqxzjwg .gt_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -2498,16 +2499,16 @@ <h2>13 - National Crime Victimization Survey Vignette<a href="exercise-solutions
   padding-right: 5px;
 }
 
-#xqossclppl .gt_first_summary_row {
+#mdvpqxzjwg .gt_first_summary_row {
   border-top-style: solid;
   border-top-color: #D3D3D3;
 }
 
-#xqossclppl .gt_first_summary_row.thick {
+#mdvpqxzjwg .gt_first_summary_row.thick {
   border-top-width: 2px;
 }
 
-#xqossclppl .gt_last_summary_row {
+#mdvpqxzjwg .gt_last_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -2517,7 +2518,7 @@ <h2>13 - National Crime Victimization Survey Vignette<a href="exercise-solutions
   border-bottom-color: #D3D3D3;
 }
 
-#xqossclppl .gt_grand_summary_row {
+#mdvpqxzjwg .gt_grand_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -2527,7 +2528,7 @@ <h2>13 - National Crime Victimization Survey Vignette<a href="exercise-solutions
   padding-right: 5px;
 }
 
-#xqossclppl .gt_first_grand_summary_row {
+#mdvpqxzjwg .gt_first_grand_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -2537,7 +2538,7 @@ <h2>13 - National Crime Victimization Survey Vignette<a href="exercise-solutions
   border-top-color: #D3D3D3;
 }
 
-#xqossclppl .gt_last_grand_summary_row_top {
+#mdvpqxzjwg .gt_last_grand_summary_row_top {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -2547,11 +2548,11 @@ <h2>13 - National Crime Victimization Survey Vignette<a href="exercise-solutions
   border-bottom-color: #D3D3D3;
 }
 
-#xqossclppl .gt_striped {
+#mdvpqxzjwg .gt_striped {
   background-color: rgba(128, 128, 128, 0.05);
 }
 
-#xqossclppl .gt_table_body {
+#mdvpqxzjwg .gt_table_body {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -2560,7 +2561,7 @@ <h2>13 - National Crime Victimization Survey Vignette<a href="exercise-solutions
   border-bottom-color: #D3D3D3;
 }
 
-#xqossclppl .gt_footnotes {
+#mdvpqxzjwg .gt_footnotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -2574,7 +2575,7 @@ <h2>13 - National Crime Victimization Survey Vignette<a href="exercise-solutions
   border-right-color: #D3D3D3;
 }
 
-#xqossclppl .gt_footnote {
+#mdvpqxzjwg .gt_footnote {
   margin: 0px;
   font-size: 90%;
   padding-top: 4px;
@@ -2583,7 +2584,7 @@ <h2>13 - National Crime Victimization Survey Vignette<a href="exercise-solutions
   padding-right: 5px;
 }
 
-#xqossclppl .gt_sourcenotes {
+#mdvpqxzjwg .gt_sourcenotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -2597,7 +2598,7 @@ <h2>13 - National Crime Victimization Survey Vignette<a href="exercise-solutions
   border-right-color: #D3D3D3;
 }
 
-#xqossclppl .gt_sourcenote {
+#mdvpqxzjwg .gt_sourcenote {
   font-size: 90%;
   padding-top: 4px;
   padding-bottom: 4px;
@@ -2605,63 +2606,63 @@ <h2>13 - National Crime Victimization Survey Vignette<a href="exercise-solutions
   padding-right: 5px;
 }
 
-#xqossclppl .gt_left {
+#mdvpqxzjwg .gt_left {
   text-align: left;
 }
 
-#xqossclppl .gt_center {
+#mdvpqxzjwg .gt_center {
   text-align: center;
 }
 
-#xqossclppl .gt_right {
+#mdvpqxzjwg .gt_right {
   text-align: right;
   font-variant-numeric: tabular-nums;
 }
 
-#xqossclppl .gt_font_normal {
+#mdvpqxzjwg .gt_font_normal {
   font-weight: normal;
 }
 
-#xqossclppl .gt_font_bold {
+#mdvpqxzjwg .gt_font_bold {
   font-weight: bold;
 }
 
-#xqossclppl .gt_font_italic {
+#mdvpqxzjwg .gt_font_italic {
   font-style: italic;
 }
 
-#xqossclppl .gt_super {
+#mdvpqxzjwg .gt_super {
   font-size: 65%;
 }
 
-#xqossclppl .gt_footnote_marks {
+#mdvpqxzjwg .gt_footnote_marks {
   font-size: 75%;
   vertical-align: 0.4em;
   position: initial;
 }
 
-#xqossclppl .gt_asterisk {
+#mdvpqxzjwg .gt_asterisk {
   font-size: 100%;
   vertical-align: 0;
 }
 
-#xqossclppl .gt_indent_1 {
+#mdvpqxzjwg .gt_indent_1 {
   text-indent: 5px;
 }
 
-#xqossclppl .gt_indent_2 {
+#mdvpqxzjwg .gt_indent_2 {
   text-indent: 10px;
 }
 
-#xqossclppl .gt_indent_3 {
+#mdvpqxzjwg .gt_indent_3 {
   text-indent: 15px;
 }
 
-#xqossclppl .gt_indent_4 {
+#mdvpqxzjwg .gt_indent_4 {
   text-indent: 20px;
 }
 
-#xqossclppl .gt_indent_5 {
+#mdvpqxzjwg .gt_indent_5 {
   text-indent: 25px;
 }
 </style>
@@ -2701,22 +2702,22 @@ <h2>13 - National Crime Victimization Survey Vignette<a href="exercise-solutions
 <ol start="4" style="list-style-type: decimal">
 <li>What is the difference between the violent victimization rate between males and females? Is it statistically different?</li>
 </ol>
-<div class="sourceCode" id="cb605"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb605-1"><a href="exercise-solutions.html#cb605-1" tabindex="-1"></a>vr_gender <span class="ot">&lt;-</span> pers_des <span class="sc">%&gt;%</span></span>
-<span id="cb605-2"><a href="exercise-solutions.html#cb605-2" tabindex="-1"></a>  <span class="fu">group_by</span>(Sex) <span class="sc">%&gt;%</span></span>
-<span id="cb605-3"><a href="exercise-solutions.html#cb605-3" tabindex="-1"></a>  <span class="fu">summarize</span>(</span>
-<span id="cb605-4"><a href="exercise-solutions.html#cb605-4" tabindex="-1"></a>    <span class="at">Violent_rate=</span><span class="fu">survey_mean</span>(Violent <span class="sc">*</span> ADJINC_WT <span class="sc">*</span> <span class="dv">1000</span>, <span class="at">na.rm=</span><span class="cn">TRUE</span>)</span>
-<span id="cb605-5"><a href="exercise-solutions.html#cb605-5" tabindex="-1"></a>  )</span>
-<span id="cb605-6"><a href="exercise-solutions.html#cb605-6" tabindex="-1"></a></span>
-<span id="cb605-7"><a href="exercise-solutions.html#cb605-7" tabindex="-1"></a>vr_gender_test <span class="ot">&lt;-</span> pers_des <span class="sc">%&gt;%</span></span>
-<span id="cb605-8"><a href="exercise-solutions.html#cb605-8" tabindex="-1"></a>  <span class="fu">mutate</span>(</span>
-<span id="cb605-9"><a href="exercise-solutions.html#cb605-9" tabindex="-1"></a>    <span class="at">Violent_Adj=</span>Violent <span class="sc">*</span> ADJINC_WT <span class="sc">*</span> <span class="dv">1000</span></span>
-<span id="cb605-10"><a href="exercise-solutions.html#cb605-10" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
-<span id="cb605-11"><a href="exercise-solutions.html#cb605-11" tabindex="-1"></a>  <span class="fu">svyttest</span>(</span>
-<span id="cb605-12"><a href="exercise-solutions.html#cb605-12" tabindex="-1"></a>    <span class="at">formula =</span> Violent_Adj <span class="sc">~</span> Sex,</span>
-<span id="cb605-13"><a href="exercise-solutions.html#cb605-13" tabindex="-1"></a>    <span class="at">design =</span> .,</span>
-<span id="cb605-14"><a href="exercise-solutions.html#cb605-14" tabindex="-1"></a>    <span class="at">na.rm =</span> <span class="cn">TRUE</span></span>
-<span id="cb605-15"><a href="exercise-solutions.html#cb605-15" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
-<span id="cb605-16"><a href="exercise-solutions.html#cb605-16" tabindex="-1"></a>  broom<span class="sc">::</span><span class="fu">tidy</span>()</span></code></pre></div>
+<div class="sourceCode" id="cb603"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb603-1"><a href="exercise-solutions.html#cb603-1" tabindex="-1"></a>vr_gender <span class="ot">&lt;-</span> pers_des <span class="sc">%&gt;%</span></span>
+<span id="cb603-2"><a href="exercise-solutions.html#cb603-2" tabindex="-1"></a>  <span class="fu">group_by</span>(Sex) <span class="sc">%&gt;%</span></span>
+<span id="cb603-3"><a href="exercise-solutions.html#cb603-3" tabindex="-1"></a>  <span class="fu">summarize</span>(</span>
+<span id="cb603-4"><a href="exercise-solutions.html#cb603-4" tabindex="-1"></a>    <span class="at">Violent_rate=</span><span class="fu">survey_mean</span>(Violent <span class="sc">*</span> ADJINC_WT <span class="sc">*</span> <span class="dv">1000</span>, <span class="at">na.rm=</span><span class="cn">TRUE</span>)</span>
+<span id="cb603-5"><a href="exercise-solutions.html#cb603-5" tabindex="-1"></a>  )</span>
+<span id="cb603-6"><a href="exercise-solutions.html#cb603-6" tabindex="-1"></a></span>
+<span id="cb603-7"><a href="exercise-solutions.html#cb603-7" tabindex="-1"></a>vr_gender_test <span class="ot">&lt;-</span> pers_des <span class="sc">%&gt;%</span></span>
+<span id="cb603-8"><a href="exercise-solutions.html#cb603-8" tabindex="-1"></a>  <span class="fu">mutate</span>(</span>
+<span id="cb603-9"><a href="exercise-solutions.html#cb603-9" tabindex="-1"></a>    <span class="at">Violent_Adj=</span>Violent <span class="sc">*</span> ADJINC_WT <span class="sc">*</span> <span class="dv">1000</span></span>
+<span id="cb603-10"><a href="exercise-solutions.html#cb603-10" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
+<span id="cb603-11"><a href="exercise-solutions.html#cb603-11" tabindex="-1"></a>  <span class="fu">svyttest</span>(</span>
+<span id="cb603-12"><a href="exercise-solutions.html#cb603-12" tabindex="-1"></a>    <span class="at">formula =</span> Violent_Adj <span class="sc">~</span> Sex,</span>
+<span id="cb603-13"><a href="exercise-solutions.html#cb603-13" tabindex="-1"></a>    <span class="at">design =</span> .,</span>
+<span id="cb603-14"><a href="exercise-solutions.html#cb603-14" tabindex="-1"></a>    <span class="at">na.rm =</span> <span class="cn">TRUE</span></span>
+<span id="cb603-15"><a href="exercise-solutions.html#cb603-15" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
+<span id="cb603-16"><a href="exercise-solutions.html#cb603-16" tabindex="-1"></a>  broom<span class="sc">::</span><span class="fu">tidy</span>()</span></code></pre></div>
 <pre><code>## Warning in summary.glm(g): observations with zero weight not used for
 ## calculating dispersion</code></pre>
 <pre><code>## Warning in summary.glm(glm.object): observations with zero weight not
@@ -2726,52 +2727,52 @@ <h2>13 - National Crime Victimization Survey Vignette<a href="exercise-solutions
 <div id="americasbarometer-vignette" class="section level2 unnumbered hasAnchor">
 <h2>14 - AmericasBarometer Vignette<a href="exercise-solutions.html#americasbarometer-vignette" class="anchor-section" aria-label="Anchor link to header"></a></h2>
 <ol style="list-style-type: decimal">
-<li>Calculate the percentage of households with broadband internet in and those with any internet at home, including from a phone or tablet in Latin America and the Caribbean. Hint: if you come across countries with 0% internet usage, you may want to filter by something first.</li>
+<li>Calculate the percentage of households with broadband internet and those with any internet at home, including from a phone or tablet in Latin America and the Caribbean. Hint: if there are countries with 0% internet usage, try filtering by something first.
+Answer:</li>
 </ol>
-<p>Answer:</p>
-<div class="sourceCode" id="cb608"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb608-1"><a href="exercise-solutions.html#cb608-1" tabindex="-1"></a>int_ests <span class="ot">&lt;-</span></span>
-<span id="cb608-2"><a href="exercise-solutions.html#cb608-2" tabindex="-1"></a>  ambarom_des <span class="sc">%&gt;%</span></span>
-<span id="cb608-3"><a href="exercise-solutions.html#cb608-3" tabindex="-1"></a>  <span class="fu">filter</span>(<span class="sc">!</span><span class="fu">is.na</span>(Internet) <span class="sc">|</span> <span class="sc">!</span><span class="fu">is.na</span>(BroadbandInternet)) <span class="sc">%&gt;%</span></span>
-<span id="cb608-4"><a href="exercise-solutions.html#cb608-4" tabindex="-1"></a>  <span class="fu">group_by</span>(Country) <span class="sc">%&gt;%</span></span>
-<span id="cb608-5"><a href="exercise-solutions.html#cb608-5" tabindex="-1"></a>  <span class="fu">summarize</span>(</span>
-<span id="cb608-6"><a href="exercise-solutions.html#cb608-6" tabindex="-1"></a>    <span class="at">p_broadband =</span> <span class="fu">survey_mean</span>(BroadbandInternet, <span class="at">na.rm =</span> <span class="cn">TRUE</span>) <span class="sc">*</span> <span class="dv">100</span>,</span>
-<span id="cb608-7"><a href="exercise-solutions.html#cb608-7" tabindex="-1"></a>    <span class="at">p_internet =</span> <span class="fu">survey_mean</span>(Internet, <span class="at">na.rm =</span> <span class="cn">TRUE</span>) <span class="sc">*</span> <span class="dv">100</span></span>
-<span id="cb608-8"><a href="exercise-solutions.html#cb608-8" tabindex="-1"></a>  ) </span>
-<span id="cb608-9"><a href="exercise-solutions.html#cb608-9" tabindex="-1"></a></span>
-<span id="cb608-10"><a href="exercise-solutions.html#cb608-10" tabindex="-1"></a>int_ests <span class="sc">%&gt;%</span></span>
-<span id="cb608-11"><a href="exercise-solutions.html#cb608-11" tabindex="-1"></a>  <span class="fu">gt</span>(<span class="at">rowname_col =</span> <span class="st">&quot;Country&quot;</span>) <span class="sc">%&gt;%</span></span>
-<span id="cb608-12"><a href="exercise-solutions.html#cb608-12" tabindex="-1"></a>  <span class="fu">fmt_number</span>(<span class="at">decimals=</span><span class="dv">1</span>) <span class="sc">%&gt;%</span></span>
-<span id="cb608-13"><a href="exercise-solutions.html#cb608-13" tabindex="-1"></a>  <span class="fu">tab_spanner</span>(</span>
-<span id="cb608-14"><a href="exercise-solutions.html#cb608-14" tabindex="-1"></a>    <span class="at">label=</span><span class="st">&quot;Broadband at home&quot;</span>,</span>
-<span id="cb608-15"><a href="exercise-solutions.html#cb608-15" tabindex="-1"></a>    <span class="at">columns=</span><span class="fu">c</span>(p_broadband, p_broadband_se)</span>
-<span id="cb608-16"><a href="exercise-solutions.html#cb608-16" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
-<span id="cb608-17"><a href="exercise-solutions.html#cb608-17" tabindex="-1"></a>  <span class="fu">tab_spanner</span>(</span>
-<span id="cb608-18"><a href="exercise-solutions.html#cb608-18" tabindex="-1"></a>    <span class="at">label=</span><span class="st">&quot;Internet at home&quot;</span>,</span>
-<span id="cb608-19"><a href="exercise-solutions.html#cb608-19" tabindex="-1"></a>    <span class="at">columns=</span><span class="fu">c</span>(p_internet, p_internet_se)</span>
-<span id="cb608-20"><a href="exercise-solutions.html#cb608-20" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
-<span id="cb608-21"><a href="exercise-solutions.html#cb608-21" tabindex="-1"></a>  <span class="fu">cols_label</span>(</span>
-<span id="cb608-22"><a href="exercise-solutions.html#cb608-22" tabindex="-1"></a>    <span class="at">p_broadband=</span><span class="st">&quot;Percent&quot;</span>,</span>
-<span id="cb608-23"><a href="exercise-solutions.html#cb608-23" tabindex="-1"></a>    <span class="at">p_internet=</span><span class="st">&quot;Percent&quot;</span>,</span>
-<span id="cb608-24"><a href="exercise-solutions.html#cb608-24" tabindex="-1"></a>    <span class="at">p_broadband_se=</span><span class="st">&quot;S.E.&quot;</span>,</span>
-<span id="cb608-25"><a href="exercise-solutions.html#cb608-25" tabindex="-1"></a>    <span class="at">p_internet_se=</span><span class="st">&quot;S.E.&quot;</span>,</span>
-<span id="cb608-26"><a href="exercise-solutions.html#cb608-26" tabindex="-1"></a>  )</span></code></pre></div>
-<div id="rwvnyhqfyu" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
-<style>#rwvnyhqfyu table {
+<div class="sourceCode" id="cb606"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb606-1"><a href="exercise-solutions.html#cb606-1" tabindex="-1"></a>int_ests <span class="ot">&lt;-</span></span>
+<span id="cb606-2"><a href="exercise-solutions.html#cb606-2" tabindex="-1"></a>  ambarom_des <span class="sc">%&gt;%</span></span>
+<span id="cb606-3"><a href="exercise-solutions.html#cb606-3" tabindex="-1"></a>  <span class="fu">filter</span>(<span class="sc">!</span><span class="fu">is.na</span>(Internet) <span class="sc">|</span> <span class="sc">!</span><span class="fu">is.na</span>(BroadbandInternet)) <span class="sc">%&gt;%</span></span>
+<span id="cb606-4"><a href="exercise-solutions.html#cb606-4" tabindex="-1"></a>  <span class="fu">group_by</span>(Country) <span class="sc">%&gt;%</span></span>
+<span id="cb606-5"><a href="exercise-solutions.html#cb606-5" tabindex="-1"></a>  <span class="fu">summarize</span>(</span>
+<span id="cb606-6"><a href="exercise-solutions.html#cb606-6" tabindex="-1"></a>    <span class="at">p_broadband =</span> <span class="fu">survey_mean</span>(BroadbandInternet, <span class="at">na.rm =</span> <span class="cn">TRUE</span>) <span class="sc">*</span> <span class="dv">100</span>,</span>
+<span id="cb606-7"><a href="exercise-solutions.html#cb606-7" tabindex="-1"></a>    <span class="at">p_internet =</span> <span class="fu">survey_mean</span>(Internet, <span class="at">na.rm =</span> <span class="cn">TRUE</span>) <span class="sc">*</span> <span class="dv">100</span></span>
+<span id="cb606-8"><a href="exercise-solutions.html#cb606-8" tabindex="-1"></a>  ) </span>
+<span id="cb606-9"><a href="exercise-solutions.html#cb606-9" tabindex="-1"></a></span>
+<span id="cb606-10"><a href="exercise-solutions.html#cb606-10" tabindex="-1"></a>int_ests <span class="sc">%&gt;%</span></span>
+<span id="cb606-11"><a href="exercise-solutions.html#cb606-11" tabindex="-1"></a>  <span class="fu">gt</span>(<span class="at">rowname_col =</span> <span class="st">&quot;Country&quot;</span>) <span class="sc">%&gt;%</span></span>
+<span id="cb606-12"><a href="exercise-solutions.html#cb606-12" tabindex="-1"></a>  <span class="fu">fmt_number</span>(<span class="at">decimals=</span><span class="dv">1</span>) <span class="sc">%&gt;%</span></span>
+<span id="cb606-13"><a href="exercise-solutions.html#cb606-13" tabindex="-1"></a>  <span class="fu">tab_spanner</span>(</span>
+<span id="cb606-14"><a href="exercise-solutions.html#cb606-14" tabindex="-1"></a>    <span class="at">label=</span><span class="st">&quot;Broadband at home&quot;</span>,</span>
+<span id="cb606-15"><a href="exercise-solutions.html#cb606-15" tabindex="-1"></a>    <span class="at">columns=</span><span class="fu">c</span>(p_broadband, p_broadband_se)</span>
+<span id="cb606-16"><a href="exercise-solutions.html#cb606-16" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
+<span id="cb606-17"><a href="exercise-solutions.html#cb606-17" tabindex="-1"></a>  <span class="fu">tab_spanner</span>(</span>
+<span id="cb606-18"><a href="exercise-solutions.html#cb606-18" tabindex="-1"></a>    <span class="at">label=</span><span class="st">&quot;Internet at home&quot;</span>,</span>
+<span id="cb606-19"><a href="exercise-solutions.html#cb606-19" tabindex="-1"></a>    <span class="at">columns=</span><span class="fu">c</span>(p_internet, p_internet_se)</span>
+<span id="cb606-20"><a href="exercise-solutions.html#cb606-20" tabindex="-1"></a>  ) <span class="sc">%&gt;%</span></span>
+<span id="cb606-21"><a href="exercise-solutions.html#cb606-21" tabindex="-1"></a>  <span class="fu">cols_label</span>(</span>
+<span id="cb606-22"><a href="exercise-solutions.html#cb606-22" tabindex="-1"></a>    <span class="at">p_broadband=</span><span class="st">&quot;Percent&quot;</span>,</span>
+<span id="cb606-23"><a href="exercise-solutions.html#cb606-23" tabindex="-1"></a>    <span class="at">p_internet=</span><span class="st">&quot;Percent&quot;</span>,</span>
+<span id="cb606-24"><a href="exercise-solutions.html#cb606-24" tabindex="-1"></a>    <span class="at">p_broadband_se=</span><span class="st">&quot;S.E.&quot;</span>,</span>
+<span id="cb606-25"><a href="exercise-solutions.html#cb606-25" tabindex="-1"></a>    <span class="at">p_internet_se=</span><span class="st">&quot;S.E.&quot;</span>,</span>
+<span id="cb606-26"><a href="exercise-solutions.html#cb606-26" tabindex="-1"></a>  )</span></code></pre></div>
+<div id="fqvyyzfmai" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
+<style>#fqvyyzfmai table {
   font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';
   -webkit-font-smoothing: antialiased;
   -moz-osx-font-smoothing: grayscale;
 }
 
-#rwvnyhqfyu thead, #rwvnyhqfyu tbody, #rwvnyhqfyu tfoot, #rwvnyhqfyu tr, #rwvnyhqfyu td, #rwvnyhqfyu th {
+#fqvyyzfmai thead, #fqvyyzfmai tbody, #fqvyyzfmai tfoot, #fqvyyzfmai tr, #fqvyyzfmai td, #fqvyyzfmai th {
   border-style: none;
 }
 
-#rwvnyhqfyu p {
+#fqvyyzfmai p {
   margin: 0;
   padding: 0;
 }
 
-#rwvnyhqfyu .gt_table {
+#fqvyyzfmai .gt_table {
   display: table;
   border-collapse: collapse;
   line-height: normal;
@@ -2797,12 +2798,12 @@ <h2>14 - AmericasBarometer Vignette<a href="exercise-solutions.html#americasbaro
   border-left-color: #D3D3D3;
 }
 
-#rwvnyhqfyu .gt_caption {
+#fqvyyzfmai .gt_caption {
   padding-top: 4px;
   padding-bottom: 4px;
 }
 
-#rwvnyhqfyu .gt_title {
+#fqvyyzfmai .gt_title {
   color: #333333;
   font-size: 125%;
   font-weight: initial;
@@ -2814,7 +2815,7 @@ <h2>14 - AmericasBarometer Vignette<a href="exercise-solutions.html#americasbaro
   border-bottom-width: 0;
 }
 
-#rwvnyhqfyu .gt_subtitle {
+#fqvyyzfmai .gt_subtitle {
   color: #333333;
   font-size: 85%;
   font-weight: initial;
@@ -2826,7 +2827,7 @@ <h2>14 - AmericasBarometer Vignette<a href="exercise-solutions.html#americasbaro
   border-top-width: 0;
 }
 
-#rwvnyhqfyu .gt_heading {
+#fqvyyzfmai .gt_heading {
   background-color: #FFFFFF;
   text-align: center;
   border-bottom-color: #FFFFFF;
@@ -2838,13 +2839,13 @@ <h2>14 - AmericasBarometer Vignette<a href="exercise-solutions.html#americasbaro
   border-right-color: #D3D3D3;
 }
 
-#rwvnyhqfyu .gt_bottom_border {
+#fqvyyzfmai .gt_bottom_border {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
 }
 
-#rwvnyhqfyu .gt_col_headings {
+#fqvyyzfmai .gt_col_headings {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -2859,7 +2860,7 @@ <h2>14 - AmericasBarometer Vignette<a href="exercise-solutions.html#americasbaro
   border-right-color: #D3D3D3;
 }
 
-#rwvnyhqfyu .gt_col_heading {
+#fqvyyzfmai .gt_col_heading {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -2879,7 +2880,7 @@ <h2>14 - AmericasBarometer Vignette<a href="exercise-solutions.html#americasbaro
   overflow-x: hidden;
 }
 
-#rwvnyhqfyu .gt_column_spanner_outer {
+#fqvyyzfmai .gt_column_spanner_outer {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -2891,15 +2892,15 @@ <h2>14 - AmericasBarometer Vignette<a href="exercise-solutions.html#americasbaro
   padding-right: 4px;
 }
 
-#rwvnyhqfyu .gt_column_spanner_outer:first-child {
+#fqvyyzfmai .gt_column_spanner_outer:first-child {
   padding-left: 0;
 }
 
-#rwvnyhqfyu .gt_column_spanner_outer:last-child {
+#fqvyyzfmai .gt_column_spanner_outer:last-child {
   padding-right: 0;
 }
 
-#rwvnyhqfyu .gt_column_spanner {
+#fqvyyzfmai .gt_column_spanner {
   border-bottom-style: solid;
   border-bottom-width: 2px;
   border-bottom-color: #D3D3D3;
@@ -2911,11 +2912,11 @@ <h2>14 - AmericasBarometer Vignette<a href="exercise-solutions.html#americasbaro
   width: 100%;
 }
 
-#rwvnyhqfyu .gt_spanner_row {
+#fqvyyzfmai .gt_spanner_row {
   border-bottom-style: hidden;
 }
 
-#rwvnyhqfyu .gt_group_heading {
+#fqvyyzfmai .gt_group_heading {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -2941,7 +2942,7 @@ <h2>14 - AmericasBarometer Vignette<a href="exercise-solutions.html#americasbaro
   text-align: left;
 }
 
-#rwvnyhqfyu .gt_empty_group_heading {
+#fqvyyzfmai .gt_empty_group_heading {
   padding: 0.5px;
   color: #333333;
   background-color: #FFFFFF;
@@ -2956,15 +2957,15 @@ <h2>14 - AmericasBarometer Vignette<a href="exercise-solutions.html#americasbaro
   vertical-align: middle;
 }
 
-#rwvnyhqfyu .gt_from_md > :first-child {
+#fqvyyzfmai .gt_from_md > :first-child {
   margin-top: 0;
 }
 
-#rwvnyhqfyu .gt_from_md > :last-child {
+#fqvyyzfmai .gt_from_md > :last-child {
   margin-bottom: 0;
 }
 
-#rwvnyhqfyu .gt_row {
+#fqvyyzfmai .gt_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -2983,7 +2984,7 @@ <h2>14 - AmericasBarometer Vignette<a href="exercise-solutions.html#americasbaro
   overflow-x: hidden;
 }
 
-#rwvnyhqfyu .gt_stub {
+#fqvyyzfmai .gt_stub {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -2996,7 +2997,7 @@ <h2>14 - AmericasBarometer Vignette<a href="exercise-solutions.html#americasbaro
   padding-right: 5px;
 }
 
-#rwvnyhqfyu .gt_stub_row_group {
+#fqvyyzfmai .gt_stub_row_group {
   color: #333333;
   background-color: #FFFFFF;
   font-size: 100%;
@@ -3010,15 +3011,15 @@ <h2>14 - AmericasBarometer Vignette<a href="exercise-solutions.html#americasbaro
   vertical-align: top;
 }
 
-#rwvnyhqfyu .gt_row_group_first td {
+#fqvyyzfmai .gt_row_group_first td {
   border-top-width: 2px;
 }
 
-#rwvnyhqfyu .gt_row_group_first th {
+#fqvyyzfmai .gt_row_group_first th {
   border-top-width: 2px;
 }
 
-#rwvnyhqfyu .gt_summary_row {
+#fqvyyzfmai .gt_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -3028,16 +3029,16 @@ <h2>14 - AmericasBarometer Vignette<a href="exercise-solutions.html#americasbaro
   padding-right: 5px;
 }
 
-#rwvnyhqfyu .gt_first_summary_row {
+#fqvyyzfmai .gt_first_summary_row {
   border-top-style: solid;
   border-top-color: #D3D3D3;
 }
 
-#rwvnyhqfyu .gt_first_summary_row.thick {
+#fqvyyzfmai .gt_first_summary_row.thick {
   border-top-width: 2px;
 }
 
-#rwvnyhqfyu .gt_last_summary_row {
+#fqvyyzfmai .gt_last_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -3047,7 +3048,7 @@ <h2>14 - AmericasBarometer Vignette<a href="exercise-solutions.html#americasbaro
   border-bottom-color: #D3D3D3;
 }
 
-#rwvnyhqfyu .gt_grand_summary_row {
+#fqvyyzfmai .gt_grand_summary_row {
   color: #333333;
   background-color: #FFFFFF;
   text-transform: inherit;
@@ -3057,7 +3058,7 @@ <h2>14 - AmericasBarometer Vignette<a href="exercise-solutions.html#americasbaro
   padding-right: 5px;
 }
 
-#rwvnyhqfyu .gt_first_grand_summary_row {
+#fqvyyzfmai .gt_first_grand_summary_row {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -3067,7 +3068,7 @@ <h2>14 - AmericasBarometer Vignette<a href="exercise-solutions.html#americasbaro
   border-top-color: #D3D3D3;
 }
 
-#rwvnyhqfyu .gt_last_grand_summary_row_top {
+#fqvyyzfmai .gt_last_grand_summary_row_top {
   padding-top: 8px;
   padding-bottom: 8px;
   padding-left: 5px;
@@ -3077,11 +3078,11 @@ <h2>14 - AmericasBarometer Vignette<a href="exercise-solutions.html#americasbaro
   border-bottom-color: #D3D3D3;
 }
 
-#rwvnyhqfyu .gt_striped {
+#fqvyyzfmai .gt_striped {
   background-color: rgba(128, 128, 128, 0.05);
 }
 
-#rwvnyhqfyu .gt_table_body {
+#fqvyyzfmai .gt_table_body {
   border-top-style: solid;
   border-top-width: 2px;
   border-top-color: #D3D3D3;
@@ -3090,7 +3091,7 @@ <h2>14 - AmericasBarometer Vignette<a href="exercise-solutions.html#americasbaro
   border-bottom-color: #D3D3D3;
 }
 
-#rwvnyhqfyu .gt_footnotes {
+#fqvyyzfmai .gt_footnotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -3104,7 +3105,7 @@ <h2>14 - AmericasBarometer Vignette<a href="exercise-solutions.html#americasbaro
   border-right-color: #D3D3D3;
 }
 
-#rwvnyhqfyu .gt_footnote {
+#fqvyyzfmai .gt_footnote {
   margin: 0px;
   font-size: 90%;
   padding-top: 4px;
@@ -3113,7 +3114,7 @@ <h2>14 - AmericasBarometer Vignette<a href="exercise-solutions.html#americasbaro
   padding-right: 5px;
 }
 
-#rwvnyhqfyu .gt_sourcenotes {
+#fqvyyzfmai .gt_sourcenotes {
   color: #333333;
   background-color: #FFFFFF;
   border-bottom-style: none;
@@ -3127,7 +3128,7 @@ <h2>14 - AmericasBarometer Vignette<a href="exercise-solutions.html#americasbaro
   border-right-color: #D3D3D3;
 }
 
-#rwvnyhqfyu .gt_sourcenote {
+#fqvyyzfmai .gt_sourcenote {
   font-size: 90%;
   padding-top: 4px;
   padding-bottom: 4px;
@@ -3135,63 +3136,63 @@ <h2>14 - AmericasBarometer Vignette<a href="exercise-solutions.html#americasbaro
   padding-right: 5px;
 }
 
-#rwvnyhqfyu .gt_left {
+#fqvyyzfmai .gt_left {
   text-align: left;
 }
 
-#rwvnyhqfyu .gt_center {
+#fqvyyzfmai .gt_center {
   text-align: center;
 }
 
-#rwvnyhqfyu .gt_right {
+#fqvyyzfmai .gt_right {
   text-align: right;
   font-variant-numeric: tabular-nums;
 }
 
-#rwvnyhqfyu .gt_font_normal {
+#fqvyyzfmai .gt_font_normal {
   font-weight: normal;
 }
 
-#rwvnyhqfyu .gt_font_bold {
+#fqvyyzfmai .gt_font_bold {
   font-weight: bold;
 }
 
-#rwvnyhqfyu .gt_font_italic {
+#fqvyyzfmai .gt_font_italic {
   font-style: italic;
 }
 
-#rwvnyhqfyu .gt_super {
+#fqvyyzfmai .gt_super {
   font-size: 65%;
 }
 
-#rwvnyhqfyu .gt_footnote_marks {
+#fqvyyzfmai .gt_footnote_marks {
   font-size: 75%;
   vertical-align: 0.4em;
   position: initial;
 }
 
-#rwvnyhqfyu .gt_asterisk {
+#fqvyyzfmai .gt_asterisk {
   font-size: 100%;
   vertical-align: 0;
 }
 
-#rwvnyhqfyu .gt_indent_1 {
+#fqvyyzfmai .gt_indent_1 {
   text-indent: 5px;
 }
 
-#rwvnyhqfyu .gt_indent_2 {
+#fqvyyzfmai .gt_indent_2 {
   text-indent: 10px;
 }
 
-#rwvnyhqfyu .gt_indent_3 {
+#fqvyyzfmai .gt_indent_3 {
   text-indent: 15px;
 }
 
-#rwvnyhqfyu .gt_indent_4 {
+#fqvyyzfmai .gt_indent_4 {
   text-indent: 20px;
 }
 
-#rwvnyhqfyu .gt_indent_5 {
+#fqvyyzfmai .gt_indent_5 {
   text-indent: 25px;
 }
 </style>
@@ -3324,40 +3325,40 @@ <h2>14 - AmericasBarometer Vignette<a href="exercise-solutions.html#americasbaro
 <li>Create a faceted map showing both broadband internet and any internet usage.</li>
 </ol>
 <p>Answer:</p>
-<div class="sourceCode" id="cb609"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb609-1"><a href="exercise-solutions.html#cb609-1" tabindex="-1"></a><span class="fu">library</span>(sf)</span>
-<span id="cb609-2"><a href="exercise-solutions.html#cb609-2" tabindex="-1"></a><span class="fu">library</span>(rnaturalearth)</span>
-<span id="cb609-3"><a href="exercise-solutions.html#cb609-3" tabindex="-1"></a><span class="fu">library</span>(ggpattern)</span>
-<span id="cb609-4"><a href="exercise-solutions.html#cb609-4" tabindex="-1"></a>internet_sf <span class="ot">&lt;-</span> country_shape_upd <span class="sc">%&gt;%</span></span>
-<span id="cb609-5"><a href="exercise-solutions.html#cb609-5" tabindex="-1"></a>  <span class="fu">full_join</span>(<span class="fu">select</span>(int_ests, <span class="at">p =</span> p_internet, <span class="at">geounit =</span> Country), <span class="at">by =</span> <span class="st">&quot;geounit&quot;</span>) <span class="sc">%&gt;%</span></span>
-<span id="cb609-6"><a href="exercise-solutions.html#cb609-6" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">Type =</span> <span class="st">&quot;Internet&quot;</span>)</span>
-<span id="cb609-7"><a href="exercise-solutions.html#cb609-7" tabindex="-1"></a>broadband_sf <span class="ot">&lt;-</span> country_shape_upd <span class="sc">%&gt;%</span></span>
-<span id="cb609-8"><a href="exercise-solutions.html#cb609-8" tabindex="-1"></a>  <span class="fu">full_join</span>(<span class="fu">select</span>(int_ests, <span class="at">p =</span> p_broadband, <span class="at">geounit =</span> Country), <span class="at">by =</span> <span class="st">&quot;geounit&quot;</span>) <span class="sc">%&gt;%</span></span>
-<span id="cb609-9"><a href="exercise-solutions.html#cb609-9" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">Type =</span> <span class="st">&quot;Broadband&quot;</span>)</span>
-<span id="cb609-10"><a href="exercise-solutions.html#cb609-10" tabindex="-1"></a>b_int_sf <span class="ot">&lt;-</span> internet_sf <span class="sc">%&gt;%</span></span>
-<span id="cb609-11"><a href="exercise-solutions.html#cb609-11" tabindex="-1"></a>  <span class="fu">bind_rows</span>(broadband_sf) <span class="sc">%&gt;%</span></span>
-<span id="cb609-12"><a href="exercise-solutions.html#cb609-12" tabindex="-1"></a>  <span class="fu">filter</span>(region_wb <span class="sc">==</span> <span class="st">&quot;Latin America &amp; Caribbean&quot;</span>)</span>
-<span id="cb609-13"><a href="exercise-solutions.html#cb609-13" tabindex="-1"></a></span>
-<span id="cb609-14"><a href="exercise-solutions.html#cb609-14" tabindex="-1"></a>b_int_sf <span class="sc">%&gt;%</span></span>
-<span id="cb609-15"><a href="exercise-solutions.html#cb609-15" tabindex="-1"></a>  <span class="fu">ggplot</span>(<span class="fu">aes</span>(<span class="at">fill =</span> p),</span>
-<span id="cb609-16"><a href="exercise-solutions.html#cb609-16" tabindex="-1"></a>         <span class="at">color=</span><span class="st">&quot;darkgray&quot;</span>) <span class="sc">+</span></span>
-<span id="cb609-17"><a href="exercise-solutions.html#cb609-17" tabindex="-1"></a>  <span class="fu">geom_sf</span>() <span class="sc">+</span></span>
-<span id="cb609-18"><a href="exercise-solutions.html#cb609-18" tabindex="-1"></a>  <span class="fu">facet_wrap</span>( <span class="sc">~</span> Type) <span class="sc">+</span></span>
-<span id="cb609-19"><a href="exercise-solutions.html#cb609-19" tabindex="-1"></a>  <span class="fu">scale_fill_gradientn</span>(</span>
-<span id="cb609-20"><a href="exercise-solutions.html#cb609-20" tabindex="-1"></a>    <span class="at">guide =</span> <span class="st">&quot;colorbar&quot;</span>,</span>
-<span id="cb609-21"><a href="exercise-solutions.html#cb609-21" tabindex="-1"></a>    <span class="at">name =</span> <span class="st">&quot;Percent&quot;</span>,</span>
-<span id="cb609-22"><a href="exercise-solutions.html#cb609-22" tabindex="-1"></a>    <span class="at">labels =</span> scales<span class="sc">::</span>comma,</span>
-<span id="cb609-23"><a href="exercise-solutions.html#cb609-23" tabindex="-1"></a>    <span class="at">colors =</span> <span class="fu">c</span>(<span class="st">&quot;#BFD7EA&quot;</span>, <span class="st">&quot;#087E8B&quot;</span>, <span class="st">&quot;#0B3954&quot;</span>),</span>
-<span id="cb609-24"><a href="exercise-solutions.html#cb609-24" tabindex="-1"></a>    <span class="at">na.value =</span> <span class="cn">NA</span></span>
-<span id="cb609-25"><a href="exercise-solutions.html#cb609-25" tabindex="-1"></a>  ) <span class="sc">+</span></span>
-<span id="cb609-26"><a href="exercise-solutions.html#cb609-26" tabindex="-1"></a>  <span class="fu">geom_sf_pattern</span>(</span>
-<span id="cb609-27"><a href="exercise-solutions.html#cb609-27" tabindex="-1"></a>    <span class="at">data =</span> <span class="fu">filter</span>(b_int_sf, <span class="fu">is.na</span>(p)),</span>
-<span id="cb609-28"><a href="exercise-solutions.html#cb609-28" tabindex="-1"></a>    <span class="at">pattern =</span> <span class="st">&quot;crosshatch&quot;</span>,</span>
-<span id="cb609-29"><a href="exercise-solutions.html#cb609-29" tabindex="-1"></a>    <span class="at">pattern_fill =</span> <span class="st">&quot;lightgray&quot;</span>,</span>
-<span id="cb609-30"><a href="exercise-solutions.html#cb609-30" tabindex="-1"></a>    <span class="at">pattern_color =</span> <span class="st">&quot;lightgray&quot;</span>,</span>
-<span id="cb609-31"><a href="exercise-solutions.html#cb609-31" tabindex="-1"></a>    <span class="at">fill =</span> <span class="cn">NA</span>,</span>
-<span id="cb609-32"><a href="exercise-solutions.html#cb609-32" tabindex="-1"></a>    <span class="at">color =</span> <span class="st">&quot;darkgray&quot;</span></span>
-<span id="cb609-33"><a href="exercise-solutions.html#cb609-33" tabindex="-1"></a>  ) <span class="sc">+</span></span>
-<span id="cb609-34"><a href="exercise-solutions.html#cb609-34" tabindex="-1"></a>  <span class="fu">theme_minimal</span>()</span></code></pre></div>
+<div class="sourceCode" id="cb607"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb607-1"><a href="exercise-solutions.html#cb607-1" tabindex="-1"></a><span class="fu">library</span>(sf)</span>
+<span id="cb607-2"><a href="exercise-solutions.html#cb607-2" tabindex="-1"></a><span class="fu">library</span>(rnaturalearth)</span>
+<span id="cb607-3"><a href="exercise-solutions.html#cb607-3" tabindex="-1"></a><span class="fu">library</span>(ggpattern)</span>
+<span id="cb607-4"><a href="exercise-solutions.html#cb607-4" tabindex="-1"></a>internet_sf <span class="ot">&lt;-</span> country_shape_upd <span class="sc">%&gt;%</span></span>
+<span id="cb607-5"><a href="exercise-solutions.html#cb607-5" tabindex="-1"></a>  <span class="fu">full_join</span>(<span class="fu">select</span>(int_ests, <span class="at">p =</span> p_internet, <span class="at">geounit =</span> Country), <span class="at">by =</span> <span class="st">&quot;geounit&quot;</span>) <span class="sc">%&gt;%</span></span>
+<span id="cb607-6"><a href="exercise-solutions.html#cb607-6" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">Type =</span> <span class="st">&quot;Internet&quot;</span>)</span>
+<span id="cb607-7"><a href="exercise-solutions.html#cb607-7" tabindex="-1"></a>broadband_sf <span class="ot">&lt;-</span> country_shape_upd <span class="sc">%&gt;%</span></span>
+<span id="cb607-8"><a href="exercise-solutions.html#cb607-8" tabindex="-1"></a>  <span class="fu">full_join</span>(<span class="fu">select</span>(int_ests, <span class="at">p =</span> p_broadband, <span class="at">geounit =</span> Country), <span class="at">by =</span> <span class="st">&quot;geounit&quot;</span>) <span class="sc">%&gt;%</span></span>
+<span id="cb607-9"><a href="exercise-solutions.html#cb607-9" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">Type =</span> <span class="st">&quot;Broadband&quot;</span>)</span>
+<span id="cb607-10"><a href="exercise-solutions.html#cb607-10" tabindex="-1"></a>b_int_sf <span class="ot">&lt;-</span> internet_sf <span class="sc">%&gt;%</span></span>
+<span id="cb607-11"><a href="exercise-solutions.html#cb607-11" tabindex="-1"></a>  <span class="fu">bind_rows</span>(broadband_sf) <span class="sc">%&gt;%</span></span>
+<span id="cb607-12"><a href="exercise-solutions.html#cb607-12" tabindex="-1"></a>  <span class="fu">filter</span>(region_wb <span class="sc">==</span> <span class="st">&quot;Latin America &amp; Caribbean&quot;</span>)</span>
+<span id="cb607-13"><a href="exercise-solutions.html#cb607-13" tabindex="-1"></a></span>
+<span id="cb607-14"><a href="exercise-solutions.html#cb607-14" tabindex="-1"></a>b_int_sf <span class="sc">%&gt;%</span></span>
+<span id="cb607-15"><a href="exercise-solutions.html#cb607-15" tabindex="-1"></a>  <span class="fu">ggplot</span>(<span class="fu">aes</span>(<span class="at">fill =</span> p),</span>
+<span id="cb607-16"><a href="exercise-solutions.html#cb607-16" tabindex="-1"></a>         <span class="at">color=</span><span class="st">&quot;darkgray&quot;</span>) <span class="sc">+</span></span>
+<span id="cb607-17"><a href="exercise-solutions.html#cb607-17" tabindex="-1"></a>  <span class="fu">geom_sf</span>() <span class="sc">+</span></span>
+<span id="cb607-18"><a href="exercise-solutions.html#cb607-18" tabindex="-1"></a>  <span class="fu">facet_wrap</span>( <span class="sc">~</span> Type) <span class="sc">+</span></span>
+<span id="cb607-19"><a href="exercise-solutions.html#cb607-19" tabindex="-1"></a>  <span class="fu">scale_fill_gradientn</span>(</span>
+<span id="cb607-20"><a href="exercise-solutions.html#cb607-20" tabindex="-1"></a>    <span class="at">guide =</span> <span class="st">&quot;colorbar&quot;</span>,</span>
+<span id="cb607-21"><a href="exercise-solutions.html#cb607-21" tabindex="-1"></a>    <span class="at">name =</span> <span class="st">&quot;Percent&quot;</span>,</span>
+<span id="cb607-22"><a href="exercise-solutions.html#cb607-22" tabindex="-1"></a>    <span class="at">labels =</span> scales<span class="sc">::</span>comma,</span>
+<span id="cb607-23"><a href="exercise-solutions.html#cb607-23" tabindex="-1"></a>    <span class="at">colors =</span> <span class="fu">c</span>(<span class="st">&quot;#BFD7EA&quot;</span>, <span class="st">&quot;#087E8B&quot;</span>, <span class="st">&quot;#0B3954&quot;</span>),</span>
+<span id="cb607-24"><a href="exercise-solutions.html#cb607-24" tabindex="-1"></a>    <span class="at">na.value =</span> <span class="cn">NA</span></span>
+<span id="cb607-25"><a href="exercise-solutions.html#cb607-25" tabindex="-1"></a>  ) <span class="sc">+</span></span>
+<span id="cb607-26"><a href="exercise-solutions.html#cb607-26" tabindex="-1"></a>  <span class="fu">geom_sf_pattern</span>(</span>
+<span id="cb607-27"><a href="exercise-solutions.html#cb607-27" tabindex="-1"></a>    <span class="at">data =</span> <span class="fu">filter</span>(b_int_sf, <span class="fu">is.na</span>(p)),</span>
+<span id="cb607-28"><a href="exercise-solutions.html#cb607-28" tabindex="-1"></a>    <span class="at">pattern =</span> <span class="st">&quot;crosshatch&quot;</span>,</span>
+<span id="cb607-29"><a href="exercise-solutions.html#cb607-29" tabindex="-1"></a>    <span class="at">pattern_fill =</span> <span class="st">&quot;lightgray&quot;</span>,</span>
+<span id="cb607-30"><a href="exercise-solutions.html#cb607-30" tabindex="-1"></a>    <span class="at">pattern_color =</span> <span class="st">&quot;lightgray&quot;</span>,</span>
+<span id="cb607-31"><a href="exercise-solutions.html#cb607-31" tabindex="-1"></a>    <span class="at">fill =</span> <span class="cn">NA</span>,</span>
+<span id="cb607-32"><a href="exercise-solutions.html#cb607-32" tabindex="-1"></a>    <span class="at">color =</span> <span class="st">&quot;darkgray&quot;</span></span>
+<span id="cb607-33"><a href="exercise-solutions.html#cb607-33" tabindex="-1"></a>  ) <span class="sc">+</span></span>
+<span id="cb607-34"><a href="exercise-solutions.html#cb607-34" tabindex="-1"></a>  <span class="fu">theme_minimal</span>()</span></code></pre></div>
 <div class="figure"><span style="display:block;" id="fig:ambarom-ex-solution2"></span>
 <img src="bookdown_files/figure-html/ambarom-ex-solution2-1.png" alt="Percent of broadband internet and any internet usage, Central and South America" width="672" />
 <p class="caption">
@@ -3375,13 +3376,12 @@ <h3>References<a href="references.html#references" class="anchor-section" aria-l
 <div id="ref-nhis-svy-des" class="csl-entry">
 National Center for Health Statistics. 2023. <span>“<span class="nocase">National Health Interview Survey, 2022 survey description</span>.”</span> <a href="https://ftp.cdc.gov/pub/Health_Statistics/NCHS/Dataset_Documentation/NHIS/2022/srvydesc-508.pdf" class="uri">https://ftp.cdc.gov/pub/Health_Statistics/NCHS/Dataset_Documentation/NHIS/2022/srvydesc-508.pdf</a>.
 </div>
+<div id="ref-npr-voting-trend" class="csl-entry">
+Sprunt, Barbara. 2020. <span>“93 Million and Counting: Americans Are Shattering Early Voting Records.”</span> <em>National Public Radio</em>.
+</div>
+<div id="ref-eia-cdd" class="csl-entry">
+———. 2023d. <span>“Units and Calculators Explained: Degree Days.”</span> <a href="https://www.eia.gov/energyexplained/units-and-calculators/degree-days.php" class="uri">https://www.eia.gov/energyexplained/units-and-calculators/degree-days.php</a>.
 </div>
-<div class="footnotes">
-<hr />
-<ol start="29">
-<li id="fn29"><p><a href="https://www.eia.gov/energyexplained/units-and-calculators/degree-days.php" class="uri">https://www.eia.gov/energyexplained/units-and-calculators/degree-days.php</a><a href="exercise-solutions.html#fnref29" class="footnote-back">↩︎</a></p></li>
-<li id="fn30"><p><a href="https://www.npr.org/2020/10/26/927803214/62-million-and-counting-americans-are-breaking-early-voting-records" class="uri">https://www.npr.org/2020/10/26/927803214/62-million-and-counting-americans-are-breaking-early-voting-records</a><a href="exercise-solutions.html#fnref30" class="footnote-back">↩︎</a></p></li>
-</ol>
 </div>
             </section>
 
diff --git a/importing-survey-data-into-r.html b/importing-survey-data-into-r.html
index a0ef5ab5..c1de1d50 100644
--- a/importing-survey-data-into-r.html
+++ b/importing-survey-data-into-r.html
@@ -23,7 +23,7 @@
 <meta name="author" content="Stephanie Zimmer, Rebecca J. Powell, and Isabella Velásquez" />
 
 
-<meta name="date" content="2024-04-16" />
+<meta name="date" content="2024-04-24" />
 
   <meta name="viewport" content="width=device-width, initial-scale=1" />
   <meta name="apple-mobile-web-app-capable" content="yes" />
@@ -229,7 +229,7 @@
 <li class="chapter" data-level="4.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#introduction-3"><i class="fa fa-check"></i><b>4.1</b> Introduction</a></li>
 <li class="chapter" data-level="4.2" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup"><i class="fa fa-check"></i><b>4.2</b> Setup</a>
 <ul>
-<li class="chapter" data-level="4.2.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#packages"><i class="fa fa-check"></i><b>4.2.1</b> Packages</a></li>
+<li class="chapter" data-level="4.2.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup-load-pkgs"><i class="fa fa-check"></i><b>4.2.1</b> Packages</a></li>
 <li class="chapter" data-level="4.2.2" data-path="c04-getting-started.html"><a href="c04-getting-started.html#data"><i class="fa fa-check"></i><b>4.2.2</b> Data</a></li>
 <li class="chapter" data-level="4.2.3" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup-des-obj"><i class="fa fa-check"></i><b>4.2.3</b> Design objects</a></li>
 </ul></li>
@@ -358,7 +358,7 @@
 <li class="chapter" data-level="9.9.1" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#random-number-seeds"><i class="fa fa-check"></i><b>9.9.1</b> Random number seeds</a></li>
 <li class="chapter" data-level="9.9.2" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#descriptive-names-and-labels"><i class="fa fa-check"></i><b>9.9.2</b> Descriptive names and labels</a></li>
 </ul></li>
-<li class="chapter" data-level="9.10" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#summary"><i class="fa fa-check"></i><b>9.10</b> Summary</a></li>
+<li class="chapter" data-level="9.10" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#additional-resources-1"><i class="fa fa-check"></i><b>9.10</b> Additional resources</a></li>
 </ul></li>
 <li class="part"><span><b>IV Real life data</b></span></li>
 <li class="chapter" data-level="10" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html"><i class="fa fa-check"></i><b>10</b> Sample designs and replicate weights</a>
@@ -380,7 +380,7 @@
 <ul>
 <li class="chapter" data-level="10.4.1" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#balanced-repeated-replication-brr-method"><i class="fa fa-check"></i><b>10.4.1</b> Balanced Repeated Replication (BRR) method</a></li>
 <li class="chapter" data-level="10.4.2" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#fays-brr-method"><i class="fa fa-check"></i><b>10.4.2</b> Fay’s BRR method</a></li>
-<li class="chapter" data-level="10.4.3" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#jackknife-method"><i class="fa fa-check"></i><b>10.4.3</b> Jackknife method</a></li>
+<li class="chapter" data-level="10.4.3" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#samp-jackknife"><i class="fa fa-check"></i><b>10.4.3</b> Jackknife method</a></li>
 <li class="chapter" data-level="10.4.4" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#bootstrap-method"><i class="fa fa-check"></i><b>10.4.4</b> Bootstrap method</a></li>
 </ul></li>
 <li class="chapter" data-level="10.5" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#exercises-2"><i class="fa fa-check"></i><b>10.5</b> Exercises</a></li>
@@ -398,14 +398,14 @@
 <li class="chapter" data-level="11.4" data-path="c11-missing-data.html"><a href="c11-missing-data.html#analysis-with-missing-data"><i class="fa fa-check"></i><b>11.4</b> Analysis with missing data</a>
 <ul>
 <li class="chapter" data-level="11.4.1" data-path="c11-missing-data.html"><a href="c11-missing-data.html#recoding-missing-data"><i class="fa fa-check"></i><b>11.4.1</b> Recoding missing data</a></li>
-<li class="chapter" data-level="11.4.2" data-path="c11-missing-data.html"><a href="c11-missing-data.html#accounting-for-skip-patterns"><i class="fa fa-check"></i><b>11.4.2</b> Accounting for skip patterns</a></li>
+<li class="chapter" data-level="11.4.2" data-path="c11-missing-data.html"><a href="c11-missing-data.html#missing-skip-patt"><i class="fa fa-check"></i><b>11.4.2</b> Accounting for skip patterns</a></li>
 </ul></li>
 </ul></li>
 <li class="chapter" data-level="12" data-path="c12-recommendations.html"><a href="c12-recommendations.html"><i class="fa fa-check"></i><b>12</b> Successful survey analysis recommendations</a>
 <ul>
 <li class="chapter" data-level="" data-path="c12-recommendations.html"><a href="c12-recommendations.html#prereq12"><i class="fa fa-check"></i>Prerequisites</a></li>
 <li class="chapter" data-level="12.1" data-path="c12-recommendations.html"><a href="c12-recommendations.html#introduction-10"><i class="fa fa-check"></i><b>12.1</b> Introduction</a></li>
-<li class="chapter" data-level="12.2" data-path="c12-recommendations.html"><a href="c12-recommendations.html#recs-survey-process"><i class="fa fa-check"></i><b>12.2</b> Follow survey analysis process</a></li>
+<li class="chapter" data-level="12.2" data-path="c12-recommendations.html"><a href="c12-recommendations.html#recs-survey-process"><i class="fa fa-check"></i><b>12.2</b> Follow the survey analysis process</a></li>
 <li class="chapter" data-level="12.3" data-path="c12-recommendations.html"><a href="c12-recommendations.html#begin-with-descriptive-analysis"><i class="fa fa-check"></i><b>12.3</b> Begin with descriptive analysis</a>
 <ul>
 <li class="chapter" data-level="12.3.1" data-path="c12-recommendations.html"><a href="c12-recommendations.html#table-review"><i class="fa fa-check"></i><b>12.3.1</b> Table review</a></li>
@@ -567,14 +567,14 @@ <h2><span class="header-section-number">A.1</span> Importing delimiter-separated
 <p>The arguments are:</p>
 <ul>
 <li><code>file</code>: the path to the Excel file to import</li>
-<li><code>col_names</code>: a value of <code>TRUE</code> will import the first row of the <code>file</code> as column names and not included in the data frame. A value of <code>FALSE</code> will create automated column names. Alternatively, we can provide a vector of column names.</li>
-<li><code>col_types</code>: by default, R will infer the column variable types. We can also provide a column specification using <code>list()</code> or <code>cols()</code>; for example, use <code>col_types = cols(.default = "c")</code> to read all the columns as characters. Alternatively, we can use a string to specify the variable types for each column.</li>
+<li><code>col_names</code>: a value of <code>TRUE</code> imports the first row of the <code>file</code> as column names and not included in the data frame. A value of <code>FALSE</code> creates automated column names. Alternatively, we can provide a vector of column names.</li>
+<li><code>col_types</code>: by default, R infers the column variable types. We can also provide a column specification using <code>list()</code> or <code>cols()</code>; for example, use <code>col_types = cols(.default = "c")</code> to read all the columns as characters. Alternatively, we can use a string to specify the variable types for each column.</li>
 <li><code>col_select</code>: the columns to include in the results</li>
 <li><code>id</code>: a column for storing the file path. This is useful for keeping track of the input file when importing multiple CSVs at a time.</li>
 <li><code>locale</code>: the location-specific defaults for the file</li>
 <li><code>na</code>: a character vector of values to interpret as missing</li>
 <li><code>comment</code>: a character vector of values to interpret as comments</li>
-<li><code>trim_ws</code>: a value of <code>TRUE</code> will trim leading and trailing white space</li>
+<li><code>trim_ws</code>: a value of <code>TRUE</code> trims leading and trailing white space</li>
 <li><code>skip</code>: number of lines to skip before importing the data</li>
 <li><code>n_max</code>: maximum number of lines to read</li>
 <li><code>guess_max</code>: maximum number of lines use for guessing column types</li>
@@ -582,20 +582,20 @@ <h2><span class="header-section-number">A.1</span> Importing delimiter-separated
 <li><code>num_threads</code>: the number of processing threads to use for initial parsing and lazy reading of data</li>
 <li><code>progress</code>: a value of <code>TRUE</code> displays a progress bar</li>
 <li><code>show_col_types</code>: a value of <code>TRUE</code> displays the column types</li>
-<li><code>skip_empty_rows</code>: a value of <code>TRUE</code> will ignore blank rows</li>
-<li><code>lazy</code>: a value of <code>TRUE</code> will read values lazily</li>
+<li><code>skip_empty_rows</code>: a value of <code>TRUE</code> ignores blank rows</li>
+<li><code>lazy</code>: a value of <code>TRUE</code> reads values lazily</li>
 </ul>
 <p>The other functions share a similar syntax to <code>read_csv()</code>. To find more details, run <code>??</code> followed by the function name. For example, run <code>??read_delim</code> in the Console for additional information.</p>
 <p>In the example below, we use {readr} to load a CSV file named ‘anes_timeseries_2020_csv_20220210.csv’ into an R object called <code>anes_csv</code>. The <code>read_csv()</code> imports the file and stores the data in the <code>anes_csv</code> object. We can then use this object for further analysis.</p>
-<div class="sourceCode" id="cb509"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb509-1"><a href="importing-survey-data-into-r.html#cb509-1" tabindex="-1"></a><span class="fu">library</span>(readr)</span>
-<span id="cb509-2"><a href="importing-survey-data-into-r.html#cb509-2" tabindex="-1"></a></span>
-<span id="cb509-3"><a href="importing-survey-data-into-r.html#cb509-3" tabindex="-1"></a>anes_csv <span class="ot">&lt;-</span></span>
-<span id="cb509-4"><a href="importing-survey-data-into-r.html#cb509-4" tabindex="-1"></a>  <span class="fu">read_csv</span>(<span class="st">&quot;data/anes_timeseries_2020_csv_20220210.csv&quot;</span>)</span></code></pre></div>
+<div class="sourceCode" id="cb503"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb503-1"><a href="importing-survey-data-into-r.html#cb503-1" tabindex="-1"></a><span class="fu">library</span>(readr)</span>
+<span id="cb503-2"><a href="importing-survey-data-into-r.html#cb503-2" tabindex="-1"></a></span>
+<span id="cb503-3"><a href="importing-survey-data-into-r.html#cb503-3" tabindex="-1"></a>anes_csv <span class="ot">&lt;-</span></span>
+<span id="cb503-4"><a href="importing-survey-data-into-r.html#cb503-4" tabindex="-1"></a>  <span class="fu">read_csv</span>(<span class="st">&quot;data/anes_timeseries_2020_csv_20220210.csv&quot;</span>)</span></code></pre></div>
 </div>
 <div id="loading-excel-files-into-r" class="section level2 hasAnchor" number="15.2">
 <h2><span class="header-section-number">A.2</span> Loading Excel files into R<a href="importing-survey-data-into-r.html#loading-excel-files-into-r" class="anchor-section" aria-label="Anchor link to header"></a></h2>
 <p>Excel, a widely used spreadsheet software program created by Microsoft, is a common file format in survey research. We can load Excel spreadsheets into the R environment using the {readxl} package. The package supports both the legacy <code>.xls</code> files and the modern <code>.xlsx</code> format.</p>
-<p>To load Excel data into R, we can use the <code>read_excel()</code> function from the {readxl} package. This function offers a range of customizable options for the import process. Let’s explore the syntax:</p>
+<p>To load Excel data into R, we can use the <code>read_excel()</code> function from the {readxl} package. This function offers a range of options for the import process. Let’s explore the syntax:</p>
 <pre><code>read_excel(
   path,
   sheet = NULL,
@@ -625,10 +625,10 @@ <h2><span class="header-section-number">A.2</span> Loading Excel files into R<a
 <li><code>.name_repair</code>: determines how column names are repaired if they are not valid</li>
 </ul>
 <p>In the code example below, we load an Excel spreadsheet named ‘anes_timeseries_2020_csv_20220210.xlsx’ into R. The resulting data is saved as a tibble in the <code>anes_excel</code> object, ready for further analysis.</p>
-<div class="sourceCode" id="cb511"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb511-1"><a href="importing-survey-data-into-r.html#cb511-1" tabindex="-1"></a><span class="fu">library</span>(readxl)</span>
-<span id="cb511-2"><a href="importing-survey-data-into-r.html#cb511-2" tabindex="-1"></a></span>
-<span id="cb511-3"><a href="importing-survey-data-into-r.html#cb511-3" tabindex="-1"></a>anes_excel <span class="ot">&lt;-</span></span>
-<span id="cb511-4"><a href="importing-survey-data-into-r.html#cb511-4" tabindex="-1"></a>  <span class="fu">read_excel</span>(<span class="at">path =</span> <span class="st">&quot;data/anes_timeseries_2020_csv_20220210.xlsx&quot;</span>)</span></code></pre></div>
+<div class="sourceCode" id="cb505"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb505-1"><a href="importing-survey-data-into-r.html#cb505-1" tabindex="-1"></a><span class="fu">library</span>(readxl)</span>
+<span id="cb505-2"><a href="importing-survey-data-into-r.html#cb505-2" tabindex="-1"></a></span>
+<span id="cb505-3"><a href="importing-survey-data-into-r.html#cb505-3" tabindex="-1"></a>anes_excel <span class="ot">&lt;-</span></span>
+<span id="cb505-4"><a href="importing-survey-data-into-r.html#cb505-4" tabindex="-1"></a>  <span class="fu">read_excel</span>(<span class="at">path =</span> <span class="st">&quot;data/anes_timeseries_2020_csv_20220210.xlsx&quot;</span>)</span></code></pre></div>
 </div>
 <div id="importing-stata-sas-and-spss-files-into-r" class="section level2 hasAnchor" number="15.3">
 <h2><span class="header-section-number">A.3</span> Importing Stata, SAS, and SPSS files into R<a href="importing-survey-data-into-r.html#importing-stata-sas-and-spss-files-into-r" class="anchor-section" aria-label="Anchor link to header"></a></h2>
@@ -636,14 +636,14 @@ <h2><span class="header-section-number">A.3</span> Importing Stata, SAS, and SPS
 <div id="syntax-9" class="section level3 hasAnchor" number="15.3.1">
 <h3><span class="header-section-number">A.3.1</span> Syntax<a href="importing-survey-data-into-r.html#syntax-9" class="anchor-section" aria-label="Anchor link to header"></a></h3>
 <p>Let’s explore the syntax for importing Stata files <code>.dat</code> files using <code>haven::read_dat()</code>:</p>
-<div class="sourceCode" id="cb512"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb512-1"><a href="importing-survey-data-into-r.html#cb512-1" tabindex="-1"></a><span class="fu">read_dta</span>(</span>
-<span id="cb512-2"><a href="importing-survey-data-into-r.html#cb512-2" tabindex="-1"></a>  file,</span>
-<span id="cb512-3"><a href="importing-survey-data-into-r.html#cb512-3" tabindex="-1"></a>  <span class="at">encoding =</span> <span class="cn">NULL</span>,</span>
-<span id="cb512-4"><a href="importing-survey-data-into-r.html#cb512-4" tabindex="-1"></a>  <span class="at">col_select =</span> <span class="cn">NULL</span>,</span>
-<span id="cb512-5"><a href="importing-survey-data-into-r.html#cb512-5" tabindex="-1"></a>  <span class="at">skip =</span> <span class="dv">0</span>,</span>
-<span id="cb512-6"><a href="importing-survey-data-into-r.html#cb512-6" tabindex="-1"></a>  <span class="at">n_max =</span> <span class="cn">Inf</span>,</span>
-<span id="cb512-7"><a href="importing-survey-data-into-r.html#cb512-7" tabindex="-1"></a>  <span class="at">.name_repair =</span> <span class="st">&quot;unique&quot;</span></span>
-<span id="cb512-8"><a href="importing-survey-data-into-r.html#cb512-8" tabindex="-1"></a>)</span></code></pre></div>
+<div class="sourceCode" id="cb506"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb506-1"><a href="importing-survey-data-into-r.html#cb506-1" tabindex="-1"></a><span class="fu">read_dta</span>(</span>
+<span id="cb506-2"><a href="importing-survey-data-into-r.html#cb506-2" tabindex="-1"></a>  file,</span>
+<span id="cb506-3"><a href="importing-survey-data-into-r.html#cb506-3" tabindex="-1"></a>  <span class="at">encoding =</span> <span class="cn">NULL</span>,</span>
+<span id="cb506-4"><a href="importing-survey-data-into-r.html#cb506-4" tabindex="-1"></a>  <span class="at">col_select =</span> <span class="cn">NULL</span>,</span>
+<span id="cb506-5"><a href="importing-survey-data-into-r.html#cb506-5" tabindex="-1"></a>  <span class="at">skip =</span> <span class="dv">0</span>,</span>
+<span id="cb506-6"><a href="importing-survey-data-into-r.html#cb506-6" tabindex="-1"></a>  <span class="at">n_max =</span> <span class="cn">Inf</span>,</span>
+<span id="cb506-7"><a href="importing-survey-data-into-r.html#cb506-7" tabindex="-1"></a>  <span class="at">.name_repair =</span> <span class="st">&quot;unique&quot;</span></span>
+<span id="cb506-8"><a href="importing-survey-data-into-r.html#cb506-8" tabindex="-1"></a>)</span></code></pre></div>
 <p>The arguments are:</p>
 <ul>
 <li><code>file</code>: the path to the proprietary data file to import</li>
@@ -667,21 +667,21 @@ <h3><span class="header-section-number">A.3.1</span> Syntax<a href="importing-su
 <li><code>file</code>: the path to the proprietary data file to import</li>
 <li><code>encoding</code>: specifies the character encoding of the data file</li>
 <li><code>col_select</code>: select specific columns for import</li>
-<li><code>user_na</code>: a value of <code>TRUE</code> will read variables with user defined missing labels will be read into <code>labelled_spss()</code> objects</li>
+<li><code>user_na</code>: a value of <code>TRUE</code> reads variables with user defined missing labels into <code>labelled_spss()</code> objects</li>
 <li><code>skip</code> and <code>n_max</code>: control the number of rows skipped and the maximum number of rows imported</li>
 <li><code>.name_repair</code>: determines how column names are repaired if they are not valid</li>
 </ul>
 <p>The syntax for importing SAS files with <code>read_sas()</code> is as follows:</p>
-<div class="sourceCode" id="cb514"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb514-1"><a href="importing-survey-data-into-r.html#cb514-1" tabindex="-1"></a><span class="fu">read_sas</span>(</span>
-<span id="cb514-2"><a href="importing-survey-data-into-r.html#cb514-2" tabindex="-1"></a>  data_file,</span>
-<span id="cb514-3"><a href="importing-survey-data-into-r.html#cb514-3" tabindex="-1"></a>  <span class="at">catalog_file =</span> <span class="cn">NULL</span>,</span>
-<span id="cb514-4"><a href="importing-survey-data-into-r.html#cb514-4" tabindex="-1"></a>  <span class="at">encoding =</span> <span class="cn">NULL</span>,</span>
-<span id="cb514-5"><a href="importing-survey-data-into-r.html#cb514-5" tabindex="-1"></a>  <span class="at">catalog_encoding =</span> encoding,</span>
-<span id="cb514-6"><a href="importing-survey-data-into-r.html#cb514-6" tabindex="-1"></a>  <span class="at">col_select =</span> <span class="cn">NULL</span>,</span>
-<span id="cb514-7"><a href="importing-survey-data-into-r.html#cb514-7" tabindex="-1"></a>  <span class="at">skip =</span> <span class="dv">0</span><span class="dt">L</span>,</span>
-<span id="cb514-8"><a href="importing-survey-data-into-r.html#cb514-8" tabindex="-1"></a>  <span class="at">n_max =</span> <span class="cn">Inf</span>,</span>
-<span id="cb514-9"><a href="importing-survey-data-into-r.html#cb514-9" tabindex="-1"></a>  <span class="at">.name_repair =</span> <span class="st">&quot;unique&quot;</span></span>
-<span id="cb514-10"><a href="importing-survey-data-into-r.html#cb514-10" tabindex="-1"></a>)</span></code></pre></div>
+<div class="sourceCode" id="cb508"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb508-1"><a href="importing-survey-data-into-r.html#cb508-1" tabindex="-1"></a><span class="fu">read_sas</span>(</span>
+<span id="cb508-2"><a href="importing-survey-data-into-r.html#cb508-2" tabindex="-1"></a>  data_file,</span>
+<span id="cb508-3"><a href="importing-survey-data-into-r.html#cb508-3" tabindex="-1"></a>  <span class="at">catalog_file =</span> <span class="cn">NULL</span>,</span>
+<span id="cb508-4"><a href="importing-survey-data-into-r.html#cb508-4" tabindex="-1"></a>  <span class="at">encoding =</span> <span class="cn">NULL</span>,</span>
+<span id="cb508-5"><a href="importing-survey-data-into-r.html#cb508-5" tabindex="-1"></a>  <span class="at">catalog_encoding =</span> encoding,</span>
+<span id="cb508-6"><a href="importing-survey-data-into-r.html#cb508-6" tabindex="-1"></a>  <span class="at">col_select =</span> <span class="cn">NULL</span>,</span>
+<span id="cb508-7"><a href="importing-survey-data-into-r.html#cb508-7" tabindex="-1"></a>  <span class="at">skip =</span> <span class="dv">0</span><span class="dt">L</span>,</span>
+<span id="cb508-8"><a href="importing-survey-data-into-r.html#cb508-8" tabindex="-1"></a>  <span class="at">n_max =</span> <span class="cn">Inf</span>,</span>
+<span id="cb508-9"><a href="importing-survey-data-into-r.html#cb508-9" tabindex="-1"></a>  <span class="at">.name_repair =</span> <span class="st">&quot;unique&quot;</span></span>
+<span id="cb508-10"><a href="importing-survey-data-into-r.html#cb508-10" tabindex="-1"></a>)</span></code></pre></div>
 <p>The arguments are:</p>
 <ul>
 <li><code>data_file</code>: the path to the proprietary data file to import</li>
@@ -692,34 +692,36 @@ <h3><span class="header-section-number">A.3.1</span> Syntax<a href="importing-su
 <li><code>skip</code> and <code>n_max</code>: control the number of rows skipped and the maximum number of rows imported</li>
 <li><code>.name_repair</code>: determines how column names are repaired if they are not valid</li>
 </ul>
-<p>In the code examples below, we demonstrate how to load Stata, SPSS, and SAS files into R using the respective {haven} functions. The resulting data is stored in <code>anes_dta</code>, <code>anes_sav</code>, and <code>anes_sas</code> objects as tibbles, ready for use in R.</p>
+<p>In the code examples below, we demonstrate how to load Stata, SPSS, and SAS files into R using the respective {haven} functions. The resulting data are stored in <code>anes_dta</code>, <code>anes_sav</code>, and <code>anes_sas</code> objects as tibbles, ready for use in R. For the Stata example, we show you how to load in the data from the {srvyrexploR} package and will use this data in examples later in this Appendix.</p>
 <p>Stata:</p>
-<div class="sourceCode" id="cb515"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb515-1"><a href="importing-survey-data-into-r.html#cb515-1" tabindex="-1"></a><span class="fu">library</span>(haven)</span>
-<span id="cb515-2"><a href="importing-survey-data-into-r.html#cb515-2" tabindex="-1"></a></span>
-<span id="cb515-3"><a href="importing-survey-data-into-r.html#cb515-3" tabindex="-1"></a>anes_dta <span class="ot">&lt;-</span></span>
-<span id="cb515-4"><a href="importing-survey-data-into-r.html#cb515-4" tabindex="-1"></a>  <span class="fu">read_dta</span>(<span class="fu">system.file</span>(<span class="st">&quot;extdata&quot;</span>, <span class="st">&quot;anes_2020_stata_example.dta&quot;</span>, <span class="at">package=</span><span class="st">&quot;srvyrexploR&quot;</span>))</span></code></pre></div>
+<div class="sourceCode" id="cb509"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb509-1"><a href="importing-survey-data-into-r.html#cb509-1" tabindex="-1"></a><span class="fu">library</span>(haven)</span>
+<span id="cb509-2"><a href="importing-survey-data-into-r.html#cb509-2" tabindex="-1"></a></span>
+<span id="cb509-3"><a href="importing-survey-data-into-r.html#cb509-3" tabindex="-1"></a>anes_dta <span class="ot">&lt;-</span></span>
+<span id="cb509-4"><a href="importing-survey-data-into-r.html#cb509-4" tabindex="-1"></a>  <span class="fu">read_dta</span>(<span class="fu">system.file</span>(<span class="st">&quot;extdata&quot;</span>, </span>
+<span id="cb509-5"><a href="importing-survey-data-into-r.html#cb509-5" tabindex="-1"></a>                       <span class="st">&quot;anes_2020_stata_example.dta&quot;</span>, </span>
+<span id="cb509-6"><a href="importing-survey-data-into-r.html#cb509-6" tabindex="-1"></a>                       <span class="at">package=</span><span class="st">&quot;srvyrexploR&quot;</span>))</span></code></pre></div>
 <p>SPSS:</p>
-<div class="sourceCode" id="cb516"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb516-1"><a href="importing-survey-data-into-r.html#cb516-1" tabindex="-1"></a><span class="fu">library</span>(haven)</span>
-<span id="cb516-2"><a href="importing-survey-data-into-r.html#cb516-2" tabindex="-1"></a></span>
-<span id="cb516-3"><a href="importing-survey-data-into-r.html#cb516-3" tabindex="-1"></a>anes_sav <span class="ot">&lt;-</span></span>
-<span id="cb516-4"><a href="importing-survey-data-into-r.html#cb516-4" tabindex="-1"></a>  <span class="fu">read_sav</span>(<span class="at">file =</span> <span class="st">&quot;data/anes_timeseries_2020_spss_20220210.sav&quot;</span>)</span></code></pre></div>
+<div class="sourceCode" id="cb510"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb510-1"><a href="importing-survey-data-into-r.html#cb510-1" tabindex="-1"></a><span class="fu">library</span>(haven)</span>
+<span id="cb510-2"><a href="importing-survey-data-into-r.html#cb510-2" tabindex="-1"></a></span>
+<span id="cb510-3"><a href="importing-survey-data-into-r.html#cb510-3" tabindex="-1"></a>anes_sav <span class="ot">&lt;-</span></span>
+<span id="cb510-4"><a href="importing-survey-data-into-r.html#cb510-4" tabindex="-1"></a>  <span class="fu">read_sav</span>(<span class="at">file =</span> <span class="st">&quot;data/anes_timeseries_2020_spss_20220210.sav&quot;</span>)</span></code></pre></div>
 <p>SAS:</p>
-<div class="sourceCode" id="cb517"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb517-1"><a href="importing-survey-data-into-r.html#cb517-1" tabindex="-1"></a><span class="fu">library</span>(haven)</span>
-<span id="cb517-2"><a href="importing-survey-data-into-r.html#cb517-2" tabindex="-1"></a></span>
-<span id="cb517-3"><a href="importing-survey-data-into-r.html#cb517-3" tabindex="-1"></a>anes_sas <span class="ot">&lt;-</span></span>
-<span id="cb517-4"><a href="importing-survey-data-into-r.html#cb517-4" tabindex="-1"></a>  <span class="fu">read_sas</span>(<span class="at">file =</span> <span class="st">&quot;data/anes_timeseries_2020_sas_20220210.sas7bdat&quot;</span>)</span></code></pre></div>
+<div class="sourceCode" id="cb511"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb511-1"><a href="importing-survey-data-into-r.html#cb511-1" tabindex="-1"></a><span class="fu">library</span>(haven)</span>
+<span id="cb511-2"><a href="importing-survey-data-into-r.html#cb511-2" tabindex="-1"></a></span>
+<span id="cb511-3"><a href="importing-survey-data-into-r.html#cb511-3" tabindex="-1"></a>anes_sas <span class="ot">&lt;-</span></span>
+<span id="cb511-4"><a href="importing-survey-data-into-r.html#cb511-4" tabindex="-1"></a>  <span class="fu">read_sas</span>(<span class="at">file =</span> <span class="st">&quot;data/anes_timeseries_2020_sas_20220210.sas7bdat&quot;</span>)</span></code></pre></div>
 </div>
 <div id="working-with-labeled-data" class="section level3 hasAnchor" number="15.3.2">
 <h3><span class="header-section-number">A.3.2</span> Working with labeled data<a href="importing-survey-data-into-r.html#working-with-labeled-data" class="anchor-section" aria-label="Anchor link to header"></a></h3>
 <p>Stata, SPSS, and SAS files often contain labeled variables and values. These labels provide descriptive information about categorical data, making it easier to understand and analyze. When importing data from Stata, SPSS, or SAS, preserving these labels is essential for maintaining data fidelity.</p>
-<p>Consider a variable like ‘Education Level’ with coded values (e.g., 1, 2, 3). Without labels, these codes can be cryptic. However, with labels (‘High School Graduate,’ ‘Bachelor’s Degree,’ ‘Master’s Degree’), the data becomes more informative and easier to work with.</p>
+<p>Consider a variable like ‘Education Level’ with coded values (e.g., 1, 2, 3.) Without labels, these codes can be cryptic. However, with labels (‘High School Graduate,’ ‘Bachelor’s Degree,’ ‘Master’s Degree’), the data become more informative and easier to work with.</p>
 <p>With the {haven} package, we have the capability to import and work with labeled data from Stata, SPSS, and SAS files. The package uses a special class of data called <code>haven_labelled</code> to store labeled variables. When a dataset label is defined in Stata, it is stored in the ‘label’ attribute of the tibble when imported, ensuring that the information is not lost.</p>
-<p>We can use functions like <code>select()</code>, <code>glimpse()</code>, and <code>is.labelled()</code> to inspect the imported data and verify if variables are labeled. Take a look at the ANES Stata file. Notice that categorical variables are marked with a type of <code>&lt;dbl+lbl&gt;</code>. This notation indicates that these variables are labeled.</p>
-<div class="sourceCode" id="cb518"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb518-1"><a href="importing-survey-data-into-r.html#cb518-1" tabindex="-1"></a><span class="fu">library</span>(dplyr)</span>
-<span id="cb518-2"><a href="importing-survey-data-into-r.html#cb518-2" tabindex="-1"></a></span>
-<span id="cb518-3"><a href="importing-survey-data-into-r.html#cb518-3" tabindex="-1"></a>anes_dta <span class="sc">%&gt;%</span> </span>
-<span id="cb518-4"><a href="importing-survey-data-into-r.html#cb518-4" tabindex="-1"></a>  <span class="fu">select</span>(<span class="dv">1</span><span class="sc">:</span><span class="dv">6</span>) <span class="sc">%&gt;%</span> </span>
-<span id="cb518-5"><a href="importing-survey-data-into-r.html#cb518-5" tabindex="-1"></a>  <span class="fu">glimpse</span>()</span></code></pre></div>
+<p>We can use functions like <code>select()</code>, <code>glimpse()</code>, and <code>is.labelled()</code> to inspect the imported data and verify if the variables are labeled. Take a look at the ANES Stata file. Notice that categorical variables are marked with a type of <code>&lt;dbl+lbl&gt;</code>. This notation indicates that these variables are labeled.</p>
+<div class="sourceCode" id="cb512"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb512-1"><a href="importing-survey-data-into-r.html#cb512-1" tabindex="-1"></a><span class="fu">library</span>(dplyr)</span>
+<span id="cb512-2"><a href="importing-survey-data-into-r.html#cb512-2" tabindex="-1"></a></span>
+<span id="cb512-3"><a href="importing-survey-data-into-r.html#cb512-3" tabindex="-1"></a>anes_dta <span class="sc">%&gt;%</span> </span>
+<span id="cb512-4"><a href="importing-survey-data-into-r.html#cb512-4" tabindex="-1"></a>  <span class="fu">select</span>(<span class="dv">1</span><span class="sc">:</span><span class="dv">6</span>) <span class="sc">%&gt;%</span> </span>
+<span id="cb512-5"><a href="importing-survey-data-into-r.html#cb512-5" tabindex="-1"></a>  <span class="fu">glimpse</span>()</span></code></pre></div>
 <pre><code>## Rows: 7,453
 ## Columns: 6
 ## $ V200001  &lt;dbl&gt; 200015, 200022, 200039, 200046, 200053, 200060, 20008…
@@ -729,10 +731,10 @@ <h3><span class="header-section-number">A.3.2</span> Working with labeled data<a
 ## $ V200010c &lt;dbl&gt; 2, 2, 1, 2, 1, 2, 1, 2, 2, 2, 1, 1, 2, 2, 2, 2, 1, 1,…
 ## $ V201006  &lt;dbl+lbl&gt; 2, 3, 2, 3, 2, 1, 2, 3, 2, 2, 2, 2, 2, 1, 2, 1, 1…</code></pre>
 <p>We can confirm this label status using the <code>haven::is.labelled()</code> function.</p>
-<div class="sourceCode" id="cb520"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb520-1"><a href="importing-survey-data-into-r.html#cb520-1" tabindex="-1"></a>haven<span class="sc">::</span><span class="fu">is.labelled</span>(anes_dta<span class="sc">$</span>V200002)</span></code></pre></div>
+<div class="sourceCode" id="cb514"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb514-1"><a href="importing-survey-data-into-r.html#cb514-1" tabindex="-1"></a>haven<span class="sc">::</span><span class="fu">is.labelled</span>(anes_dta<span class="sc">$</span>V200002)</span></code></pre></div>
 <pre><code>## [1] TRUE</code></pre>
-<p>To explore the labels further, we can use the <code>attributes()</code> function. This function provides insights into both the variable labels (<code>$label</code>) and the associated value labels (<code>$labels</code>).</p>
-<div class="sourceCode" id="cb522"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb522-1"><a href="importing-survey-data-into-r.html#cb522-1" tabindex="-1"></a><span class="fu">attributes</span>(anes_dta<span class="sc">$</span>V200002)</span></code></pre></div>
+<p>To explore the labels further, we can use the <code>attributes()</code> function. This function provides insights into both the variable labels (<code>$label</code>) and the associated value labels (<code>$labels</code>.)</p>
+<div class="sourceCode" id="cb516"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb516-1"><a href="importing-survey-data-into-r.html#cb516-1" tabindex="-1"></a><span class="fu">attributes</span>(anes_dta<span class="sc">$</span>V200002)</span></code></pre></div>
 <pre><code>## $label
 ## [1] &quot;Mode of interview: pre-election interview&quot;
 ## 
@@ -748,24 +750,24 @@ <h3><span class="header-section-number">A.3.2</span> Working with labeled data<a
 <p>When we import a labeled dataset using {haven}, it results in a tibble containing both the data and label information. However, this is meant to be an intermediary data structure and not intended to be the final data format for analysis. Instead, we should convert it into a regular R data frame before continuing our data workflow. There are two primary methods to achieve this conversion: (1) convert to factors or (2) remove the labels.</p>
 <div id="option-1-convert-the-vector-into-a-factor" class="section level4 unnumbered hasAnchor">
 <h4>Option 1: Convert the vector into a factor<a href="importing-survey-data-into-r.html#option-1-convert-the-vector-into-a-factor" class="anchor-section" aria-label="Anchor link to header"></a></h4>
-<p>Factors are native R data types for working with categorical data. They consist of integer values that correspond to character values, known as levels. Below is a dummy example of factors. Printing <code>factors</code> shows the four different levels in the data: <code>strongly agree</code>, <code>agree</code>, <code>disagree</code>, and <code>strongly disagree</code>.</p>
-<div class="sourceCode" id="cb524"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb524-1"><a href="importing-survey-data-into-r.html#cb524-1" tabindex="-1"></a>response <span class="ot">&lt;-</span> </span>
-<span id="cb524-2"><a href="importing-survey-data-into-r.html#cb524-2" tabindex="-1"></a>  <span class="fu">c</span>(<span class="st">&quot;strongly agree&quot;</span>, <span class="st">&quot;agree&quot;</span>, <span class="st">&quot;agree&quot;</span>, <span class="st">&quot;disagree&quot;</span>)</span>
-<span id="cb524-3"><a href="importing-survey-data-into-r.html#cb524-3" tabindex="-1"></a></span>
-<span id="cb524-4"><a href="importing-survey-data-into-r.html#cb524-4" tabindex="-1"></a>response_levels <span class="ot">&lt;-</span></span>
-<span id="cb524-5"><a href="importing-survey-data-into-r.html#cb524-5" tabindex="-1"></a>  <span class="fu">c</span>(<span class="st">&quot;strongly agree&quot;</span>, <span class="st">&quot;agree&quot;</span>, <span class="st">&quot;disagree&quot;</span>, <span class="st">&quot;strongly disagree&quot;</span>)</span>
-<span id="cb524-6"><a href="importing-survey-data-into-r.html#cb524-6" tabindex="-1"></a></span>
-<span id="cb524-7"><a href="importing-survey-data-into-r.html#cb524-7" tabindex="-1"></a>factors <span class="ot">&lt;-</span> <span class="fu">factor</span>(response, <span class="at">levels =</span> response_levels)</span>
-<span id="cb524-8"><a href="importing-survey-data-into-r.html#cb524-8" tabindex="-1"></a></span>
-<span id="cb524-9"><a href="importing-survey-data-into-r.html#cb524-9" tabindex="-1"></a>factors</span></code></pre></div>
+<p>Factors are native R data types for working with categorical data. They consist of integer values that correspond to character values, known as levels. Below is a dummy example of factors. The <code>factors</code> show the four different levels in the data: <code>strongly agree</code>, <code>agree</code>, <code>disagree</code>, and <code>strongly disagree</code>.</p>
+<div class="sourceCode" id="cb518"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb518-1"><a href="importing-survey-data-into-r.html#cb518-1" tabindex="-1"></a>response <span class="ot">&lt;-</span> </span>
+<span id="cb518-2"><a href="importing-survey-data-into-r.html#cb518-2" tabindex="-1"></a>  <span class="fu">c</span>(<span class="st">&quot;strongly agree&quot;</span>, <span class="st">&quot;agree&quot;</span>, <span class="st">&quot;agree&quot;</span>, <span class="st">&quot;disagree&quot;</span>)</span>
+<span id="cb518-3"><a href="importing-survey-data-into-r.html#cb518-3" tabindex="-1"></a></span>
+<span id="cb518-4"><a href="importing-survey-data-into-r.html#cb518-4" tabindex="-1"></a>response_levels <span class="ot">&lt;-</span></span>
+<span id="cb518-5"><a href="importing-survey-data-into-r.html#cb518-5" tabindex="-1"></a>  <span class="fu">c</span>(<span class="st">&quot;strongly agree&quot;</span>, <span class="st">&quot;agree&quot;</span>, <span class="st">&quot;disagree&quot;</span>, <span class="st">&quot;strongly disagree&quot;</span>)</span>
+<span id="cb518-6"><a href="importing-survey-data-into-r.html#cb518-6" tabindex="-1"></a></span>
+<span id="cb518-7"><a href="importing-survey-data-into-r.html#cb518-7" tabindex="-1"></a>factors <span class="ot">&lt;-</span> <span class="fu">factor</span>(response, <span class="at">levels =</span> response_levels)</span>
+<span id="cb518-8"><a href="importing-survey-data-into-r.html#cb518-8" tabindex="-1"></a></span>
+<span id="cb518-9"><a href="importing-survey-data-into-r.html#cb518-9" tabindex="-1"></a>factors</span></code></pre></div>
 <pre><code>## [1] strongly agree agree          agree          disagree      
 ## Levels: strongly agree agree disagree strongly disagree</code></pre>
 <p>Factors are integer vectors, though they may look like character strings. We can confirm by looking at the vector’s structure:</p>
-<div class="sourceCode" id="cb526"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb526-1"><a href="importing-survey-data-into-r.html#cb526-1" tabindex="-1"></a><span class="fu">glimpse</span>(factors)</span></code></pre></div>
+<div class="sourceCode" id="cb520"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb520-1"><a href="importing-survey-data-into-r.html#cb520-1" tabindex="-1"></a><span class="fu">glimpse</span>(factors)</span></code></pre></div>
 <pre><code>##  Factor w/ 4 levels &quot;strongly agree&quot;,..: 1 2 2 3</code></pre>
 <p>R’s factors differ from Stata, SPSS, or SAS’ labeled vectors. However, we can convert labeled variables into factors using the <code>as_factor()</code> function.</p>
-<div class="sourceCode" id="cb528"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb528-1"><a href="importing-survey-data-into-r.html#cb528-1" tabindex="-1"></a>anes_dta <span class="sc">%&gt;%</span> </span>
-<span id="cb528-2"><a href="importing-survey-data-into-r.html#cb528-2" tabindex="-1"></a>  <span class="fu">transmute</span>(<span class="at">V200002 =</span> <span class="fu">as_factor</span>(V200002))</span></code></pre></div>
+<div class="sourceCode" id="cb522"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb522-1"><a href="importing-survey-data-into-r.html#cb522-1" tabindex="-1"></a>anes_dta <span class="sc">%&gt;%</span> </span>
+<span id="cb522-2"><a href="importing-survey-data-into-r.html#cb522-2" tabindex="-1"></a>  <span class="fu">transmute</span>(<span class="at">V200002 =</span> <span class="fu">as_factor</span>(V200002))</span></code></pre></div>
 <pre><code>## # A tibble: 7,453 × 1
 ##    V200002
 ##    &lt;fct&gt;  
@@ -781,13 +783,13 @@ <h4>Option 1: Convert the vector into a factor<a href="importing-survey-data-int
 ## 10 3. Web 
 ## # ℹ 7,443 more rows</code></pre>
 <p>The <code>as_factor()</code> function can be applied to all columns in a data frame or individual ones. Below, we convert all <code>&lt;dbl+lbl&gt;</code> columns into factors.</p>
-<div class="sourceCode" id="cb530"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb530-1"><a href="importing-survey-data-into-r.html#cb530-1" tabindex="-1"></a>anes_dta_factor <span class="ot">&lt;-</span></span>
-<span id="cb530-2"><a href="importing-survey-data-into-r.html#cb530-2" tabindex="-1"></a>  anes_dta <span class="sc">%&gt;%</span> </span>
-<span id="cb530-3"><a href="importing-survey-data-into-r.html#cb530-3" tabindex="-1"></a>  <span class="fu">as_factor</span>()</span>
-<span id="cb530-4"><a href="importing-survey-data-into-r.html#cb530-4" tabindex="-1"></a></span>
-<span id="cb530-5"><a href="importing-survey-data-into-r.html#cb530-5" tabindex="-1"></a>anes_dta_factor <span class="sc">%&gt;%</span> </span>
-<span id="cb530-6"><a href="importing-survey-data-into-r.html#cb530-6" tabindex="-1"></a>  <span class="fu">select</span>(<span class="dv">1</span><span class="sc">:</span><span class="dv">6</span>) <span class="sc">%&gt;%</span> </span>
-<span id="cb530-7"><a href="importing-survey-data-into-r.html#cb530-7" tabindex="-1"></a>  <span class="fu">glimpse</span>()</span></code></pre></div>
+<div class="sourceCode" id="cb524"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb524-1"><a href="importing-survey-data-into-r.html#cb524-1" tabindex="-1"></a>anes_dta_factor <span class="ot">&lt;-</span></span>
+<span id="cb524-2"><a href="importing-survey-data-into-r.html#cb524-2" tabindex="-1"></a>  anes_dta <span class="sc">%&gt;%</span> </span>
+<span id="cb524-3"><a href="importing-survey-data-into-r.html#cb524-3" tabindex="-1"></a>  <span class="fu">as_factor</span>()</span>
+<span id="cb524-4"><a href="importing-survey-data-into-r.html#cb524-4" tabindex="-1"></a></span>
+<span id="cb524-5"><a href="importing-survey-data-into-r.html#cb524-5" tabindex="-1"></a>anes_dta_factor <span class="sc">%&gt;%</span> </span>
+<span id="cb524-6"><a href="importing-survey-data-into-r.html#cb524-6" tabindex="-1"></a>  <span class="fu">select</span>(<span class="dv">1</span><span class="sc">:</span><span class="dv">6</span>) <span class="sc">%&gt;%</span> </span>
+<span id="cb524-7"><a href="importing-survey-data-into-r.html#cb524-7" tabindex="-1"></a>  <span class="fu">glimpse</span>()</span></code></pre></div>
 <pre><code>## Rows: 7,453
 ## Columns: 6
 ## $ V200001  &lt;dbl&gt; 200015, 200022, 200039, 200046, 200053, 200060, 20008…
@@ -800,18 +802,18 @@ <h4>Option 1: Convert the vector into a factor<a href="importing-survey-data-int
 <div id="option-2-strip-the-labels" class="section level4 unnumbered hasAnchor">
 <h4>Option 2: Strip the labels<a href="importing-survey-data-into-r.html#option-2-strip-the-labels" class="anchor-section" aria-label="Anchor link to header"></a></h4>
 <p>The second option is to remove the labels altogether, converting the labeled data into a regular R data frame. To remove, or ‘zap’ the labels from our tibble, we can use the {haven} package’s <code>zap_label()</code> and <code>zap_labels()</code> functions. This approach removes the labels but retains the data values in their original form.</p>
-<p>The ANES Stata file columns contains variable labels. Using purrr’s <code>map()</code>, we can review the labels using <code>attr</code>. In the example below, we list the first two variables and their labels. For instance, the label for <code>V200002</code> is “Mode of interview: pre-election interview”.</p>
-<div class="sourceCode" id="cb532"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb532-1"><a href="importing-survey-data-into-r.html#cb532-1" tabindex="-1"></a>purrr<span class="sc">::</span><span class="fu">map</span>(anes_dta, <span class="sc">~</span><span class="fu">attr</span>(.x, <span class="st">&quot;label&quot;</span>)) <span class="sc">%&gt;%</span></span>
-<span id="cb532-2"><a href="importing-survey-data-into-r.html#cb532-2" tabindex="-1"></a>  <span class="fu">head</span>(<span class="dv">2</span>)</span></code></pre></div>
+<p>The ANES Stata file columns contain variable labels. Using the function <code>map()</code> from {purrr}, we can review the labels using <code>attr</code>. In the example below, we list the first two variables and their labels. For instance, the label for <code>V200002</code> is “Mode of interview: pre-election interview”.</p>
+<div class="sourceCode" id="cb526"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb526-1"><a href="importing-survey-data-into-r.html#cb526-1" tabindex="-1"></a>purrr<span class="sc">::</span><span class="fu">map</span>(anes_dta, <span class="sc">~</span><span class="fu">attr</span>(.x, <span class="st">&quot;label&quot;</span>)) <span class="sc">%&gt;%</span></span>
+<span id="cb526-2"><a href="importing-survey-data-into-r.html#cb526-2" tabindex="-1"></a>  <span class="fu">head</span>(<span class="dv">2</span>)</span></code></pre></div>
 <pre><code>## $V200001
 ## [1] &quot;2020 Case ID&quot;
 ## 
 ## $V200002
 ## [1] &quot;Mode of interview: pre-election interview&quot;</code></pre>
 <p>Use <code>zap_label()</code> to remove the variable labels but retain the value labels. Notice that the labels return as <code>NULL</code>.</p>
-<div class="sourceCode" id="cb534"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb534-1"><a href="importing-survey-data-into-r.html#cb534-1" tabindex="-1"></a><span class="fu">zap_label</span>(anes_dta) <span class="sc">%&gt;%</span> </span>
-<span id="cb534-2"><a href="importing-survey-data-into-r.html#cb534-2" tabindex="-1"></a>  purrr<span class="sc">::</span><span class="fu">map</span>(<span class="sc">~</span><span class="fu">attr</span>(.x, <span class="st">&quot;label&quot;</span>)) <span class="sc">%&gt;%</span></span>
-<span id="cb534-3"><a href="importing-survey-data-into-r.html#cb534-3" tabindex="-1"></a>  <span class="fu">head</span>(<span class="dv">2</span>)</span></code></pre></div>
+<div class="sourceCode" id="cb528"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb528-1"><a href="importing-survey-data-into-r.html#cb528-1" tabindex="-1"></a><span class="fu">zap_label</span>(anes_dta) <span class="sc">%&gt;%</span> </span>
+<span id="cb528-2"><a href="importing-survey-data-into-r.html#cb528-2" tabindex="-1"></a>  purrr<span class="sc">::</span><span class="fu">map</span>(<span class="sc">~</span><span class="fu">attr</span>(.x, <span class="st">&quot;label&quot;</span>)) <span class="sc">%&gt;%</span></span>
+<span id="cb528-3"><a href="importing-survey-data-into-r.html#cb528-3" tabindex="-1"></a>  <span class="fu">head</span>(<span class="dv">2</span>)</span></code></pre></div>
 <pre><code>## $V200001
 ## NULL
 ## 
@@ -819,9 +821,9 @@ <h4>Option 2: Strip the labels<a href="importing-survey-data-into-r.html#option-
 ##     1. Video 2. Telephone       3. Web 
 ##            1            2            3</code></pre>
 <p>To remove the value labels, use <code>zap_labels()</code>. Notice the previous <code>&lt;dbl+lbl&gt;</code> columns are now <code>&lt;dbl&gt;</code>.</p>
-<div class="sourceCode" id="cb536"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb536-1"><a href="importing-survey-data-into-r.html#cb536-1" tabindex="-1"></a><span class="fu">zap_labels</span>(anes_dta) <span class="sc">%&gt;%</span> </span>
-<span id="cb536-2"><a href="importing-survey-data-into-r.html#cb536-2" tabindex="-1"></a>  <span class="fu">select</span>(<span class="dv">1</span><span class="sc">:</span><span class="dv">6</span>) <span class="sc">%&gt;%</span> </span>
-<span id="cb536-3"><a href="importing-survey-data-into-r.html#cb536-3" tabindex="-1"></a>  <span class="fu">glimpse</span>()</span></code></pre></div>
+<div class="sourceCode" id="cb530"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb530-1"><a href="importing-survey-data-into-r.html#cb530-1" tabindex="-1"></a><span class="fu">zap_labels</span>(anes_dta) <span class="sc">%&gt;%</span> </span>
+<span id="cb530-2"><a href="importing-survey-data-into-r.html#cb530-2" tabindex="-1"></a>  <span class="fu">select</span>(<span class="dv">1</span><span class="sc">:</span><span class="dv">6</span>) <span class="sc">%&gt;%</span> </span>
+<span id="cb530-3"><a href="importing-survey-data-into-r.html#cb530-3" tabindex="-1"></a>  <span class="fu">glimpse</span>()</span></code></pre></div>
 <pre><code>## Rows: 7,453
 ## Columns: 6
 ## $ V200001  &lt;dbl&gt; 200015, 200022, 200039, 200046, 200053, 200060, 20008…
@@ -830,15 +832,15 @@ <h4>Option 2: Strip the labels<a href="importing-survey-data-into-r.html#option-
 ## $ V200010d &lt;dbl&gt; 9, 26, 41, 29, 23, 37, 7, 37, 32, 41, 22, 7, 38, 21, …
 ## $ V200010c &lt;dbl&gt; 2, 2, 1, 2, 1, 2, 1, 2, 2, 2, 1, 1, 2, 2, 2, 2, 1, 1,…
 ## $ V201006  &lt;dbl&gt; 2, 3, 2, 3, 2, 1, 2, 3, 2, 2, 2, 2, 2, 1, 2, 1, 1, 1,…</code></pre>
-<p>While it is important to convert labeled datasets into regular R data frames for working in R, the labels themselves often contain valuable information that provide context and meaning to the survey variables. To aid with interpretability and documention, consider creating a data dictionary from the labeled dataset. A data dictionary is a reference document that provides detailed information about the variables and values of a survey.</p>
+<p>While it is important to convert labeled datasets into regular R data frames for working in R, the labels themselves often contain valuable information that provides context and meaning to the survey variables. To aid with interpretability and documentation, consider creating a data dictionary from the labeled dataset. A data dictionary is a reference document that provides detailed information about the variables and values of a survey.</p>
 <p>The {labelled} package offers a convenient function, <code>generate_dictionary()</code>, that creates data dictionaries directly from a labeled dataset <span class="citation">(<a href="#ref-R-labelled">Larmarange 2023</a>)</span>. This function extracts variable labels, value labels, and other metadata and organizes them into a structured document that we can browse and reference throughout our analysis.</p>
 <p>Let’s create a data dictionary from the ANES Stata dataset as an example:</p>
-<div class="sourceCode" id="cb538"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb538-1"><a href="importing-survey-data-into-r.html#cb538-1" tabindex="-1"></a><span class="fu">library</span>(labelled)</span>
-<span id="cb538-2"><a href="importing-survey-data-into-r.html#cb538-2" tabindex="-1"></a></span>
-<span id="cb538-3"><a href="importing-survey-data-into-r.html#cb538-3" tabindex="-1"></a>dictionary <span class="ot">&lt;-</span> <span class="fu">generate_dictionary</span>(anes_dta)</span></code></pre></div>
+<div class="sourceCode" id="cb532"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb532-1"><a href="importing-survey-data-into-r.html#cb532-1" tabindex="-1"></a><span class="fu">library</span>(labelled)</span>
+<span id="cb532-2"><a href="importing-survey-data-into-r.html#cb532-2" tabindex="-1"></a></span>
+<span id="cb532-3"><a href="importing-survey-data-into-r.html#cb532-3" tabindex="-1"></a>dictionary <span class="ot">&lt;-</span> <span class="fu">generate_dictionary</span>(anes_dta)</span></code></pre></div>
 <p>Once we’ve generated the data dictionary, we can take a look at the <code>V200002</code> variable and see the label, column type, number of missing entries, and associated values.</p>
-<div class="sourceCode" id="cb539"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb539-1"><a href="importing-survey-data-into-r.html#cb539-1" tabindex="-1"></a>dictionary <span class="sc">%&gt;%</span> </span>
-<span id="cb539-2"><a href="importing-survey-data-into-r.html#cb539-2" tabindex="-1"></a>  <span class="fu">filter</span>(variable <span class="sc">==</span> <span class="st">&quot;V200002&quot;</span>)</span></code></pre></div>
+<div class="sourceCode" id="cb533"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb533-1"><a href="importing-survey-data-into-r.html#cb533-1" tabindex="-1"></a>dictionary <span class="sc">%&gt;%</span> </span>
+<span id="cb533-2"><a href="importing-survey-data-into-r.html#cb533-2" tabindex="-1"></a>  <span class="fu">filter</span>(variable <span class="sc">==</span> <span class="st">&quot;V200002&quot;</span>)</span></code></pre></div>
 <pre><code>##  pos variable label                          col_type missing
 ##  2   V200002  Mode of interview: pre-electi~ dbl+lbl  0      
 ##                                                              
@@ -859,20 +861,20 @@ <h3><span class="header-section-number">A.3.3</span> Labeled missing data values
 </ul>
 <p>SAS and Stata use a concept known as ‘tagged’ missing values, which extend R’s regular <code>NA</code>. A ‘tagged’ missing value is essentially an <code>NA</code> with an additional single-character label. These values behave identically to regular <code>NA</code> in standard R operations while preserving the informative tag associated with the missing value.</p>
 <p>Here is an example from the NORC at the University of Chicago’s 2018 General Society Survey.</p>
-<div class="sourceCode" id="cb541"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb541-1"><a href="importing-survey-data-into-r.html#cb541-1" tabindex="-1"></a><span class="fu">head</span>(gss_dta<span class="sc">$</span>HEALTH)</span>
-<span id="cb541-2"><a href="importing-survey-data-into-r.html#cb541-2" tabindex="-1"></a><span class="co">#&gt; &lt;labelled&lt;double&gt;[6]&gt;: condition of health</span></span>
-<span id="cb541-3"><a href="importing-survey-data-into-r.html#cb541-3" tabindex="-1"></a><span class="co">#&gt; [1]     2     1 NA(i) NA(i)     1     2</span></span>
-<span id="cb541-4"><a href="importing-survey-data-into-r.html#cb541-4" tabindex="-1"></a><span class="co">#&gt; </span></span>
-<span id="cb541-5"><a href="importing-survey-data-into-r.html#cb541-5" tabindex="-1"></a><span class="co">#&gt; Labels:</span></span>
-<span id="cb541-6"><a href="importing-survey-data-into-r.html#cb541-6" tabindex="-1"></a><span class="co">#&gt;  value     label</span></span>
-<span id="cb541-7"><a href="importing-survey-data-into-r.html#cb541-7" tabindex="-1"></a><span class="co">#&gt;      1 excellent</span></span>
-<span id="cb541-8"><a href="importing-survey-data-into-r.html#cb541-8" tabindex="-1"></a><span class="co">#&gt;      2      good</span></span>
-<span id="cb541-9"><a href="importing-survey-data-into-r.html#cb541-9" tabindex="-1"></a><span class="co">#&gt;      3      fair</span></span>
-<span id="cb541-10"><a href="importing-survey-data-into-r.html#cb541-10" tabindex="-1"></a><span class="co">#&gt;      4      poor</span></span>
-<span id="cb541-11"><a href="importing-survey-data-into-r.html#cb541-11" tabindex="-1"></a><span class="co">#&gt;  NA(d)        DK</span></span>
-<span id="cb541-12"><a href="importing-survey-data-into-r.html#cb541-12" tabindex="-1"></a><span class="co">#&gt;  NA(i)       IAP</span></span>
-<span id="cb541-13"><a href="importing-survey-data-into-r.html#cb541-13" tabindex="-1"></a><span class="co">#&gt;  NA(n)        NA</span></span></code></pre></div>
-<p>In contrast, SPSS uses a different approach called ‘user-defined values’ to denote missing values. Each column in an SPSS dataset can have up to three distinct values designated as missing or a specified range of missing values. To model these additional user-defined missing values, {haven} provides the <code>labeled_spss()</code> subclass of <code>labeled()</code>. When you import SPSS data using {haven}, it ensures that user-defined missing values are correctly handled. You can work with this data in R while preserving the unique missing value conventions from SPSS.</p>
+<div class="sourceCode" id="cb535"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb535-1"><a href="importing-survey-data-into-r.html#cb535-1" tabindex="-1"></a><span class="fu">head</span>(gss_dta<span class="sc">$</span>HEALTH)</span>
+<span id="cb535-2"><a href="importing-survey-data-into-r.html#cb535-2" tabindex="-1"></a><span class="co">#&gt; &lt;labelled&lt;double&gt;[6]&gt;: condition of health</span></span>
+<span id="cb535-3"><a href="importing-survey-data-into-r.html#cb535-3" tabindex="-1"></a><span class="co">#&gt; [1]     2     1 NA(i) NA(i)     1     2</span></span>
+<span id="cb535-4"><a href="importing-survey-data-into-r.html#cb535-4" tabindex="-1"></a><span class="co">#&gt; </span></span>
+<span id="cb535-5"><a href="importing-survey-data-into-r.html#cb535-5" tabindex="-1"></a><span class="co">#&gt; Labels:</span></span>
+<span id="cb535-6"><a href="importing-survey-data-into-r.html#cb535-6" tabindex="-1"></a><span class="co">#&gt;  value     label</span></span>
+<span id="cb535-7"><a href="importing-survey-data-into-r.html#cb535-7" tabindex="-1"></a><span class="co">#&gt;      1 excellent</span></span>
+<span id="cb535-8"><a href="importing-survey-data-into-r.html#cb535-8" tabindex="-1"></a><span class="co">#&gt;      2      good</span></span>
+<span id="cb535-9"><a href="importing-survey-data-into-r.html#cb535-9" tabindex="-1"></a><span class="co">#&gt;      3      fair</span></span>
+<span id="cb535-10"><a href="importing-survey-data-into-r.html#cb535-10" tabindex="-1"></a><span class="co">#&gt;      4      poor</span></span>
+<span id="cb535-11"><a href="importing-survey-data-into-r.html#cb535-11" tabindex="-1"></a><span class="co">#&gt;  NA(d)        DK</span></span>
+<span id="cb535-12"><a href="importing-survey-data-into-r.html#cb535-12" tabindex="-1"></a><span class="co">#&gt;  NA(i)       IAP</span></span>
+<span id="cb535-13"><a href="importing-survey-data-into-r.html#cb535-13" tabindex="-1"></a><span class="co">#&gt;  NA(n)        NA</span></span></code></pre></div>
+<p>In contrast, SPSS uses a different approach called ‘user-defined values’ to denote missing values. Each column in an SPSS dataset can have up to three distinct values designated as missing or a specified range of missing values. To model these additional user-defined missing values, {haven} provides the <code>labeled_spss()</code> subclass of <code>labeled()</code>. When importing SPSS data using {haven}, it ensures that user-defined missing values are correctly handled. We can work with these data in R while preserving the unique missing value conventions from SPSS.</p>
 <p>Here is what the GSS SPSS data looks like when loaded with {haven}.</p>
 <pre><code>head(gss_sps$HEALTH)
 #&gt; &lt;labelled_spss&lt;double&gt;[6]&gt;: Condition of health
@@ -892,54 +894,85 @@ <h3><span class="header-section-number">A.3.3</span> Labeled missing data values
 </div>
 <div id="importing-data-from-apis-into-r" class="section level2 hasAnchor" number="15.4">
 <h2><span class="header-section-number">A.4</span> Importing data from APIs into R<a href="importing-survey-data-into-r.html#importing-data-from-apis-into-r" class="anchor-section" aria-label="Anchor link to header"></a></h2>
-<p>In addition to working with data saved as files, we may also need to retrieve data through Application Programming Interfaces (APIs). APIs provide a structured way to access data hosted on external servers and import it directly into R for analysis.</p>
-<p>To access this data, you need to understand how to construct API requests. Each API has unique endpoints, parameters, and authentication requirements. Pay attention to:</p>
+<p>In addition to working with data saved as files, we may also need to retrieve data through Application Programming Interfaces (APIs.) APIs provide a structured way to access data hosted on external servers and import it directly into R for analysis.</p>
+<p>To access these data, we need to understand how to construct API requests. Each API has unique endpoints, parameters, and authentication requirements. Pay attention to:</p>
 <ul>
-<li>Endpoints: These are URLs that point to specific data or services.</li>
-<li>Parameters: Information you pass to the API to customize your request (e.g., date ranges, filters).</li>
-<li>Authentication: APIs may require API keys or tokens for access.</li>
-<li>Rate Limits: APIs may have usage limits, so be aware of any rate limits or quotas.</li>
+<li>Endpoints: These are URLs that point to specific data or services</li>
+<li>Parameters: Information passed to the API to customize the request (e.g., date ranges, filters)</li>
+<li>Authentication: APIs may require API keys or tokens for access</li>
+<li>Rate Limits: APIs may have usage limits, so be aware of any rate limits or quotas</li>
 </ul>
 <p>Typically, we begin by making a GET request to an API endpoint. The {httr2} package allows us to generate and process HTTP requests <span class="citation">(<a href="#ref-R-httr2">Wickham 2023b</a>)</span>. We can make the GET request by pointing to the URL that contains the data we would like.</p>
-<div class="sourceCode" id="cb543"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb543-1"><a href="importing-survey-data-into-r.html#cb543-1" tabindex="-1"></a><span class="fu">library</span>(httr2)</span>
-<span id="cb543-2"><a href="importing-survey-data-into-r.html#cb543-2" tabindex="-1"></a></span>
-<span id="cb543-3"><a href="importing-survey-data-into-r.html#cb543-3" tabindex="-1"></a>api_url <span class="ot">&lt;-</span> <span class="st">&quot;https://api.example.com/survey-data&quot;</span></span>
-<span id="cb543-4"><a href="importing-survey-data-into-r.html#cb543-4" tabindex="-1"></a>response <span class="ot">&lt;-</span> <span class="fu">GET</span>(api_url)</span></code></pre></div>
-<p>Once we make the request, we will obtain the data as the <code>response</code>. The data often comes in JSON format. We can extract and parse the data using the {jsonlite} package, allowing us to work with it in R <span class="citation">(<a href="#ref-jsonlite2014">Ooms 2014</a>)</span>. The <code>fromJSON()</code> function, shown below, coverts JSON data to an R object.</p>
-<div class="sourceCode" id="cb544"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb544-1"><a href="importing-survey-data-into-r.html#cb544-1" tabindex="-1"></a>survey_data <span class="ot">&lt;-</span> <span class="fu">fromJSON</span>(<span class="fu">content</span>(response, <span class="st">&quot;text&quot;</span>))</span></code></pre></div>
-<p>Note that these are dummy examples. Please review the documentation to understand how to make requests from your specific API.</p>
-<p>R offers several packages that simplify API access by providing ready-to-use functions for popular APIs. These packages are called “wrappers”, as they “wrap” the API to make it easier to use. For example, the {tidycensus} package used in this book simplifies access to U.S. Census data, allowing us to retrieve data with R commands instead of writing complex API requests <span class="citation">(<a href="#ref-R-tidycensus">Walker and Herman 2024</a>)</span>. For example, if we are interested in the population (<code>B01003_001</code>) of each census tract in North Carolina from the 2020 ACS, we would use the <code>get_acs()</code> function and the code below. Behind the scenes, <code>get_acs()</code> is making a GET request from the Census API and the {tidycensus} functions are converting the response into an R-friendly format.</p>
-<div class="sourceCode" id="cb545"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb545-1"><a href="importing-survey-data-into-r.html#cb545-1" tabindex="-1"></a><span class="fu">library</span>(tidycensus)</span>
-<span id="cb545-2"><a href="importing-survey-data-into-r.html#cb545-2" tabindex="-1"></a></span>
-<span id="cb545-3"><a href="importing-survey-data-into-r.html#cb545-3" tabindex="-1"></a>census_data <span class="ot">&lt;-</span></span>
-<span id="cb545-4"><a href="importing-survey-data-into-r.html#cb545-4" tabindex="-1"></a>  <span class="fu">get_acs</span>(</span>
-<span id="cb545-5"><a href="importing-survey-data-into-r.html#cb545-5" tabindex="-1"></a>    <span class="at">geography =</span> <span class="st">&quot;tract&quot;</span>,</span>
-<span id="cb545-6"><a href="importing-survey-data-into-r.html#cb545-6" tabindex="-1"></a>    <span class="at">variables =</span> <span class="st">&quot;B01003_001&quot;</span>,</span>
-<span id="cb545-7"><a href="importing-survey-data-into-r.html#cb545-7" tabindex="-1"></a>    <span class="at">year =</span> <span class="dv">2020</span>,</span>
-<span id="cb545-8"><a href="importing-survey-data-into-r.html#cb545-8" tabindex="-1"></a>    <span class="at">state =</span> <span class="st">&quot;NC&quot;</span></span>
-<span id="cb545-9"><a href="importing-survey-data-into-r.html#cb545-9" tabindex="-1"></a>  )</span></code></pre></div>
+<div class="sourceCode" id="cb537"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb537-1"><a href="importing-survey-data-into-r.html#cb537-1" tabindex="-1"></a><span class="fu">library</span>(httr2)</span>
+<span id="cb537-2"><a href="importing-survey-data-into-r.html#cb537-2" tabindex="-1"></a></span>
+<span id="cb537-3"><a href="importing-survey-data-into-r.html#cb537-3" tabindex="-1"></a>api_url <span class="ot">&lt;-</span> <span class="st">&quot;https://api.example.com/survey-data&quot;</span></span>
+<span id="cb537-4"><a href="importing-survey-data-into-r.html#cb537-4" tabindex="-1"></a>response <span class="ot">&lt;-</span> <span class="fu">GET</span>(api_url)</span></code></pre></div>
+<p>Once we make the request, we obtain the data as the <code>response</code>. The data often come in JSON format. We can extract and parse the data using the {jsonlite} package, allowing us to work with it in R <span class="citation">(<a href="#ref-jsonliteooms">Ooms 2014</a>)</span>. The <code>fromJSON()</code> function, shown below, converts JSON data to an R object.</p>
+<div class="sourceCode" id="cb538"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb538-1"><a href="importing-survey-data-into-r.html#cb538-1" tabindex="-1"></a>survey_data <span class="ot">&lt;-</span> <span class="fu">fromJSON</span>(<span class="fu">content</span>(response, <span class="st">&quot;text&quot;</span>))</span></code></pre></div>
+<p>Note that these are dummy examples. Please review the documentation to understand how to make requests from a specific API.</p>
+<p>R offers several packages that simplify API access by providing ready-to-use functions for popular APIs. These packages are called “wrappers”, as they “wrap” the API to make it easier to use. For example, the {tidycensus} package used in this book simplifies access to U.S. Census data, allowing us to retrieve data with R commands instead of writing API requests from scratch <span class="citation">(<a href="#ref-R-tidycensus">Walker and Herman 2024</a>)</span>. Behind the scenes, <code>get_pums()</code> is making a GET request from the Census API, and the {tidycensus} functions are converting the response into an R-friendly format. For example, if we are interested in the age, sex, race, and Hispanicity of those in the American Community Survey sample in Durham County, North Carolina<a href="#fn30" class="footnote-ref" id="fnref30"><sup>30</sup></a>, we can use the <code>get_pums()</code> function to extract this microdata as shown in the code below. We can then use the replicate weights to create a survey object and calculate estimates for Durham County.</p>
+<div class="sourceCode" id="cb539"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb539-1"><a href="importing-survey-data-into-r.html#cb539-1" tabindex="-1"></a><span class="fu">library</span>(tidycensus)</span>
+<span id="cb539-2"><a href="importing-survey-data-into-r.html#cb539-2" tabindex="-1"></a></span>
+<span id="cb539-3"><a href="importing-survey-data-into-r.html#cb539-3" tabindex="-1"></a>durh_pums <span class="ot">&lt;-</span> <span class="fu">get_pums</span>(</span>
+<span id="cb539-4"><a href="importing-survey-data-into-r.html#cb539-4" tabindex="-1"></a>  <span class="at">variables =</span> <span class="fu">c</span>(<span class="st">&quot;PUMA&quot;</span>, <span class="st">&quot;SEX&quot;</span>, <span class="st">&quot;AGEP&quot;</span>, <span class="st">&quot;RAC1P&quot;</span>, <span class="st">&quot;HISP&quot;</span>),</span>
+<span id="cb539-5"><a href="importing-survey-data-into-r.html#cb539-5" tabindex="-1"></a>  <span class="at">state =</span> <span class="st">&quot;NC&quot;</span>,</span>
+<span id="cb539-6"><a href="importing-survey-data-into-r.html#cb539-6" tabindex="-1"></a>  <span class="at">puma =</span> <span class="fu">c</span>(<span class="st">&quot;01301&quot;</span>, <span class="st">&quot;01302&quot;</span>),</span>
+<span id="cb539-7"><a href="importing-survey-data-into-r.html#cb539-7" tabindex="-1"></a>  <span class="at">survey =</span> <span class="st">&quot;acs1&quot;</span>,</span>
+<span id="cb539-8"><a href="importing-survey-data-into-r.html#cb539-8" tabindex="-1"></a>  <span class="at">year =</span> <span class="dv">2022</span>,</span>
+<span id="cb539-9"><a href="importing-survey-data-into-r.html#cb539-9" tabindex="-1"></a>  <span class="at">rep_weights =</span> <span class="st">&quot;person&quot;</span></span>
+<span id="cb539-10"><a href="importing-survey-data-into-r.html#cb539-10" tabindex="-1"></a>)</span></code></pre></div>
+<pre><code>## Getting data from the 2022 1-year ACS Public Use Microdata Sample</code></pre>
+<pre><code>## Warning: • You have not set a Census API key. Users without a key are limited to 500
+## queries per day and may experience performance limitations.
+## ℹ For best results, get a Census API key at
+## http://api.census.gov/data/key_signup.html and then supply the key to the
+## `census_api_key()` function to use it throughout your tidycensus session.
+## This warning is displayed once per session.</code></pre>
+<div class="sourceCode" id="cb542"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb542-1"><a href="importing-survey-data-into-r.html#cb542-1" tabindex="-1"></a>durh_pums</span></code></pre></div>
+<pre><code>## # A tibble: 2,724 × 90
+##    SERIALNO      SPORDER  AGEP PUMA  ST    SEX   HISP  RAC1P  WGTP PWGTP
+##    &lt;chr&gt;           &lt;dbl&gt; &lt;dbl&gt; &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;dbl&gt; &lt;dbl&gt;
+##  1 2022HU0937941       1    60 01302 37    2     01    1       132   132
+##  2 2022HU0937941       2    61 01302 37    1     01    1       132   107
+##  3 2022HU0938759       1    44 01301 37    1     01    1        60    61
+##  4 2022HU0938759       2    48 01301 37    2     01    1        60    63
+##  5 2022HU0938759       3    19 01301 37    1     01    1        60   107
+##  6 2022HU0938759       4    16 01301 37    2     01    1        60    50
+##  7 2022HU0938759       5    12 01301 37    2     01    1        60    84
+##  8 2022HU0939904       1    53 01302 37    1     01    1       104   104
+##  9 2022HU0939904       2    53 01302 37    1     01    1       104   101
+## 10 2022HU0941348       1    70 01301 37    1     01    1        77    77
+## # ℹ 2,714 more rows
+## # ℹ 80 more variables: PWGTP1 &lt;dbl&gt;, PWGTP2 &lt;dbl&gt;, PWGTP3 &lt;dbl&gt;,
+## #   PWGTP4 &lt;dbl&gt;, PWGTP5 &lt;dbl&gt;, PWGTP6 &lt;dbl&gt;, PWGTP7 &lt;dbl&gt;,
+## #   PWGTP8 &lt;dbl&gt;, PWGTP9 &lt;dbl&gt;, PWGTP10 &lt;dbl&gt;, PWGTP11 &lt;dbl&gt;,
+## #   PWGTP12 &lt;dbl&gt;, PWGTP13 &lt;dbl&gt;, PWGTP14 &lt;dbl&gt;, PWGTP15 &lt;dbl&gt;,
+## #   PWGTP16 &lt;dbl&gt;, PWGTP17 &lt;dbl&gt;, PWGTP18 &lt;dbl&gt;, PWGTP19 &lt;dbl&gt;,
+## #   PWGTP20 &lt;dbl&gt;, PWGTP21 &lt;dbl&gt;, PWGTP22 &lt;dbl&gt;, PWGTP23 &lt;dbl&gt;, …</code></pre>
 <p>In Chapter <a href="c04-getting-started.html#c04-getting-started">4</a>, we used the {censusapi} package to get data from the Census data API for the Current Population Survey. To discover if there’s an R package that directly interfaces with a specific survey or data source, search for “[survey] R wrapper” or “[data source] R package” online.</p>
 </div>
 <div id="accessing-databases-in-r" class="section level2 hasAnchor" number="15.5">
 <h2><span class="header-section-number">A.5</span> Accessing databases in R<a href="importing-survey-data-into-r.html#accessing-databases-in-r" class="anchor-section" aria-label="Anchor link to header"></a></h2>
 <p>Databases provide a secure and organized solution as the volume and complexity of data grow. We can access, manage, and update data stored in databases in a systematic way. Because of how the data are organized, teams can draw from the same source and obtain any metadata that would be helpful for analysis.</p>
 <p>There are various ways of working with databases in RStudio. We can connect to different databases through the Connections Pane in the top right of the IDE. We can also use packages like {DBI} and {odbc} to access database tables in R files. Here is an example script connecting to a database:</p>
-<div class="sourceCode" id="cb546"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb546-1"><a href="importing-survey-data-into-r.html#cb546-1" tabindex="-1"></a>con <span class="ot">&lt;-</span> DBI<span class="sc">::</span><span class="fu">dbConnect</span>(odbc<span class="sc">::</span><span class="fu">odbc</span>(),</span>
-<span id="cb546-2"><a href="importing-survey-data-into-r.html#cb546-2" tabindex="-1"></a>                      <span class="at">Driver       =</span> <span class="st">&quot;[your driver&#39;s name]&quot;</span>,</span>
-<span id="cb546-3"><a href="importing-survey-data-into-r.html#cb546-3" tabindex="-1"></a>                      <span class="at">Server       =</span> <span class="st">&quot;[your server&#39;s path]&quot;</span>,</span>
-<span id="cb546-4"><a href="importing-survey-data-into-r.html#cb546-4" tabindex="-1"></a>                      <span class="at">UID          =</span> rstudioapi<span class="sc">::</span><span class="fu">askForPassword</span>(<span class="st">&quot;Database user&quot;</span>),</span>
-<span id="cb546-5"><a href="importing-survey-data-into-r.html#cb546-5" tabindex="-1"></a>                      <span class="at">PWD          =</span> rstudioapi<span class="sc">::</span><span class="fu">askForPassword</span>(<span class="st">&quot;Database password&quot;</span>),</span>
-<span id="cb546-6"><a href="importing-survey-data-into-r.html#cb546-6" tabindex="-1"></a>                      <span class="at">Database     =</span> <span class="st">&quot;[your database&#39;s name]&quot;</span>,</span>
-<span id="cb546-7"><a href="importing-survey-data-into-r.html#cb546-7" tabindex="-1"></a>                      <span class="at">Warehouse    =</span> <span class="st">&quot;[your warehouse&#39;s name]&quot;</span>,</span>
-<span id="cb546-8"><a href="importing-survey-data-into-r.html#cb546-8" tabindex="-1"></a>                      <span class="at">Schema       =</span> <span class="st">&quot;[your schema&#39;s name]&quot;</span></span>
-<span id="cb546-9"><a href="importing-survey-data-into-r.html#cb546-9" tabindex="-1"></a>                      )</span></code></pre></div>
-<p>The {dbplyr} and {dplyr} packages allow us to make queries and run data analysis entirely using {dplyr} syntax. All of the code can be written in R so we do not have to switch between R and SQL to explore the data. Here is some sample code:</p>
-<div class="sourceCode" id="cb547"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb547-1"><a href="importing-survey-data-into-r.html#cb547-1" tabindex="-1"></a>q1 <span class="ot">&lt;-</span> <span class="fu">tbl</span>(con, <span class="st">&quot;bank&quot;</span>) <span class="sc">%&gt;%</span></span>
-<span id="cb547-2"><a href="importing-survey-data-into-r.html#cb547-2" tabindex="-1"></a>  <span class="fu">group_by</span>(month_idx, year, month) <span class="sc">%&gt;%</span></span>
-<span id="cb547-3"><a href="importing-survey-data-into-r.html#cb547-3" tabindex="-1"></a>  <span class="fu">summarise</span>(</span>
-<span id="cb547-4"><a href="importing-survey-data-into-r.html#cb547-4" tabindex="-1"></a>    <span class="at">subscribe =</span> <span class="fu">sum</span>(<span class="fu">ifelse</span>(term_deposit <span class="sc">==</span> <span class="st">&quot;yes&quot;</span>, <span class="dv">1</span>, <span class="dv">0</span>)),</span>
-<span id="cb547-5"><a href="importing-survey-data-into-r.html#cb547-5" tabindex="-1"></a>    <span class="at">total =</span> <span class="fu">n</span>())</span>
-<span id="cb547-6"><a href="importing-survey-data-into-r.html#cb547-6" tabindex="-1"></a><span class="fu">show_query</span>(q1)</span></code></pre></div>
+<div class="sourceCode" id="cb544"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb544-1"><a href="importing-survey-data-into-r.html#cb544-1" tabindex="-1"></a>con <span class="ot">&lt;-</span> </span>
+<span id="cb544-2"><a href="importing-survey-data-into-r.html#cb544-2" tabindex="-1"></a>  DBI<span class="sc">::</span><span class="fu">dbConnect</span>(</span>
+<span id="cb544-3"><a href="importing-survey-data-into-r.html#cb544-3" tabindex="-1"></a>    odbc<span class="sc">::</span><span class="fu">odbc</span>(),</span>
+<span id="cb544-4"><a href="importing-survey-data-into-r.html#cb544-4" tabindex="-1"></a>    <span class="at">Driver    =</span> <span class="st">&quot;[driver name]&quot;</span>,</span>
+<span id="cb544-5"><a href="importing-survey-data-into-r.html#cb544-5" tabindex="-1"></a>    <span class="at">Server    =</span> <span class="st">&quot;[server path]&quot;</span>,</span>
+<span id="cb544-6"><a href="importing-survey-data-into-r.html#cb544-6" tabindex="-1"></a>    <span class="at">UID       =</span> rstudioapi<span class="sc">::</span><span class="fu">askForPassword</span>(<span class="st">&quot;Database user&quot;</span>),</span>
+<span id="cb544-7"><a href="importing-survey-data-into-r.html#cb544-7" tabindex="-1"></a>    <span class="at">PWD       =</span> rstudioapi<span class="sc">::</span><span class="fu">askForPassword</span>(<span class="st">&quot;Database password&quot;</span>),</span>
+<span id="cb544-8"><a href="importing-survey-data-into-r.html#cb544-8" tabindex="-1"></a>    <span class="at">Database  =</span> <span class="st">&quot;[database name]&quot;</span>,</span>
+<span id="cb544-9"><a href="importing-survey-data-into-r.html#cb544-9" tabindex="-1"></a>    <span class="at">Warehouse =</span> <span class="st">&quot;[warehouse name]&quot;</span>,</span>
+<span id="cb544-10"><a href="importing-survey-data-into-r.html#cb544-10" tabindex="-1"></a>    <span class="at">Schema    =</span> <span class="st">&quot;[schema name]&quot;</span></span>
+<span id="cb544-11"><a href="importing-survey-data-into-r.html#cb544-11" tabindex="-1"></a>  )</span></code></pre></div>
+<p>The {dbplyr} and {dplyr} packages allow us to make queries and run data analysis entirely using {dplyr} syntax. All of the code can be written in R, so we do not have to switch between R and SQL to explore the data. Here is some sample code:</p>
+<div class="sourceCode" id="cb545"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb545-1"><a href="importing-survey-data-into-r.html#cb545-1" tabindex="-1"></a>q1 <span class="ot">&lt;-</span> <span class="fu">tbl</span>(con, <span class="st">&quot;bank&quot;</span>) <span class="sc">%&gt;%</span></span>
+<span id="cb545-2"><a href="importing-survey-data-into-r.html#cb545-2" tabindex="-1"></a>  <span class="fu">group_by</span>(month_idx, year, month) <span class="sc">%&gt;%</span></span>
+<span id="cb545-3"><a href="importing-survey-data-into-r.html#cb545-3" tabindex="-1"></a>  <span class="fu">summarise</span>(</span>
+<span id="cb545-4"><a href="importing-survey-data-into-r.html#cb545-4" tabindex="-1"></a>    <span class="at">subscribe =</span> <span class="fu">sum</span>(<span class="fu">ifelse</span>(term_deposit <span class="sc">==</span> <span class="st">&quot;yes&quot;</span>, <span class="dv">1</span>, <span class="dv">0</span>)),</span>
+<span id="cb545-5"><a href="importing-survey-data-into-r.html#cb545-5" tabindex="-1"></a>    <span class="at">total =</span> <span class="fu">n</span>())</span>
+<span id="cb545-6"><a href="importing-survey-data-into-r.html#cb545-6" tabindex="-1"></a><span class="fu">show_query</span>(q1)</span></code></pre></div>
 <p>Be sure to check the documentation to configure a database connection.</p>
 </div>
 <div id="importing-data-from-other-formats" class="section level2 hasAnchor" number="15.6">
@@ -953,11 +986,11 @@ <h3>References<a href="references.html#references" class="anchor-section" aria-l
 <div id="ref-R-labelled" class="csl-entry">
 Larmarange, Joseph. 2023. <em><span class="nocase">labelled</span>: Manipulating Labelled Data</em>. <a href="https://larmarange.github.io/labelled/">https://larmarange.github.io/labelled/</a>.
 </div>
-<div id="ref-jsonlite2014" class="csl-entry">
-Ooms, Jeroen. 2014. <span>“The Jsonlite Package: A Practical and Consistent Mapping Between JSON Data and r Objects.”</span> <em>arXiv:1403.2805 [Stat.CO]</em>. <a href="https://arxiv.org/abs/1403.2805">https://arxiv.org/abs/1403.2805</a>.
+<div id="ref-jsonliteooms" class="csl-entry">
+Ooms, Jeroen. 2014. <span>“The <span class="nocase">jsonlite</span> Package: A Practical and Consistent Mapping Between JSON Data and <span>R</span> Objects.”</span> <em>arXiv:1403.2805 [Stat.CO]</em>. <a href="https://arxiv.org/abs/1403.2805">https://arxiv.org/abs/1403.2805</a>.
 </div>
 <div id="ref-R-tidycensus" class="csl-entry">
-Walker, Kyle, and Matt Herman. 2024. <em><span class="nocase">tidycensus</span>: Load US Census Boundary and Attribute Data as Tidyverse and Sf-Ready Data Frames</em>. <a href="https://walker-data.com/tidycensus/">https://walker-data.com/tidycensus/</a>.
+Walker, Kyle, and Matt Herman. 2024. <em><span class="nocase">tidycensus</span>: Load US Census Boundary and Attribute Data as ’<span class="nocase">tidyverse</span>’ and ’<span class="nocase">sf</span>’-Ready Data Frames</em>. <a href="https://walker-data.com/tidycensus/">https://walker-data.com/tidycensus/</a>.
 </div>
 <div id="ref-R-httr2" class="csl-entry">
 ———. 2023b. <em><span class="nocase">httr2</span>: Perform HTTP Requests and Process the Responses</em>.
@@ -966,8 +999,14 @@ <h3>References<a href="references.html#references" class="anchor-section" aria-l
 Wickham, Hadley, Jim Hester, and Jennifer Bryan. 2023. <em><span class="nocase">readr</span>: Read Rectangular Text Data</em>.
 </div>
 <div id="ref-R-haven" class="csl-entry">
-Wickham, Hadley, Evan Miller, and Danny Smith. 2023. <em><span class="nocase">haven</span>: Import and Export SPSS, Stata and SAS Files</em>.
+Wickham, Hadley, Evan Miller, and Danny Smith. 2023. <em><span class="nocase">haven</span>: Import and Export ’SPSS’, ’Stata’ and ’SAS’ Files</em>.
+</div>
 </div>
+<div class="footnotes">
+<hr />
+<ol start="30">
+<li id="fn30"><p>The public use microdata areas (PUMA) for Durham County were identified using the 2020 PUMA Names File: <a href="https://www2.census.gov/geo/pdfs/reference/puma2020/2020_PUMA_Names.pdf" class="uri">https://www2.census.gov/geo/pdfs/reference/puma2020/2020_PUMA_Names.pdf</a><a href="importing-survey-data-into-r.html#fnref30" class="footnote-back">↩︎</a></p></li>
+</ol>
 </div>
             </section>
 
diff --git a/index.html b/index.html
index bd2bef50..ae6b96c7 100644
--- a/index.html
+++ b/index.html
@@ -23,7 +23,7 @@
 <meta name="author" content="Stephanie Zimmer, Rebecca J. Powell, and Isabella Velásquez" />
 
 
-<meta name="date" content="2024-04-16" />
+<meta name="date" content="2024-04-24" />
 
   <meta name="viewport" content="width=device-width, initial-scale=1" />
   <meta name="apple-mobile-web-app-capable" content="yes" />
@@ -229,7 +229,7 @@
 <li class="chapter" data-level="4.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#introduction-3"><i class="fa fa-check"></i><b>4.1</b> Introduction</a></li>
 <li class="chapter" data-level="4.2" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup"><i class="fa fa-check"></i><b>4.2</b> Setup</a>
 <ul>
-<li class="chapter" data-level="4.2.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#packages"><i class="fa fa-check"></i><b>4.2.1</b> Packages</a></li>
+<li class="chapter" data-level="4.2.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup-load-pkgs"><i class="fa fa-check"></i><b>4.2.1</b> Packages</a></li>
 <li class="chapter" data-level="4.2.2" data-path="c04-getting-started.html"><a href="c04-getting-started.html#data"><i class="fa fa-check"></i><b>4.2.2</b> Data</a></li>
 <li class="chapter" data-level="4.2.3" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup-des-obj"><i class="fa fa-check"></i><b>4.2.3</b> Design objects</a></li>
 </ul></li>
@@ -358,7 +358,7 @@
 <li class="chapter" data-level="9.9.1" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#random-number-seeds"><i class="fa fa-check"></i><b>9.9.1</b> Random number seeds</a></li>
 <li class="chapter" data-level="9.9.2" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#descriptive-names-and-labels"><i class="fa fa-check"></i><b>9.9.2</b> Descriptive names and labels</a></li>
 </ul></li>
-<li class="chapter" data-level="9.10" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#summary"><i class="fa fa-check"></i><b>9.10</b> Summary</a></li>
+<li class="chapter" data-level="9.10" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#additional-resources-1"><i class="fa fa-check"></i><b>9.10</b> Additional resources</a></li>
 </ul></li>
 <li class="part"><span><b>IV Real life data</b></span></li>
 <li class="chapter" data-level="10" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html"><i class="fa fa-check"></i><b>10</b> Sample designs and replicate weights</a>
@@ -380,7 +380,7 @@
 <ul>
 <li class="chapter" data-level="10.4.1" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#balanced-repeated-replication-brr-method"><i class="fa fa-check"></i><b>10.4.1</b> Balanced Repeated Replication (BRR) method</a></li>
 <li class="chapter" data-level="10.4.2" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#fays-brr-method"><i class="fa fa-check"></i><b>10.4.2</b> Fay’s BRR method</a></li>
-<li class="chapter" data-level="10.4.3" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#jackknife-method"><i class="fa fa-check"></i><b>10.4.3</b> Jackknife method</a></li>
+<li class="chapter" data-level="10.4.3" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#samp-jackknife"><i class="fa fa-check"></i><b>10.4.3</b> Jackknife method</a></li>
 <li class="chapter" data-level="10.4.4" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#bootstrap-method"><i class="fa fa-check"></i><b>10.4.4</b> Bootstrap method</a></li>
 </ul></li>
 <li class="chapter" data-level="10.5" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#exercises-2"><i class="fa fa-check"></i><b>10.5</b> Exercises</a></li>
@@ -398,14 +398,14 @@
 <li class="chapter" data-level="11.4" data-path="c11-missing-data.html"><a href="c11-missing-data.html#analysis-with-missing-data"><i class="fa fa-check"></i><b>11.4</b> Analysis with missing data</a>
 <ul>
 <li class="chapter" data-level="11.4.1" data-path="c11-missing-data.html"><a href="c11-missing-data.html#recoding-missing-data"><i class="fa fa-check"></i><b>11.4.1</b> Recoding missing data</a></li>
-<li class="chapter" data-level="11.4.2" data-path="c11-missing-data.html"><a href="c11-missing-data.html#accounting-for-skip-patterns"><i class="fa fa-check"></i><b>11.4.2</b> Accounting for skip patterns</a></li>
+<li class="chapter" data-level="11.4.2" data-path="c11-missing-data.html"><a href="c11-missing-data.html#missing-skip-patt"><i class="fa fa-check"></i><b>11.4.2</b> Accounting for skip patterns</a></li>
 </ul></li>
 </ul></li>
 <li class="chapter" data-level="12" data-path="c12-recommendations.html"><a href="c12-recommendations.html"><i class="fa fa-check"></i><b>12</b> Successful survey analysis recommendations</a>
 <ul>
 <li class="chapter" data-level="" data-path="c12-recommendations.html"><a href="c12-recommendations.html#prereq12"><i class="fa fa-check"></i>Prerequisites</a></li>
 <li class="chapter" data-level="12.1" data-path="c12-recommendations.html"><a href="c12-recommendations.html#introduction-10"><i class="fa fa-check"></i><b>12.1</b> Introduction</a></li>
-<li class="chapter" data-level="12.2" data-path="c12-recommendations.html"><a href="c12-recommendations.html#recs-survey-process"><i class="fa fa-check"></i><b>12.2</b> Follow survey analysis process</a></li>
+<li class="chapter" data-level="12.2" data-path="c12-recommendations.html"><a href="c12-recommendations.html#recs-survey-process"><i class="fa fa-check"></i><b>12.2</b> Follow the survey analysis process</a></li>
 <li class="chapter" data-level="12.3" data-path="c12-recommendations.html"><a href="c12-recommendations.html#begin-with-descriptive-analysis"><i class="fa fa-check"></i><b>12.3</b> Begin with descriptive analysis</a>
 <ul>
 <li class="chapter" data-level="12.3.1" data-path="c12-recommendations.html"><a href="c12-recommendations.html#table-review"><i class="fa fa-check"></i><b>12.3.1</b> Table review</a></li>
@@ -521,7 +521,7 @@ <h1>
 <h1 class="title">Exploring Complex Survey Data Analysis in R</h1>
 <h2 class="subtitle"><em>A Tidy Introduction with srvyr</em></h2>
 <p class="author"><em>Stephanie Zimmer, Rebecca J. Powell, and Isabella Velásquez</em></p>
-<p class="date"><em>2024-04-16</em></p>
+<p class="date"><em>2024-04-24</em></p>
 </div>
 <div id="dedication" class="section level1 unnumbered hasAnchor">
 <h1>Dedication<a href="index.html#dedication" class="anchor-section" aria-label="Anchor link to header"></a></h1>
diff --git a/recs-cb.html b/recs-cb.html
index ff6f1981..cdc70832 100644
--- a/recs-cb.html
+++ b/recs-cb.html
@@ -23,7 +23,7 @@
 <meta name="author" content="Stephanie Zimmer, Rebecca J. Powell, and Isabella Velásquez" />
 
 
-<meta name="date" content="2024-04-16" />
+<meta name="date" content="2024-04-24" />
 
   <meta name="viewport" content="width=device-width, initial-scale=1" />
   <meta name="apple-mobile-web-app-capable" content="yes" />
@@ -229,7 +229,7 @@
 <li class="chapter" data-level="4.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#introduction-3"><i class="fa fa-check"></i><b>4.1</b> Introduction</a></li>
 <li class="chapter" data-level="4.2" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup"><i class="fa fa-check"></i><b>4.2</b> Setup</a>
 <ul>
-<li class="chapter" data-level="4.2.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#packages"><i class="fa fa-check"></i><b>4.2.1</b> Packages</a></li>
+<li class="chapter" data-level="4.2.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup-load-pkgs"><i class="fa fa-check"></i><b>4.2.1</b> Packages</a></li>
 <li class="chapter" data-level="4.2.2" data-path="c04-getting-started.html"><a href="c04-getting-started.html#data"><i class="fa fa-check"></i><b>4.2.2</b> Data</a></li>
 <li class="chapter" data-level="4.2.3" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup-des-obj"><i class="fa fa-check"></i><b>4.2.3</b> Design objects</a></li>
 </ul></li>
@@ -358,7 +358,7 @@
 <li class="chapter" data-level="9.9.1" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#random-number-seeds"><i class="fa fa-check"></i><b>9.9.1</b> Random number seeds</a></li>
 <li class="chapter" data-level="9.9.2" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#descriptive-names-and-labels"><i class="fa fa-check"></i><b>9.9.2</b> Descriptive names and labels</a></li>
 </ul></li>
-<li class="chapter" data-level="9.10" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#summary"><i class="fa fa-check"></i><b>9.10</b> Summary</a></li>
+<li class="chapter" data-level="9.10" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#additional-resources-1"><i class="fa fa-check"></i><b>9.10</b> Additional resources</a></li>
 </ul></li>
 <li class="part"><span><b>IV Real life data</b></span></li>
 <li class="chapter" data-level="10" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html"><i class="fa fa-check"></i><b>10</b> Sample designs and replicate weights</a>
@@ -380,7 +380,7 @@
 <ul>
 <li class="chapter" data-level="10.4.1" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#balanced-repeated-replication-brr-method"><i class="fa fa-check"></i><b>10.4.1</b> Balanced Repeated Replication (BRR) method</a></li>
 <li class="chapter" data-level="10.4.2" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#fays-brr-method"><i class="fa fa-check"></i><b>10.4.2</b> Fay’s BRR method</a></li>
-<li class="chapter" data-level="10.4.3" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#jackknife-method"><i class="fa fa-check"></i><b>10.4.3</b> Jackknife method</a></li>
+<li class="chapter" data-level="10.4.3" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#samp-jackknife"><i class="fa fa-check"></i><b>10.4.3</b> Jackknife method</a></li>
 <li class="chapter" data-level="10.4.4" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#bootstrap-method"><i class="fa fa-check"></i><b>10.4.4</b> Bootstrap method</a></li>
 </ul></li>
 <li class="chapter" data-level="10.5" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#exercises-2"><i class="fa fa-check"></i><b>10.5</b> Exercises</a></li>
@@ -398,14 +398,14 @@
 <li class="chapter" data-level="11.4" data-path="c11-missing-data.html"><a href="c11-missing-data.html#analysis-with-missing-data"><i class="fa fa-check"></i><b>11.4</b> Analysis with missing data</a>
 <ul>
 <li class="chapter" data-level="11.4.1" data-path="c11-missing-data.html"><a href="c11-missing-data.html#recoding-missing-data"><i class="fa fa-check"></i><b>11.4.1</b> Recoding missing data</a></li>
-<li class="chapter" data-level="11.4.2" data-path="c11-missing-data.html"><a href="c11-missing-data.html#accounting-for-skip-patterns"><i class="fa fa-check"></i><b>11.4.2</b> Accounting for skip patterns</a></li>
+<li class="chapter" data-level="11.4.2" data-path="c11-missing-data.html"><a href="c11-missing-data.html#missing-skip-patt"><i class="fa fa-check"></i><b>11.4.2</b> Accounting for skip patterns</a></li>
 </ul></li>
 </ul></li>
 <li class="chapter" data-level="12" data-path="c12-recommendations.html"><a href="c12-recommendations.html"><i class="fa fa-check"></i><b>12</b> Successful survey analysis recommendations</a>
 <ul>
 <li class="chapter" data-level="" data-path="c12-recommendations.html"><a href="c12-recommendations.html#prereq12"><i class="fa fa-check"></i>Prerequisites</a></li>
 <li class="chapter" data-level="12.1" data-path="c12-recommendations.html"><a href="c12-recommendations.html#introduction-10"><i class="fa fa-check"></i><b>12.1</b> Introduction</a></li>
-<li class="chapter" data-level="12.2" data-path="c12-recommendations.html"><a href="c12-recommendations.html#recs-survey-process"><i class="fa fa-check"></i><b>12.2</b> Follow survey analysis process</a></li>
+<li class="chapter" data-level="12.2" data-path="c12-recommendations.html"><a href="c12-recommendations.html#recs-survey-process"><i class="fa fa-check"></i><b>12.2</b> Follow the survey analysis process</a></li>
 <li class="chapter" data-level="12.3" data-path="c12-recommendations.html"><a href="c12-recommendations.html#begin-with-descriptive-analysis"><i class="fa fa-check"></i><b>12.3</b> Begin with descriptive analysis</a>
 <ul>
 <li class="chapter" data-level="12.3.1" data-path="c12-recommendations.html"><a href="c12-recommendations.html#table-review"><i class="fa fa-check"></i><b>12.3.1</b> Table review</a></li>
diff --git a/reference-keys.txt b/reference-keys.txt
index 2afddf3d..fa06c317 100644
--- a/reference-keys.txt
+++ b/reference-keys.txt
@@ -48,12 +48,16 @@ fig:missing-anes-vismiss
 fig:missing-anes-ggmissfct
 fig:missing-recs-hist
 tab:missing-anes-shadow-tab
+fig:recommendations-anscombe-plot
 tab:cb-incident
 tab:cb-crimetype
 tab:cb-hh
 tab:cb-pers
+tab:ncvs-vign-vt1
+tab:ncvs-vign-vt2a
+tab:ncvs-vign-vt2b
 tab:ncvs-vign-rates-demo-tab
-tab:ncvs-vgn-prop-stat-test-gt-tab
+tab:ncvs-vign-prop-stat-test-gt-tab
 tab:ambarom-worry-tab
 tab:ambarom-covid-ed-der-tab
 fig:ambarom-americas-map
@@ -102,7 +106,7 @@ example-american-national-election-studies-anes-2020-survey-documentation
 c04-getting-started
 introduction-3
 setup
-packages
+setup-load-pkgs
 data
 setup-des-obj
 survey-analysis-process
@@ -182,7 +186,7 @@ parameterization
 other-tips-for-reproducibility
 random-number-seeds
 descriptive-names-and-labels
-summary
+additional-resources-1
 c10-sample-designs-replicate-weights
 introduction-8
 common-sampling-designs
@@ -194,7 +198,7 @@ samp-combo
 replicate-weights
 balanced-repeated-replication-brr-method
 fays-brr-method
-jackknife-method
+samp-jackknife
 bootstrap-method
 exercises-2
 c11-missing-data
@@ -205,7 +209,7 @@ summarize-data
 visualization-of-missing-data
 analysis-with-missing-data
 recoding-missing-data
-accounting-for-skip-patterns
+missing-skip-patt
 c12-recommendations
 introduction-10
 recs-survey-process
diff --git a/references.html b/references.html
index ace91a3f..6a314360 100644
--- a/references.html
+++ b/references.html
@@ -23,7 +23,7 @@
 <meta name="author" content="Stephanie Zimmer, Rebecca J. Powell, and Isabella Velásquez" />
 
 
-<meta name="date" content="2024-04-16" />
+<meta name="date" content="2024-04-24" />
 
   <meta name="viewport" content="width=device-width, initial-scale=1" />
   <meta name="apple-mobile-web-app-capable" content="yes" />
@@ -229,7 +229,7 @@
 <li class="chapter" data-level="4.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#introduction-3"><i class="fa fa-check"></i><b>4.1</b> Introduction</a></li>
 <li class="chapter" data-level="4.2" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup"><i class="fa fa-check"></i><b>4.2</b> Setup</a>
 <ul>
-<li class="chapter" data-level="4.2.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#packages"><i class="fa fa-check"></i><b>4.2.1</b> Packages</a></li>
+<li class="chapter" data-level="4.2.1" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup-load-pkgs"><i class="fa fa-check"></i><b>4.2.1</b> Packages</a></li>
 <li class="chapter" data-level="4.2.2" data-path="c04-getting-started.html"><a href="c04-getting-started.html#data"><i class="fa fa-check"></i><b>4.2.2</b> Data</a></li>
 <li class="chapter" data-level="4.2.3" data-path="c04-getting-started.html"><a href="c04-getting-started.html#setup-des-obj"><i class="fa fa-check"></i><b>4.2.3</b> Design objects</a></li>
 </ul></li>
@@ -358,7 +358,7 @@
 <li class="chapter" data-level="9.9.1" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#random-number-seeds"><i class="fa fa-check"></i><b>9.9.1</b> Random number seeds</a></li>
 <li class="chapter" data-level="9.9.2" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#descriptive-names-and-labels"><i class="fa fa-check"></i><b>9.9.2</b> Descriptive names and labels</a></li>
 </ul></li>
-<li class="chapter" data-level="9.10" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#summary"><i class="fa fa-check"></i><b>9.10</b> Summary</a></li>
+<li class="chapter" data-level="9.10" data-path="c09-reprex-data.html"><a href="c09-reprex-data.html#additional-resources-1"><i class="fa fa-check"></i><b>9.10</b> Additional resources</a></li>
 </ul></li>
 <li class="part"><span><b>IV Real life data</b></span></li>
 <li class="chapter" data-level="10" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html"><i class="fa fa-check"></i><b>10</b> Sample designs and replicate weights</a>
@@ -380,7 +380,7 @@
 <ul>
 <li class="chapter" data-level="10.4.1" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#balanced-repeated-replication-brr-method"><i class="fa fa-check"></i><b>10.4.1</b> Balanced Repeated Replication (BRR) method</a></li>
 <li class="chapter" data-level="10.4.2" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#fays-brr-method"><i class="fa fa-check"></i><b>10.4.2</b> Fay’s BRR method</a></li>
-<li class="chapter" data-level="10.4.3" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#jackknife-method"><i class="fa fa-check"></i><b>10.4.3</b> Jackknife method</a></li>
+<li class="chapter" data-level="10.4.3" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#samp-jackknife"><i class="fa fa-check"></i><b>10.4.3</b> Jackknife method</a></li>
 <li class="chapter" data-level="10.4.4" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#bootstrap-method"><i class="fa fa-check"></i><b>10.4.4</b> Bootstrap method</a></li>
 </ul></li>
 <li class="chapter" data-level="10.5" data-path="c10-sample-designs-replicate-weights.html"><a href="c10-sample-designs-replicate-weights.html#exercises-2"><i class="fa fa-check"></i><b>10.5</b> Exercises</a></li>
@@ -398,14 +398,14 @@
 <li class="chapter" data-level="11.4" data-path="c11-missing-data.html"><a href="c11-missing-data.html#analysis-with-missing-data"><i class="fa fa-check"></i><b>11.4</b> Analysis with missing data</a>
 <ul>
 <li class="chapter" data-level="11.4.1" data-path="c11-missing-data.html"><a href="c11-missing-data.html#recoding-missing-data"><i class="fa fa-check"></i><b>11.4.1</b> Recoding missing data</a></li>
-<li class="chapter" data-level="11.4.2" data-path="c11-missing-data.html"><a href="c11-missing-data.html#accounting-for-skip-patterns"><i class="fa fa-check"></i><b>11.4.2</b> Accounting for skip patterns</a></li>
+<li class="chapter" data-level="11.4.2" data-path="c11-missing-data.html"><a href="c11-missing-data.html#missing-skip-patt"><i class="fa fa-check"></i><b>11.4.2</b> Accounting for skip patterns</a></li>
 </ul></li>
 </ul></li>
 <li class="chapter" data-level="12" data-path="c12-recommendations.html"><a href="c12-recommendations.html"><i class="fa fa-check"></i><b>12</b> Successful survey analysis recommendations</a>
 <ul>
 <li class="chapter" data-level="" data-path="c12-recommendations.html"><a href="c12-recommendations.html#prereq12"><i class="fa fa-check"></i>Prerequisites</a></li>
 <li class="chapter" data-level="12.1" data-path="c12-recommendations.html"><a href="c12-recommendations.html#introduction-10"><i class="fa fa-check"></i><b>12.1</b> Introduction</a></li>
-<li class="chapter" data-level="12.2" data-path="c12-recommendations.html"><a href="c12-recommendations.html#recs-survey-process"><i class="fa fa-check"></i><b>12.2</b> Follow survey analysis process</a></li>
+<li class="chapter" data-level="12.2" data-path="c12-recommendations.html"><a href="c12-recommendations.html#recs-survey-process"><i class="fa fa-check"></i><b>12.2</b> Follow the survey analysis process</a></li>
 <li class="chapter" data-level="12.3" data-path="c12-recommendations.html"><a href="c12-recommendations.html#begin-with-descriptive-analysis"><i class="fa fa-check"></i><b>12.3</b> Begin with descriptive analysis</a>
 <ul>
 <li class="chapter" data-level="12.3.1" data-path="c12-recommendations.html"><a href="c12-recommendations.html#table-review"><i class="fa fa-check"></i><b>12.3.1</b> Table review</a></li>
@@ -570,7 +570,7 @@ <h1>References<a href="references.html#references" class="anchor-section" aria-l
 DeBell, Matthew. 2010. <span>“How to Analyze ANES Survey Data.”</span> ANES Technical Report Series nes012492. Palo Alto, CA: Stanford University; Ann Arbor, MI: the University of Michigan; <a href="https://electionstudies.org/wp-content/uploads/2018/05/HowToAnalyzeANESData.pdf" class="uri">https://electionstudies.org/wp-content/uploads/2018/05/HowToAnalyzeANESData.pdf</a>.
 </div>
 <div class="csl-entry">
-DeBell, Matthew and Amsbary, Michelle and Brader, Ted and Brock, Shelley and Good, Cindy and Kamens, Justin and Maisel, Natalya and Pinto, Sarah. 2022. <span>“<span class="nocase">Methodology Report for the ANES 2020 Time Series Study</span>.”</span> <a href="https://electionstudies.org/wp-content/uploads/2022/08/anes_timeseries_2020_methodology_report.pdf" class="uri">https://electionstudies.org/wp-content/uploads/2022/08/anes_timeseries_2020_methodology_report.pdf</a>.
+DeBell, Matthew, Michelle Amsbary, Ted Brader, Shelley Brock, Cindy Good, Justin Kamens, Natalya Maisel, and Sarah Pinto. 2022. <span>“<span class="nocase">Methodology Report for the ANES 2020 Time Series Study</span>.”</span> <a href="https://electionstudies.org/wp-content/uploads/2022/08/anes_timeseries_2020_methodology_report.pdf" class="uri">https://electionstudies.org/wp-content/uploads/2022/08/anes_timeseries_2020_methodology_report.pdf</a>.
 </div>
 <div class="csl-entry">
 DeLeeuw, Edith D. 2005. <span>“To Mix or Not to Mix Data Collection Modes in Surveys.”</span> <em>Journal of Official Statistics</em> 21: 233–55.
@@ -585,13 +585,13 @@ <h1>References<a href="references.html#references" class="anchor-section" aria-l
 Dillman, Don A, Jolene D Smyth, and Leah Melani Christian. 2014. <em>Internet, Phone, Mail, and Mixed-Mode Surveys: The Tailored Design Method</em>. John Wiley &amp; Sons.
 </div>
 <div class="csl-entry">
-FC, Mike, Trevor L Davis, and ggplot2 authors. 2022. <em><span class="nocase">ggpattern</span>: Ggplot2 Pattern Geoms</em>.
+FC, Mike, Trevor L Davis, and ggplot2 authors. 2022. <em><span class="nocase">ggpattern</span>: ’<span class="nocase">ggplot2</span>’ Pattern Geoms</em>.
 </div>
 <div class="csl-entry">
 Fowler, Floyd J, and Thomas W. Mangione. 1989. <em>Standardized Survey Interviewing</em>. SAGE.
 </div>
 <div class="csl-entry">
-Freedman Ellis, Greg, and Ben Schneider. 2023. <em><span class="nocase">srvyr</span>: Dplyr-Like Syntax for Summary Statistics of Survey Data</em>.
+Freedman Ellis, Greg, and Ben Schneider. 2023. <em><span class="nocase">srvyr</span>: ’<span class="nocase">dplyr</span>’-Like Syntax for Summary Statistics of Survey Data</em>.
 </div>
 <div class="csl-entry">
 Fuller, Wayne A. 2011. <em>Sampling Statistics</em>. John Wiley &amp; Sons.
@@ -618,7 +618,7 @@ <h1>References<a href="references.html#references" class="anchor-section" aria-l
 Kim, Jae Kwang, and Jun Shao. 2021. <em>Statistical Methods for Handling Incomplete Data</em>. Chapman &amp; Hall/CRC Press.
 </div>
 <div class="csl-entry">
-Landau, William Michael. 2021. <span>“The Targets r Package: A Dynamic Make-Like Function-Oriented Pipeline Toolkit for Reproducibility and High-Performance Computing.”</span> <em>Journal of Open Source Software</em> 6 (57): 2959. <a href="https://doi.org/10.21105/joss.02959">https://doi.org/10.21105/joss.02959</a>.
+Landau, William Michael. 2021. <span>“The <span class="nocase">targets</span> <span>R</span> Package: A Dynamic <span>Make</span>-Like Function-Oriented Pipeline Toolkit for Reproducibility and High-Performance Computing.”</span> <em>Journal of Open Source Software</em> 6 (57): 2959. <a href="https://doi.org/10.21105/joss.02959">https://doi.org/10.21105/joss.02959</a>.
 </div>
 <div class="csl-entry">
 LAPOP. 2021a. <span>“AmericasBarometer 2021 - Canada: Technical Information.”</span> Vanderbilt University; <a href="http://datasets.americasbarometer.org/database/files/ABCAN2021-Technical-Report-v1.0-FINAL-eng-110921.pdf" class="uri">http://datasets.americasbarometer.org/database/files/ABCAN2021-Technical-Report-v1.0-FINAL-eng-110921.pdf</a>.
@@ -665,7 +665,7 @@ <h1>References<a href="references.html#references" class="anchor-section" aria-l
 National Center for Health Statistics. 2023. <span>“<span class="nocase">National Health Interview Survey, 2022 survey description</span>.”</span> <a href="https://ftp.cdc.gov/pub/Health_Statistics/NCHS/Dataset_Documentation/NHIS/2022/srvydesc-508.pdf" class="uri">https://ftp.cdc.gov/pub/Health_Statistics/NCHS/Dataset_Documentation/NHIS/2022/srvydesc-508.pdf</a>.
 </div>
 <div class="csl-entry">
-Ooms, Jeroen. 2014. <span>“The Jsonlite Package: A Practical and Consistent Mapping Between JSON Data and r Objects.”</span> <em>arXiv:1403.2805 [Stat.CO]</em>. <a href="https://arxiv.org/abs/1403.2805">https://arxiv.org/abs/1403.2805</a>.
+Ooms, Jeroen. 2014. <span>“The <span class="nocase">jsonlite</span> Package: A Practical and Consistent Mapping Between JSON Data and <span>R</span> Objects.”</span> <em>arXiv:1403.2805 [Stat.CO]</em>. <a href="https://arxiv.org/abs/1403.2805">https://arxiv.org/abs/1403.2805</a>.
 </div>
 <div class="csl-entry">
 Pebesma, Edzer, and Roger Bivand. 2023. <em><span class="nocase">Spatial Data Science: With applications in R</span></em>. <span>Chapman and Hall/CRC</span>. <a href="https://doi.org/10.1201/9780429459016">https://doi.org/10.1201/9780429459016</a>.
@@ -701,7 +701,7 @@ <h1>References<a href="references.html#references" class="anchor-section" aria-l
 Shook-Sa, Bonnie, Couzens, G. Lance, and Berzofsky, Marcus. 2015. <span>“Users’ Guide to the <span>National</span> <span>Crime</span> <span>Victimization</span> <span>Survey</span> (<span>NCVS</span>) Direct Variance Estimation.”</span> <a href="https://bjs.ojp.gov/sites/g/files/xyckuh236/files/media/document/ncvs_variance_user_guide_11.06.14.pdf" class="uri">https://bjs.ojp.gov/sites/g/files/xyckuh236/files/media/document/ncvs_variance_user_guide_11.06.14.pdf</a>; Bureau of Justice Statistics.
 </div>
 <div class="csl-entry">
-Sjoberg, Daniel D., Karissa Whiting, Michael Curry, Jessica A. Lavery, and Joseph Larmarange. 2021. <span>“Reproducible Summary Tables with the Gtsummary Package.”</span> <em><span>The R Journal</span></em> 13: 570–80. <a href="https://doi.org/10.32614/RJ-2021-053">https://doi.org/10.32614/RJ-2021-053</a>.
+Sjoberg, Daniel D., Karissa Whiting, Michael Curry, Jessica A. Lavery, and Joseph Larmarange. 2021. <span>“Reproducible Summary Tables with the <span class="nocase">gtsummary</span> Package.”</span> <em><span>The R Journal</span></em> 13: 570–80. <a href="https://doi.org/10.32614/RJ-2021-053">https://doi.org/10.32614/RJ-2021-053</a>.
 </div>
 <div class="csl-entry">
 Skinner, Chris. 2009. <span>“Chapter 15: Statistical Disclosure Control for Survey Data.”</span> In <em>Handbook of Statistics: Sample Surveys: Design, Methods and Applications</em>, edited by C. R. Rao, 381–96. Elsevier B.V.
@@ -710,10 +710,7 @@ <h1>References<a href="references.html#references" class="anchor-section" aria-l
 Sprunt, Barbara. 2020. <span>“93 Million and Counting: Americans Are Shattering Early Voting Records.”</span> <em>National Public Radio</em>.
 </div>
 <div class="csl-entry">
-Stephanie, Zimmer, Powell Rebecca, and Velásquez Isabella. 2024. <em><span class="nocase">srvyrexploR</span>: Data Supplement for Exploring Complex Survey Data Analysis in <span>R</span></em>.
-</div>
-<div class="csl-entry">
-Tierney, Nicholas. 2017. <span>“Visdat: Visualising Whole Data Frames.”</span> <em>JOSS</em> 2 (16): 355. <a href="https://doi.org/10.21105/joss.00355">https://doi.org/10.21105/joss.00355</a>.
+Tierney, Nicholas. 2017. <span>“<span class="nocase">visdat</span>: Visualising Whole Data Frames.”</span> <em>Journal of Open Source Software</em> 2 (16): 355. <a href="https://doi.org/10.21105/joss.00355">https://doi.org/10.21105/joss.00355</a>.
 </div>
 <div class="csl-entry">
 Tierney, Nicholas, and Dianne Cook. 2023. <span>“Expanding Tidy Data Principles to Facilitate Missing Data Exploration, Visualization and Assessment of Imputations.”</span> <em>Journal of Statistical Software</em> 105 (7): 1–31. <a href="https://doi.org/10.18637/jss.v105.i07">https://doi.org/10.18637/jss.v105.i07</a>.
@@ -755,10 +752,10 @@ <h1>References<a href="references.html#references" class="anchor-section" aria-l
 Valliant, Richard, Jill A Dever, and Frauke Kreuter. 2013. <em>Practical Tools for Designing and Weighting Survey Samples</em>. Vol. 1. Springer.
 </div>
 <div class="csl-entry">
-Walker, Kyle, and Matt Herman. 2024. <em><span class="nocase">tidycensus</span>: Load US Census Boundary and Attribute Data as Tidyverse and Sf-Ready Data Frames</em>. <a href="https://walker-data.com/tidycensus/">https://walker-data.com/tidycensus/</a>.
+Walker, Kyle, and Matt Herman. 2024. <em><span class="nocase">tidycensus</span>: Load US Census Boundary and Attribute Data as ’<span class="nocase">tidyverse</span>’ and ’<span class="nocase">sf</span>’-Ready Data Frames</em>. <a href="https://walker-data.com/tidycensus/">https://walker-data.com/tidycensus/</a>.
 </div>
 <div class="csl-entry">
-Wickham, Hadley. 2016. <em>Ggplot2: Elegant Graphics for Data Analysis</em>. Springer-Verlag New York. <a href="https://ggplot2.tidyverse.org">https://ggplot2.tidyverse.org</a>.
+Wickham, Hadley. 2016. <em><span class="nocase">ggplot2</span>: Elegant Graphics for Data Analysis</em>. Springer-Verlag New York. <a href="https://ggplot2.tidyverse.org">https://ggplot2.tidyverse.org</a>.
 </div>
 <div class="csl-entry">
 ———. 2019. <em>Advanced <span>R</span></em>. <a href="https://adv-r.hadley.nz/" class="uri">https://adv-r.hadley.nz/</a>; CRC press.
@@ -785,7 +782,7 @@ <h1>References<a href="references.html#references" class="anchor-section" aria-l
 Wickham, Hadley, Jim Hester, and Jennifer Bryan. 2023. <em><span class="nocase">readr</span>: Read Rectangular Text Data</em>.
 </div>
 <div class="csl-entry">
-Wickham, Hadley, Evan Miller, and Danny Smith. 2023. <em><span class="nocase">haven</span>: Import and Export SPSS, Stata and SAS Files</em>.
+Wickham, Hadley, Evan Miller, and Danny Smith. 2023. <em><span class="nocase">haven</span>: Import and Export ’SPSS’, ’Stata’ and ’SAS’ Files</em>.
 </div>
 <div class="csl-entry">
 Wolter, Kirk M. 2007. <em>Introduction to Variance Estimation</em>. Vol. 53. Springer.
@@ -793,6 +790,9 @@ <h1>References<a href="references.html#references" class="anchor-section" aria-l
 <div class="csl-entry">
 Xie, Yihui, Christophe Dervieux, and Emily Riederer. 2020. <em>R Markdown Cookbook</em>. Boca Raton, Florida: Chapman; Hall/CRC. <a href="https://bookdown.org/yihui/rmarkdown-cookbook">https://bookdown.org/yihui/rmarkdown-cookbook</a>.
 </div>
+<div class="csl-entry">
+Zimmer, Stephanie, Rebecca Powell, and Isabella Velásquez. 2024. <em><span class="nocase">srvyrexploR</span>: Data Supplement for Exploring Complex Survey Data Analysis in <span>R</span></em>.
+</div>
 </div>
 </div>
 
diff --git a/search_index.json b/search_index.json
index 1e16fc9c..aedf5f2c 100644
--- a/search_index.json
+++ b/search_index.json
@@ -1 +1 @@
-[["index.html", "Exploring Complex Survey Data Analysis in R A Tidy Introduction with srvyr Dedication", " Exploring Complex Survey Data Analysis in R A Tidy Introduction with srvyr Stephanie Zimmer, Rebecca J. Powell, and Isabella Velásquez 2024-04-16 Dedication To Will, Tom, and Drew, thanks for all the help with additional chores and plenty of Git consulting! "],["c01-intro.html", "Chapter 1 Introduction 1.1 Survey analysis in R 1.2 What to expect 1.3 Prerequisites 1.4 Datasets used in this book 1.5 Conventions 1.6 Getting help 1.7 Acknowledgements 1.8 Colophon", " Chapter 1 Introduction Surveys are valuable tools for gathering information about a population, and are used by researchers, governments, and businesses alike to better understand public opinion and behaviors. For example, a non-profit group may analyze societal trends to measure their impact, government agencies may study behaviors to inform policy, or companies may seek to learn customer product preferences to refine business strategy. With survey data, we can explore the world around us. Surveys are often conducted with a sample of the population. Therefore, in order to use the survey data to understand the population, we use weights to adjust the survey results for unequal probabilities of selection, non-response, and post-stratification. These adjustments ensure the sample accurately represents the population of interest (Gard et al. 2023). To account for the intricate nature of the survey design, analysts rely on statistical software such as SAS, Stata, SUDAAN, and R. In this book, we focus on R to introduce survey analysis. Our goal is to provide a comprehensive guide for individuals new to survey analysis but with some familiarity with statistics and R programming. We use a combination of the {survey} and {srvyr} packages and present the code following best practices from the tidyverse and assume weights have already been calculated and are available (Freedman Ellis and Schneider 2023; Lumley 2010; Wickham et al. 2019). 1.1 Survey analysis in R The {survey} package was released on the Comprehensive R Archive Network (CRAN) in 2003 and has been continuously developed over time. This package, primarily authored by Thomas Lumley, offers an extensive array of features, including: Calculation of point estimates and their associated variances, including means, totals, ratios, quantiles, and proportions Estimation of regression models, including generalized linear models, log-linear models, and survival curves Variances by Taylor linearization or by replicate weights, including balance repeated replication, jackknife, bootstrap, multistage bootstrap, or user-supplied methods Hypothesis testing for means, proportions, and other parameters The {srvyr} package builds on the {survey} package by providing wrappers for functions that align with the tidyverse philosophy. This is our motivation for using and recommending this package. We find that the {srvyr} package is user-friendly for those familiar with the tidyverse packages in R. For example, while many functions in the {survey} package use variables as formulas, the {srvyr} package uses tidy selection to pass variable names, a common feature in the tidyverse (Henry and Wickham 2022). Users of the tidyverse are likely familiar with the magrittr pipe operator (%&gt;%), which seamlessly works with functions from the {srvyr} package. Moreover, several common functions from {dplyr}, such as filter(), mutate(), and summarize(), can be applied to survey objects (Wickham et al. 2023). This enables users to streamline their analysis workflow and leverage the benefits of both the {srvyr} and {tidyverse} packages. While the {srvyr} package offers many advantages, there is one notable limitation: it doesn’t fully incorporate the modeling capabilities of the {survey} package into tidy wrappers. When discussing modeling and hypothesis testing, we primarily rely on the {survey} package. However, we guide you on how to apply the pipe operator to these functions to maintain clarity and consistency in your analyses. 1.2 What to expect This book covers many aspects of survey design and analysis, from understanding how to create design objects to conducting descriptive analysis, statistical tests, and models. We emphasize coding best practices and effective presentation techniques while using real-world data and practical examples to help you gain proficiency in survey analysis. Below is a summary of each chapter: Chapter 2 - Overview of Surveys: Overview of survey design processes References for more in-depth knowledge Chapter 3 - Survey data documentation: Guide to survey documentation Chapter 4 - Getting started: Installation of packages Introduction to the {srvyrexploR} package and its analytic datasets Outline of the survey analysis process Comparison between the {dplyr} and {srvyr} packages Chapter 5 - Descriptive analyses: Calculation of point estimates, standard errors, confidence intervals, and design effects Chapter 6 - Statistical testing: Statistical testing methods Comparison of means and proportions Goodness of fit tests, tests of independence, and tests of homogeneity Chapter 7 - Modeling: Linear regression, ANOVA, and logistic regression modeling Chapter 8 - Communication of results: Strategies for communicating survey results Tools and guidance for creating publishable tables and graphs Chapter 9 - Reproducible research: Various tools and methods for achieving reproducibility Chapter 10 - Sample designs and replicate weights: Description of common sampling designs and how to specify in R Description of replicate weight methods and how to specify in R Chapter 11 - Missing data: Overview of missing data in surveys Approaches to dealing with missing data Chapter 12 - Successful survey analysis recommendations: Tips for successful analysis Debugging skills Chapter 13 - National Crime Victimization Survey Vignette: Vignette on analyzing National Crime Victimization Survey (NCVS) data Illustrates analysis requiring multiple files for victimization rates Chapter 14 - AmericasBarometer Vignette: Vignette on analyzing AmericasBarometer survey data Includes making choropleth maps with survey estimates The majority of chapters contain code that you can follow. Each of these chapters starts with a “set-up” section, which includes the code needed to load the packages and datasets. We then provide the main idea of the chapter and examples of how to use the functions. Most chapters conclude with exercises to work through. We provide the solutions to the exercises in the online version of the book, available at tidy-survey-r.github.io. While we provide a brief overview of survey methodology and statistical theory, this book is not intended to be the sole resource for these topics. We reference other materials throughout the book and encourage readers to seek those out for more information. 1.3 Prerequisites To get the most of our this book, we assume that you have already conducted a survey and have the data or obtained a microdata file. Microdata, also known as respondent-level or row-level data, differs from summarized data typically found in tables. It contains individual survey responses, along with analysis weights and design variables such as strata or clusters. Additionally, the survey data should already include weights and design variables. These are required to accurately calculate unbiased estimates. The concepts and techniques discussed in this book will help you to extract meaningful insights from your survey data, but will not cover how to create weights in the first place as this is a separate complex topic. If you do not already have weights created for the survey data you are using, we recommend reviewing other resources focused on weight creation such as Valliant and Dever (2018). This book is tailored for analysts already familiar with R and the tidyverse but who may be new to complex survey analysis in R. We anticipate that readers of this book can: Install R and their Integrated Development Environment (IDE) of choice, such as RStudio Install and load packages from CRAN and GitHub repositories Run R code Read data from a folder or their working directory Understand fundamental tidyverse concepts such as tidy/long/wide data, tibbles, the magrittr pipe (%&gt;%), and tidy selection Use the tidyverse packages to wrangle, tidy, and visualize data If these concepts or skills are new to you, we recommend starting with introductory resources to cover these topics before reading this book. R for Data Science (Wickham, Çetinkaya-Rundel, and Grolemund 2023) is a beginner-friendly guide for getting started in data science using R. It offers guidance on preliminary installation steps and basic R syntax, and it introduces tidyverse concepts and packages. 1.4 Datasets used in this book We work with two key datasets throughout the book: the Residential Energy Consumption Survey (RECS – U.S. Energy Information Administration 2023b) and the American National Election Studies (ANES – DeBell 2010). We introduce and demonstrate the loading and preparation of these datasets in Chapter 4. 1.5 Conventions Throughout the book, we use the following typographical conventions: Package names are surrounded by curly brackets: {srvyr} Function names are in constant width text format and include parentheses: survey_mean() Object and variable names are in constant width text format: anes_des 1.6 Getting help We recommend first trying to resolve errors and issues independently using the tips provided in Chapter 12. If you have questions or face issues while working through the book, please report them to its GitHub repository. There are several community forums for asking questions, including: Posit Community: https://community.rstudio.com/ R for Data Science Slack Community: https://rfordatasci.com/ Stack Overflow: https://stackoverflow.com/ 1.7 Acknowledgements We would like to thank Holly Cast, Greg Freedman Ellis, Joe Murphy, and Sheila Saia for their reviews of the initial draft. Their detailed and honest feedback helped to make this book considerably better, and we are grateful for their input. Additionally, this book started from two short courses. The first at the Annual Conference for the American Association for Public Opinion Research (AAPOR) and the second as a series of webinars for the Midwest Association of Public Opinion Research (MAPOR). We would like to also thank those that assisted us by moderating breakout rooms and answering questions from attendees: Greg Freedman Ellis, Raphael Nishimura, and Benjamin Schneider. 1.8 Colophon This book was written in bookdown using RStudio. The complete source is available on GitHub: https://github.com/tidy-survey-r/tidy-survey-book. This version of the book was built with R version 4.3.1 (2023-06-16) and with the packages listed in Table 1.1. #htbijaoair table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #htbijaoair thead, #htbijaoair tbody, #htbijaoair tfoot, #htbijaoair tr, #htbijaoair td, #htbijaoair th { border-style: none; } #htbijaoair p { margin: 0; padding: 0; } #htbijaoair .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #htbijaoair .gt_caption { padding-top: 4px; padding-bottom: 4px; } #htbijaoair .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #htbijaoair .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #htbijaoair .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #htbijaoair .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #htbijaoair .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #htbijaoair .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #htbijaoair .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #htbijaoair .gt_column_spanner_outer:first-child { padding-left: 0; } #htbijaoair .gt_column_spanner_outer:last-child { padding-right: 0; } #htbijaoair .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #htbijaoair .gt_spanner_row { border-bottom-style: hidden; } #htbijaoair .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #htbijaoair .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #htbijaoair .gt_from_md > :first-child { margin-top: 0; } #htbijaoair .gt_from_md > :last-child { margin-bottom: 0; } #htbijaoair .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #htbijaoair .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #htbijaoair .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #htbijaoair .gt_row_group_first td { border-top-width: 2px; } #htbijaoair .gt_row_group_first th { border-top-width: 2px; } #htbijaoair .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #htbijaoair .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #htbijaoair .gt_first_summary_row.thick { border-top-width: 2px; } #htbijaoair .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #htbijaoair .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #htbijaoair .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #htbijaoair .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #htbijaoair .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #htbijaoair .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #htbijaoair .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #htbijaoair .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #htbijaoair .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #htbijaoair .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #htbijaoair .gt_left { text-align: left; } #htbijaoair .gt_center { text-align: center; } #htbijaoair .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #htbijaoair .gt_font_normal { font-weight: normal; } #htbijaoair .gt_font_bold { font-weight: bold; } #htbijaoair .gt_font_italic { font-style: italic; } #htbijaoair .gt_super { font-size: 65%; } #htbijaoair .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #htbijaoair .gt_asterisk { font-size: 100%; vertical-align: 0; } #htbijaoair .gt_indent_1 { text-indent: 5px; } #htbijaoair .gt_indent_2 { text-indent: 10px; } #htbijaoair .gt_indent_3 { text-indent: 15px; } #htbijaoair .gt_indent_4 { text-indent: 20px; } #htbijaoair .gt_indent_5 { text-indent: 25px; } TABLE 1.1: Package versions and source used in building this book Package Version Source DiagrammeR 1.0.10 CRAN Matrix 1.6-1 CRAN bookdown 0.34 CRAN broom 1.0.5 CRAN censusapi 0.8.0 GitHub (hrecht/censusapi@15b2b02) dplyr 1.1.4 CRAN forcats 1.0.0 CRAN ggpattern 1.0.1 CRAN ggplot2 3.4.2 CRAN gt 0.9.0 CRAN gtsummary 1.7.1 CRAN haven 2.5.2 CRAN janitor 2.2.0 CRAN kableExtra 1.3.4 CRAN knitr 1.43 CRAN labelled 2.12.0 CRAN lubridate 1.9.2 CRAN naniar 1.0.0 CRAN osfr 0.2.9 CRAN prettyunits 1.2.0 CRAN purrr 1.0.2 CRAN readr 2.1.4 CRAN renv 1.0.0 CRAN rmarkdown 2.23 CRAN rnaturalearth 0.3.3 CRAN rnaturalearthdata 0.1.0 CRAN sf 1.0-14 CRAN srvyr 1.2.0 GitHub (gergness/srvyr@1917f75) srvyrexploR 0.0.0.9000 GitHub (tidy-survey-r/srvyrexploR@914fc0f) stringr 1.5.1 CRAN survey 4.2-1 CRAN survival 3.5-7 CRAN tibble 3.2.1 CRAN tidyr 1.3.0 CRAN tidyselect 1.2.0 CRAN tidyverse 2.0.0 CRAN References DeBell, Matthew. 2010. “How to Analyze ANES Survey Data.” ANES Technical Report Series nes012492. Palo Alto, CA: Stanford University; Ann Arbor, MI: the University of Michigan; https://electionstudies.org/wp-content/uploads/2018/05/HowToAnalyzeANESData.pdf. Freedman Ellis, Greg, and Ben Schneider. 2023. srvyr: Dplyr-Like Syntax for Summary Statistics of Survey Data. Gard, Arianna M., Luke W. Hyde, Steven G. Heeringa, Brady T. West, and Colter Mitchell. 2023. “Why Weight? Analytic Approaches for Large-Scale Population Neuroscience Data.” Dev Cogn Neurosci. https://doi.org/10.1016/j.dcn.2023.101196. Henry, Lionel, and Hadley Wickham. 2022. tidyselect: Select from a Set of Strings. Lumley, Thomas. 2010. Complex Surveys: A Guide to Analysis Using R. John Wiley; Sons. ———. 2023b. “2020 Residential Energy Consumption Survey: Household Characteristics Technical Documentation Summary.” https://www.eia.gov/consumption/residential/data/2020/pdf/2020%20RECS_Methodology%20Report.pdf. Valliant, Richard, and Jill A. Dever. 2018. Survey Weights: A Step-by-Step Guide to Calculation. Stata Press. Wickham, Hadley, Mara Averick, Jennifer Bryan, Winston Chang, Lucy D’Agostino McGowan, Romain François, Garrett Grolemund, et al. 2019. “Welcome to the tidyverse.” Journal of Open Source Software 4 (43): 1686. https://doi.org/10.21105/joss.01686. Wickham, Hadley, Mine Çetinkaya-Rundel, and Garrett Grolemund. 2023. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. 2rd Edition. https://r4ds.hadley.nz/; O’Reilly Media. Wickham, Hadley, Romain François, Lionel Henry, Kirill Müller, and Davis Vaughan. 2023. dplyr: A Grammar of Data Manipulation. "],["c02-overview-surveys.html", "Chapter 2 Overview of Surveys 2.1 Introduction 2.2 Searching for public-use survey data 2.3 Pre-survey planning 2.4 Study design 2.5 Data collection 2.6 Post-survey processing 2.7 Post-survey data analysis and reporting", " Chapter 2 Overview of Surveys 2.1 Introduction Developing surveys to gather accurate information about populations often involves a intricate and time-intensive process. Researchers can spend months, or even years, developing the study design, questions, and other methods for a single survey to ensure high-quality data is collected. Prior to analyzing survey data, we recommend understanding the entire survey life cycle. This understanding can provide a better insight into what types of analyses should be conducted on the data. The survey life cycle consists of the necessary stages to execute a survey project successfully. Each stage influences the survey’s timing, costs, and feasibility, consequently impacting the data collected and how we should analyze it. Figure 2.1 shows a high level view of the survey process and this chapter gives an overview of each step. FIGURE 2.1: Overview of the survey process The survey life cycle starts with a research topic or question of interest (e.g., what impact does childhood trauma have on health outcomes later in life). Researchers typically review existing data sources to determine if data are already available that can address this question, as drawing from available resources can result in a reduced burden on respondents, cheaper research costs, and faster research outcomes. However, if existing data cannot answer the nuances of the research question, a survey can be used to capture the exact data that the researcher needs through a questionnaire, or a set of questions. To gain a deeper understanding of survey design and implementation, we recommend reviewing several pieces of existing literature in detail (e.g., Biemer and Lyberg 2003; Bradburn, Sudman, and Wansink 2004; Dillman, Smyth, and Christian 2014; Groves et al. 2009; Tourangeau, Rips, and Rasinski 2000; Valliant, Dever, and Kreuter 2013). 2.2 Searching for public-use survey data Throughout this book, we use public-use datasets from different surveys, including the American National Election Survey (ANES), the Residential Energy Consumption Survey (RECS), the National Crime Victimization Survey (NCVS), and the AmericasBarometer surveys. As mentioned above, researchers should look for existing data that can provide insights into their research questions before embarking on a new survey. One of the greatest sources of data is the government. For example, in the U.S., we can get data directly from the various statistical agencies like with RECS and NCVS. Other countries often have data available through official statistics offices, such as the Office for National Statistics in the United Kingdom. In addition to government data, many researchers will make their data publicly available through repositories such as the Inter-university Consortium for Political and Social Research (ICPSR) variable search or the Odum Institute Data Archive. Searching these repositories or other compiled lists (e.g., Analyze Survey Data for Free) can be an efficient way to identify surveys with questions related to the researcher’s topic of interest. 2.3 Pre-survey planning There are multiple things to consider when starting a survey. Errors are the differences between the true values of the variables being studied and the values obtained through the survey. Each step and decision made before the launch of the survey impact the types of errors that are introduced into the data, which in turn impact how to interpret the results. Generally, survey researchers consider there to be seven main sources of error that fall under either Representation and Measurement (Groves et al. 2009): Representation Coverage Error: A mismatch between the population of interest (also known as the target population or study population) and the sampling frame, the list from which the sample is drawn. Sampling Error: Error produced when selecting a sample, the subset of the population, from the sampling frame. This error is due to randomization, and we discuss how to quantify this error in Chapter 10. There is no sampling error in a census as there is no randomization. The sampling error measures the difference between all potential samples under the same sampling method. Nonresponse Error: Differences between those who responded and did not respond to the survey (unit nonresponse) or a given question (item nonresponse). Adjustment Error: Error introduced during post-survey statistical adjustments. Measurement Validity: A mismatch between the topic of interest and the question(s) used to collect that information. Measurement Error: A mismatch between what the researcher asked and how the respondent answered. Processing Error: Edits by the researcher to responses provided by the respondent (e.g., adjustments to data based on illogical responses). Almost every survey has errors. Researchers attempt to conduct a survey that reduces the total survey error, or the accumulation of all errors that may arise throughout the survey life cycle. By assessing these different types of errors together, researchers can seek strategies to maximize the overall survey quality and improve the reliability and validity of results (Biemer 2010). However, attempts to reduce individual sources errors (and therefore total survey error) come at the price of time and money. For example: Coverage Error Tradeoff: Researchers can search for or create more accurate and updated sampling frames, but they can be difficult to construct or obtain. Sampling Error Tradeoff: Researchers can increase the sample size to reduce sampling error; however, larger samples can be expensive and time-consuming to field. Nonresponse Error Tradeoff: Researchers can increase or diversify efforts to improve survey participation but this may be resource-intensive while not entirely removing nonresponse bias. Adjustment Error Tradeoff: Weighting is a statistical technique used to adjust the contribution of individual survey responses to the final survey estimates. It is typically done to make the sample more representative of the target population. However, if researchers do not carefully execute the adjustments or base them on inaccurate information, they can introduce new biases, leading to less accurate estimates. Validity Error Tradeoff: Researchers can increase validity through a variety of ways, such as using established scales or collaborating with a psychometrician during survey design to pilot and evaluate questions. However, doing so lengthens the amount of time and resources needed to complete survey design. Measurement Error Tradeoff: Reseachers can use techniques such as questionnaire testing and cognitive interviewing to ensure respondents are answering questions as expected. However, these activities also require time and resources to complete. Processing Error Tradeoff: Researchers can impose rigorous data cleaning and validation processes. However, this requires supervision, training, and time. The challenge for survey researchers is to find the optimal tradeoffs among these errors. They must carefully consider ways to reduce each error source and total survey error while balancing their study’s objectives and resources. For survey analysts, understanding the decisions that researchers took to minimize these error sources can impact how results are interpreted. The remainder of this chapter dives into critical considerations for survey development. We explore how to consider each of these sources of error and how these error sources can inform the interpretations of the data. 2.4 Study design From formulating methodologies to choosing an appropriate sampling frame, the study design phase is where the blueprint for a successful survey takes shape. Study design encompasses multiple parts of the survey life cycle, including decisions on the population of interest, survey mode (the format through which a survey is administered to respondents), timeline, and questionnaire design. Knowing who and how to survey individuals depends on the study’s goals and the feasibility of implementation. This section explores the strategic planning that lays the foundation for a survey. 2.4.1 Sampling design The set or group we want to survey is known as the population of interest or the target population. The population of interest could be broad, such as “all adults age 18+ living in the U.S.” or a specific population based on a particular characteristic or location. For example, we may want to know about “adults aged 18-24 who live in North Carolina” or “eligible voters living in Illinois.” However, a sampling frame with contact information is needed to survey individuals in these populations of interest. If researchers are looking at eligible voters, the sampling frame could be the voting registry for a given state or area. If researchers are looking at more board target populations like all adults in the United States, the sampling frame is likely imperfect. In these cases, a full list of individuals in the United States is not available for a sampling frame. Instead, researchers may choose to use a sampling frame of mailing addresses and send the survey to households, or they may choose to use random digit dialing (RDD) and call random phone numbers (that may or may not be assigned, connected, and working). These imperfect sampling frames can result in coverage error where there is a mismatch between the target population and the list of individuals researchers can select. For example, if a researcher is looking to obtain estimates for “all adults aged 18+ living in the U.S.”, a sampling frame of mailing addresses will miss specific types of individuals, such as the homeless, transient populations, and incarcerated individuals. Additionally, many households have more than one adult resident, so researchers would need to consider how to get a specific individual to fill out the survey (called within household selection) or adjust the target population to report on “U.S. households” instead of “individuals.” Once the researchers have selected the sampling frame, the next step is determining how to select individuals for the survey. In rare cases, researchers may conduct a census and survey everyone on the sampling frame. However, the ability to implement a questionnaire at that scale is something only some can do (e.g., government censuses). Instead, researchers typically choose to sample individuals and use weights to estimate numbers in the target population. They can use a variety of different sampling methods, and more information on these can be found in Chapter 10. This decision of which sampling method to use impacts sampling error and can be accounted for in weighting. Example: Number of pets in a household Let’s use a simple example where a researcher is interested in the average number of pets in a household. Our researcher needs to consider the target population for this study. Specifically, are they interested in all households in a given country or households in a more local area (e.g., city or state)? Let’s assume our researcher is interested in the number of pets in a U.S. household with at least one adult (18 years old or older). In this case, a sampling frame of mailing addresses would introduce only a small amount of coverage error as the frame would closely match our target population. Specifically, our researcher would likely want to use the Computerized Delivery Sequence File (CDSF), which is a file of mailing addresses that the United States Postal Service (USPS) creates and covers nearly 100% of U.S. households (Harter et al. 2016). To sample these households, for simplicity, we use a stratified simple random sample design (see Chapter 10 for more information on sample designs), where we randomly sample households within each state (i.e., we stratify by state). Throughout this chapter, we build on this example research question to plan a survey. 2.4.2 Data collection planning With the sampling design decided, researchers can then decide how to survey these individuals. Specifically, the modes used for contacting and surveying the sample, how frequently to send reminders and follow-ups, and the overall timeline of the study are four of the major data collection determinations. Traditionally, researchers have considered four main modes1: Computer Assisted Personal Interview (CAPI; also known as face-to-face or in-person interviewing) Computer Assisted Telephone Interview (CATI; also known as phone or telephone interviewing) Computer Assisted Web Interview (CAWI; also known as web or online interviewing) Paper and Pencil Interview (PAPI) Researchers can use a single mode to collect data or multiple modes (also called mixed-modes). Using mixed-modes can allow for broader reach and increase response rates depending on the target population (Biemer et al. 2017; DeLeeuw 2005, 2018). For example, researchers could both call households to conduct a CATI survey and send mail with a PAPI survey to the household. Using both modes, researchers could gain participation through the mail from individuals who do not pick up the phone to unknown numbers or through the phone from individuals who do not open all of their mail. However, mode effects (where responses differ based on the mode of response) can be present in the data and may need to be considered during analysis. When selecting which mode, or modes, to use, understanding the unique aspects of the chosen target population and sampling frame provides insight into how they can best be reached and engaged. For example, if we plan to survey adults aged 18-24 who live in North Carolina, asking them to complete a survey using CATI (i.e., over the phone) would likely not be as successful as other modes like the web. This age group does not talk on the phone as much as other generations and often does not answer their phones for unknown numbers. Additionally, the mode for contacting respondents relies on what information is available in the sampling frame. For example, if our sampling frame includes an email address, we could email our selected sample members to convince them to complete a survey. Alternatively, if the sampling frame is a list of mailing addresses, we could contact sample members with a letter. It is important to note that there can be a difference between the contact and survey modes. For example, if we have a sampling frame with addresses, we can send a letter to our sample members and provide information on completing a web survey. Another option is using mixed-mode surveys by mailing sample members a paper and pencil survey but also including instructions to complete the survey online. Combining different contact modes and different survey modes can be helpful in reducing unit nonresponse error–where the entire unit (e.g., a household) does not respond to the survey at all–as different sample members may respond better to different contact and survey modes. However, when considering which modes to use, it is important to make access to the survey as easy as possible for sample members to reduce burden and unit nonresponse. Another way to reduce unit nonresponse error is by varying the language of the contact materials (Dillman, Smyth, and Christian 2014). People are motivated by different things, so constantly repeating the same message may not be helpful. Instead, mixing up the messaging and the type of contact material the sample member receives can increase response rates and reduce the unit nonresponse error. For example, instead of only sending standard letters, researchers could consider sending mailings that invoke “urgent” or “important” thoughts by sending priority letters or using other delivery services like FedEx, UPS, or DHL. A study timeline may also determine the number and types of contacts. If the timeline is long, there is plentiful time for follow-ups and diversified messages in contact materials. If the timeline is short, then fewer follow-ups can be implemented. Many studies start with the tailored design method put forth by Dillman, Smyth, and Christian (2014) and implement five contacts: Prenotification (Prenotice) letting sample members know the survey is coming Invitation to complete the survey Reminder that also thanks the respondents that may have already completed the survey Reminder (with a replacement paper survey if needed) Final reminder This method is easily adaptable based on the study timeline and needs but provides a starting point for most studies. Example: Number of pets in a household Let’s return to our example of a researcher who wants to know the average number of pets in a household. We are using a sampling frame of mailing addresses, so we recommend starting our data collection with letters mailed to households, but later in data collection, we want to send interviewers to the house to conduct an in-person (or CAPI) interview to decrease unit nonresponse error. This means we have two contact modes (paper and in-person). As mentioned above, the survey mode does not have to be the same as the contact mode, so we recommend a mixed-mode study with both Web and CAPI modes. Let’s assume we have six months for data collection, so we may want to recommend the following protocol: Protocol Example for 6-month Web and CAPI Data Collection Week Contact Mode Contact Message Survey Mode Offered 1 Mail: Letter Prenotice — 2 Mail: Letter Invitation Web 3 Mail: Postcard Thank You/Reminder Web 6 Mail: Letter in large envelope Animal Welfare Discussion Web 10 Mail: Postcard Inform Upcoming In-Person Visit Web 14 In-Person Visit — CAPI 16 Mail: Letter Reminder of In-Person Visit Web, but includes a number to call to schedule CAPI 20 In-Person Visit — CAPI 25 Mail: Letter in large envelope Survey Closing Notice Web, but includes a number to call to schedule CAPI This is just one possible protocol that we can use that starts respondents with the web (typically done to reduce costs). However, researchers may want to begin in-person data collection earlier during the data collection period or ask their interviewers to attempt more than two visits with a household. 2.4.3 Questionnaire design When developing the questionnaire, it can be helpful to first outline the topics to be asked and include the “why” each question or topic is important to the research question(s). This can help researchers better tailor the questionnaire and reduce the number of questions (and thus the burden on the respondent) if topics are deemed irrelevant to the research question. When making these decisions, researchers should also consider questions needed for weighting. While we would love to have everyone in our population of interest answer our survey, this rarely happens. Thus, including questions about demographics in the survey can assist with weighting for nonresponse errors (both unit and item nonresponse). Knowing the details of the sampling plan and what may impact coverage error and sampling error can help researchers determine what types of demographics to include. Thus questionnaire design is done in conjunction with sampling design. Researchers can benefit from the work of others by using questions from other surveys. Demographic sections such as race, ethnicity, or education borrow questions from a government census or other official surveys. Question banks such as the Inter-university Consortium for Political and Social Research (ICPSR) variable search can provide additional potential questions. If a question does not exist in a question bank, researchers can craft their own. When developing survey questions, researchers should start with the research topic and attempt to write questions that match the concept. The closer the question asked is to the overall concept, the better validity there is. For example, if the researcher wants to know how people consume T.V. series and movies but only asks a question about how many T.V.s are in the house, then they would be missing other ways that people watch T.V. series and movies, such as on other devices or at places outside of the home. As mentioned above, researchers can employ techniques to increase the validity of their questionnaires. For example, questionnaire testing involves piloting the survey instrument to identify and fix potential issues before conducting the main survey. Additionally, researchers could conduct cognitive interviews – a technique where researchers walk through the survey with participants, encouraging them to speak their thoughts out loud to uncover how they interpret and understand survey questions. Additionally, when designing questions, researchers should consider the mode for the survey and adjust the language appropriately. In self-administered surveys (e.g., web or mail), respondents can see all the questions and response options, but that is not the case in interviewer-administered surveys (e.g., CATI or CAPI). With interviewer-administered surveys, the response options must be read aloud to the respondents, so the question may need to be adjusted to create a better flow to the interview. Additionally, with self-administered surveys, because the respondents are viewing the questionnaire, the formatting of the questions is even more critical to ensure accurate measurement. Incorrect formatting or wording can result in measurement error, so following best practices or using existing validated questions can reduce error. There are multiple resources to help researchers draft questions for different modes (e.g., Bradburn, Sudman, and Wansink 2004; Dillman, Smyth, and Christian 2014; Fowler and Mangione 1989; Tourangeau, Couper, and Conrad 2004). Example: Number of pets in a household As part of our survey on the average number of pets in a household, researchers may want to know what animal most people prefer to have as a pet. Let’s say we have the following question in our survey: FIGURE 2.2: Example Question Asking Pet Preference Type This question may have validity issues as it only provides the options of “dogs” and “cats” to respondents, and the interpretation of the data could be incorrect. For example, if we had 100 respondents who answered the question and 50 selected dogs, then the results of this question cannot be “50% of the population prefers to have a dog as a pet,” as only two response options were provided. If a respondent taking our survey prefers turtles, they could either be forced to choose a response between these two (i.e., interpret the question as “between dogs and cats, which do you prefer?” and result in measurement error), or they may not answer the question (which results in item nonresponse error). Based on this, the interpretation of this question should be, “When given a choice between dogs and cats, 50% of respondents preferred to have a dog as a pet.” To avoid this issue, researchers should consider these possibilities and adjust the question accordingly. One simple way could be to add an “other” response option to give respondents a chance to provide a different response. The “other” response option could then include a way for respondents to write their other preference. For example, we could rewrite this question as: FIGURE 2.3: Example Question Asking Pet Preference Type with Other Specify Option Researchers can then code the responses from the open-ended box and get a better understanding of the respondent’s choice of preferred pet. Interpreting this question becomes easier as researchers no longer need to qualify the results with the choices provided. This is a simple example of how the presentation of the question and options can impact the findings. For more complex topics and questions, researchers must thoroughly consider how to mitigate any impacts from the presentation, formatting, wording, and other aspects. As survey analysts, reviewing not only the data but also the wording of the questions is crucial to ensure the results are presented in a manner consistent with the question asked. Chapter 3 provides further details on how to review existing survey documentation to inform our analyses. 2.5 Data collection Once the data collection starts, researchers try to stick to the data collection protocol designed during pre-survey planning. However, effective researchers also prepare to adjust their plans and adapt as needed to the current progress of data collection (Schouten, Peytchev, and Wagner 2018). Some extreme examples could be natural disasters that could prevent mailings or interviewers getting to the sample members. This could cause an in-person survey needing to quickly pivot to a self-administered survey, or the field period could be delayed, for example. Others could be smaller in that something newsworthy occurs connected to the survey, so researchers could choose to play this up in communication materials. In addition to these external factors, there could be factors unique to the survey, such as lower response rates for a specific sub-group, so the data collection protocol may need to find ways to improve response rates for that specific group. 2.6 Post-survey processing After data collection, various activities need to be completed before we can analyze the survey. Multiple decisions made during this post-survey phase can assist researchers in reducing different error sources, such as weighting to account for the sample selection. Knowing the decisions researchers made in creating the final analytic data can impact how analysts use the data and interpret the results. 2.6.1 Data cleaning and imputation Post-survey cleaning is one of the first steps researchers do to get the survey responses into a dataset for use by analysts. Data cleaning can consist of correcting inconsistent data (e.g., with skip pattern errors or multiple questions throughout the survey being consistent with each other), editing numeric entries or open-ended responses for grammar and consistency, or recoding open-ended questions into categories for analysis. There is no universal set of fixed rules that every project must adhere to. Instead, each project or research study should establish its own guidelines and procedures for handling various cleaning scenarios based on its specific objectives. Researchers should use their best judgment to ensure data integrity, and all decisions should be documented and available to those using the data in the analysis. Each decision a researcher makes impacts processing error, so often, researchers have multiple people review these rules or recode open-ended data and adjudicate any differences in an attempt to reduce this error. Another crucial step in post-survey processing is imputation. Often, there is item nonresponse where respondents do not answer specific questions. If the questions are crucial to analysis efforts or the research question, researchers may implement imputation to reduce item nonresponse error. Imputation is a technique for replacing missing or incomplete data values with estimated values. However, as imputation is a way of assigning a value to missing data based on an algorithm or model, it can also introduce processing error, so researchers should consider the overall implications of imputing data compared to having item nonresponse. There are multiple ways to impute data. We recommend reviewing other resources like Kim and Shao (2021) for more information. Example: Number of pets in a household Let’s return to the question we created to ask about animal preference. The “other specify” invites respondents to specify the type of animal they prefer to have as a pet. If respondents entered answers such as “puppy,” “turtle,” “rabit,” “rabbit,” “bunny,” “ant farm,” “snake,” “Mr. Purr,” then researchers may wish to categorize these write-in responses to help with analysis. In this example, “puppy” could be assumed to be a reference to a “Dog”, and could be recoded there. The misspelling of “rabit” could be coded along with “rabbit” and “bunny” into a single category of “Bunny or Rabbit”. These are relatively standard decisions that a researcher could make. The remaining write-in responses could be categorized in a few different ways. “Mr. Purr,” which may be someone’s reference to their own cat, could be recoded as “Cat”, or it could remain as “Other” or some category that is “Unknown”. Depending on the number of responses related to each of the others, they could all be combined into a single “Other” category, or maybe categories such as “Reptiles” or “Insects” could be created. Each of these decisions may impact the interpretation of the data, so our researchers should document the types of responses that fall into each of the new categories and any decisions made. 2.6.2 Weighting We can address some of the error sources identified in the previous sections using weighting. During the weighting process, weights are created for each respondent record. These weights allow the survey responses to generalize to the population. A weight, generally, reflects how many units in the population each respondent represents, and, often the weight is constructed such that the sum of the weights is the size of the population. Weights can address coverage, sampling, and nonresponse errors. Many published surveys include an “analysis weight” variable that combines these adjustments. However, weighting itself can also introduce adjustment error, so researchers need to balance which types of errors should be corrected with weighting. The construction of weights is outside the scope of this book, and researchers should reference other materials if interested in constructing their own (Valliant and Dever 2018). Instead, this book assumes the survey has been completed, weights are constructed, and data is available to users. Example: Number of pets in a household In the simple example of our survey, we decided to obtain a random sample from each state to select our sample members. Knowing this sampling design, our researcher can include selection weights for analysis that account for how the sample members were selected for the survey. Additionally, the sampling frame may have the type of building associated with each address, so we could include the building type as a potential nonresponse weighting variable, along with some interviewer observations that may be related to our research topic of the average number of pets in a household. Combining these weights, we can create an analytic weight that researchers need to use when analyzing the data. 2.6.3 Disclosure Before data is released publicly, researchers need to ensure that individual respondents can not be identified by the data when confidentiality is required. There are a variety of different methods that can be used. Here we describe a few of the most commonly used: Data swapping: Researchers may swap specific data values across different respondents so that it does not impact insights from the data but ensures that specific individuals cannot be identified. Top/bottom coding: Researchers may choose top or bottom coding to mask extreme values. For example, researchers may top-code income values such that households with income greater than $500,000 are coded as “$500,000 or more” with other incomes being presented as integers between $0 and $499,999. This can impact analyses at the tails of the distribution. Coarsening: Researchers may use coarsening to mask unique values. For example, a survey question may ask for a precise income but the public data may include data as a categorical variable. Another example commonly used in survey practice is to coarsen geographic variables. Data collectors likely know the precise address of sample members but the public data may only include the state or even region of respondents. Perturbation: Researchers may add random noise to outcomes. As with swapping, this is done so that it does not impact insights from the data but ensures that specific individuals cannot be identified. There is as much art as there is science to the methods used for disclosure. In the survey documentation, researchers will only provide high-level comments about the disclosure and not specific details. This ensures nobody can reverse the disclosure and thus identify individuals. For more information on different disclosure methods, please see Skinner (2009) and the AAPOR Standards. 2.6.4 Documentation Documentation is a critical step of the survey life cycle. Researchers systematically record all the details, decisions, procedures, and methodologies to ensure transparency, reproducibility, and the overall quality of survey research. Proper documentation allows analysts to understand, reproduce, and evaluate the study’s methods and findings. Chapter 3 dives into how analysts should use survey data documentation. 2.7 Post-survey data analysis and reporting After completing the survey life cycle, the data is ready for analysts to use. The rest of this book continues from this point. For more information on the survey life cycle, please explore the references cited throughout this chapter. References Biemer, Paul P. 2010. “Total Survey Error: Design, Implementation, and Evaluation.” Public Opinion Quarterly 74 (5): 817–48. https://doi.org/10.1093/poq/nfq058. Biemer, Paul P., and Lars E. Lyberg. 2003. Introduction to Survey Quality. John Wiley &amp; Sons. Biemer, Paul P., Joe Murphy, Stephanie Zimmer, Chip Berry, Grace Deng, and Katie Lewis. 2017. “Using Bonus Monetary Incentives to Encourage Web Response in Mixed-Mode Household Surveys.” Journal of Survey Statistics and Methodology 6 (2): 240–61. https://doi.org/10.1093/jssam/smx015. Bradburn, Norman M., Seymour Sudman, and Brian Wansink. 2004. Asking Questions: The Definitive Guide to Questionnaire Design. 2nd Edition. Jossey-Bass. DeLeeuw, Edith D. 2005. “To Mix or Not to Mix Data Collection Modes in Surveys.” Journal of Official Statistics 21: 233–55. ———. 2018. “Mixed-Mode: Past, Present, and Future.” Survey Research Methods 12 (2): 75–89. https://doi.org/10.18148/srm/2018.v12i2.7402. Dillman, Don A, Jolene D Smyth, and Leah Melani Christian. 2014. Internet, Phone, Mail, and Mixed-Mode Surveys: The Tailored Design Method. John Wiley &amp; Sons. Fowler, Floyd J, and Thomas W. Mangione. 1989. Standardized Survey Interviewing. SAGE. Groves, Robert M, Floyd J Fowler Jr, Mick P Couper, James M Lepkowski, Eleanor Singer, and Roger Tourangeau. 2009. Survey Methodology. John Wiley &amp; Sons. Harter, Rachel, Michael P Battaglia, Trent D Buskirk, Don A Dillman, Ned English, Mansour Fahimi, Martin R Frankel, et al. 2016. “Address-Based Sampling.” Task force report. American Association for Public Opinion Research; https://aapor.org/wp-content/uploads/2022/11/AAPOR_Report_1_7_16_CLEAN-COPY-FINAL-2.pdf. Kim, Jae Kwang, and Jun Shao. 2021. Statistical Methods for Handling Incomplete Data. Chapman &amp; Hall/CRC Press. Schouten, Barry, Andy Peytchev, and James Wagner. 2018. Adaptive Survey Design. Chapman &amp; Hall/CRC Press. Skinner, Chris. 2009. “Chapter 15: Statistical Disclosure Control for Survey Data.” In Handbook of Statistics: Sample Surveys: Design, Methods and Applications, edited by C. R. Rao, 381–96. Elsevier B.V. Tourangeau, Roger, Mick P. Couper, and Frederick Conrad. 2004. “Spacing, Position, and Order: Interpretive Heuristics for Visual Features of Survey Questions.” Public Opinion Quarterly 68: 368–93. Tourangeau, Roger, Lance J. Rips, and Kenneth Rasinski. 2000. Psychology of Survey Response. Cambridge University Press. Valliant, Richard, and Jill A. Dever. 2018. Survey Weights: A Step-by-Step Guide to Calculation. Stata Press. Valliant, Richard, Jill A Dever, and Frauke Kreuter. 2013. Practical Tools for Designing and Weighting Survey Samples. Vol. 1. Springer. Other modes such as using mobile apps or text messaging can also be considered, but at the time of publication, have smaller reach or are better for longitudinal studies (i.e., surveying the same individuals over many time periods of a single study).↩︎ "],["c03-survey-data-documentation.html", "Chapter 3 Survey data documentation 3.1 Introduction 3.2 Types of survey documentation 3.3 Missing data coding 3.4 Example: American National Election Studies (ANES) 2020 survey documentation", " Chapter 3 Survey data documentation 3.1 Introduction Survey documentation helps us prepare before we look at the actual survey data. The documentation includes technical guides, questionnaires, codebooks, errata, and other useful resources. By taking the time to review these materials, we can gain a comprehensive understanding of the survey data (including research and design decisions discussed in Chapters 2 and 10) and conduct our analysis more effectively. Survey documentation can vary in organization, type, and ease of use. The information may be stored in any format - PDFs, Excel spreadsheets, Word documents, and so on. Some surveys bundle documentation together, such as providing the codebook and questionnaire in a single document. Others keep them in separate files. Despite these variations, we can gain a general understanding of the documentation types and what aspects to focus on in each. 3.2 Types of survey documentation 3.2.1 Technical documentation The technical documentation, also known as user guides or methodology/analysis guides, highlights the variables necessary to specify the survey design. We recommend concentrating on these key sections: Introduction: The introduction orients us to the survey. This section provides the project’s background, the study’s purpose, and the main research questions. Study design: The study design section describes how researchers prepared and administered the survey. Sample: The sample section describes the sample frame, any known sampling errors, and the limitations of the sample. This section can contain recommendations on how to use sampling weights. Look for weight information, whether the survey design contains strata, clusters/PSUs, or replicate weights. Also look for population sizes, finite population correction, or replicate weight scaling information. Additional detail on sample designs is available in Chapter 10. Notes on fielding: Any additional notes on fielding, such as response rates, may be found in the technical documentation. The technical documentation may include other helpful resources. Some technical documentation includes syntax for SAS, SUDAAN, Stata, and/or R, so we do not have to create this code from scratch. 3.2.2 Questionnaires A questionnaire is a series of questions used to collect information from people in a survey. It can ask about opinions, behaviors, demographics, or even just numbers like the count of lightbulbs, square footage, or farm size. Questionnaires can employ different types of questions, such as closed-ended (e.g., select one or check all that apply), open-ended (e.g., numeric or text), Likert scales (e.g., a 5- or 7-point scale specifying a respondent’s level of agreement to a statement), or ranking questions (e.g., a list of options that a respondent ranks by preference). It may randomize the display order of responses or include instructions that help respondents understand the questions. A survey may have one questionnaire or multiple, depending on its scale and scope. The questionnaire is another important resource for understanding and interpreting the survey data (see Section 2.4.3), and we should use it alongside any analysis. It provides details about each of the questions asked in the survey, such as question name, question wording, response options, skip logic, randomizations, display specification, mode differences, and the universe (the subset of respondents that were asked a question). Below, in Figure 3.1, we show an example from the ANES 2020 questionnaire (American National Election Studies 2021). The figure shows a question’s question name (POSTVOTE_RVOTE), description (Did R Vote?), full wording of the question and responses, response order, universe, question logic (this question was only asked if vote_pre = 0), and other specifications. The section also includes the variable name, which we can link to the codebook. FIGURE 3.1: ANES 2020 Questionnaire Example The content and structure of questionnaires vary depending on the specific survey. For instance, question names may be informative (like the ANES example above), sequential, or denoted by a code. In some cases, surveys may not use separate names for questions and variables. Figure 3.2 shows an example from the Behavioral Risk Factor Surveillance System (BRFSS) questionnaire that shows a sequential question number and a coded variable name (as opposed to a question name) (Centers for Disease Control and Prevention (CDC) 2021). FIGURE 3.2: BRFSS 2021 Questionnaire Example We should factor in the details of a survey when conducting our analyses. For example, surveys that use various modes (e.g., web and mail) may have differences in question wording or skip logic, as web surveys can include fills or automate skip logic. These variations could warrant separate analyses for each mode. 3.2.3 Codebooks While a questionnaire provides information about the questions posed to respondents, the codebook explains how the survey data was coded and recorded. It lists details such as variable names, variable labels, variable meanings, codes for missing data, value labels, and value types (whether categorical or continuous, etc.). The codebook helps us understand and use the variables appropriately in our analysis. In particular, the codebook (as opposed to the questionnaire) often includes information on missing data. Note that the term data dictionary is sometimes used interchangeably with codebook, but a data dictionary may include more details on the structure and elements of the data. Figure 3.3 is a question from the ANES 2020 codebook (American National Election Studies 2022). This section indicates a particular variable’s name (V202066), question wording, value labels, universe, and associated survey question (POSTVOTE_RVOTE). FIGURE 3.3: ANES 2020 Codebook Example Reviewing the questionnaires and codebooks in parallel can clarify how to interpret the variables (Figures 3.1 and 3.3), as questions and variables do not always correspond directly to each other in a one-to-one mapping. A single question may have multiple associated variables, or a single variable may summarize multiple questions. 3.2.4 Errata An erratum (singular) or errata (plural) is a document that lists errors found in a publication or dataset. The purpose of an erratum is to correct or update inaccuracies in the original document. Examples of errata include: Issuing a corrected data table after realizing a typo or mistake in a table cell Reporting incorrectly programmed skips in an electronic survey where questions are skipped by the respondent when they should not have been The 2004 ANES dataset released an erratum, notifying analysts to remove a specific row from the data file due to the inclusion of a respondent who should not have been part of the sample. Adhering to an issued erratum helps us increase the accuracy and reliability of analysis. 3.2.5 Additional resources Survey documentation may include additional material, such as interviewer instructions or “show cards” provided to respondents during interviewer-administered surveys to help respondents answer questions. Explore the survey website to find out what resources were used and in what contexts. 3.3 Missing data coding For some observations in a dataset, there may be missing data. This can be by design or from nonresponse, and these concepts are detailed in Chapter 11. In that chapter, we also discuss how to analyze data with missing data. In this section, we discuss how to understand documentation related to missing data. The survey documentation, often the codebook, represents the missing data with a code. The codebook may list different codes depending on why certain data is missing. In the example of variable V202066 from the ANES (Figure 3.3), -9 represents “Refused,” -7 means that the response was deleted due to an incomplete interview, -6 means that there is no response because there was no follow-up interview, and -1 means “Inapplicable” (due to the designed skip pattern). As another example, there may be a summary variable that describes the missingness of a set of variables - particularly with “select all that apply” or “multiple response” questions. In the National Crime Victimization Survey (NCVS), respondents who are victims of a crime and saw the offender are asked if the offender have a weapon and then asked what the type of weapon was. This part of the questionnaire from 2021 is shown in Figure 3.4. FIGURE 3.4: Excerpt from the NCVS 2020-2021 Crime Incident Report - Weapon Type The NCVS codebook includes coding for all multiple response variables of a “lead in” variable that summarizes the individual options. For question 23a on the weapon type, the lead in variable is V4050 which is shown in 3.5. This variable is then followed by a set of variables for each weapon type. An example of one of the individual variables from the codebook, the handgun, is shown in 3.6. We will dive in more to this example in Chapter 11 of how to analyze this variable. FIGURE 3.5: Excerpt from the NCVS 2021 Codebook for V4050 - LI WHAT WAS WEAPON FIGURE 3.6: Excerpt from the NCVS 2021 Codebook for V4051 - C WEAPON: HAND GUN When data is read into R, some values may be system missing, that is they are coded as NA even if that is not evident in a codebook. We will discuss in Chapter 11 how to analyze data with NA values and review how R handles missing data in calculations. 3.4 Example: American National Election Studies (ANES) 2020 survey documentation Let’s look at the survey documentation for the American National Election Studies (ANES) 2020. The survey website is located at https://electionstudies.org/data-center/2020-time-series-study/. Navigating to “User Guide and Codebook” (American National Election Studies 2022), we can download the PDF that contains the survey documentation, titled “ANES 2020 Time Series Study Full Release: User Guide and Codebook”. Do not be daunted by the 796-page PDF. We will focus on the most critical information. Introduction The first section in the User Guide explains that the ANES 2020 Times Series Study continues a series of election surveys conducted since 1948. These surveys contain data on public opinion and voting behavior in the U.S. presidential elections. The introduction also includes information about the modes used for data collection (web, live video interviewing, or CATI). Additionally, there is a summary of the number of pre-election interviews (8,280) and post-election re-interviews (7,449). Sample design and respondent recruitment The section “Sample Design and Respondent Recruitment” provides more detail about the survey’s sequential mixed-mode design. All three modes were conducted one after another and not at the same time. Additionally, it indicates that for the 2020 survey, they resampled all respondents who participated in 2016 ANES, along with a newly-drawn cross-section: The target population for the fresh cross-section was the 231 million non-institutional U.S. citizens aged 18 or older living in the 50 U.S. states or the District of Columbia. The document continues with more details on the sample groups. Data analysis, weights, and variance estimation The section “Data Analysis, Weights, and Variance Estimation” includes information on weights and strata/cluster variables. Reading through, we can find the full sample weight variables: For analysis of the complete set of cases using pre-election data only, including all cases and representative of the 2020 electorate, use the full sample pre-election weight, V200010a. For analysis including post-election data for the complete set of participants (i.e., analysis of post-election data only or a combination of pre- and post-election data), use the full sample post-election weight, V200010b. Additional weights are provided for analysis of subsets of the data… The document provides more information about the variables, summarized in Table 3.1. TABLE 3.1: Weight and variance information for ANES For weight Use variance unit/PSU/cluster and use variance stratum V200010a V200010c V200010d V200010b V200010c V200010d Methodology The user guide mentions a supplemental document called “How to Analyze ANES Survey Data” (DeBell 2010) as a ‘how-to guide’ for analyzing the data. In this document, we learn more about the weights, where we learn that they sum to the sample size and not the population. If our goal is to calculate estimates for the entire U.S. population instead of just the sample, we must adjust the weights to the U.S. population. To create accurate weights for the population, we need to determine the total population size at the time of the survey. Let’s review the “Sample Design and Respondent Recruitment” section for more details: The target population for the fresh cross-section was the 231 million non-institutional U.S. citizens aged 18 or older living in the 50 U.S. states or the District of Columbia. The documentation suggests that the population should equal around 231 million, but this is a very imprecise count. Upon further investigation in the available resources, we can find the methodology file titled “Methodology Report for the ANES 2020 Time Series Study” (DeBell, Matthew and Amsbary, Michelle and Brader, Ted and Brock, Shelley and Good, Cindy and Kamens, Justin and Maisel, Natalya and Pinto, Sarah 2022). This file states that we can use the population total from the Current Population Survey (CPS), a monthly survey sponsored by the U.S. Census Bureau and the U.S. Bureau of Labor Statistics. The CPS provides a more accurate population estimate for a specific month. Therefore, we can use the CPS to get the total population number for March 2020, the time in which the ANES was conducted. Chapter 4 goes into detailed instructions on how to calculate and adjust this value in the data. References American National Election Studies. 2021. “ANES 2020 Time Series Study: Pre-Election and Post-Election Survey Questionnaires.” https://electionstudies.org/wp-content/uploads/2021/07/anes_timeseries_2020_questionnaire_20210719.pdf. ———. 2022. “ANES 2020 Time Series Study Full Release: User Guide and Codebook.” https://electionstudies.org/wp-content/uploads/2022/02/anes_timeseries_2020_userguidecodebook_20220210.pdf. Centers for Disease Control and Prevention (CDC). 2021. “Behavioral Risk Factor Surveillance System Survey Questionnaire.” U.S. Department of Health; Human Services, Centers for Disease Control; Prevention; https://www.cdc.gov/brfss/questionnaires/pdf-ques/2021-BRFSS-Questionnaire-1-19-2022-508.pdf. DeBell, Matthew. 2010. “How to Analyze ANES Survey Data.” ANES Technical Report Series nes012492. Palo Alto, CA: Stanford University; Ann Arbor, MI: the University of Michigan; https://electionstudies.org/wp-content/uploads/2018/05/HowToAnalyzeANESData.pdf. DeBell, Matthew and Amsbary, Michelle and Brader, Ted and Brock, Shelley and Good, Cindy and Kamens, Justin and Maisel, Natalya and Pinto, Sarah. 2022. “Methodology Report for the ANES 2020 Time Series Study.” https://electionstudies.org/wp-content/uploads/2022/08/anes_timeseries_2020_methodology_report.pdf. "],["c04-getting-started.html", "Chapter 4 Getting started 4.1 Introduction 4.2 Setup 4.3 Survey analysis process 4.4 Similarities between {dplyr} and {srvyr} functions", " Chapter 4 Getting started 4.1 Introduction This chapter provides an overview of the packages, data, and design objects we use frequently throughout this book. As mentioned in Chapter 2, understanding how a survey was conducted helps us make sense of the results and interpret findings. Therefore, we provide background on the datasets used in examples and exercises. Next, we walk through how to create the survey design objects necessary to begin analysis. Finally, we provide an overview of the {srvyr} package and the steps needed for analysis. If you have questions or face issues while going through the book, please report them in the book’s GitHub repository. 4.2 Setup The Setup section provides details on the required packages and data, as well as the steps for preparing survey design objects. For a streamlined learning experience, we recommend taking the time to walk through the code provided and making sure everything is properly set up. 4.2.1 Packages We use several packages throughout the book, but let’s install and load specific ones for this chapter. Many functions in the examples and exercises are from three packages: {tidyverse}, {survey}, and {srvyr}. If they are not already installed, use the code below. The {tidyverse} and {survey} packages can both be installed from the Comprehensive R Archive Network (CRAN) (Lumley 2010; Wickham et al. 2019). We use the GitHub development version of {srvyr} because of its additional functionality compared to the one on CRAN (Freedman Ellis and Schneider 2023). Install the package directly from GitHub using the {remotes} package: install.packages(c(&quot;tidyverse&quot;, &quot;survey&quot;, &quot;remotes&quot;)) remotes::install_github(&quot;gergness/srvyr&quot;) We bundled the datasets used in the book in an R package, {srvyrexploR}. Install it directly from GitHub using the {remotes} package: remotes::install_github(&quot;tidy-survey-r/srvyrexploR&quot;) After installing these packages, load them using the library() function: library(tidyverse) library(survey) library(srvyr) library(srvyrexploR) The packages {broom}, {gt}, and {gtsummary} play a role in displaying output and creating formatted tables Robinson, Hayes, and Couch (2023). Install them with the provided code2: install.packages(c(&quot;gt&quot;, &quot;gtsummary&quot;)) After installing these packages, load them using the library() function: library(broom) library(gt) library(gtsummary) Install and load the {censusapi} package to access the Current Population Survey (CPS), which we use to ensure accurate weighting of a key dataset in the book (Recht 2024). Run the code below to install {censusapi}: install.packages(&quot;censusapi&quot;) After installing this package, load it using the library() function: library(censusapi) Note that the {censusapi} package requires a Census API key, available for free from the U.S. Census Bureau website (refer to the package documentation for more information). We recommend storing the Census API key in our R environment instead of directly in the code. After obtaining the API key, save it in your R environment by running Sys.setenv(): Sys.setenv(CENSUS_KEY=&quot;YOUR_API_KEY_HERE&quot;) Then, restart the R session. Once the Census API key is stored, we can retrieve it in our R code with Sys.getenv(\"CENSUS_KEY\"). There are a few other packages used in the book in limited frequency. We list them in the Prerequisite boxes at the beginning of each chapter. As we work through the book, make sure to check the Prerequisite box and install any missing packages before proceeding. 4.2.2 Data As mentioned above, the {srvyrexploR} package contains the datasets used in the book. Once installed and loaded, explore the documentation using the help() function. Read the descriptions of the datasets to understand what they contain: help(package = &quot;srvyrexploR&quot;) This book uses two main datasets: the American National Election Studies (ANES – DeBell 2010) and the Residential Energy Consumption Survey (RECS – U.S. Energy Information Administration 2023b) which are included as anes_2020 and recs_2020, respectively, in the {srvyrexploR} package. American National Election Studies (ANES) Data The ANES is a study that collects data from election surveys dating back to 1948. These surveys contain information on public opinion and voting behavior in U.S. presidential elections and some midterm elections3. They cover topics such as party affiliation, voting choice, and level of trust in the government. The 2020 survey, the data we use in the book, was fielded online, through live video interviews, or via computer-assisted telephone interviews (CATI). When working with new survey data, analysts should review the survey documentation (see Chapter 3) to understand the data collection methods. The original ANES data contains variables starting with V20 (DeBell 2010), so to assist with our analysis throughout the book, we created descriptive variable names. For example, the respondent’s age is now in a variable called Age, and gender is in a variable called Gender. These descriptive variables are included in the {srvyrexploR} package, and Table 4.1 displays the list of these renamed variables. A complete overview of all variables can be found in Appendix B. #usgbmusvau table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #usgbmusvau thead, #usgbmusvau tbody, #usgbmusvau tfoot, #usgbmusvau tr, #usgbmusvau td, #usgbmusvau th { border-style: none; } #usgbmusvau p { margin: 0; padding: 0; } #usgbmusvau .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #usgbmusvau .gt_caption { padding-top: 4px; padding-bottom: 4px; } #usgbmusvau .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #usgbmusvau .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #usgbmusvau .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #usgbmusvau .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #usgbmusvau .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #usgbmusvau .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #usgbmusvau .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #usgbmusvau .gt_column_spanner_outer:first-child { padding-left: 0; } #usgbmusvau .gt_column_spanner_outer:last-child { padding-right: 0; } #usgbmusvau .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #usgbmusvau .gt_spanner_row { border-bottom-style: hidden; } #usgbmusvau .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #usgbmusvau .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #usgbmusvau .gt_from_md > :first-child { margin-top: 0; } #usgbmusvau .gt_from_md > :last-child { margin-bottom: 0; } #usgbmusvau .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #usgbmusvau .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #usgbmusvau .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #usgbmusvau .gt_row_group_first td { border-top-width: 2px; } #usgbmusvau .gt_row_group_first th { border-top-width: 2px; } #usgbmusvau .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #usgbmusvau .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #usgbmusvau .gt_first_summary_row.thick { border-top-width: 2px; } #usgbmusvau .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #usgbmusvau .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #usgbmusvau .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #usgbmusvau .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #usgbmusvau .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #usgbmusvau .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #usgbmusvau .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #usgbmusvau .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #usgbmusvau .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #usgbmusvau .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #usgbmusvau .gt_left { text-align: left; } #usgbmusvau .gt_center { text-align: center; } #usgbmusvau .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #usgbmusvau .gt_font_normal { font-weight: normal; } #usgbmusvau .gt_font_bold { font-weight: bold; } #usgbmusvau .gt_font_italic { font-style: italic; } #usgbmusvau .gt_super { font-size: 65%; } #usgbmusvau .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #usgbmusvau .gt_asterisk { font-size: 100%; vertical-align: 0; } #usgbmusvau .gt_indent_1 { text-indent: 5px; } #usgbmusvau .gt_indent_2 { text-indent: 10px; } #usgbmusvau .gt_indent_3 { text-indent: 15px; } #usgbmusvau .gt_indent_4 { text-indent: 20px; } #usgbmusvau .gt_indent_5 { text-indent: 25px; } TABLE 4.1: List of created variables in the ANES Data Variable Name CaseID InterviewMode Weight VarUnit Stratum CampaignInterest EarlyVote2020 VotedPres2016 VotedPres2016_selection PartyID TrustGovernment TrustPeople Age AgeGroup Education RaceEth Gender Income Income7 VotedPres2020 VotedPres2020_selection Before beginning an analysis, it is useful to view the data to understand the available variables. The dplyr::glimpse() function produces a list of all variables, their types (e.g., function, double), and a few example values. Below, we remove variables containing a “V” followed by numbers with select(-matches(\"^V\\\\d\")) before using glimpse() to get a quick overview of the data with descriptive variable names: anes_2020 %&gt;% select(-matches(&quot;^V\\\\d&quot;)) %&gt;% glimpse() ## Rows: 7,453 ## Columns: 21 ## $ CaseID &lt;dbl&gt; 200015, 200022, 200039, 200046, 200053… ## $ InterviewMode &lt;fct&gt; Web, Web, Web, Web, Web, Web, Web, Web… ## $ Weight &lt;dbl&gt; 1.0057, 1.1635, 0.7687, 0.5210, 0.9658… ## $ VarUnit &lt;fct&gt; 2, 2, 1, 2, 1, 2, 1, 2, 2, 2, 1, 1, 2,… ## $ Stratum &lt;fct&gt; 9, 26, 41, 29, 23, 37, 7, 37, 32, 41, … ## $ CampaignInterest &lt;fct&gt; Somewhat interested, Not much interest… ## $ EarlyVote2020 &lt;fct&gt; NA, NA, NA, NA, NA, NA, NA, NA, Yes, N… ## $ VotedPres2016 &lt;fct&gt; Yes, Yes, Yes, Yes, Yes, No, Yes, No, … ## $ VotedPres2016_selection &lt;fct&gt; Trump, Other, Clinton, Clinton, Trump,… ## $ PartyID &lt;fct&gt; Strong republican, Independent, Indepe… ## $ TrustGovernment &lt;fct&gt; Never, Never, Some of the time, About … ## $ TrustPeople &lt;fct&gt; About half the time, Some of the time,… ## $ Age &lt;dbl&gt; 46, 37, 40, 41, 72, 71, 37, 45, 70, 43… ## $ AgeGroup &lt;fct&gt; 40-49, 30-39, 40-49, 40-49, 70 or olde… ## $ Education &lt;fct&gt; Bachelor&#39;s, Post HS, High school, Post… ## $ RaceEth &lt;fct&gt; &quot;Hispanic&quot;, &quot;Asian, NH/PI&quot;, &quot;White&quot;, &quot;… ## $ Gender &lt;fct&gt; Male, Female, Female, Male, Male, Fema… ## $ Income &lt;fct&gt; &quot;$175,000-249,999&quot;, &quot;$70,000-74,999&quot;, … ## $ Income7 &lt;fct&gt; $125k or more, $60k to &lt; 80k, $100k to… ## $ VotedPres2020 &lt;fct&gt; NA, Yes, Yes, Yes, Yes, Yes, Yes, NA, … ## $ VotedPres2020_selection &lt;fct&gt; NA, Other, Biden, Biden, Trump, Biden,… From the output, we can see there are 7,453 rows and 21 variables in the ANES data. This output also indicates that most of the variables are factors (e.g., InterviewMode), while a few variables are in double (numeric) format (e.g., Age). Residential Energy Consumption Survey (RECS) Data RECS is a study that measures energy consumption and expenditure in American households. Funded by the Energy Information Administration, the RECS data are collected through interviews with household members and energy suppliers. These interviews take place in person, over the phone, via mail, and on the web with modes changing over time. The survey has been fielded 14 times between 1950 and 2020. It includes questions about appliances, electronics, heating, air conditioning (A/C), temperatures, water heating, lighting, energy bills, respondent demographics, and energy assistance. As mentioned above, analysts should read the survey documentation (see Chapter 3) to understand how the data was collected and implemented. Table 4.2 displays the list of variables in the RECS data (not including the weights, which start with NWEIGHT and will be described in more detail in Chapter 10). An overview of all variables can be found in Appendix C. #qqdbxxsqdu table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #qqdbxxsqdu thead, #qqdbxxsqdu tbody, #qqdbxxsqdu tfoot, #qqdbxxsqdu tr, #qqdbxxsqdu td, #qqdbxxsqdu th { border-style: none; } #qqdbxxsqdu p { margin: 0; padding: 0; } #qqdbxxsqdu .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #qqdbxxsqdu .gt_caption { padding-top: 4px; padding-bottom: 4px; } #qqdbxxsqdu .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #qqdbxxsqdu .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #qqdbxxsqdu .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #qqdbxxsqdu .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #qqdbxxsqdu .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #qqdbxxsqdu .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #qqdbxxsqdu .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #qqdbxxsqdu .gt_column_spanner_outer:first-child { padding-left: 0; } #qqdbxxsqdu .gt_column_spanner_outer:last-child { padding-right: 0; } #qqdbxxsqdu .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #qqdbxxsqdu .gt_spanner_row { border-bottom-style: hidden; } #qqdbxxsqdu .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #qqdbxxsqdu .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #qqdbxxsqdu .gt_from_md > :first-child { margin-top: 0; } #qqdbxxsqdu .gt_from_md > :last-child { margin-bottom: 0; } #qqdbxxsqdu .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #qqdbxxsqdu .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #qqdbxxsqdu .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #qqdbxxsqdu .gt_row_group_first td { border-top-width: 2px; } #qqdbxxsqdu .gt_row_group_first th { border-top-width: 2px; } #qqdbxxsqdu .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #qqdbxxsqdu .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #qqdbxxsqdu .gt_first_summary_row.thick { border-top-width: 2px; } #qqdbxxsqdu .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #qqdbxxsqdu .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #qqdbxxsqdu .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #qqdbxxsqdu .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #qqdbxxsqdu .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #qqdbxxsqdu .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #qqdbxxsqdu .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #qqdbxxsqdu .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #qqdbxxsqdu .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #qqdbxxsqdu .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #qqdbxxsqdu .gt_left { text-align: left; } #qqdbxxsqdu .gt_center { text-align: center; } #qqdbxxsqdu .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #qqdbxxsqdu .gt_font_normal { font-weight: normal; } #qqdbxxsqdu .gt_font_bold { font-weight: bold; } #qqdbxxsqdu .gt_font_italic { font-style: italic; } #qqdbxxsqdu .gt_super { font-size: 65%; } #qqdbxxsqdu .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #qqdbxxsqdu .gt_asterisk { font-size: 100%; vertical-align: 0; } #qqdbxxsqdu .gt_indent_1 { text-indent: 5px; } #qqdbxxsqdu .gt_indent_2 { text-indent: 10px; } #qqdbxxsqdu .gt_indent_3 { text-indent: 15px; } #qqdbxxsqdu .gt_indent_4 { text-indent: 20px; } #qqdbxxsqdu .gt_indent_5 { text-indent: 25px; } TABLE 4.2: List of Variables in the RECS Data Variable Name DOEID ClimateRegion_BA Urbanicity Region REGIONC Division STATE_FIPS state_postal state_name HDD65 CDD65 HDD30YR CDD30YR HousingUnitType YearMade TOTSQFT_EN TOTHSQFT TOTCSQFT ZTOTSQFT_EN ZYearMade ZHousingUnitType SpaceHeatingUsed ZSpaceHeatingUsed ACUsed ZACUsed ZACBehavior HeatingBehavior WinterTempDay WinterTempAway WinterTempNight ACBehavior SummerTempDay SummerTempAway SummerTempNight ZHeatingBehavior ZWinterTempAway ZSummerTempAway ZWinterTempDay ZSummerTempDay ZWinterTempNight ZSummerTempNight BTUEL DOLLAREL ZBTUEL BTUNG DOLLARNG ZBTUNG BTULP DOLLARLP ZBTULP BTUFO DOLLARFO ZBTUFO BTUWOOD ZBTUWOOD TOTALBTU TOTALDOL Before starting an analysis, we recommend viewing the data to understand the types of data and variables that are included. The dplyr::glimpse() function produces a list of all variables, the type of the variable (e.g., function, double), and a few example values. Below, we remove the weight variables with select(-matches(\"^NWEIGHT\")) before using glimpse() to get a quick overview of the data: recs_2020 %&gt;% select(-matches(&quot;^NWEIGHT&quot;)) %&gt;% glimpse() ## Rows: 18,496 ## Columns: 57 ## $ DOEID &lt;dbl&gt; 1e+05, 1e+05, 1e+05, 1e+05, 1e+05, 1e+05, 1e… ## $ ClimateRegion_BA &lt;fct&gt; Mixed-Dry, Mixed-Humid, Mixed-Dry, Mixed-Hum… ## $ Urbanicity &lt;fct&gt; Urban Area, Urban Area, Urban Area, Urban Ar… ## $ Region &lt;fct&gt; West, South, West, South, Northeast, South, … ## $ REGIONC &lt;chr&gt; &quot;WEST&quot;, &quot;SOUTH&quot;, &quot;WEST&quot;, &quot;SOUTH&quot;, &quot;NORTHEAST… ## $ Division &lt;fct&gt; Mountain South, West South Central, Mountain… ## $ STATE_FIPS &lt;chr&gt; &quot;35&quot;, &quot;05&quot;, &quot;35&quot;, &quot;45&quot;, &quot;34&quot;, &quot;48&quot;, &quot;40&quot;, &quot;2… ## $ state_postal &lt;fct&gt; NM, AR, NM, SC, NJ, TX, OK, MS, DC, AZ, CA, … ## $ state_name &lt;fct&gt; New Mexico, Arkansas, New Mexico, South Caro… ## $ HDD65 &lt;dbl&gt; 3844, 3766, 3819, 2614, 4219, 901, 3148, 182… ## $ CDD65 &lt;dbl&gt; 1679, 1458, 1696, 1718, 1363, 3558, 2128, 23… ## $ HDD30YR &lt;dbl&gt; 4451, 4429, 4500, 3229, 4896, 1150, 3564, 26… ## $ CDD30YR &lt;dbl&gt; 1027, 1305, 1010, 1653, 1059, 3588, 2043, 21… ## $ HousingUnitType &lt;fct&gt; Single-family detached, Apartment: 5 or more… ## $ YearMade &lt;ord&gt; 1970-1979, 1980-1989, 1960-1969, 1980-1989, … ## $ TOTSQFT_EN &lt;dbl&gt; 2100, 590, 900, 2100, 800, 4520, 2100, 900, … ## $ TOTHSQFT &lt;dbl&gt; 2100, 590, 900, 2100, 800, 3010, 1200, 900, … ## $ TOTCSQFT &lt;dbl&gt; 2100, 590, 900, 2100, 800, 3010, 1200, 0, 50… ## $ ZTOTSQFT_EN &lt;fct&gt; Not imputed, Not imputed, Not imputed, Not i… ## $ ZYearMade &lt;fct&gt; Not imputed, Not imputed, Not imputed, Not i… ## $ ZHousingUnitType &lt;fct&gt; Not imputed, Not imputed, Not imputed, Not i… ## $ SpaceHeatingUsed &lt;lgl&gt; TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TR… ## $ ZSpaceHeatingUsed &lt;fct&gt; Not imputed, Not imputed, Not imputed, Not i… ## $ ACUsed &lt;lgl&gt; TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, FA… ## $ ZACUsed &lt;fct&gt; Not imputed, Not imputed, Not imputed, Not i… ## $ ZACBehavior &lt;fct&gt; Not imputed, Imputed, Not imputed, Not imput… ## $ HeatingBehavior &lt;fct&gt; Set one temp and leave it, Turn on or off as… ## $ WinterTempDay &lt;dbl&gt; 70, 70, 69, 68, 68, 76, 74, 70, 68, 70, 72, … ## $ WinterTempAway &lt;dbl&gt; 70, 65, 68, 68, 68, 76, 65, 70, 60, 70, 70, … ## $ WinterTempNight &lt;dbl&gt; 68, 65, 67, 68, 68, 68, 74, 68, 62, 68, 72, … ## $ ACBehavior &lt;fct&gt; Set one temp and leave it, Turn on or off as… ## $ SummerTempDay &lt;dbl&gt; 71, 68, 70, 72, 72, 69, 68, NA, 72, 74, 77, … ## $ SummerTempAway &lt;dbl&gt; 71, 68, 68, 72, 72, 74, 70, NA, 76, 74, 77, … ## $ SummerTempNight &lt;dbl&gt; 71, 68, 68, 72, 72, 68, 70, NA, 68, 72, 77, … ## $ ZHeatingBehavior &lt;fct&gt; Not imputed, Not imputed, Not imputed, Not i… ## $ ZWinterTempAway &lt;fct&gt; Not imputed, Not imputed, Not imputed, Not i… ## $ ZSummerTempAway &lt;fct&gt; Not imputed, Not imputed, Not imputed, Not i… ## $ ZWinterTempDay &lt;fct&gt; Not imputed, Not imputed, Not imputed, Not i… ## $ ZSummerTempDay &lt;fct&gt; Not imputed, Not imputed, Not imputed, Not i… ## $ ZWinterTempNight &lt;fct&gt; Not imputed, Not imputed, Not imputed, Not i… ## $ ZSummerTempNight &lt;fct&gt; Not imputed, Not imputed, Not imputed, Not i… ## $ BTUEL &lt;dbl&gt; 42723, 17889, 8147, 31647, 20027, 48968, 494… ## $ DOLLAREL &lt;dbl&gt; 1955.06, 713.27, 334.51, 1424.86, 1087.00, 1… ## $ ZBTUEL &lt;fct&gt; Not imputed, Not imputed, Imputed amount and… ## $ BTUNG &lt;dbl&gt; 101924.4, 10145.3, 22603.1, 55118.7, 39099.5… ## $ DOLLARNG &lt;dbl&gt; 701.83, 261.73, 188.14, 636.91, 376.04, 439.… ## $ ZBTUNG &lt;fct&gt; Not imputed, Not imputed, Imputed, Not imput… ## $ BTULP &lt;dbl&gt; 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 17… ## $ DOLLARLP &lt;dbl&gt; 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,… ## $ ZBTULP &lt;fct&gt; Not applicable, Not applicable, Not applicab… ## $ BTUFO &lt;dbl&gt; 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 68… ## $ DOLLARFO &lt;dbl&gt; 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 18… ## $ ZBTUFO &lt;fct&gt; Not applicable, Not applicable, Not applicab… ## $ BTUWOOD &lt;dbl&gt; 0, 0, 0, 0, 0, 3000, 0, 0, 0, 0, 0, 0, 0, 0,… ## $ ZBTUWOOD &lt;fct&gt; Not applicable, Not applicable, Not applicab… ## $ TOTALBTU &lt;dbl&gt; 144648, 28035, 30750, 86765, 59127, 85401, 1… ## $ TOTALDOL &lt;dbl&gt; 2656.9, 975.0, 522.6, 2061.8, 1463.0, 2335.1… From the output, we can see that there are 18,496 rows and 57 non-weight variables in the RECS data. This output also indicates that most of the variables are in double (numeric) format (e.g., TOTSQFT_EN), with some factor (e.g., Region), Boolean (e.g., ACUsed), character (e.g., REGIONC), and ordinal (e.g., YearMade) variables. 4.2.3 Design objects The design object is the backbone for survey analysis. It is where we specify the sampling design, weights, and other necessary information to ensure we account for errors in the data. Before creating the design object, analysts should carefully review the survey documentation to understand how to create the design object for accurate analysis. In this chapter, we provide details on how to code the design object for the ANES and RECS data used in the book. However, we only provide a high-level overview to get readers started. For a deeper understanding of creating these design objects for a variety of sampling designs, see Chapter 10. While we recommend conducting exploratory data analysis on the original data before diving into complex survey analysis (see Chapter 12), the actual analysis and inference should be performed with the survey design objects instead of the original survey data. For example, the ANES data is called anes_2020. If we create a survey design object called anes_des, our analyses should begin with anes_des and not anes_2020. Using the survey design object ensures that our calculations are appropriately accounting for the details of the survey design. American National Election Studies (ANES) Design Object The ANES documentation (DeBell 2010) details the sampling and weighting implications for analyzing the survey data. From this documentation and as noted in Chapter 3, the 2020 ANES data is weighted to the sample, not the population. To make generalizations about the population, we need to weigh the data against the full population count. The ANES methodology recommends using the Current Population Survey (CPS) to determine the number of non-institutional U.S. citizens aged 18 or older living in the 50 U.S. states or D.C. in March of 2020. We can use the {censusapi} package to obtain the information needed for the survey design object. The getCensus() function allows us to retrieve the CPS data for March (cps/basic/mar) in 2020 (vintage = 2020). Additionally, we extract several variables from the CPS: month (HRMONTH) and year (HRYEAR4) of the interview: to confirm the correct time period age (PRTAGE) of the respondent: to narrow the population to 18 and older (eligible age to vote) citizenship status (PRCITSHP) of the respondent: to narrow the population to only those eligible to vote final person-level weight (PWSSWGT) Detailed information for these variables can be found in the CPS data dictionary. cps_state_in &lt;- getCensus(name = &quot;cps/basic/mar&quot;, vintage = 2020, region = &quot;state&quot;, vars = c(&quot;HRMONTH&quot;, &quot;HRYEAR4&quot;, &quot;PRTAGE&quot;, &quot;PRCITSHP&quot;, &quot;PWSSWGT&quot;), key = Sys.getenv(&quot;CENSUS_KEY&quot;)) cps_state &lt;- cps_state_in %&gt;% as_tibble() %&gt;% mutate(across(.cols = everything(), .fns = as.numeric)) In the code above, we include region = \"state\". The default region type for the CPS data is at the state level. While not required, including the region can be helpful for understanding the geographical context of the data. In getCensus(), we filtered the dataset by specifying the month (HRMONTH == 3) and year (HRYEAR4 == 2020) of our request. Therefore, we expect that all interviews within our output were conducted during that particular month and year. We can confirm that the data is from March 2020 by running the code below: cps_state %&gt;% distinct(HRMONTH, HRYEAR4) ## # A tibble: 1 × 2 ## HRMONTH HRYEAR4 ## &lt;dbl&gt; &lt;dbl&gt; ## 1 3 2020 We can narrow down the dataset using the age and citizenship variables to include only individuals who are 18 years or older (PRTAGE &gt;= 18) and have U.S. citizenship (PRCITSHIP %in% c(1:4)): cps_narrow_resp &lt;- cps_state %&gt;% filter(PRTAGE &gt;= 18, PRCITSHP %in% c(1:4)) To calculate the U.S. population from the filtered data, we sum the person weights (PWSSWGT): targetpop &lt;- cps_narrow_resp %&gt;% pull(PWSSWGT) %&gt;% sum() scales::comma(targetpop) ## [1] &quot;231,034,125&quot; The target population in 2020 is 231,034,125. This result gives us what we need to create the survey design object for estimating population statistics. Using the anes_2020 data, we adjust the weighting variable (V200010b) using the target population we just calculated (targetpop). We determine the proportion of the total weight for each individual weight (V200010b / sum(V200010b)) and then multiply that proportion by the calculated target population. anes_adjwgt &lt;- anes_2020 %&gt;% mutate(Weight = V200010b / sum(V200010b) * targetpop) Once we have the adjusted weights, we can refer to the rest of the documentation to create the survey design. The documentation indicates that the study uses a stratified cluster sampling design. Therefore, we need to specify variables for strata and ids (cluster) and fill in the nest argument. The documentation provides guidance on which strata and cluster variables to use depending on whether we are analyzing pre- or post-election data. In this book, we analyze post-election data, so we need to use the post-election weight V200010b, strata variable V200010d, and PSU/cluster variable V200010c. Additionally, we set nest=TRUE to ensure the clusters are nested within the strata. anes_des &lt;- anes_adjwgt %&gt;% as_survey_design(weights = Weight, strata = V200010d, ids = V200010c, nest = TRUE) anes_des ## Stratified 1 - level Cluster Sampling design (with replacement) ## With (101) clusters. ## Called via srvyr ## Sampling variables: ## - ids: V200010c ## - strata: V200010d ## - weights: Weight ## Data variables: ## - V200001 (dbl), CaseID (dbl), V200002 (dbl+lbl), InterviewMode ## (fct), V200010b (dbl), Weight (dbl), V200010c (dbl), VarUnit (fct), ## V200010d (dbl), Stratum (fct), V201006 (dbl+lbl), CampaignInterest ## (fct), V201023 (dbl+lbl), EarlyVote2020 (fct), V201024 (dbl+lbl), ## V201025x (dbl+lbl), V201028 (dbl+lbl), V201029 (dbl+lbl), V201101 ## (dbl+lbl), V201102 (dbl+lbl), VotedPres2016 (fct), V201103 ## (dbl+lbl), VotedPres2016_selection (fct), V201228 (dbl+lbl), ## V201229 (dbl+lbl), V201230 (dbl+lbl), V201231x (dbl+lbl), PartyID ## (fct), V201233 (dbl+lbl), TrustGovernment (fct), V201237 (dbl+lbl), ## TrustPeople (fct), V201507x (dbl+lbl), Age (dbl), AgeGroup (fct), ## V201510 (dbl+lbl), Education (fct), V201546 (dbl+lbl), V201547a ## (dbl+lbl), V201547b (dbl+lbl), V201547c (dbl+lbl), V201547d ## (dbl+lbl), V201547e (dbl+lbl), V201547z (dbl+lbl), V201549x ## (dbl+lbl), RaceEth (fct), V201600 (dbl+lbl), Gender (fct), V201607 ## (dbl+lbl), V201610 (dbl+lbl), V201611 (dbl+lbl), V201613 (dbl+lbl), ## V201615 (dbl+lbl), V201616 (dbl+lbl), V201617x (dbl+lbl), Income ## (fct), Income7 (fct), V202051 (dbl+lbl), V202066 (dbl+lbl), V202072 ## (dbl+lbl), VotedPres2020 (fct), V202073 (dbl+lbl), V202109x ## (dbl+lbl), V202110x (dbl+lbl), VotedPres2020_selection (fct) We can examine this new object to learn more about the survey design, such that the ANES is a “Stratified 1 - level Cluster Sampling design (with replacement) With (101) clusters”. Additionally, the output displays the sampling variables and then lists the remaining variables in the dataset. This design object will be used throughout this book to conduct survey analysis. Residential Energy Consumption Survey (RECS) Design Object The RECS documentation (U.S. Energy Information Administration 2023b) provides information on the survey’s sampling and weighting implications for analysis. The documentation shows the 2020 RECS uses Jackknife weights, where the main analytic weight is NWEIGHT, and the Jackknife weights are NWEIGHT1-NWEIGHT60. We can specify these in the weights and repweights arguments in the survey design object code, respectively. With Jackknife weights, additional information is required: type, scale, and mse. Chapter 10 goes into depth about each of these arguments, but to quickly get started, the documentation lets us know that type=JK1, scale=59/60, and mse = TRUE. We can use the following code to create the survey design object: recs_des &lt;- recs_2020 %&gt;% as_survey_rep( weights = NWEIGHT, repweights = NWEIGHT1:NWEIGHT60, type = &quot;JK1&quot;, scale = 59 / 60, mse = TRUE ) recs_des ## Call: Called via srvyr ## Unstratified cluster jacknife (JK1) with 60 replicates and MSE variances. ## Sampling variables: ## - repweights: `NWEIGHT1 + NWEIGHT2 + NWEIGHT3 + NWEIGHT4 + NWEIGHT5 + ## NWEIGHT6 + NWEIGHT7 + NWEIGHT8 + NWEIGHT9 + NWEIGHT10 + NWEIGHT11 + ## NWEIGHT12 + NWEIGHT13 + NWEIGHT14 + NWEIGHT15 + NWEIGHT16 + ## NWEIGHT17 + NWEIGHT18 + NWEIGHT19 + NWEIGHT20 + NWEIGHT21 + ## NWEIGHT22 + NWEIGHT23 + NWEIGHT24 + NWEIGHT25 + NWEIGHT26 + ## NWEIGHT27 + NWEIGHT28 + NWEIGHT29 + NWEIGHT30 + NWEIGHT31 + ## NWEIGHT32 + NWEIGHT33 + NWEIGHT34 + NWEIGHT35 + NWEIGHT36 + ## NWEIGHT37 + NWEIGHT38 + NWEIGHT39 + NWEIGHT40 + NWEIGHT41 + ## NWEIGHT42 + NWEIGHT43 + NWEIGHT44 + NWEIGHT45 + NWEIGHT46 + ## NWEIGHT47 + NWEIGHT48 + NWEIGHT49 + NWEIGHT50 + NWEIGHT51 + ## NWEIGHT52 + NWEIGHT53 + NWEIGHT54 + NWEIGHT55 + NWEIGHT56 + ## NWEIGHT57 + NWEIGHT58 + NWEIGHT59 + NWEIGHT60` ## - weights: NWEIGHT ## Data variables: ## - DOEID (dbl), ClimateRegion_BA (fct), Urbanicity (fct), Region ## (fct), REGIONC (chr), Division (fct), STATE_FIPS (chr), ## state_postal (fct), state_name (fct), HDD65 (dbl), CDD65 (dbl), ## HDD30YR (dbl), CDD30YR (dbl), HousingUnitType (fct), YearMade ## (ord), TOTSQFT_EN (dbl), TOTHSQFT (dbl), TOTCSQFT (dbl), ## ZTOTSQFT_EN (fct), ZYearMade (fct), ZHousingUnitType (fct), ## SpaceHeatingUsed (lgl), ZSpaceHeatingUsed (fct), ACUsed (lgl), ## ZACUsed (fct), ZACBehavior (fct), HeatingBehavior (fct), ## WinterTempDay (dbl), WinterTempAway (dbl), WinterTempNight (dbl), ## ACBehavior (fct), SummerTempDay (dbl), SummerTempAway (dbl), ## SummerTempNight (dbl), ZHeatingBehavior (fct), ZWinterTempAway ## (fct), ZSummerTempAway (fct), ZWinterTempDay (fct), ZSummerTempDay ## (fct), ZWinterTempNight (fct), ZSummerTempNight (fct), NWEIGHT ## (dbl), NWEIGHT1 (dbl), NWEIGHT2 (dbl), NWEIGHT3 (dbl), NWEIGHT4 ## (dbl), NWEIGHT5 (dbl), NWEIGHT6 (dbl), NWEIGHT7 (dbl), NWEIGHT8 ## (dbl), NWEIGHT9 (dbl), NWEIGHT10 (dbl), NWEIGHT11 (dbl), NWEIGHT12 ## (dbl), NWEIGHT13 (dbl), NWEIGHT14 (dbl), NWEIGHT15 (dbl), NWEIGHT16 ## (dbl), NWEIGHT17 (dbl), NWEIGHT18 (dbl), NWEIGHT19 (dbl), NWEIGHT20 ## (dbl), NWEIGHT21 (dbl), NWEIGHT22 (dbl), NWEIGHT23 (dbl), NWEIGHT24 ## (dbl), NWEIGHT25 (dbl), NWEIGHT26 (dbl), NWEIGHT27 (dbl), NWEIGHT28 ## (dbl), NWEIGHT29 (dbl), NWEIGHT30 (dbl), NWEIGHT31 (dbl), NWEIGHT32 ## (dbl), NWEIGHT33 (dbl), NWEIGHT34 (dbl), NWEIGHT35 (dbl), NWEIGHT36 ## (dbl), NWEIGHT37 (dbl), NWEIGHT38 (dbl), NWEIGHT39 (dbl), NWEIGHT40 ## (dbl), NWEIGHT41 (dbl), NWEIGHT42 (dbl), NWEIGHT43 (dbl), NWEIGHT44 ## (dbl), NWEIGHT45 (dbl), NWEIGHT46 (dbl), NWEIGHT47 (dbl), NWEIGHT48 ## (dbl), NWEIGHT49 (dbl), NWEIGHT50 (dbl), NWEIGHT51 (dbl), NWEIGHT52 ## (dbl), NWEIGHT53 (dbl), NWEIGHT54 (dbl), NWEIGHT55 (dbl), NWEIGHT56 ## (dbl), NWEIGHT57 (dbl), NWEIGHT58 (dbl), NWEIGHT59 (dbl), NWEIGHT60 ## (dbl), BTUEL (dbl), DOLLAREL (dbl), ZBTUEL (fct), BTUNG (dbl), ## DOLLARNG (dbl), ZBTUNG (fct), BTULP (dbl), DOLLARLP (dbl), ZBTULP ## (fct), BTUFO (dbl), DOLLARFO (dbl), ZBTUFO (fct), BTUWOOD (dbl), ## ZBTUWOOD (fct), TOTALBTU (dbl), TOTALDOL (dbl) Viewing this new object provides information about the survey design, such that the RECS is an “unstratified cluster jacknife (JK1) with 60 replicates and MSE variances”. Additionally, the output shows the sampling variables (NWEIGHT1-NWEIGHT60) and then lists the remaining variables in the dataset. This design object will be used throughout this book to conduct survey analysis. 4.3 Survey analysis process The section above walked through the installation and loading of several packages, introduced the survey data available in the {srvyrexploR} package, and provided context on preparing survey design objects for the ANES and RECS data. Once the survey design objects are created, there is a general process for analyzing data to create estimates with {srvyr} package: Create a tbl_svy object (a survey object) using: as_survey_design() or as_survey_rep() Subset data (if needed) using filter() (to create subpopulations) Specify domains of analysis using group_by() Within summarize(), specify variables to calculate, including means, totals, proportions, quantiles, and more In Section 4.2.3, we follow Step #1 to create the survey design objects for the ANES and RECS data featured in this book. Additional details on how to create design objects can be found in 10. Then, once we have the design object, we can then filter the data to any subpopulation of interest (if needed). It is important to filter the data after creating the design object. This ensures that we are accurately accounting for the survey design in our calculations. Finally, we can use group_by(), summarize(), and other functions from the {survey} and {srvyr} packages to analyze the survey data by estimating means, totals, and so on. 4.4 Similarities between {dplyr} and {srvyr} functions The {dplyr} package from the tidyverse offers flexible and intuitive functions for data wrangling (Wickham et al. 2023). One of the major advantages of using {srvyr} is that it applies {dplyr}-like syntax to the {survey} package (Freedman Ellis and Schneider 2023). We can use pipes, such as %&gt;% from the {magrittr} package, to specify a survey design object, apply a function, and then feed that output into the next function’s first argument (Bache and Wickham 2022). Functions follow the ‘tidy’ convention of snake_case function names. To help explain the similarities between {dplyr} functions and {srvyr} functions, we use the towny dataset from the {gt} package and apistrat data that comes in the {survey} package. The towny dataset provides population data for municipalities in Ontario, Canada on Census years between 1996 and 2021. Taking a look at towny with dplyr::glimpse(), we can see the dataset has 25 columns with a mix of character and numeric data. towny %&gt;% glimpse() ## Rows: 414 ## Columns: 25 ## $ name &lt;chr&gt; &quot;Addington Highlands&quot;, &quot;Adelaide Metc… ## $ website &lt;chr&gt; &quot;https://addingtonhighlands.ca&quot;, &quot;htt… ## $ status &lt;chr&gt; &quot;lower-tier&quot;, &quot;lower-tier&quot;, &quot;lower-ti… ## $ csd_type &lt;chr&gt; &quot;township&quot;, &quot;township&quot;, &quot;township&quot;, &quot;… ## $ census_div &lt;chr&gt; &quot;Lennox and Addington&quot;, &quot;Middlesex&quot;, … ## $ latitude &lt;dbl&gt; 45.00, 42.95, 44.13, 45.53, 43.86, 48… ## $ longitude &lt;dbl&gt; -77.25, -81.70, -79.93, -76.90, -79.0… ## $ land_area_km2 &lt;dbl&gt; 1293.99, 331.11, 371.53, 519.59, 66.6… ## $ population_1996 &lt;int&gt; 2429, 3128, 9359, 2837, 64430, 1027, … ## $ population_2001 &lt;int&gt; 2402, 3149, 10082, 2824, 73753, 956, … ## $ population_2006 &lt;int&gt; 2512, 3135, 10695, 2716, 90167, 958, … ## $ population_2011 &lt;int&gt; 2517, 3028, 10603, 2844, 109600, 864,… ## $ population_2016 &lt;int&gt; 2318, 2990, 10975, 2935, 119677, 969,… ## $ population_2021 &lt;int&gt; 2534, 3011, 10989, 2995, 126666, 954,… ## $ density_1996 &lt;dbl&gt; 1.88, 9.45, 25.19, 5.46, 966.84, 8.81… ## $ density_2001 &lt;dbl&gt; 1.86, 9.51, 27.14, 5.44, 1106.74, 8.2… ## $ density_2006 &lt;dbl&gt; 1.94, 9.47, 28.79, 5.23, 1353.05, 8.2… ## $ density_2011 &lt;dbl&gt; 1.95, 9.14, 28.54, 5.47, 1644.66, 7.4… ## $ density_2016 &lt;dbl&gt; 1.79, 9.03, 29.54, 5.65, 1795.87, 8.3… ## $ density_2021 &lt;dbl&gt; 1.96, 9.09, 29.58, 5.76, 1900.75, 8.1… ## $ pop_change_1996_2001_pct &lt;dbl&gt; -0.0111, 0.0067, 0.0773, -0.0046, 0.1… ## $ pop_change_2001_2006_pct &lt;dbl&gt; 0.0458, -0.0044, 0.0608, -0.0382, 0.2… ## $ pop_change_2006_2011_pct &lt;dbl&gt; 0.0020, -0.0341, -0.0086, 0.0471, 0.2… ## $ pop_change_2011_2016_pct &lt;dbl&gt; -0.0791, -0.0125, 0.0351, 0.0320, 0.0… ## $ pop_change_2016_2021_pct &lt;dbl&gt; 0.0932, 0.0070, 0.0013, 0.0204, 0.058… Let’s examine the towny object’s class. We verify that it is a tibble, as indicated by \"tbl_df\", by running the code below: class(towny) ## [1] &quot;tbl_df&quot; &quot;tbl&quot; &quot;data.frame&quot; All tibbles are data.frames but not all data.frames are tibbles. Compared to data.frames, tibbles have some advantages with the printing behavior being a noticeable advantage. The {survey} package contains datasets related to the California Academic Performance Index, which measures student performance in schools with at least 100 students in California. We can access these datasets by loading the {survey} package and running data(api). Let’s work with the apistrat dataset, a stratified simple random sample of three school types (elementary, middle, high) in each stratum. We can follow the process outlined in Section 4.2.3 to create the survey design object. The sample is stratified by the stype variable and the sampling weights are found in the pw variable. We can use this information to construct the design object, dstrata. data(api) dstrata &lt;- apistrat %&gt;% as_survey_design(strata = stype, weights = pw) When we check the class of dstrata, it is not a typical data.frame. Applying the as_survey_design() function transforms the data into a tbl_svy, a special class specifically for survey design objects. The {srvyr} package is designed to work with the tbl_svy class of objects. class(dstrata) ## [1] &quot;tbl_svy&quot; &quot;survey.design2&quot; &quot;survey.design&quot; Let’s look at how {dplyr} works with regular data frames. The example below calculates the mean and median for the land_area_km2 variable in the towny dataset. towny %&gt;% summarize(area_mean = mean(land_area_km2), area_median = median(land_area_km2)) ## # A tibble: 1 × 2 ## area_mean area_median ## &lt;dbl&gt; &lt;dbl&gt; ## 1 373. 273. In the code below, we calculate the mean and median of the variable api00 using dstrata. Note the similarity in the syntax. When we dig into the {srvyr} functions later, we will show that the outputs share a similar structure. Each group (if present) generates one row of output, but with additional columns. By default, the standard error of the statistic is also calculated in addition to the statistic itself. dstrata %&gt;% summarize(api00_mean = survey_mean(api00), api00_med = survey_median(api00)) ## # A tibble: 1 × 4 ## api00_mean api00_mean_se api00_med api00_med_se ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 662. 9.54 668 13.7 The functions in {srvyr} also play nicely with other tidyverse functions. For example, if we wanted to select columns with shared characteristics, we can use {tidyselect} functions such as starts_with(), num_range(), etc (Henry and Wickham 2022). In the examples below, we use a combination of across() and starts_with() to calculate the mean of variables starting with “population” in the towny data frame and those beginning with api in the dstrata survey object. towny %&gt;% summarize(across(starts_with(&quot;population&quot;), ~mean(.x, na.rm=TRUE))) ## # A tibble: 1 × 6 ## population_1996 population_2001 population_2006 population_2011 ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 25866. 27538. 29173. 30838. ## # ℹ 2 more variables: population_2016 &lt;dbl&gt;, population_2021 &lt;dbl&gt; dstrata %&gt;% summarize(across(starts_with(&quot;api&quot;), survey_mean)) ## # A tibble: 1 × 6 ## api00 api00_se api99 api99_se api.stu api.stu_se ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 662. 9.54 629. 10.1 498. 16.4 We have the flexibility to use {dplyr} verbs such as mutate(), filter(), and select() on our survey design object. As mentioned in Section 4.3, these steps should be performed on the survey design object. This ensures our survey design is properly considered in all our calculations. dstrata_mod &lt;- dstrata %&gt;% mutate(api_diff = api00 - api99) %&gt;% filter(stype == &quot;E&quot;) %&gt;% select(stype, api99, api00, api_diff, api_students = api.stu) dstrata_mod ## Stratified Independent Sampling design (with replacement) ## Called via srvyr ## Sampling variables: ## - ids: `1` ## - strata: stype ## - weights: pw ## Data variables: ## - stype (fct), api99 (int), api00 (int), api_diff (int), api_students ## (int) dstrata ## Stratified Independent Sampling design (with replacement) ## Called via srvyr ## Sampling variables: ## - ids: `1` ## - strata: stype ## - weights: pw ## Data variables: ## - cds (chr), stype (fct), name (chr), sname (chr), snum (dbl), dname ## (chr), dnum (int), cname (chr), cnum (int), flag (int), pcttest ## (int), api00 (int), api99 (int), target (int), growth (int), ## sch.wide (fct), comp.imp (fct), both (fct), awards (fct), meals ## (int), ell (int), yr.rnd (fct), mobility (int), acs.k3 (int), ## acs.46 (int), acs.core (int), pct.resp (int), not.hsg (int), hsg ## (int), some.col (int), col.grad (int), grad.sch (int), avg.ed ## (dbl), full (int), emer (int), enroll (int), api.stu (int), pw ## (dbl), fpc (dbl) Several functions in {srvyr} must be called within srvyr::summarize(), with the exception of srvyr::survey_count() and srvyr::survey_tally(). This is similar to how dplyr::count() and dplyr::tally() are not called within dplyr::summarize(). The summarize() function can be used in conjunction with the group_by() function or by/.by arguments, which applies the functions on a group-by-group basis to create grouped summaries. towny %&gt;% group_by(csd_type) %&gt;% dplyr::summarize(area_mean = mean(land_area_km2), area_median = median(land_area_km2)) ## # A tibble: 5 × 3 ## csd_type area_mean area_median ## &lt;chr&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 city 498. 198. ## 2 municipality 607. 488. ## 3 town 183. 129. ## 4 township 363. 301. ## 5 village 23.0 3.3 We use a similar setup to summarize data in {srvyr}: dstrata %&gt;% group_by(stype) %&gt;% summarize(api00_mean = survey_mean(api00), api00_median = survey_median(api00)) ## # A tibble: 3 × 5 ## stype api00_mean api00_mean_se api00_median api00_median_se ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 E 674. 12.5 671 20.7 ## 2 H 626. 15.5 635 21.6 ## 3 M 637. 16.6 648 24.1 At this time, the .by argument is srvyr::summarize() does not exist as it does in {dplyr}. An alternative way to do the grouped analysis on the towny data would be: towny %&gt;% dplyr::summarize(area_mean = mean(land_area_km2), area_median = median(land_area_km2), .by=csd_type) ## # A tibble: 5 × 3 ## csd_type area_mean area_median ## &lt;chr&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 township 363. 301. ## 2 town 183. 129. ## 3 municipality 607. 488. ## 4 city 498. 198. ## 5 village 23.0 3.3 However, the .by syntax is not yet available in {srvyr}: dstrata %&gt;% summarize(api00_mean = survey_mean(api00), api00_median = survey_median(api00), .by=stype) ## Error in `dplyr::summarise()` at gergness-srvyr-1917f75/R/summarise.r:10:3: ## ℹ In argument: `api00_mean = survey_mean(api00)`. ## ℹ In group 1: `stype = E`. ## Caused by error in `[[&lt;-` at gergness-srvyr-1917f75/R/survey_statistics_helpers.R:48:5: ## ! Assigned data `x` must be compatible with existing data. ## ✖ Existing data has 200 rows. ## ✖ Assigned data has 100 rows. ## ℹ Only vectors of size 1 are recycled. ## Caused by error in `vectbl_recycle_rhs_rows()`: ## ! Can&#39;t recycle input of size 100 to size 200. As mentioned above, {srvyr} functions are meant for tbl_svy objects. Attempting to perform data manipulation on non-tbl_svy objects, like the towny example shown below, will result in an error. Running the code will let you know what the issue is: Survey context not set. towny %&gt;% summarize(area_mean = survey_mean(land_area_km2)) ## Error in `summarize()`: ## ℹ In argument: `area_mean = survey_mean(land_area_km2)`. ## Caused by error in `cur_svy()` at gergness-srvyr-1917f75/R/survey_statistics.r:114:3: ## ! Survey context not set A few functions in {srvyr} have counterparts in {dplyr}, such as srvyr::summarize() and srvyr::group_by(). Unlike {srvyr}-specific verbs, {srvyr} recognizes these parallel functions if applied to a non-survey object. Instead of causing an error, the package will provide the equivalent output from {dplyr}: towny %&gt;% srvyr::summarize(area_mean = mean(land_area_km2)) ## # A tibble: 1 × 1 ## area_mean ## &lt;dbl&gt; ## 1 373. Because this book focuses on survey analysis, most of our pipes will stem from a survey object. When we load the {dplyr} and {srvyr} packages, the functions will automatically figure out the class of data and use the appropriate one from {dplyr} or {srvyr}. Therefore, we do not need to include the namespace for each function (e.g., srvyr::summarize()). References Bache, Stefan Milton, and Hadley Wickham. 2022. magrittr: A Forward-Pipe Operator for R. DeBell, Matthew. 2010. “How to Analyze ANES Survey Data.” ANES Technical Report Series nes012492. Palo Alto, CA: Stanford University; Ann Arbor, MI: the University of Michigan; https://electionstudies.org/wp-content/uploads/2018/05/HowToAnalyzeANESData.pdf. Freedman Ellis, Greg, and Ben Schneider. 2023. srvyr: Dplyr-Like Syntax for Summary Statistics of Survey Data. Henry, Lionel, and Hadley Wickham. 2022. tidyselect: Select from a Set of Strings. Lumley, Thomas. 2010. Complex Surveys: A Guide to Analysis Using R. John Wiley; Sons. Recht, Hannah. 2024. censusapi: Retrieve Data from the Census APIs. Robinson, David, Alex Hayes, and Simon Couch. 2023. broom: Convert Statistical Objects into Tidy Tibbles. ———. 2023b. “2020 Residential Energy Consumption Survey: Household Characteristics Technical Documentation Summary.” https://www.eia.gov/consumption/residential/data/2020/pdf/2020%20RECS_Methodology%20Report.pdf. Wickham, Hadley, Mara Averick, Jennifer Bryan, Winston Chang, Lucy D’Agostino McGowan, Romain François, Garrett Grolemund, et al. 2019. “Welcome to the tidyverse.” Journal of Open Source Software 4 (43): 1686. https://doi.org/10.21105/joss.01686. Wickham, Hadley, Romain François, Lionel Henry, Kirill Müller, and Davis Vaughan. 2023. dplyr: A Grammar of Data Manipulation. Note: {broom} is already included in the tidyverse, so no separate installation is required↩︎ In the United States, presidential elections are held in years divisible by four. In other even years, there are elections at the federal level for congress which are referred to as midterm elections as they occur at the middle of the term of a president.↩︎ "],["c05-descriptive-analysis.html", "Chapter 5 Descriptive analyses 5.1 Introduction 5.2 Counts and cross-tabulations 5.3 Totals and sums 5.4 Means and proportions 5.5 Quantiles and medians 5.6 Ratios 5.7 Correlations 5.8 Standard deviation and variance 5.9 Additional topics 5.10 Exercises", " Chapter 5 Descriptive analyses Prerequisites For this chapter, load the following packages: library(tidyverse) library(srvyr) library(srvyrexploR) library(broom) We will be using data from ANES and RECS described in Chapter 4. As a reminder, here is the code to create the design objects for each to use throughout this chapter. For ANES, we need to adjust the weight so it sums to the population instead of the sample (see the ANES documentation and Chapter 4 for more information). targetpop &lt;- 231592693 anes_adjwgt &lt;- anes_2020 %&gt;% mutate(Weight = Weight / sum(Weight) * targetpop) anes_des &lt;- anes_adjwgt %&gt;% as_survey_design( weights = Weight, strata = Stratum, ids = VarUnit, nest = TRUE ) For RECS, details are included in the RECS documentation and Chapters 4 and 10. recs_des &lt;- recs_2020 %&gt;% as_survey_rep( weights = NWEIGHT, repweights = NWEIGHT1:NWEIGHT60, type = &quot;JK1&quot;, scale = 59/60, mse = TRUE ) 5.1 Introduction Descriptive analyses, such as basic counts, cross-tabulations, or means, are one of the first steps in making sense of our survey results. By reviewing the findings, analysts can glean insight into the data, the underlying population, and any unique aspects of the data or population. For example, if only 10% of the survey respondents are male, it could indicate a unique population, a potential error or bias, an intentional survey sampling method, or other factors. Additionally, descriptive analyses allow analysts to provide summaries like means, proportions, or other measures to make estimates about the population. These analyses lay the groundwork for the next steps of running statistical tests or developing models. We will discuss many different types of descriptive analyses in this chapter. However, it is important to know what type of data we are working with and which statistics are appropriate. In survey data, we typically consider data as one of four main types: Categorical/nominal data: variables with levels or descriptions that cannot be ordered, such as the region of the country (North, South, East, and West) Ordinal data: variables that can be ordered, such as those from a Likert scale (strongly disagree, disagree, agree, and strongly agree) Discrete data: variables that are counted or measured, such as number of children Continuous data, variables that are measured and whose values can lie anywhere on an interval, such as income This chapter will discuss how to analyze measures of distribution (e.g., cross-tabulations), central tendency (e.g., means), relationship (e.g., ratios), and dispersion (e.g., standard deviation) using functions from the {srvyr} package (Freedman Ellis and Schneider 2023). Measures of distribution describe how often an event or response occurs. These measures include counts and totals. We will cover the following functions: Count of observations (survey_count() and survey_tally()) Summation of variables (survey_total()) Measures of central tendency find the central (or average) responses. These measures include means and medians. We will cover the following functions: Means and proportions (survey_mean() and survey_prop()) Quantiles and medians (survey_quantile() and survey_median()) Measures of relationship describe how variables relate to each other. These measures include correlations and ratios. We will cover the following functions: Correlations (survey_corr()) Ratios (survey_ratio()) Measures of dispersion describe how data spreads around the central tendency for continuous variables. These measures include standard deviations and variances. We will cover the following functions: Variances and standard deviations (survey_var() and survey_sd()) To incorporate each of these survey functions, recall the general process for survey estimation from Chapter 4: Create a tbl_svy object using srvyr::as_survey_design() or srvyr::as_survey_rep(). Subset the data for subpopulations using srvyr::filter(), if needed. Specify domains of analysis using srvyr::group_by(), if needed. Analyze the data with survey-specific functions. This chapter will walk through how to apply the survey functions in Step 4. Note that unless otherwise specified, our estimates will be weighted as a result of setting up the survey design object. To look at the data by different subgroups, we can choose to filter and/or group the data. It is very important that we filter and group the data only after creating the design object. This ensures that the results accurately reflect the survey design. If we filter or group data before creating the survey design object, the data for those cases is not included in the survey design information and estimations of the variance, leading to inaccurate results. For the sake of simplicity, we’ve removed cases with missing values in the examples below. If you want a more detailed explanation on how to handle missing data, please refer to Chapter 11. 5.2 Counts and cross-tabulations Using survey_count() and survey_tally(), we can calculate the estimated population counts for a given variable or combination of variables. These summaries, often referred to as cross-tabulations or crosstabs, are applied to categorical data. They help in estimating counts of the population size for different groups based on the survey data. 5.2.1 Syntax The syntax for survey_count() is similar to the dplyr::count() syntax, as mentioned in Chapter 4. However, as noted above, this function can only be called on tbl_svy objects. Let’s explore the syntax: survey_count( x, ..., wt = NULL, sort = FALSE, name = &quot;n&quot;, .drop = dplyr::group_by_drop_default(x), vartype = c(&quot;se&quot;, &quot;ci&quot;, &quot;var&quot;, &quot;cv&quot;) ) The arguments are: x: a tbl_svy object created by as_survey ...: variables to group by, passed to group_by wt: a variable to weight on in addition to the survey weights, defaults to NULL sort: how to sort the variables, defaults to FALSE name: the name of the count variable, defaults to n .drop: whether to drop empty groups vartype: type(s) of variation estimate to calculate including any of c(\"se\", \"ci\", \"var\", \"cv\"), defaults to se (standard error) (see 5.2.1 for more information) To generate a count or crosstabs by different variables, we include them in the (...) argument. This argument can take any number of variables and will break down the counts by all combinations of the provided variables. This is similar to dplyr::count(). To obtain an estimate of the overall population, we can exclude any variables from the (...) argument or use the survey_tally() function. While the survey_tally() function has a similar syntax to the survey_count() function, it does not include the (...) or the .drop arguments: survey_tally( x, wt, sort = FALSE, name = &quot;n&quot;, vartype = c(&quot;se&quot;, &quot;ci&quot;, &quot;var&quot;, &quot;cv&quot;) ) Both functions include the vartype argument with four different values: se: standard error The estimated standard deviation of the estimate Output has a column with the variable name specified in the name argument with a suffix of “_se” ci: confidence interval The lower and upper limits of a confidence interval Output has a column with the variable name specified in the name argument with a suffix of “_low” and “_upp” By default, this is a 95% confidence interval but can be changed by using the argument level and specifying a number between 0 and 1. For example, level=0.8 would produce a 80% confidence interval. var: variance The estimated variance of the estimate Output has a column with the variable name specified in the name argument with a suffix of “_var” cv: coefficient of variation A ratio of the standard error and the estimate Output has a column with the variable name specified in the name argument with a suffix of “_cv” The confidence intervals are always calculated using a symmetric t-distribution based method, given by the formula: \\[ \\text{estimate} \\pm t^*_{df}\\times SE\\] where \\(t^*_{df}\\) is the critical value from a t-distribution based on the confidence level and the degrees of freedom. By default, the degrees of freedom are based on the design or number of replicates, but they can be specified using the df argument. For survey design objects, the degrees of freedom are calculated as the number of PSUs minus the number of strata. For replicate-based objects, the degrees of freedom are calculated as one less than the rank of the matrix of replicate weight, where the number of replicates is typically the rank. Note that specifying df = Inf is equivalent to using a normal (z-based) confidence interval – this is the default in {survey}. These variability types are the same for most of the survey functions, and we will provide examples using different variability types throughout this chapter. 5.2.2 Examples Example 1: Estimated population count If we want to obtain the estimated number of households in the U.S. (the target population) using the Residential Energy Consumption Survey (RECS) data, we can use survey_count(). If we do not specify any variables in the survey_count() function, it will output the estimated population count (n) and its corresponding standard error (n_se). recs_des %&gt;% survey_count() ## # A tibble: 1 × 2 ## n n_se ## &lt;dbl&gt; &lt;dbl&gt; ## 1 123529025. 0.148 Based on this calculation, the estimated number of households in the U.S. is 123,529,025. Alternatively, we could also use the survey_tally() function. The example below yields the same results as survey_count(). recs_des %&gt;% survey_tally() ## # A tibble: 1 × 2 ## n n_se ## &lt;dbl&gt; &lt;dbl&gt; ## 1 123529025. 0.148 Example 2: Estimated counts by subgroups (crosstabs) To calculate the estimated number of observations for specific subgroups, such as Region and Division, we can include the variables of interest in the survey_count() function. In the example below, we calculate the estimated number of housing units by region and division. The argument name = in survey_count() allows us to change the name of the count variable in the output from the default n to N. recs_des %&gt;% survey_count(Region, Division, name = &quot;N&quot;) ## # A tibble: 10 × 4 ## Region Division N N_se ## &lt;fct&gt; &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Northeast New England 5876166 0.0000000137 ## 2 Northeast Middle Atlantic 16043503 0.0000000487 ## 3 Midwest East North Central 18546912 0.000000437 ## 4 Midwest West North Central 8495815 0.0000000177 ## 5 South South Atlantic 24843261 0.0000000418 ## 6 South East South Central 7380717. 0.114 ## 7 South West South Central 14619094 0.000488 ## 8 West Mountain North 4615844 0.119 ## 9 West Mountain South 4602070 0.0000000492 ## 10 West Pacific 18505643. 0.00000295 When we run the crosstab, we see there are an estimated 5,876,166 housing units in the New England Division. The code will result in an error if we try to use the survey_count() syntax with survey_tally(): recs_des %&gt;% survey_tally(Region, Division, name = &quot;N&quot;) ## Error in `dplyr::summarise()` at gergness-srvyr-1917f75/R/summarise.r:10:3: ## ℹ In argument: `N = survey_total(Region, vartype = vartype, ## na.rm = TRUE)`. ## Caused by error: ## ! Factor not allowed in survey functions, should be used as a grouping variable. Use a group_by() function prior to using survey_tally() to successfully run the crosstab: recs_des %&gt;% group_by(Region, Division) %&gt;% survey_tally(name = &quot;N&quot;) ## # A tibble: 10 × 4 ## # Groups: Region [4] ## Region Division N N_se ## &lt;fct&gt; &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Northeast New England 5876166 0.0000000137 ## 2 Northeast Middle Atlantic 16043503 0.0000000487 ## 3 Midwest East North Central 18546912 0.000000437 ## 4 Midwest West North Central 8495815 0.0000000177 ## 5 South South Atlantic 24843261 0.0000000418 ## 6 South East South Central 7380717. 0.114 ## 7 South West South Central 14619094 0.000488 ## 8 West Mountain North 4615844 0.119 ## 9 West Mountain South 4602070 0.0000000492 ## 10 West Pacific 18505643. 0.00000295 5.3 Totals and sums The survey_total() function is analogous to sum. It can be applied to continuous variables to obtain the estimated total quantity in a population. Starting from this point in the chapter, all the introduced functions must be called within summarize(). 5.3.1 Syntax Here is the syntax: survey_total( x, na.rm = FALSE, vartype = c(&quot;se&quot;, &quot;ci&quot;, &quot;var&quot;, &quot;cv&quot;), level = 0.95, deff = FALSE, df = NULL ) The arguments are: x: a variable, expression, or empty na.rm: an indicator of whether missing values should be dropped, defaults to FALSE vartype: type(s) of variation estimate to calculate including any of c(\"se\", \"ci\", \"var\", \"cv\"), defaults to se (standard error) (see 5.2.1 for more information) level: a number or a vector indicating the confidence level, defaults to 0.95 deff: a logical value stating whether the design effect should be returned, defaults to FALSE (this is described in more detail in Section 5.9.3) df: (for vartype = 'ci'), a numeric value indicating degrees of freedom for the t-distribution 5.3.2 Examples Example 1: Estimated population count To calculate a population count estimate with survey_total(), we leave the argument x empty as shown in the example below: recs_des %&gt;% summarize(Tot = survey_total()) ## # A tibble: 1 × 2 ## Tot Tot_se ## &lt;dbl&gt; &lt;dbl&gt; ## 1 123529025. 0.148 The estimated number of households in the U.S. is 123,529,025. Note that this result obtained from recs_des %&gt;% summarize(survey_total()) is equivalent to the ones from the survey_count() and survey_tally() functions. However, the survey_total() function is called within summarize, whereas survey_count() and survey_tally() are not. Example 2: Overall summation of continuous variables The distinction between survey_total() and survey_count() becomes more evident when working with continuous variables. Let’s compute the total cost of electricity in whole dollars from variable DOLLAREL4. recs_des %&gt;% summarize(elec_bill = survey_total(DOLLAREL)) ## # A tibble: 1 × 2 ## elec_bill elec_bill_se ## &lt;dbl&gt; &lt;dbl&gt; ## 1 170473527909. 664893504. It is estimated that American residential households spent a total of $170,473,527,909 on electricity in 2020, and the estimate has a standard error of $664,893,504. Example 3: Summation by groups Since we are using the {srvyr} package, we can use group_by() to calculate the cost of electricity for different groups. Let’s examine the variations in the cost of electricity in whole dollars across regions and display the confidence interval instead of the default standard error. recs_des %&gt;% group_by(Region) %&gt;% summarize(elec_bill = survey_total(DOLLAREL, vartype = &quot;ci&quot;)) ## # A tibble: 4 × 4 ## Region elec_bill elec_bill_low elec_bill_upp ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Northeast 29430369947. 28788987554. 30071752341. ## 2 Midwest 34972544751. 34339576041. 35605513460. ## 3 South 72496840204. 71534780902. 73458899506. ## 4 West 33573773008. 32909111702. 34238434313. The survey results estimate that households in the Northeast spent $29,430,369,947 with a confidence interval of ($28,788,987,554, $30,071,752,341) on electricity in 2020 while households in the South spent an estimated $72,496,840,204 with a confidence interval of ($28,788,987,554, $73,458,899,506). As we calculate these numbers, we may notice that the confidence interval of the South is larger than those of other regions. This implies that we have less certainty about the true value of electricity spending in the South. A larger confidence interval could be due to a variety of factors, such as a wider range of electricity spending in the South. We could try to analyze smaller regions within the South to identify areas that are contributing to more variability. Descriptive analyses serve as a valuable starting point for more in-depth exploration and analysis. 5.4 Means and proportions Means and proportions form the backbone of many research studies. These estimates are often the first things we look for when reviewing research on a given topic. The survey_mean() and survey_prop() functions calculate means and proportions while taking into account the survey design elements. The survey_mean() function should be used on continuous variables of survey data, while the survey_prop() function should be used on categorical variables. These topics are grouped together because a proportion is a mean of a logical (Boolean) variable. 5.4.1 Syntax The syntax for both means and proportions are very similar: survey_mean( x, na.rm = FALSE, vartype = c(&quot;se&quot;, &quot;ci&quot;, &quot;var&quot;, &quot;cv&quot;), level = 0.95, proportion = FALSE, prop_method = c(&quot;logit&quot;, &quot;likelihood&quot;, &quot;asin&quot;, &quot;beta&quot;, &quot;mean&quot;), deff = FALSE, df = NULL ) survey_prop( na.rm = FALSE, vartype = c(&quot;se&quot;, &quot;ci&quot;, &quot;var&quot;, &quot;cv&quot;), level = 0.95, proportion = TRUE, prop_method = c(&quot;logit&quot;, &quot;likelihood&quot;, &quot;asin&quot;, &quot;beta&quot;, &quot;mean&quot;, &quot;xlogit&quot;), deff = FALSE, df = NULL ) Both functions have the following arguments and defaults: na.rm: an indicator of whether missing values should be dropped, defaults to FALSE vartype: type(s) of variation estimate to calculate including any of c(\"se\", \"ci\", \"var\", \"cv\"), defaults to se (standard error) (see 5.2.1 for more information) level: a number or a vector indicating the confidence level, defaults to 0.95 prop_method: Method to calculate the confidence interval for confidence intervals deff: a logical value stating whether the design effect should be returned, defaults to FALSE (this is described in more detail in Section 5.9.3) df: (for vartype = 'ci'), a numeric value indicating degrees of freedom for the t-distribution There are two main differences in the syntax. The survey_mean() function includes the first argument x, representing the variable or expression on which the mean should be calculated. The survey_prop() does not have an argument to include the variables directly. Instead, prior to summarize(), we must use the group_by() function to specify the variables of interest for survey_prop(). For survey_mean(), including a group_by() function allows us to obtain the means by different groups. The other main difference is with the proportion argument. The survey_mean() function can be used to calculate both means and proportions. Its proportion argument defaults to FALSE, indicating it is used for calculating means. If we wish to calculate a proportion using survey_mean(), we will need to set the proportion argument to TRUE. In the survey_prop() function, the proportion argument defaults to TRUE because the function is specifically designed for calculating proportions. In section 5.2.1, we provide an overview of different variability types. The confidence interval used for most measures, such as means and counts, is referred to as a Wald-type interval. However, for proportions, a Wald-type interval with a symmetric t-based confidence interval may not provide accurate coverage, especially when dealing with small sample sizes or proportions “near” 0 or 1. We can use other methods to calculate confidence intervals, which we specify using the prop_method option in survey_prop(). The options include: logit: fits a logistic regression model and computes a Wald-type interval on the log-odds scale, which is then transformed to the probability scale. This is the default method. likelihood: uses the (Rao-Scott) scaled chi-squared distribution for the log-likelihood from a binomial distribution. asin: uses the variance-stabilizing transformation for the binomial distribution, the arcsine square root, and then back-transforms the interval to the probability scale beta: uses the incomplete beta function with an effective sample size based on the estimated variance of the proportion. mean: the Wald-type interval (\\(\\pm t_{df}^*\\times SE\\)) xlogit: uses a logit transformation of the proportion, calculates a Wald-type interval, and then back-transforms to the probability scale. This method is the same as those used in SUDAAN and SPSS. Each option will yield slightly different confidence interval bounds when dealing with proportions. Please note that when working with survey_mean(), we do not need to specify a method unless the proportion argument is TRUE. If proportion is FALSE, it calculates a symmetric mean type of confidence interval. 5.4.2 Examples Example 1: One variable proportion If we are interested in obtaining the proportion of people in each region in the RECS data, we can use group_by() and survey_prop() as shown below: recs_des %&gt;% group_by(Region) %&gt;% summarize(p = survey_prop()) ## When `proportion` is unspecified, `survey_prop()` now defaults to `proportion = TRUE`. ## ℹ This should improve confidence interval coverage. ## This message is displayed once per session. ## # A tibble: 4 × 3 ## Region p p_se ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Northeast 0.177 0.000000000212 ## 2 Midwest 0.219 0.000000000262 ## 3 South 0.379 0.000000000740 ## 4 West 0.224 0.000000000816 17.7% of the households are in the Northeast, 21.9% in the Midwest, and so on. Note that the proportions in column p add up to one. The survey_prop() function is essentially the same as using survey_mean() with a categorical variable and without specifying a numeric variable in the x argument. The following code will give us the same results as above: recs_des %&gt;% group_by(Region) %&gt;% summarize(p = survey_mean()) ## # A tibble: 4 × 3 ## Region p p_se ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Northeast 0.177 0.000000000212 ## 2 Midwest 0.219 0.000000000262 ## 3 South 0.379 0.000000000740 ## 4 West 0.224 0.000000000816 Example 2: Conditional proportions We can also obtain proportions by more than one variable. In the following example, we look at the proportion of housing units by Region and whether air conditioning is used (ACUsed).5 recs_des %&gt;% group_by(Region, ACUsed) %&gt;% summarize(p = survey_prop()) ## # A tibble: 8 × 4 ## # Groups: Region [4] ## Region ACUsed p p_se ## &lt;fct&gt; &lt;lgl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Northeast FALSE 0.110 0.00590 ## 2 Northeast TRUE 0.890 0.00590 ## 3 Midwest FALSE 0.0666 0.00508 ## 4 Midwest TRUE 0.933 0.00508 ## 5 South FALSE 0.0581 0.00278 ## 6 South TRUE 0.942 0.00278 ## 7 West FALSE 0.255 0.00759 ## 8 West TRUE 0.745 0.00759 When specifying multiple variables, the proportions are conditional. In the results above, notice that the proportions sum to 1 within each region. This can be interpreted as the proportion of housing units with air conditioning within each region. For example, in the Northeast region, approximately 11.0% of housing units don’t have air conditioning, while around 89.0% have air conditioning. Example 3: Joint proportions If we’re interested in a joint proportion, we use the interact() function. In the example below, we apply the interact() function to Region and ACUsed: recs_des %&gt;% group_by(interact(Region, ACUsed)) %&gt;% summarize(p = survey_prop()) ## # A tibble: 8 × 4 ## Region ACUsed p p_se ## &lt;fct&gt; &lt;lgl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Northeast FALSE 0.0196 0.00105 ## 2 Northeast TRUE 0.158 0.00105 ## 3 Midwest FALSE 0.0146 0.00111 ## 4 Midwest TRUE 0.204 0.00111 ## 5 South FALSE 0.0220 0.00106 ## 6 South TRUE 0.357 0.00106 ## 7 West FALSE 0.0573 0.00170 ## 8 West TRUE 0.167 0.00170 As noted earlier, we can use both the survey_prop() and survey_mean() functions, and they will produce the same results. Example 4: Overall mean Below, we calculate the estimated average cost of electricity in the U.S. using survey_mean(). To include both the standard error and the confidence interval, we can include them in the vartype argument: recs_des %&gt;% summarize(elec_bill = survey_mean(DOLLAREL, vartype = c(&quot;se&quot;, &quot;ci&quot;))) ## # A tibble: 1 × 4 ## elec_bill elec_bill_se elec_bill_low elec_bill_upp ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 1380. 5.38 1369. 1391. Nationally, the average household spent $1,380 in 2020. Example 5: Means by subgroup We can also calculate the estimated average cost of electricity in the U.S. by each region. To do this, we include a group_by() function with the variable of interest before the summarize() function: recs_des %&gt;% group_by(Region) %&gt;% summarize(elec_bill = survey_mean(DOLLAREL)) ## # A tibble: 4 × 3 ## Region elec_bill elec_bill_se ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Northeast 1343. 14.6 ## 2 Midwest 1293. 11.7 ## 3 South 1548. 10.3 ## 4 West 1211. 12.0 Households from the West spent approximately $1,211, while in the South, the average spending was $1,548. 5.5 Quantiles and medians To better understand the distribution of a continuous variable like income, we can calculate quantiles at specific points. For example, computing estimates of the quartiles (25%, 50%, 75%) helps us understand how income is spread across the population. We use the survey_quantile() function to calculate quantiles in survey data. Medians are useful for finding the midpoint of a continuous distribution when the data is skewed, as medians are less affected by outliers than means. The median is the same as the 50th percentile, meaning the value where 50% of the data is higher and 50% is lower. Because medians are a special, common case of quantiles, we have a dedicated function called survey_median() for calculating the median in survey data. Alternatively, we can use the survey_quantile() function with the quantiles argument set to 0.5 to achieve the same result. 5.5.1 Syntax The syntax for survey_quantile() and survey_median() are nearly identical: survey_quantile( x, quantiles, na.rm = FALSE, vartype = c(&quot;se&quot;, &quot;ci&quot;, &quot;var&quot;, &quot;cv&quot;), level = 0.95, interval_type = c(&quot;mean&quot;, &quot;beta&quot;, &quot;xlogit&quot;, &quot;asin&quot;, &quot;score&quot;, &quot;quantile&quot;), qrule = c(&quot;math&quot;, &quot;school&quot;, &quot;shahvaish&quot;, &quot;hf1&quot;, &quot;hf2&quot;, &quot;hf3&quot;, &quot;hf4&quot;, &quot;hf5&quot;, &quot;hf6&quot;, &quot;hf7&quot;, &quot;hf8&quot;, &quot;hf9&quot;), df = NULL ) survey_median( x, na.rm = FALSE, vartype = c(&quot;se&quot;, &quot;ci&quot;, &quot;var&quot;, &quot;cv&quot;), level = 0.95, interval_type = c(&quot;mean&quot;, &quot;beta&quot;, &quot;xlogit&quot;, &quot;asin&quot;, &quot;score&quot;, &quot;quantile&quot;), qrule = c(&quot;math&quot;, &quot;school&quot;, &quot;shahvaish&quot;, &quot;hf1&quot;, &quot;hf2&quot;, &quot;hf3&quot;, &quot;hf4&quot;, &quot;hf5&quot;, &quot;hf6&quot;, &quot;hf7&quot;, &quot;hf8&quot;, &quot;hf9&quot;), df = NULL ) The arguments available in both functions are: x: a variable, expression, or empty na.rm: an indicator of whether missing values should be dropped, defaults to FALSE vartype: type(s) of variation estimate to calculate, defaults to se (standard error) level: a number or a vector indicating the confidence level, defaults to 0.95 interval_type: method for calculating a confidence interval qrule: rule for defining quantiles. The default is the lower end of the quantile interval (“math”). The midpoint of the quantile interval is the “school” rule. “hf1” to “hf9” are weighted analogs to type=1 to 9 in quantile(). “shahvaish” corresponds to a rule proposed by Shah and Vaish (2006). See vignette(\"qrule\", package=\"survey\") for more information. df: (for vartype = 'ci'), a numeric value indicating degrees of freedom for the t-distribution The only difference between survey_quantile() and survey_median() is the inclusion of the quantiles argument in the survey_quantile() function. This argument takes a vector with values between 0 and 1 to indicate which quantiles to calculate. For example, if we wanted the quartiles of a variable, we would provide quantiles = c(0.25, 0.5, 0.75). While we can specify quantiles of 0 and 1, which represent the minimum and maximum, this is not recommended. It only returns the minimum and maximum of the respondents and cannot be extrapolated to the population as there is no valid definition of standard error. In Section 5.2.1, we provide an overview of the different variability types. The interval used in confidence intervals for most measures, such as means and counts, is referred to as a Wald-type interval. However, this is not always the most accurate interval for quantiles. Similar to confidence intervals for proportions, quantiles have various interval types including asin, beta, mean, and xlogit (see Section 5.4.1). Quantiles also have two more methods available: score: the Francisco and Fuller confidence interval based on inverting a score test (only available for design-based survey objects and not replicate-based objects) quantile: based on the replicates of the quantile. This is not valid for jackknife-type replicates but is available for bootstrap and BRR replicates. One note with the score method is that when there are numerous ties in the data, this method may produce confidence intervals that do not contain the estimate. When dealing with a high propensity for ties (e.g., many respondents have the same age), it is recommended to use another method. SUDAAN, for example, uses the score method but adds noise to the values to prevent issues. The documentation in the {survey} package indicates in general, the score method may have poorer performance compared to the beta and logit intervals (Lumley 2010). 5.5.2 Examples Example 1: Overall quartiles Quantiles provide insights into the distribution of a variable. Let’s look into the quartiles, specifically, the first quartile (p=0.25), the median (p=0.5), and the third quartile (p=0.75) of electric bills. recs_des %&gt;% summarize(elec_bill = survey_quantile(DOLLAREL, quantiles = c(0.25, .5, 0.75))) %&gt;% print(width=Inf) ## # A tibble: 1 × 6 ## elec_bill_q25 elec_bill_q50 elec_bill_q75 elec_bill_q25_se ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 795. 1215. 1770. 5.69 ## elec_bill_q50_se elec_bill_q75_se ## &lt;dbl&gt; &lt;dbl&gt; ## 1 6.33 9.99 The output above shows the values for the three quartiles and their respective standard errors: the 25th percentile is $795 with a standard error of $5.69, the 50th percentile (median) is $1,215 with a standard error of $6.33, and the 75th percentile is $1,770 with a standard error of $9.99. Example 2: Quartiles by subgroup We can estimate the quantiles of electric bills by region by using the group_by() function: recs_des %&gt;% group_by(Region) %&gt;% summarize(elec_bill = survey_quantile(DOLLAREL, quantiles = c(0.25, .5, 0.75))) %&gt;% print(width = Inf) ## # A tibble: 4 × 7 ## Region elec_bill_q25 elec_bill_q50 elec_bill_q75 elec_bill_q25_se ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Northeast 740. 1148. 1712. 13.7 ## 2 Midwest 769. 1149. 1632. 8.88 ## 3 South 968. 1402. 1945. 10.6 ## 4 West 623. 1028. 1568. 10.8 ## elec_bill_q50_se elec_bill_q75_se ## &lt;dbl&gt; &lt;dbl&gt; ## 1 16.6 25.8 ## 2 11.6 18.6 ## 3 9.17 13.9 ## 4 14.3 20.5 The 25th percentile for the Northeast region is $740 while it is $968 for the South. Example 3: Minimum and maximum As mentioned in the syntax section, we can specify quantiles of 0 (minimum) and 1 (maximum) and R will calculate these values. However, these are only the minimum and maximum values in the data, and there is not enough information to determine their standard errors: recs_des %&gt;% summarize(elec_bill = survey_quantile(DOLLAREL, quantiles = c(0, 1))) ## # A tibble: 1 × 4 ## elec_bill_q00 elec_bill_q100 elec_bill_q00_se elec_bill_q100_se ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 -151. 15680. NaN 0 The minimum cost of electricity in the dataset is -$151 while the maximum is $15,680, but the standard error is shown as NaN and 0, respectively. Notice that the minimum cost is a negative number which may be surprising but some housing units with solar power sell their energy back to the grid and make money which is recorded as a negative expenditure. Example 4: Overall median We can calculate the estimated median cost of electricity in the U.S. using the survey_median() function: recs_des %&gt;% summarize(elec_bill = survey_median(DOLLAREL)) ## # A tibble: 1 × 2 ## elec_bill elec_bill_se ## &lt;dbl&gt; &lt;dbl&gt; ## 1 1215. 6.33 Nationally, the median household spent $1,215 in 2020. This is the same result as we obtained using the survey_quantile() function. Interestingly, the average electric bill for households that we calculated in section 5.4 is $1,380, but the estimated median electric bill is $1,215 indicating the distribution is likely right-skewed. Example 5: Medians by subgroup We can calculate the estimated median cost of electricity in the U.S. by region using the group_by() function with the variable(s) of interest before the summarize() function, similar to when we found the mean by region. recs_des %&gt;% group_by(Region) %&gt;% summarize(elec_bill = survey_median(DOLLAREL)) ## # A tibble: 4 × 3 ## Region elec_bill elec_bill_se ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Northeast 1148. 16.6 ## 2 Midwest 1149. 11.6 ## 3 South 1402. 9.17 ## 4 West 1028. 14.3 Households from the Northeast spent $1,148 on electricity, and in the South, they spent an average of $1,402. 5.6 Ratios A ratio is a measure of the ratio of the sum of two variables, specifically in the form of: \\[ \\frac{\\sum x_i}{\\sum y_i}.\\] Note that the ratio is not the same as calculating the following: \\[ \\frac{1}{N} \\sum \\frac{x_i}{y_i} \\] which can be calculated with survey_mean() by creating a derived variable \\(z=x/y\\) and then calculating the mean of \\(z\\). Say we wanted to assess the energy efficiency of homes in a standardized way, where we can compare homes of different sizes. We can calculate the ratio of energy consumption to the square footage of a home. This helps us meaningfully compare homes of different sizes by identifying how much energy is being used per unit of space. To calculate this ratio, we would run survey_ratio(Energy Consumption in BTUs, Square Footage of Home). If, instead, we used survey_mean(Energy Consumption in BTUs/Square Footage of Home), we would estimate the average energy consumption per square foot of all surveyed homes. While helpful in understanding general energy use, this statistic does not account for differences in home sizes. 5.6.1 Syntax The syntax for survey_ratio() is as follows: survey_ratio( numerator, denominator, na.rm = FALSE, vartype = c(&quot;se&quot;, &quot;ci&quot;, &quot;var&quot;, &quot;cv&quot;), level = 0.95, deff = FALSE, df = NULL ) The arguments are: numerator: The numerator of the ratio denominator: The denominator of the ratio na.rm: A logical value to indicate whether missing values should be dropped vartype: type(s) of variation estimate to calculate including any of c(\"se\", \"ci\", \"var\", \"cv\"), defaults to se (standard error) (see 5.2.1 for more information) level: A single number or vector of numbers indicating the confidence level deff: A logical value to indicate whether the design effect should be returned (this is described in more detail in Section 5.9.3) df: (For vartype = “ci” only) A numeric value indicating the degrees of freedom for t-distribution 5.6.2 Examples Example 1: Overall ratios Suppose we wanted to find the ratio of dollars spent on liquid propane per unit (in British thermal unit [Btu]) nationally6. To find the average cost to a household, we can use survey_mean(). However, to find the national unit rate, we can use survey_ratio(). In the following example, we show both methods and discuss the interpretation of each: recs_des %&gt;% summarize( DOLLARLP_Tot = survey_total(DOLLARLP, vartype = NULL), BTULP_Tot = survey_total(BTULP, vartype = NULL), DOL_BTU_Rat = survey_ratio(DOLLARLP, BTULP), DOL_BTU_Avg = survey_mean(DOLLARLP / BTULP, na.rm = TRUE) ) %&gt;% print(width = Inf) ## # A tibble: 1 × 6 ## DOLLARLP_Tot BTULP_Tot DOL_BTU_Rat DOL_BTU_Rat_se DOL_BTU_Avg ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 8122911173. 391425311586. 0.0208 0.000240 0.0240 ## DOL_BTU_Avg_se ## &lt;dbl&gt; ## 1 0.000223 The ratio of the total spent on liquid propane to the total consumption was 0.0208, but the average rate was 0.024. With a bit of calculation, we can show that the ratio is the ratio of the totals DOLLARLP_Tot/BTULP_Tot=8,122,911,173/391,425,311,586=0.0208. Although the ratio can be calculated manually in this manner, the standard error requires the use of the survey_ratio() function. The average can be interpreted as the average rate paid by a household. Example 2: Ratios by subgroup As previously done with other estimates, we can use group_by() to examine whether this ratio varies by region. recs_des %&gt;% group_by(Region) %&gt;% summarize(DOL_BTU_Rat = survey_ratio(DOLLARLP, BTULP)) %&gt;% arrange(DOL_BTU_Rat) ## # A tibble: 4 × 3 ## Region DOL_BTU_Rat DOL_BTU_Rat_se ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Midwest 0.0158 0.000240 ## 2 South 0.0245 0.000388 ## 3 West 0.0246 0.000875 ## 4 Northeast 0.0247 0.000488 Although not a formal statistical test, it appears that the cost ratios for liquid propane are the lowest in the Midwest (0.0158). 5.7 Correlations The correlation is a measure of the linear relationship between two continuous variables, which ranges between -1 and 1. The most commonly used method is Pearson’s correlation (referred to as correlation henceforth). A sample correlation for a simple random sample is calculated as follows: \\[\\frac{\\sum (x_i-\\bar{x})(y_i-\\bar{y})}{\\sqrt{\\sum (x_i-\\bar{x})^2} \\sqrt{\\sum(y_i-\\bar{y})^2}} \\] When using survey_corr() for designs other than a simple random sample, the weights are applied when estimating the correlation. 5.7.1 Syntax The syntax for survey_corr() is as follows: survey_corr( x, y, na.rm = FALSE, vartype = c(&quot;se&quot;, &quot;ci&quot;, &quot;var&quot;, &quot;cv&quot;), level = 0.95, df = NULL ) The arguments are: x: A variable or expression y: A variable or expression na.rm: A logical value to indicate whether missing values should be dropped vartype: type(s) of variation estimate to calculate including any of c(\"se\", \"ci\", \"var\", \"cv\"), defaults to se (standard error) (see 5.2.1 for more information) level: (For vartype = “ci” only) A single number or vector of numbers indicating the confidence level df: (For vartype = “ci” only) A numeric value indicating the degrees of freedom for t-distribution 5.7.2 Examples Example 1: Overall correlation We can calculate the correlation between total square footage of homes (TOTSQFT_EN)7 and electricity consumption (BTUEL)8. recs_des %&gt;% summarize(SQFT_Elec_Corr = survey_corr(TOTSQFT_EN, BTUEL)) ## # A tibble: 1 × 2 ## SQFT_Elec_Corr SQFT_Elec_Corr_se ## &lt;dbl&gt; &lt;dbl&gt; ## 1 0.417 0.00689 The correlation between total square footage of homes and electricity consumption is 0.417, indicating a moderate positive relationship. Example 2: Correlations by subgroup Like with other statistics, we can explore the correlation between total square footage and electricity consumption based on subgroups, such as whether air conditioning is used (ACUsed). recs_des %&gt;% group_by(ACUsed) %&gt;% summarize(SQFT_Elec_Corr = survey_corr(TOTSQFT_EN, DOLLAREL)) ## # A tibble: 2 × 3 ## ACUsed SQFT_Elec_Corr SQFT_Elec_Corr_se ## &lt;lgl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 FALSE 0.290 0.0240 ## 2 TRUE 0.401 0.00808 For homes without air conditioning, there is a moderate positive correlation between total square footage with electricity consumption (0.29). For homes with air conditioning, the correlation of 0.401 indicates a stronger positive correlation between total square footage and electricity consumption. 5.8 Standard deviation and variance All survey functions produce an estimate of the variability of a given estimate. No additional function is needed when dealing with variable estimates. However, if we are specifically interested in population variance and standard deviation, we can use the survey_var() and survey_sd() functions. In our experience, it is not common practice to use these functions. They can be used when designing a future study to gauge population variability and inform sampling precision. 5.8.1 Syntax As with non-survey data, the standard deviation estimate is the square root of the variance estimate. Therefore, the survey_var() and survey_sd() functions share the same arguments, except the standard deviation does not allow the usage of vartype. survey_var( x, na.rm = FALSE, vartype = c(&quot;se&quot;, &quot;ci&quot;, &quot;var&quot;), level = 0.95, df = NULL ) survey_sd( x, na.rm = FALSE ) The arguments are: x: A variable or expression, or empty na.rm: A logical value to indicate whether missing values should be dropped vartype: type(s) of variation estimate to calculate including any of c(\"se\", \"ci\", \"var\"), defaults to se (standard error) (see 5.2.1 for more information) level: (For vartype = “ci” only) A single number or vector of numbers indicating the confidence level. df: (For vartype = “ci” only) A numeric value indicating the degrees of freedom for t-distribution 5.8.2 Examples Example 1: Overall variability Let’s return to electricity bills and explore the variability in electricity expenditure. recs_des %&gt;% summarize(var_elbill = survey_var(DOLLAREL), sd_elbill = survey_sd(DOLLAREL)) ## # A tibble: 1 × 3 ## var_elbill var_elbill_se sd_elbill ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 704906. 13926. 840. We may encounter a warning related to a deprecation in the underlying calculations performed by the survey_var() function. This warning is a result of changes in the way R handles recycling in vectorized operations. The results are still valid. They give an estimate of the population variance of electricity bills (var_elbill), the standard error of that variance (var_elbill_se), and the estimated population standard deviation of electricity bills (sd_elbill). Note that no standard error is associated with the standard deviation - this is the only estimate that does not include a standard error. Example 2: Variability by subgroup To find out if the variability in electricity expenditure is similar across regions, we can calculate the variance by region using group_by(): recs_des %&gt;% group_by(Region) %&gt;% summarize(var_elbill = survey_var(DOLLAREL), sd_elbill = survey_sd(DOLLAREL)) ## # A tibble: 4 × 4 ## Region var_elbill var_elbill_se sd_elbill ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Northeast 775450. 38843. 881. ## 2 Midwest 552423. 25252. 743. ## 3 South 702521. 30641. 838. ## 4 West 717886. 30597. 847. 5.9 Additional topics 5.9.1 Unweighted analysis Sometimes, it is helpful to calculate an unweighted estimate of a given variable. For this, we use the unweighted() function in the summarize() function. The unweighted() function calculates unweighted summaries from a tbl_svy object, providing the summary among the respondents without extrapolating to a population estimate. The unweighted() function can be used in conjunction with any {dplyr} functions. Here is an example looking at the average household electricity cost: recs_des %&gt;% summarize(elec_bill = survey_mean(DOLLAREL), elec_unweight = unweighted(mean(DOLLAREL))) ## # A tibble: 1 × 3 ## elec_bill elec_bill_se elec_unweight ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 1380. 5.38 1425. It is estimated that American residential households spent an average of $1,380 on electricity in 2020, and the estimate has a standard error of $5.38. The unweighted() function calculates the unweighted average and represents the average amount of money spent on electricity in 2020 by the respondents, which was $1,425. 5.9.2 Subpopulation analysis We mentioned using filter() to subset a survey object for analysis. This operation should be done after creating the survey design object. In rare circumstances, subsetting data before creating the object can lead to incorrect variability estimates. This may occur if subsetting removes an entire Primary Sampling Unit (PSU; see Chapter 10 for more information on PSUs and sample designs). Suppose we want estimates of the average amount spent on natural gas among housing units using natural gas (based on the variable BTUNG)9. We first filter records to only include records where BTUNG &gt; 0 and then find the average amount of money spent. recs_des %&gt;% filter(BTUNG &gt; 0) %&gt;% summarize(NG_mean = survey_mean(DOLLARNG, vartype = c(&quot;se&quot;, &quot;ci&quot;))) ## # A tibble: 1 × 4 ## NG_mean NG_mean_se NG_mean_low NG_mean_upp ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 631. 4.64 621. 640. The estimated average amount spent on natural gas is $631. Note that applying the filter to include only housing units that use natural gas yields a higher mean than when not applying the filter. This is because including housing units that do not use natural gas introduces many $0 amounts, impacting the mean calculation. recs_des %&gt;% summarize(NG_mean = survey_mean(DOLLARNG, vartype = c(&quot;se&quot;, &quot;ci&quot;))) ## # A tibble: 1 × 4 ## NG_mean NG_mean_se NG_mean_low NG_mean_upp ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 382. 3.41 375. 389. Based on this calculation, the estimated average amount spent on natural gas is $382. 5.9.3 Design effects The design effect measures how the precision of an estimate is influenced by the sampling design. In other words, it measures how much more or less statistically efficient the survey design is compared to a simple random sample (SRS). It is computed by taking the ratio of the estimate’s variance under the design at hand to the estimate’s variance under a simple random sample without replacement. A design effect less than 1 indicates that the design is more statistically efficient than an SRS design, which is rare but possible in a stratified sampling design where the outcome correlates with the stratification variable(s). A design effect greater than 1 indicates that the design is less statistically efficient than a SRS design. From a design effect, we can calculate the effective sample size as follows: \\[n_{eff}=\\frac{n}{D_{eff}} \\] where \\(n\\) is the nominal sample size (the number of survey responses) and \\(D_{eff}\\) is the estimated design effect. We can interpret the effective sample size \\(n_{eff}\\) as the hypothetical sample size that a survey using an SRS design would need to achieve the same precision as the design at hand. Design effects specific to each outcome — outcomes that are less clustered in the population have smaller design effects than outcomes that are clustered. In the {srvyr} package, design effects can be calculated for totals, proportions, means, and ratio estimates by setting the deff argument to TRUE in the corresponding functions. In the example below, we calculate the design effects for the average consumption of electricity (BTUEL), natural gas (BTUNG), liquid propane (BTULP), fuel oil (BTUFO), and wood (BTUWOOD) by setting deff = TRUE: recs_des %&gt;% summarize(across( c(BTUEL, BTUNG, BTULP, BTUFO, BTUWOOD), ~ survey_mean(.x, deff = TRUE, vartype = NULL) )) %&gt;% select(ends_with(&quot;deff&quot;)) ## # A tibble: 1 × 5 ## BTUEL_deff BTUNG_deff BTULP_deff BTUFO_deff BTUWOOD_deff ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 0.597 0.938 1.21 0.720 1.10 For the values less than 1 (BTUEL_deff and BTUFO_deff), the results suggest that the survey design is more efficient than a simple random sample. For the values greater than 1 (BTUNG_deff, BTULP_deff, and BTUWOOD_deff), the results indicate that the survey design is less efficient than a simple random sample. 5.9.4 Creating summary rows When using group_by() in analysis, the results are returned with a row for each group or combination of groups. Often, we want both the breakdowns by group and a summary row for the estimate representing the entire population. For example, we may want the average electricity consumption by region and nationally. The {srvyr} package has the convenient cascade() function, which adds summary rows for the total of a group. It is used in place of summarize() and has similar functionalities along with some additional features. Syntax The syntax is as follows: cascade( .data, ..., .fill = NA, .fill_level_top = FALSE, .groupings = NULL ) where the arguments are: .data: A tbl_svy object ...: Name-value pairs of summary functions (same as the summarize() function) .fill: Value to fill in for group summaries (defaults to NA) .fill_level_top: When filling factor variables, whether to put the value ‘.fill’ in the first position (defaults to FALSE, placing it in the bottom). Example First, let’s look at an example where we calculate the average household electricity cost and. Then, we build on it to examine the features of the cascade() function. In the first example below, we calculate the average household energy cost DOLLAREL_mn using survey_mean() without modifying any of the argument defaults in the function: recs_des %&gt;% cascade(DOLLAREL_mn = survey_mean(DOLLAREL)) ## # A tibble: 1 × 2 ## DOLLAREL_mn DOLLAREL_mn_se ## &lt;dbl&gt; &lt;dbl&gt; ## 1 1380. 5.38 Next, let’s group the results by region by adding group_by() before the cascade() function: recs_des %&gt;% group_by(Region) %&gt;% cascade(DOLLAREL_mn = survey_mean(DOLLAREL)) ## # A tibble: 5 × 3 ## Region DOLLAREL_mn DOLLAREL_mn_se ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Northeast 1343. 14.6 ## 2 Midwest 1293. 11.7 ## 3 South 1548. 10.3 ## 4 West 1211. 12.0 ## 5 &lt;NA&gt; 1380. 5.38 We can see the estimated average electricity bills by regions: $1,343 for the Northeast, $1,548 for the South, and so on. The last row where Region = NA is the national average electricity bill, $1,380. However, naming the national “region” as NA is not very informative. We can give it a better name using the .fill argument. recs_des %&gt;% group_by(Region) %&gt;% cascade(DOLLAREL_mn = survey_mean(DOLLAREL), .fill = &quot;National&quot;) ## # A tibble: 5 × 3 ## Region DOLLAREL_mn DOLLAREL_mn_se ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Northeast 1343. 14.6 ## 2 Midwest 1293. 11.7 ## 3 South 1548. 10.3 ## 4 West 1211. 12.0 ## 5 National 1380. 5.38 We can move the summary row to the first row by adding .fill_level_top = TRUE to cascade(): recs_des %&gt;% group_by(Region) %&gt;% cascade( DOLLAREL_mn = survey_mean(DOLLAREL), .fill = &quot;National&quot;, .fill_level_top = TRUE ) ## # A tibble: 5 × 3 ## Region DOLLAREL_mn DOLLAREL_mn_se ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 National 1380. 5.38 ## 2 Northeast 1343. 14.6 ## 3 Midwest 1293. 11.7 ## 4 South 1548. 10.3 ## 5 West 1211. 12.0 While the results remain the same, the table is now easier to interpret. 5.9.5 Calculating estimates for many outcomes Often, we are interested in a summary statistic across many variables. Useful tools include the across() function in {dplyr}, shown a few times above, and the map() function in {purrr}. The across() function allows you to apply the same function to multiple columns within summarize(). This works well with all functions shown above, except for survey_prop(). In a later example, we will tackle summarizing multiple proportions. Example 1: across() Suppose we want to calculate the total and average consumption, along with coefficients of variation (CV), for each fuel type. These include the reported consumption of electricity (BTUEL), natural gas (BTUNG), liquid propane (BTULP), fuel oil (BTUFO), and wood (BTUWOOD), as mentioned in the section on design effects. We can take advantage of the fact that these are the only variables that start with “BTU” by selecting them with starts_with(\"BTU\") in the across() function. For each selected column (.x), across() creates a list of two functions to be applied: survey_total() to calculate the total and survey_mean() to calculate the mean, along with their CV (vartype = \"cv\"). Finally, .unpack = \"{outer}.{inner}\" specifies that the resulting column names are a concatenation of the variable name, followed by Total or Mean, and then “coef” or “cv”. consumption_ests &lt;- recs_des %&gt;% summarize(across( starts_with(&quot;BTU&quot;), list( Total = ~ survey_total(.x, vartype = &quot;cv&quot;), Mean = ~ survey_mean(.x, vartype = &quot;cv&quot;) ), .unpack = &quot;{outer}.{inner}&quot; )) consumption_ests ## # A tibble: 1 × 20 ## BTUEL_Total.coef BTUEL_Total._cv BTUEL_Mean.coef BTUEL_Mean._cv ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 4453284510065 0.00377 36051. 0.00377 ## # ℹ 16 more variables: BTUNG_Total.coef &lt;dbl&gt;, BTUNG_Total._cv &lt;dbl&gt;, ## # BTUNG_Mean.coef &lt;dbl&gt;, BTUNG_Mean._cv &lt;dbl&gt;, ## # BTULP_Total.coef &lt;dbl&gt;, BTULP_Total._cv &lt;dbl&gt;, ## # BTULP_Mean.coef &lt;dbl&gt;, BTULP_Mean._cv &lt;dbl&gt;, ## # BTUFO_Total.coef &lt;dbl&gt;, BTUFO_Total._cv &lt;dbl&gt;, ## # BTUFO_Mean.coef &lt;dbl&gt;, BTUFO_Mean._cv &lt;dbl&gt;, ## # BTUWOOD_Total.coef &lt;dbl&gt;, BTUWOOD_Total._cv &lt;dbl&gt;, … The estimated total consumption of electricity (BTUEL) is 4,453,284,510,065 (BTUEL_Total.coef), the estimated average consumption is 36,051 (BTUEL_Mean.coef), and the CV is 0.0038. In the example above, the table was quite wide. We may prefer a row for each fuel type. Using the pivot_longer() and pivot_wider() functions from {tidyr} can help us achieve this. First, we use pivot_longer() to make each variable a column, changing the data to a “long” format. We use the names_to argument to specify new column names: FuelType, Stat, and Type. Then, the names_pattern argument extracts the names in the original column names based on the regular expression pattern BTU(.*)_(.*)\\\\.(.*). They are saved in the column names defined in names_to. consumption_ests_long &lt;- consumption_ests %&gt;% pivot_longer( cols = everything(), names_to = c(&quot;FuelType&quot;, &quot;Stat&quot;, &quot;Type&quot;), names_pattern = &quot;BTU(.*)_(.*)\\\\.(.*)&quot; ) consumption_ests_long ## # A tibble: 20 × 4 ## FuelType Stat Type value ## &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;dbl&gt; ## 1 EL Total coef 4453284510065 ## 2 EL Total _cv 0.00377 ## 3 EL Mean coef 36051. ## 4 EL Mean _cv 0.00377 ## 5 NG Total coef 4240769382106. ## 6 NG Total _cv 0.00908 ## 7 NG Mean coef 34330. ## 8 NG Mean _cv 0.00908 ## 9 LP Total coef 391425311586. ## 10 LP Total _cv 0.0380 ## 11 LP Mean coef 3169. ## 12 LP Mean _cv 0.0380 ## 13 FO Total coef 395699976655. ## 14 FO Total _cv 0.0343 ## 15 FO Mean coef 3203. ## 16 FO Mean _cv 0.0343 ## 17 WOOD Total coef 345091088404. ## 18 WOOD Total _cv 0.0454 ## 19 WOOD Mean coef 2794. ## 20 WOOD Mean _cv 0.0454 Then, we use pivot_wider() to create a table that is nearly ready for publication. Within the function, we can make the names for each element more descriptive and informative by gluing the Stat and Type together with names_glue. Further details on creating publication-ready tables are covered in Chapter 8. consumption_ests_long %&gt;% mutate(Type = case_when(Type == &quot;coef&quot; ~ &quot;&quot;, Type == &quot;_cv&quot; ~ &quot; (CV)&quot;)) %&gt;% pivot_wider( id_cols = FuelType, names_from = c(Stat, Type), names_glue = &quot;{Stat}{Type}&quot;, values_from = value ) ## # A tibble: 5 × 5 ## FuelType Total `Total (CV)` Mean `Mean (CV)` ## &lt;chr&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 EL 4453284510065 0.00377 36051. 0.00377 ## 2 NG 4240769382106. 0.00908 34330. 0.00908 ## 3 LP 391425311586. 0.0380 3169. 0.0380 ## 4 FO 395699976655. 0.0343 3203. 0.0343 ## 5 WOOD 345091088404. 0.0454 2794. 0.0454 Example 2: Proportions with across() As mentioned earlier, proportions do not work as well directly with the across() method. If we want the proportion of houses with air conditioning and the proportion of houses with heating, we require two separate group_by() statements as shown below: recs_des %&gt;% group_by(ACUsed) %&gt;% summarize(p = survey_prop()) ## # A tibble: 2 × 3 ## ACUsed p p_se ## &lt;lgl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 FALSE 0.113 0.00306 ## 2 TRUE 0.887 0.00306 recs_des %&gt;% group_by(SpaceHeatingUsed) %&gt;% summarize(p = survey_prop()) ## # A tibble: 2 × 3 ## SpaceHeatingUsed p p_se ## &lt;lgl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 FALSE 0.0469 0.00207 ## 2 TRUE 0.953 0.00207 We estimate 88.7% of households have air conditioning and 95.3% have heating. If we are only interested in the TRUE outcomes, that is, the proportion of households that have air conditioning and the proportion that have heating, we can simplify the code. Applying survey_mean() to a logical variable is the same as using survey_prop(), as shown below: cool_heat_tab &lt;- recs_des %&gt;% summarize(across(c(ACUsed, SpaceHeatingUsed), ~ survey_mean(.x), .unpack = &quot;{outer}.{inner}&quot;)) cool_heat_tab ## # A tibble: 1 × 4 ## ACUsed.coef ACUsed._se SpaceHeatingUsed.coef SpaceHeatingUsed._se ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 0.887 0.00306 0.953 0.00207 Note that the estimates are the same with those obtained using the separate group_by() statements. As before, we can use pivot_longer() to structure the table in a more suitable format for distribution. cool_heat_tab %&gt;% pivot_longer(everything(), names_to = c(&quot;Comfort&quot;, &quot;.value&quot;), names_pattern = &quot;(.*)\\\\.(.*)&quot;) %&gt;% rename(p = coef, se = `_se`) ## # A tibble: 2 × 3 ## Comfort p se ## &lt;chr&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 ACUsed 0.887 0.00306 ## 2 SpaceHeatingUsed 0.953 0.00207 Example 3: purrr::map() Loops are a common tool when dealing with repetitive calculations. The {purrr} package provides the map() functions which, like a loop, allow you to perform the same task across different elements (Wickham and Henry 2023). In our case, we may want to calculate proportions from the same design multiple times. A straightforward approach is to design the calculation for one variable, build a function based on that, and then apply it iteratively for the rest of the variables. Suppose we want to create a table that shows the proportion of people who express trust in their government (TrustGovernment)10 as well as those that trust in people (TrustPeople)11. First, we create a table for a single variable. The table includes the variable name as a column, the response, and the corresponding percentage with its standard error. anes_des %&gt;% drop_na(TrustGovernment) %&gt;% group_by(TrustGovernment) %&gt;% summarize(p = survey_prop() * 100) %&gt;% mutate(Variable = &quot;TrustGovernment&quot;) %&gt;% rename(Answer = TrustGovernment) %&gt;% select(Variable, everything()) ## # A tibble: 5 × 4 ## Variable Answer p p_se ## &lt;chr&gt; &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 TrustGovernment Always 1.55 0.204 ## 2 TrustGovernment Most of the time 13.2 0.553 ## 3 TrustGovernment About half the time 30.9 0.829 ## 4 TrustGovernment Some of the time 43.4 0.855 ## 5 TrustGovernment Never 11.0 0.566 We estimate that 1.55% of people always trust the government, 13.16% trust the government most of the time, and so on. Now, we want to use the original series of steps as a template to create a general function calcps() that can apply the same steps to other variables. We replace TrustGovernment with an argument for a generic variable, var. Referring to var involves a bit of tidy evaluation, an advanced skill. To learn more, we recommend Wickham (2019). calcps &lt;- function(var) { anes_des %&gt;% drop_na(!!sym(var)) %&gt;% group_by(!!sym(var)) %&gt;% summarize(p = survey_prop() * 100) %&gt;% mutate(Variable = var) %&gt;% rename(Answer := !!sym(var)) %&gt;% select(Variable, everything()) } We then apply this function to the two variables of interest, TrustGovernment and TrustPeople: calcps(&quot;TrustGovernment&quot;) ## # A tibble: 5 × 4 ## Variable Answer p p_se ## &lt;chr&gt; &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 TrustGovernment Always 1.55 0.204 ## 2 TrustGovernment Most of the time 13.2 0.553 ## 3 TrustGovernment About half the time 30.9 0.829 ## 4 TrustGovernment Some of the time 43.4 0.855 ## 5 TrustGovernment Never 11.0 0.566 calcps(&quot;TrustPeople&quot;) ## # A tibble: 5 × 4 ## Variable Answer p p_se ## &lt;chr&gt; &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 TrustPeople Always 0.809 0.164 ## 2 TrustPeople Most of the time 41.4 0.857 ## 3 TrustPeople About half the time 28.2 0.776 ## 4 TrustPeople Some of the time 24.5 0.670 ## 5 TrustPeople Never 5.05 0.422 Finally, we use map() to iterate over as many variables as needed. We feed our desired variables into map() along with our custom function, calcps. The output is a tibble with the variable names in the “Variable” column, the responses in the “Answer” column, along with the percentage and standard error. The list_rbind() function combines the rows into a single tibble. This example extends nicely when dealing with numerous variables for which we want percentage estimates. c(&quot;TrustGovernment&quot;, &quot;TrustPeople&quot;) %&gt;% map(calcps) %&gt;% list_rbind() ## # A tibble: 10 × 4 ## Variable Answer p p_se ## &lt;chr&gt; &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 TrustGovernment Always 1.55 0.204 ## 2 TrustGovernment Most of the time 13.2 0.553 ## 3 TrustGovernment About half the time 30.9 0.829 ## 4 TrustGovernment Some of the time 43.4 0.855 ## 5 TrustGovernment Never 11.0 0.566 ## 6 TrustPeople Always 0.809 0.164 ## 7 TrustPeople Most of the time 41.4 0.857 ## 8 TrustPeople About half the time 28.2 0.776 ## 9 TrustPeople Some of the time 24.5 0.670 ## 10 TrustPeople Never 5.05 0.422 In addition to our results above, we can also see the output for TrustPeople. While we estimate 1.55% of people always trust the government, 0.81% always trust people. 5.10 Exercises The exercises use the design objects anes_des and recs_des provided in the Prerequisites box in the beginning of the chapter. How many females have a graduate degree? Hint: the variables Gender and Education will be useful. What percentage of people identify as “Strong Democrat”? Hint: The variable PartyID indicates someone’s party affiliation. What percentage of people who voted in the 2020 election identify as “Strong Republican”? Hint: The variable VotedPres2020 indicates whether someone voted in 2020. What percentage of people voted in both the 2016 election and the 2020 election? Include the logit confidence interval. Hint: The variable VotedPres2016 indicates whether someone voted in 2016. What is the design effect for the proportion of people who voted early? Hint: The variable EarlyVote2020 indicates whether someone voted early in 2020. What is the median temperature people set their thermostats to at night during the winter? Hint: The variable WinterTempNight indicates the temperature that people set their temperature in the winter at night. People sometimes set their temperature differently over different seasons and during the day. What median temperatures do people set their thermostat to in the summer and winter, both during the day and at night? Include confidence intervals. Hint: Use the variables WinterTempDay, WinterTempNight, SummerTempDay, and SummerTempNight. What is the correlation between the temperature that people set their temperature at during the night and during the day in the summer? What is the 1st, 2nd, and 3rd quartile of the amount of money spent on energy by Building America (BA) climate zone? Hint: TOTALDOL indicates the total amount spent on all fuel, and ClimateRegion_BA indicates the BA climate zones. References Freedman Ellis, Greg, and Ben Schneider. 2023. srvyr: Dplyr-Like Syntax for Summary Statistics of Survey Data. Lumley, Thomas. 2010. Complex Surveys: A Guide to Analysis Using R. John Wiley; Sons. Shah, Babubhai V, and Akhil K Vaish. 2006. “Confidence Intervals for Quantile Estimation from Complex Survey Data.” In Proceedings of the Section on Survey Research Methods. http://www.asasrms.org/Proceedings/y2006/Files/JSM2006-000749.pdf. ———. 2023a. “2020 Residential Energy Consumption Survey: Consumption and Expenditures Technical Documentation Summary.” https://www.eia.gov/consumption/residential/data/2020/pdf/2020%20RECS%20CE%20Methodology_Final.pdf. ———. 2019. Advanced R. https://adv-r.hadley.nz/; CRC press. Wickham, Hadley, and Lionel Henry. 2023. purrr: Functional Programming Tools. RECS has two components: a household survey and an energy supplier survey. For each household that responds, their energy provider(s) are contacted to obtain their energy consumption and expenditure. This value reflects the dollars spent on electricity in 2020, according to the energy supplier. See U.S. Energy Information Administration (2023a) for more details.↩︎ Question text: Is any air conditioning equipment used in your home?↩︎ The value of DOLLARLP reflects the annualized amount spent on liquid propane and BTULP reflects the annualized consumption in Btu of liquid propane.↩︎ Question text: What is the square footage of your home?↩︎ BTUEL is derived from the supplier side component of the survey where BTUEL represents the electricity consumption in British thermal units (Btus) converted from kilowatt hours (kWh) in a year↩︎ BTUNG is derived from the supplier side component of the survey where BTUNG represents the natural gas consumption in British thermal units (Btus) in a year↩︎ Question: How often can you trust the federal government in Washington to do what is right? (Always, most of the time, about half the time, some of the time, or never / Never, some of the time, about half the time, most of the time, or always)?↩︎ Question: Generally speaking, how often can you trust other people? (Always, most of the time, about half the time, some of the time, or never / Never, some of the time, about half the time, most of the time, or always)? ↩︎ "],["c06-statistical-testing.html", "Chapter 6 Statistical testing 6.1 Introduction 6.2 Dot notation 6.3 Comparison of proportions and means 6.4 Chi-square tests 6.5 Exercises", " Chapter 6 Statistical testing Prerequisites For this chapter, load the following packages: library(tidyverse) library(survey) library(srvyr) library(srvyrexploR) library(broom) library(gt) library(prettyunits) We will be using data from ANES and RECS described in Chapter 4. As a reminder, here is the code to create the design objects for each to use throughout this chapter. For ANES, we need to adjust the weight so it sums to the population instead of the sample (see the ANES documentation and Chapter 4 for more information). targetpop &lt;- 231592693 anes_adjwgt &lt;- anes_2020 %&gt;% mutate(Weight = Weight / sum(Weight) * targetpop) anes_des &lt;- anes_adjwgt %&gt;% as_survey_design( weights = Weight, strata = Stratum, ids = VarUnit, nest = TRUE ) For RECS, details are included in the RECS documentation and Chapters 4 and 10. recs_des &lt;- recs_2020 %&gt;% as_survey_rep( weights = NWEIGHT, repweights = NWEIGHT1:NWEIGHT60, type = &quot;JK1&quot;, scale = 59/60, mse = TRUE ) 6.1 Introduction When analyzing results from a survey, the point estimates described in Chapter 5 help us understand the data at a high level. Still, researchers and the public often want to make comparisons between different groups. These comparisons are calculated through statistical testing. The general idea of statistical testing is the same for data obtained through surveys and data obtained through other methods, where we compare the point estimates and variance estimates of each statistic to see if statistically significant differences exist. However, statistical testing for complex surveys involves additional considerations due to the need to account for the sampling design in order to obtain accurate variance estimates. Statistical testing, also called hypothesis testing, involves declaring a null and alternative hypothesis. A null hypothesis is denoted as \\(H_0\\) and the alternative hypothesis is denoted as \\(H_A\\). The null hypothesis is the default assumption in that there are no differences in the data, or that the data is operating under “standard” behaviors. On the other hand, the alternative hypothesis is the break from the “standard” and what we are trying to determine if the data supports. Let’s review an example outside of survey data. If we are flipping a coin, a null hypothesis would be that the coin is fair and that each side has an equal chance of being flipped. In other words, the probability of the coin landing on each side is 1/2. Whereas an alternative hypothesis could be that the coin is unfair and that one side has a higher probability of being flipped (e.g., a probability of 1/4 to get heads, but a probability of 3/4 to get tails). We write this set of hypotheses as: \\(H_0: \\rho_{heads} = \\rho_{tails}\\), where \\(\\rho_{x}\\) is the probability of flipping the coin and having it land on heads (\\(\\rho_{heads}\\)) or tails (\\(\\rho_{tails}\\)) \\(H_A: \\rho_{heads} \\neq \\rho_{tails}\\) When we conduct hypothesis testing, the statistical models calculate a p-value, which shows how likely we are to observe the data if the null hypothesis is true. If the p-value (a probability between 0 and 1) is small, we have strong evidence to reject the null hypothesis as it is unlikely to see the data we are observing if the null hypothesis is true. However, if the p-value is large, we say we do not have evidence to reject the null hypothesis. The size of the p-value for this cut off is determined by type 1 error known as \\(\\alpha\\). A common type 1 error value for statistical testing is to use \\(\\alpha = 0.05\\).12 It is common for explanations of statistical testing to refer to confidence level. The confidence level is the inverse of the type 1 error. Thus, if \\(\\alpha = 0.05\\), the confidence level would be 95%. The functions in the {survey} package allow for the correct estimation of the variances. This chapter will cover the following statistical tests with survey data and the following functions from the {survey} package(Lumley 2010): Comparison of proportions svyttest() Comparison of means svyttest() Goodness of fit tests svygofchisq() Tests of independence svychisq() Tests of homogeneity svychisq() 6.2 Dot notation Up to this point, we have shown functions that use wrappers from the {srvyr} package. This means that the functions work with tidyverse syntax. However, the functions in this chapter do not have wrappers in the {srvyr} package and are instead used directly from the {survey} package. Therefore, the design object is not the first argument, and to use these functions with the magrittr pipe (%&gt;%) and tidyverse syntax, we will need to use dot (.) notation13 Functions that work with the magrittr pipe (%&gt;%) have the data as the first argument. When we run a function with the pipe, it automatically places anything to the left of the pipe into the first argument of the function to the right of the pipe. For example, if we wanted to take the mtcars data and filter to cars with six cylinders, we can write the code in at least four different ways: filter(mtcars, cyl == 6) mtcars %&gt;% filter(cyl == 6) mtcars %&gt;% filter(., cyl == 6) mtcars %&gt;% filter(.data = ., cyl == 6) Each of these lines of code will produce the same output since the argument that takes the data is in the first spot in filter(). The first two are probably familiar to those who have worked with the tidyverse. The third option functions the same way as the second one but is explicit that mtcars goes into the first argument, and the fourth option indicates that mtcars is going into the named argument of .data. Here, we are telling R to take what’s on the left side of the pipe (mtcars) and pipe it into the spot with the dot (.)—the first argument. In functions that are not part of the tidyverse, the data argument may not be in the first spot. For example, in svyttest(), the data argument is in the second spot, which means we need to place the dot (.) in the second spot and not the first. For example: svydata_des %&gt;% svyttest(x ~ y, .) By default, the pipe places the left-hand object in the first argument spot. Placing the dot (.) in the second argument spot indicates that the survey design object svydata_des should be used in the second argument and not the first. Alternatively, named arguments could be used to place the dot first as named arguments can appear at any location, as in the following: svydata_des %&gt;% svyttest(design = ., x ~ y) However, the following code will not work as the svyttest() function expects the formula as the first argument when arguments are not named: svydata_des %&gt;% svyttest(., x ~ y) 6.3 Comparison of proportions and means We use t-tests to compare two proportions or means. T-tests allow us to determine if one proportion or mean is statistically different from another. They are commonly used to determine if a single estimate differs from a known value (e.g., 0 or 50%) or to compare two group means (e.g., North versus South). Comparing a single estimate to a known value is called a one sample t-test, and we can set up the hypothesis test as follows: \\(H_0: \\mu = 0\\) where \\(\\mu\\) is the mean outcome and \\(0\\) is the value we are comparing it to \\(H_A: \\mu \\neq 0\\) For comparing two estimates, this is called a two-sample t-test and we can set up the hypothesis test as follows: \\(H_0: \\mu_1 = \\mu_2\\) where \\(\\mu_i\\) is the mean outcome for group \\(i\\) \\(H_A: \\mu_1 \\neq \\mu_2\\) Two sample t-tests can also be paired or unpaired. If the data come from two different populations (e.g., North versus South), the t-test run will be an unpaired or independent samples t-test. Paired t-tests occur when the data come from the same population. This is commonly seen with data from the same population in two different time periods (e.g., before and after an intervention). The difference between t-tests with non-survey data and survey data is based on the underlying variance estimation difference. Chapter 10 provides a detailed overview of the math behind the mean and sampling error calculations for various sample designs. The functions in the {survey} package will account for these nuances, provided the design object is correctly defined. 6.3.1 Syntax When we do not have survey data, we can use the t.test() function from the {stats} package. This function does not allow for weights or the variance structure that need to be accounted for with survey data. Therefore, we need to use the svyttest() function from {survey} when using survey data. Many of the arguments are the same between the two functions, but there are a few key differences: We need to use the survey design object instead of the original data frame We can only use a formula and not separate x and y data The confidence level cannot be specified and will always be set to 95%. However, we will show examples of how the confidence level can be changed after running the svyttest() function by using the confint() function. Here is the syntax for the svyttest() function: svyttest(formula, design, ...) The arguments are: formula: Formula, outcome~group for two-sample, outcome~0 or outcome~1 for one-sample. The group variable must be a factor or character with two levels, or be coded 0/1 or 1/2. We give more details on formula set-up below for different types of tests. design: survey design object ...: This passes options on for one-sided tests only, and thus, we can specify na.rm=TRUE Notice that the first argument here is the formula and not the design. This means we must use the dot (.) if we pipe in the survey design object (as described in Section 6.2). The formula argument can take several different forms depending on what we are measuring. Here are a few common scenarios: One-sample t-test: Comparison to 0: var ~ 0, where var is the measure of interest, and we compare it to the value 0. For example, we could test if the population mean of household debt is different from 0 given the sample data collected. Comparison to a different value: var - value ~ 0, where var is the measure of interest and value is what we are comparing to. For example, we could test if the proportion of the population that has blue eyes is different from 25% by using var - 0.25 ~ 0. Note that specifying the formula as var ~ 0.25 is not equivalent and will result in a syntax error. Two-sample t-test: Unpaired: 2 level grouping variable: var ~ groupVar, where var is the measure of interest and groupVar is a variable with two categories. For example, we could test if the average age of the population who voted for president in 2020 differed from the age of people who did not vote. In this case, age would be used for var, and a binary variable indicating voting activity would be the groupVar. 3+ level grouping variable: var ~ groupVar == level, where var is the measure of interest, groupVar is the categorical variable, and level is the category level to isolate. For example, we could test if the test scores in one classroom differed from all other classrooms where groupVar would be the variable holding the values for classroom IDs and level is the classroom ID we want to compare to the others. Paired: var_1 - var_2 ~ 0, where var_1 is the first variable of interest and var_2 is the second variable of interest. For example, we could test if test scores on a subject differed between the start and the end of a course so var_1 would be the test score at the beginning of the course and var_2 would be the score at the end of the course. The na.rm argument defaults to FALSE, which means if any data is missing, the t-test will not compute. Throughout this chapter, we will always set na.rm = TRUE, but before analyzing the survey data, review the notes provided in Chapter 3 to better understand how to handle missing data. Let’s walk through a few examples using the ANES and RECS data. 6.3.2 Examples Example 1: One-sample t-test for mean RECS asks respondents to indicate what temperature they set their house to during the summer at night.14 In our data, we have called this variable SummerTempNight. If we want to see if the average U.S. household sets its temperature at a value different from 68\\(^\\circ\\)F15, we could set up the hypothesis as follows: \\(H_0: \\mu = 68\\) where \\(\\mu\\) is the average temperature U.S. households set their thermostat to in the summer at night \\(H_A: \\mu \\neq 68\\) To conduct this in R, we use svyttest() and subtract the temperature on the left-hand side of the formula: ttest_ex1 &lt;- recs_des %&gt;% svyttest( formula = SummerTempNight - 68 ~ 0, design = ., na.rm = TRUE ) ttest_ex1 ## ## Design-based one-sample t-test ## ## data: SummerTempNight - 68 ~ 0 ## t = 85, df = 58, p-value &lt;2e-16 ## alternative hypothesis: true mean is not equal to 0 ## 95 percent confidence interval: ## 3.288 3.447 ## sample estimates: ## mean ## 3.367 To pull out specific output, we can use R’s built-in $ operator. For instance, to obtain the estimate \\(\\mu - 68\\), we run ttest_ex1$estimate. If we want the average, we take our t-test estimate and add it to 68: ttest_ex1$estimate + 68 ## mean ## 71.37 Or, we can use the survey_mean() function described in Chapter 5: recs_des %&gt;% summarize(mu = survey_mean(SummerTempNight, na.rm = TRUE)) ## # A tibble: 1 × 2 ## mu mu_se ## &lt;dbl&gt; &lt;dbl&gt; ## 1 71.4 0.0397 The result is the same in both methods, so we see that the average temperature U.S. households set their thermostat to in the summer at night is 71.4\\(^\\circ\\)F. Looking at the output from svyttest(), the t-statistic is 84.8, and the p-value is \\(&lt;0.0001\\), indicating that the average is statistically different from 68\\(^\\circ\\)F at an \\(\\alpha\\) level of \\(0.05\\). If we want an 80% confidence interval for the test statistic, we can use the function confint() to change the confidence level. Below, we print both the original 95% confidence interval and the 80% confidence interval: confint(ttest_ex1, level = 0.95) ## 2.5 % 97.5 % ## as.numeric(SummerTempNight - 68) 3.288 3.447 ## attr(,&quot;conf.level&quot;) ## [1] 0.95 confint(ttest_ex1, level = 0.8) ## [1] 3.316 3.419 ## attr(,&quot;conf.level&quot;) ## [1] 0.8 In this case, neither confidence interval contains 0, and we draw the same conclusion from either that the average temperature households set their thermostat in the summer at night is significantly higher than 68\\(^\\circ\\)F. Example 2: One-sample t-test for proportion RECS asked respondents if they use any air conditioning (AC) in their home.16 In our data, we call this variable ACUsed. Let’s look at the proportion of U.S. households that use AC in their homes using the survey_prop() function we learned in Chapter 5. acprop &lt;- recs_des %&gt;% group_by(ACUsed) %&gt;% summarize(p = survey_prop()) acprop ## # A tibble: 2 × 3 ## ACUsed p p_se ## &lt;lgl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 FALSE 0.113 0.00306 ## 2 TRUE 0.887 0.00306 Based on this, 88.7% of U.S. households use AC in their homes. If we wanted to know if this differs from 90%, we could set up our hypothesis as follows: \\(H_0: p = 0.90\\) where \\(p\\) is the proportion of the U.S. households that use AC in their homes \\(H_A: p \\neq 0.90\\) To conduct this in R, we use the svyttest() function as follows: ttest_ex2 &lt;- recs_des %&gt;% svyttest( formula = (ACUsed == TRUE) - 0.90 ~ 0, design = ., na.rm = TRUE ) ttest_ex2 ## ## Design-based one-sample t-test ## ## data: (ACUsed == TRUE) - 0.9 ~ 0 ## t = -4.4, df = 58, p-value = 5e-05 ## alternative hypothesis: true mean is not equal to 0 ## 95 percent confidence interval: ## -0.019603 -0.007348 ## sample estimates: ## mean ## -0.01348 The output from the svyttest() function can be a bit hard to read. Using the tidy() function from the {broom} package, we can clean up the output into a tibble to more easily understand what the test tells us (Robinson, Hayes, and Couch 2023). tidy(ttest_ex2) ## # A tibble: 1 × 8 ## estimate statistic p.value parameter conf.low conf.high method ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;chr&gt; ## 1 -0.0135 -4.40 0.0000466 58 -0.0196 -0.00735 Design-base… ## # ℹ 1 more variable: alternative &lt;chr&gt; The ‘tidied’ output can also be piped into the {gt} package to create a table ready for publication. We go over the {gt} package in Chapter 8. The function pretty_p_value() comes from the {prettyunits} package and converts numeric p-values to characters and, by default prints four decimal places and displays any p-value less than 0.0001 as \"&lt;0.0001\" though another minimum display p-value can be specified (Csardi 2023). tidy(ttest_ex2) %&gt;% mutate(p.value = pretty_p_value(p.value)) %&gt;% gt() %&gt;% fmt_number() #lqgijczpsn table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #lqgijczpsn thead, #lqgijczpsn tbody, #lqgijczpsn tfoot, #lqgijczpsn tr, #lqgijczpsn td, #lqgijczpsn th { border-style: none; } #lqgijczpsn p { margin: 0; padding: 0; } #lqgijczpsn .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #lqgijczpsn .gt_caption { padding-top: 4px; padding-bottom: 4px; } #lqgijczpsn .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #lqgijczpsn .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #lqgijczpsn .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #lqgijczpsn .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #lqgijczpsn .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #lqgijczpsn .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #lqgijczpsn .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #lqgijczpsn .gt_column_spanner_outer:first-child { padding-left: 0; } #lqgijczpsn .gt_column_spanner_outer:last-child { padding-right: 0; } #lqgijczpsn .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #lqgijczpsn .gt_spanner_row { border-bottom-style: hidden; } #lqgijczpsn .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #lqgijczpsn .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #lqgijczpsn .gt_from_md > :first-child { margin-top: 0; } #lqgijczpsn .gt_from_md > :last-child { margin-bottom: 0; } #lqgijczpsn .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #lqgijczpsn .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #lqgijczpsn .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #lqgijczpsn .gt_row_group_first td { border-top-width: 2px; } #lqgijczpsn .gt_row_group_first th { border-top-width: 2px; } #lqgijczpsn .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #lqgijczpsn .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #lqgijczpsn .gt_first_summary_row.thick { border-top-width: 2px; } #lqgijczpsn .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #lqgijczpsn .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #lqgijczpsn .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #lqgijczpsn .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #lqgijczpsn .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #lqgijczpsn .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #lqgijczpsn .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #lqgijczpsn .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #lqgijczpsn .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #lqgijczpsn .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #lqgijczpsn .gt_left { text-align: left; } #lqgijczpsn .gt_center { text-align: center; } #lqgijczpsn .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #lqgijczpsn .gt_font_normal { font-weight: normal; } #lqgijczpsn .gt_font_bold { font-weight: bold; } #lqgijczpsn .gt_font_italic { font-style: italic; } #lqgijczpsn .gt_super { font-size: 65%; } #lqgijczpsn .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #lqgijczpsn .gt_asterisk { font-size: 100%; vertical-align: 0; } #lqgijczpsn .gt_indent_1 { text-indent: 5px; } #lqgijczpsn .gt_indent_2 { text-indent: 10px; } #lqgijczpsn .gt_indent_3 { text-indent: 15px; } #lqgijczpsn .gt_indent_4 { text-indent: 20px; } #lqgijczpsn .gt_indent_5 { text-indent: 25px; } TABLE 6.1: One-sample t-test output for estimates of U.S. households use AC in their homes differing from 90%, RECS 2020 estimate statistic p.value parameter conf.low conf.high method alternative −0.01 −4.40 &lt;0.0001 58.00 −0.02 −0.01 Design-based one-sample t-test two.sided The estimate differs from Example 1 in that the estimate is not displaying \\(\\mu - 0.90\\) but rather \\(\\mu\\), or the difference between the U.S. households that use AC and the proportion we are comparing to. We can see that there is a difference of -1.35 percentage points. Additionally, the t-statistic value in the statistic column is -4.4, and the p-value is &lt;0.0001. These results indicate that the fewer than 90% of U.S. households use AC in their homes. Example 3: Unpaired two-sample t-test Two additional variables in the RECS data are the electric bill cost (DOLLAREL) and whether the house used AC or not (ACUsed).17 If we want to know if the U.S. households that used AC had higher electrical bills compared to those that did not, we could set up the hypothesis as follows: \\(H_0: \\mu_{AC} = \\mu_{noAC}\\) where \\(\\mu_{AC}\\) is the electrical bill cost for U.S. households that used AC and \\(\\mu_{noAC}\\) is the electrical bill cost for U.S. households that did not use AC \\(H_A: \\mu_{AC} \\neq \\mu_{noAC}\\) Let’s take a quick look at the data to see the format the data are in: recs_des %&gt;% group_by(ACUsed) %&gt;% summarize(mean = survey_mean(DOLLAREL, na.rm = TRUE)) ## # A tibble: 2 × 3 ## ACUsed mean mean_se ## &lt;lgl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 FALSE 1056. 16.0 ## 2 TRUE 1422. 5.69 To conduct this in R, we use svyttest(): ttest_ex3 &lt;- recs_des %&gt;% svyttest(formula = DOLLAREL ~ ACUsed, design = ., na.rm = TRUE) tidy(ttest_ex3) %&gt;% mutate(p.value = pretty_p_value(p.value)) %&gt;% gt() %&gt;% fmt_number() #hpkosvtmzq table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #hpkosvtmzq thead, #hpkosvtmzq tbody, #hpkosvtmzq tfoot, #hpkosvtmzq tr, #hpkosvtmzq td, #hpkosvtmzq th { border-style: none; } #hpkosvtmzq p { margin: 0; padding: 0; } #hpkosvtmzq .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #hpkosvtmzq .gt_caption { padding-top: 4px; padding-bottom: 4px; } #hpkosvtmzq .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #hpkosvtmzq .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #hpkosvtmzq .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #hpkosvtmzq .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #hpkosvtmzq .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #hpkosvtmzq .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #hpkosvtmzq .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #hpkosvtmzq .gt_column_spanner_outer:first-child { padding-left: 0; } #hpkosvtmzq .gt_column_spanner_outer:last-child { padding-right: 0; } #hpkosvtmzq .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #hpkosvtmzq .gt_spanner_row { border-bottom-style: hidden; } #hpkosvtmzq .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #hpkosvtmzq .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #hpkosvtmzq .gt_from_md > :first-child { margin-top: 0; } #hpkosvtmzq .gt_from_md > :last-child { margin-bottom: 0; } #hpkosvtmzq .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #hpkosvtmzq .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #hpkosvtmzq .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #hpkosvtmzq .gt_row_group_first td { border-top-width: 2px; } #hpkosvtmzq .gt_row_group_first th { border-top-width: 2px; } #hpkosvtmzq .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #hpkosvtmzq .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #hpkosvtmzq .gt_first_summary_row.thick { border-top-width: 2px; } #hpkosvtmzq .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #hpkosvtmzq .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #hpkosvtmzq .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #hpkosvtmzq .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #hpkosvtmzq .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #hpkosvtmzq .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #hpkosvtmzq .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #hpkosvtmzq .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #hpkosvtmzq .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #hpkosvtmzq .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #hpkosvtmzq .gt_left { text-align: left; } #hpkosvtmzq .gt_center { text-align: center; } #hpkosvtmzq .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #hpkosvtmzq .gt_font_normal { font-weight: normal; } #hpkosvtmzq .gt_font_bold { font-weight: bold; } #hpkosvtmzq .gt_font_italic { font-style: italic; } #hpkosvtmzq .gt_super { font-size: 65%; } #hpkosvtmzq .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #hpkosvtmzq .gt_asterisk { font-size: 100%; vertical-align: 0; } #hpkosvtmzq .gt_indent_1 { text-indent: 5px; } #hpkosvtmzq .gt_indent_2 { text-indent: 10px; } #hpkosvtmzq .gt_indent_3 { text-indent: 15px; } #hpkosvtmzq .gt_indent_4 { text-indent: 20px; } #hpkosvtmzq .gt_indent_5 { text-indent: 25px; } TABLE 6.2: Unpaired two-sample t-test output for estimates of U.S. households electrical bills by AC use, RECS 2020 estimate statistic p.value parameter conf.low conf.high method alternative 365.72 21.29 &lt;0.0001 58.00 331.33 400.11 Design-based t-test two.sided The results indicate that the difference in electrical bills for those that used AC and those that did not is, on average, $365.72. The difference appears to be statistically significant as the t-statistic is 21.3 and the p-value is \\(&lt;0.0001\\). Households that used AC spent, on average, $365.72 more in 2020 on electricity than households without AC. Example 4: Paired two-sample t-test Let’s say we want to test whether the temperature that U.S. households set their thermostat at night differs depending on the season (comparing summer18 and winter19 temperatures). We could set up the hypothesis as follows: \\(H_0: \\mu_{summer} = \\mu_{winter}\\) where \\(\\mu_{summer}\\) is the temperature that U.S. households set their thermostat to during summer nights, and \\(\\mu_{winter}\\) is the temperature that U.S. households set their thermostat to during winter nights \\(H_A: \\mu_{summer} \\neq \\mu_{winter}\\) To conduct this in R, we use svyttest() by calculating the temperature difference on the left-hand side as follows: ttest_ex4 &lt;- recs_des %&gt;% svyttest( design = ., formula = SummerTempNight - WinterTempNight ~ 0, na.rm = TRUE ) tidy(ttest_ex4) %&gt;% mutate(p.value = pretty_p_value(p.value)) %&gt;% gt() %&gt;% fmt_number() #rsccxkcyhx table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #rsccxkcyhx thead, #rsccxkcyhx tbody, #rsccxkcyhx tfoot, #rsccxkcyhx tr, #rsccxkcyhx td, #rsccxkcyhx th { border-style: none; } #rsccxkcyhx p { margin: 0; padding: 0; } #rsccxkcyhx .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #rsccxkcyhx .gt_caption { padding-top: 4px; padding-bottom: 4px; } #rsccxkcyhx .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #rsccxkcyhx .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #rsccxkcyhx .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #rsccxkcyhx .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #rsccxkcyhx .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #rsccxkcyhx .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #rsccxkcyhx .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #rsccxkcyhx .gt_column_spanner_outer:first-child { padding-left: 0; } #rsccxkcyhx .gt_column_spanner_outer:last-child { padding-right: 0; } #rsccxkcyhx .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #rsccxkcyhx .gt_spanner_row { border-bottom-style: hidden; } #rsccxkcyhx .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #rsccxkcyhx .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #rsccxkcyhx .gt_from_md > :first-child { margin-top: 0; } #rsccxkcyhx .gt_from_md > :last-child { margin-bottom: 0; } #rsccxkcyhx .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #rsccxkcyhx .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #rsccxkcyhx .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #rsccxkcyhx .gt_row_group_first td { border-top-width: 2px; } #rsccxkcyhx .gt_row_group_first th { border-top-width: 2px; } #rsccxkcyhx .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #rsccxkcyhx .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #rsccxkcyhx .gt_first_summary_row.thick { border-top-width: 2px; } #rsccxkcyhx .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #rsccxkcyhx .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #rsccxkcyhx .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #rsccxkcyhx .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #rsccxkcyhx .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #rsccxkcyhx .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #rsccxkcyhx .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #rsccxkcyhx .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #rsccxkcyhx .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #rsccxkcyhx .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #rsccxkcyhx .gt_left { text-align: left; } #rsccxkcyhx .gt_center { text-align: center; } #rsccxkcyhx .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #rsccxkcyhx .gt_font_normal { font-weight: normal; } #rsccxkcyhx .gt_font_bold { font-weight: bold; } #rsccxkcyhx .gt_font_italic { font-style: italic; } #rsccxkcyhx .gt_super { font-size: 65%; } #rsccxkcyhx .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #rsccxkcyhx .gt_asterisk { font-size: 100%; vertical-align: 0; } #rsccxkcyhx .gt_indent_1 { text-indent: 5px; } #rsccxkcyhx .gt_indent_2 { text-indent: 10px; } #rsccxkcyhx .gt_indent_3 { text-indent: 15px; } #rsccxkcyhx .gt_indent_4 { text-indent: 20px; } #rsccxkcyhx .gt_indent_5 { text-indent: 25px; } TABLE 6.3: Paired two-sample t-test output for estimates of U.S. households thermostat temperature by season, RECS 2020 estimate statistic p.value parameter conf.low conf.high method alternative 2.85 50.83 &lt;0.0001 58.00 2.74 2.96 Design-based one-sample t-test two.sided U.S. households set their thermostat on average 2.9\\(^\\circ\\)F warmer in summer nights than winter nights, which is statistically significant (t = 50.8, p-value = \\(&lt;0.0001\\)). 6.4 Chi-square tests Chi-square tests (\\(\\chi^2\\)) allow us to examine multiple proportions using a goodness-of-fit test, a test of independence, or a test of homogeneity. These three tests have the same \\(\\chi^2\\) distributions but with slightly different underlying assumptions. First, goodness-of-fit tests are used when comparing observed data to expected data. For example, this could be used to determine if respondent demographics (the observed data in the sample) match known population information (the expected data). In this case, we can set up the hypothesis test as follows: \\(H_0: p_1 = \\pi_1, ~ p_2 = \\pi_2, ~ ..., ~ p_k = \\pi_k\\) where \\(p_i\\) is the observed proportion for category \\(i\\), \\(\\pi_i\\) is expected proportion for category \\(i\\), and \\(k\\) is the number of categories \\(H_A:\\) at least one level of \\(p_i\\) does not match \\(\\pi_i\\) Second, tests of independence are used when comparing two types of observed data to see if there is a relationship. For example, this could be used to determine if the proportion of respondents who voted for each political party in the presidential election matches the proportion of respondents who voted for each political party in a local election. In this case, we can set up the hypothesis test as follows: \\(H_0:\\) The two variables/factors are independent \\(H_A:\\) The two variables/factors are not independent Third, tests of homogeneity are used to compare two distributions to see if they match. For example, this could be used to determine if the highest education achieved is the same for both men and women. In this case, we can set up the hypothesis test as follows: \\(H_0: p_{1a} = p_{1b}, ~ p_{2a} = p_{2b}, ~ ..., ~ p_{ka} = p_{kb}\\) where \\(p_{ia}\\) is the observed proportion of category \\(i\\) for subgroup \\(a\\), \\(p_{ib}\\) is the observed proportion of category \\(i\\) for subgroup \\(a\\) and \\(k\\) is the number of categories \\(H_A:\\) at least one category of \\(p_{ia}\\) does not match \\(p_{ib}\\) As with t-tests, the difference between using \\(\\chi^2\\) tests with non-survey data and survey data is based on the underlying variance estimation. The functions in the {survey} package will account for these nuances, provided the design object is correctly defined. For basic variance estimation formulas for different survey design types, refer to Chapter 10. 6.4.1 Syntax When we do not have survey data, we may be able to use the chisq.test() function from the {stats} package in base R (R Core Team 2023). However, this function does not allow for weights or the variance structure to be accounted for with survey data. Therefore, when using survey data, we need to use one of two functions: svygofchisq(): For goodness of fit tests svychisq(): For tests of independence and homogeneity The non-survey data function of chisq.test() requires either a single set of counts and given proportions (for goodness of fit tests) or two sets of counts for tests of independence and homogeneity. The functions we use with survey data require respondent-level data and formulas instead of counts. This ensures that the variances are correctly calculated. First, the function for the goodness of fit tests is svygofchisq(): svygofchisq(formula, p, design, na.rm = TRUE, ...) The arguments are: formula: Formula specifying a single factor variable p: Vector of probabilities for the categories of the factor in the correct order. If they probabilities do not sum to 1, they will be rescaled to sum to 1. design: Survey design object …: Other arguments to pass on, such as na.rm Based on the order of the arguments, we again must use the dot (.) notation if we pipe in the survey design object or explicitly name the arguments as described in Section 6.2. For the goodness of fit tests, the formula will be a single variable formula = ~var as we compare the observed data from this variable to the expected data. The expected probabilities are then entered in the p argument and need to be a vector of the same length as the number of categories in the variable. For example, if we want to know if the proportion of males and females matches a distribution of 30/70, then the sex variable (with two categories) would be used formula = ~SEX, and the proportions would be included as p = c(.3, .7). It is important to note that the variable entered into the formula should be formatted as either a factor or a character. The examples below provide more detail and tips on how to make sure the levels match up correctly. For tests of homogeneity and independence, the svychisq() function should be used. The syntax is as follows: svychisq( formula, design, statistic = c(&quot;F&quot;, &quot;Chisq&quot;, &quot;Wald&quot;, &quot;adjWald&quot;, &quot;lincom&quot;, &quot;saddlepoint&quot;), na.rm = TRUE ) The arguments are: formula: Model formula specifying the table (shown in examples) design: Survey design object statistic: Type of test statistic to use in test (details below) na.rm: Remove missing values There are six statistics that are accepted in this formula. For tests of homogeneity (when comparing cross-tabulations), the F or Chisq statistics should be used.20 The F statistic is the default and uses the Rao-Scott second-order correction. This correction is designed to assist with complicated sampling designs (i.e., those other than a simple random sample) (Scott 2007). The Chisq statistic is an adjusted version of the Pearson \\(\\chi^2\\) statistic. The version of this statistic in the svychisq() function compares the design effect estimate from the provided survey data to what the \\(\\chi^2\\) distribution would have been if the data came from a simple random sampling. For tests of independence, the Wald and adjWald are recommended as they provide a better adjustment for variable comparisons (Lumley 2010). If the data has a small number of primary sampling units (PSUs) compared to the degrees of freedom, then the adjWald statistic should be used to account for this. The lincom and saddlepoint statistics are available for more complicated data structures. The formula argument will always be one-sided, unlike the svyttest() function. The two variables of interest should be included with a plus sign: formula = ~ var_1 + var_2. As with the svygofchisq() function, the variables entered into the formula should be formatted as either a factor or a character. Additionally, as with the t-test function, both svygofchisq() and svychisq() have the na.rm argument. If any data is missing, the \\(\\chi^2\\) tests will assume that NA is a category and include it in the calculation. Throughout this chapter, we will always set na.rm = TRUE, but before analyzing the survey data, review the notes provided in Chapter 3 to better understand how to handle missing data. 6.4.2 Examples Let’s walk through a few examples using the ANES data. Example 1: Goodness of fit test ANES asked respondents about their highest education level.21 Based on the data from the 2020 American Community Survey (ACS) 5-year estimates22, the education distribution of those aged 18+ in the United States (among the 50 states and District of Columbia) is as follows: 11% had less than High School degree 27% had a High School degree 29% had some college or associate’s degree 33% had a bachelor’s degree or higher If we want to see if the weighted distribution from the ANES 2020 data matches this distribution, we could set up the hypothesis as follows: \\(H_0: p_1 = 0.11, ~ p_2 = 0.27, ~ p_3 = 0.29, ~ p_4 = 0.33\\) \\(H_A:\\) at least one of the education levels does not match between the ANES and the ACS To conduct this in R, let’s first look at the education variable (Education) we have on the ANES data. Using the survey_mean() function discussed in Chapter 5, we can see the education levels and estimated proportions. anes_des %&gt;% drop_na(Education) %&gt;% group_by(Education) %&gt;% summarize(p = survey_mean()) ## # A tibble: 5 × 3 ## Education p p_se ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Less than HS 0.0805 0.00568 ## 2 High school 0.277 0.0102 ## 3 Post HS 0.290 0.00713 ## 4 Bachelor&#39;s 0.226 0.00633 ## 5 Graduate 0.126 0.00499 Based on this output, we can see that we have different levels than the ACS data provides. Specifically, the education data from ANES has two levels for Bachelor’s Degree or Higher (Bachelor’s and Graduate), so these two categories need to be collapsed into a single category to match the ACS data. For this, among other methods, we can use the {forcats} package from the tidyverse (Wickham 2023a). The package’s fct_collapse() function helps us create a new variable by collapsing categories into a single one. Then, we will use the svygofchisq() function to compare the ANES data to the ACS data where we specify the updated design object, the formula using the collapsed education variable, the ACS estimates for education levels as p, and removing NA values. anes_des_educ &lt;- anes_des %&gt;% mutate(Education2 = fct_collapse(Education, &quot;Bachelor or Higher&quot; = c(&quot;Bachelor&#39;s&quot;, &quot;Graduate&quot;))) anes_des_educ %&gt;% drop_na(Education2) %&gt;% group_by(Education2) %&gt;% summarize(p = survey_mean()) ## # A tibble: 4 × 3 ## Education2 p p_se ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Less than HS 0.0805 0.00568 ## 2 High school 0.277 0.0102 ## 3 Post HS 0.290 0.00713 ## 4 Bachelor or Higher 0.352 0.00732 chi_ex1 &lt;- anes_des_educ %&gt;% svygofchisq( formula = ~ Education2, p = c(0.11, 0.27, 0.29, 0.33), design = ., na.rm = TRUE ) chi_ex1 ## ## Design-based chi-squared test for given probabilities ## ## data: ~Education2 ## X-squared = 2172220, scale = 1.1e+05, df = 2.3e+00, p-value = ## 9e-05 The output from the svygofchisq() indicates that at least one proportion from ANES does not match the ACS data (\\(\\chi^2 =\\) 2,172,220; p-value &lt;0.0001). To get a better idea of the differences, we can use the expected output along with survey_mean() to create a comparison table: ex1_table &lt;- anes_des_educ %&gt;% drop_na(Education2) %&gt;% group_by(Education2) %&gt;% summarize(Observed = survey_mean(vartype = &quot;ci&quot;)) %&gt;% rename(Education = Education2) %&gt;% mutate(Expected=c(0.11, 0.27, 0.29, 0.33)) %&gt;% select(Education, Expected, everything()) ex1_table ## # A tibble: 4 × 5 ## Education Expected Observed Observed_low Observed_upp ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Less than HS 0.11 0.0805 0.0691 0.0919 ## 2 High school 0.27 0.277 0.257 0.298 ## 3 Post HS 0.29 0.290 0.276 0.305 ## 4 Bachelor or Higher 0.33 0.352 0.337 0.367 This output includes our expected proportions from the ACS that we provided the svygofchisq() function along with the output of the observed proportions and their confidence intervals. This table shows that the “High school” and “Post HS” categories have nearly identical proportions but that the other two categories are slightly different. Looking at the confidence intervals, we can see that the ANES data skews to include fewer people in the “Less than HS” category and more people in the “Bachelor or Higher” category. This may be easier to see if we plot this. The code below uses the tabular output to create Figure 6.1. ex1_table %&gt;% pivot_longer( cols = c(&quot;Expected&quot;, &quot;Observed&quot;), names_to = &quot;Names&quot;, values_to = &quot;Proportion&quot; ) %&gt;% mutate( Observed_low = if_else(Names == &quot;Observed&quot;, Observed_low, NA_real_), Observed_upp = if_else(Names == &quot;Observed&quot;, Observed_upp, NA_real_), Names = if_else(Names == &quot;Observed&quot;, &quot;ANES (observed)&quot;, &quot;ACS (expected)&quot;) ) %&gt;% ggplot(aes(x = Education, y = Proportion, color = Names)) + geom_point(alpha = 0.75, size = 2) + geom_errorbar(aes(ymin = Observed_low, ymax = Observed_upp), width = 0.25) + theme_bw() + scale_color_manual(name = &quot;Type&quot;, values = book_colors[c(4, 1)]) + theme(legend.position = &quot;bottom&quot;, legend.title = element_blank()) FIGURE 6.1: Expected and observed proportions of education, showing the confidence intervals for the expected proportions and whether the observed proportions lie within them. Example 2: Test of independence ANES asked respondents two questions about trust: How often can you trust the federal government to do what is right? How often can you trust other people? If we want to see if the distributions of these two questions are similar or not, we can conduct a test of independence. Here is how the hypothesis could be set up: \\(H_0:\\) People’s trust in the federal government and their trust in other people are independent (i.e., not related) \\(H_A:\\) People’s trust in the federal government and their trust in other people are not independent (i.e., they are related) To conduct this in R, we use the svychisq() function to compare the two variables: chi_ex2 &lt;- anes_des %&gt;% svychisq( formula = ~ TrustGovernment + TrustPeople, design = ., statistic = &quot;Wald&quot;, na.rm = TRUE ) chi_ex2 ## ## Design-based Wald test of association ## ## data: NextMethod() ## F = 21, ndf = 16, ddf = 51, p-value &lt;2e-16 The output from svychisq() indicates that the distribution of people’s trust in the federal government and their trust in other people are not independent, meaning that they are related. Let’s output the distributions in a table to see the relationship. The observed output from the test provides a cross-tabulation of the counts for each category: chi_ex2$observed ## TrustPeople ## TrustGovernment Always Most of the time About half the time ## Always 16.470 25.009 31.848 ## Most of the time 11.020 539.377 196.258 ## About half the time 11.772 934.858 861.971 ## Some of the time 17.007 1353.779 839.863 ## Never 3.174 236.785 174.272 ## TrustPeople ## TrustGovernment Some of the time Never ## Always 36.854 5.523 ## Most of the time 206.556 27.184 ## About half the time 428.871 65.024 ## Some of the time 932.628 89.596 ## Never 217.994 189.307 However, as researchers, we often want to know about the proportions and not just the respondent counts from the survey. There are a couple of different ways that we can do this. The first is using the counts from chi_ex2$observed to calculate the proportion. We can then pivot the table to create a cross-tabulation similar to the counts table above. Adding group_by() to the code means that we are obtaining the proportions within each level of that variable. In this case, we are looking at the distribution of TrustGovernment for each level of TrustPeople. The resulting table is shown in Table 6.4. chi_ex2_table&lt;-chi_ex2$observed %&gt;% as_tibble() %&gt;% group_by(TrustPeople) %&gt;% mutate(prop = round(n / sum(n), 3)) %&gt;% select(-n) %&gt;% pivot_wider(names_from = TrustPeople, values_from = prop) %&gt;% gt(rowname_col = &quot;TrustGovernment&quot;) %&gt;% tab_stubhead(label = &quot;Trust in Government&quot;) %&gt;% tab_spanner(label = &quot;Trust in People&quot;, columns = everything()) %&gt;% cols_label(`Most of the time` = md(&quot;Most of&lt;br /&gt;the time&quot;), `About half the time` = md(&quot;About half&lt;br /&gt;the time&quot;), `Some of the time` = md(&quot;Some of&lt;br /&gt;the time&quot;)) chi_ex2_table #tntohixzez table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #tntohixzez thead, #tntohixzez tbody, #tntohixzez tfoot, #tntohixzez tr, #tntohixzez td, #tntohixzez th { border-style: none; } #tntohixzez p { margin: 0; padding: 0; } #tntohixzez .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #tntohixzez .gt_caption { padding-top: 4px; padding-bottom: 4px; } #tntohixzez .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #tntohixzez .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #tntohixzez .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #tntohixzez .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #tntohixzez .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #tntohixzez .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #tntohixzez .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #tntohixzez .gt_column_spanner_outer:first-child { padding-left: 0; } #tntohixzez .gt_column_spanner_outer:last-child { padding-right: 0; } #tntohixzez .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #tntohixzez .gt_spanner_row { border-bottom-style: hidden; } #tntohixzez .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #tntohixzez .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #tntohixzez .gt_from_md > :first-child { margin-top: 0; } #tntohixzez .gt_from_md > :last-child { margin-bottom: 0; } #tntohixzez .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #tntohixzez .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #tntohixzez .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #tntohixzez .gt_row_group_first td { border-top-width: 2px; } #tntohixzez .gt_row_group_first th { border-top-width: 2px; } #tntohixzez .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #tntohixzez .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #tntohixzez .gt_first_summary_row.thick { border-top-width: 2px; } #tntohixzez .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #tntohixzez .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #tntohixzez .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #tntohixzez .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #tntohixzez .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #tntohixzez .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #tntohixzez .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #tntohixzez .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #tntohixzez .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #tntohixzez .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #tntohixzez .gt_left { text-align: left; } #tntohixzez .gt_center { text-align: center; } #tntohixzez .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #tntohixzez .gt_font_normal { font-weight: normal; } #tntohixzez .gt_font_bold { font-weight: bold; } #tntohixzez .gt_font_italic { font-style: italic; } #tntohixzez .gt_super { font-size: 65%; } #tntohixzez .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #tntohixzez .gt_asterisk { font-size: 100%; vertical-align: 0; } #tntohixzez .gt_indent_1 { text-indent: 5px; } #tntohixzez .gt_indent_2 { text-indent: 10px; } #tntohixzez .gt_indent_3 { text-indent: 15px; } #tntohixzez .gt_indent_4 { text-indent: 20px; } #tntohixzez .gt_indent_5 { text-indent: 25px; } TABLE 6.4: Proportion of adults in the U.S. by levels of trust in people and government, ANES 2020 Trust in Government Trust in People Always Most ofthe time About halfthe time Some ofthe time Never Always 0.277 0.008 0.015 0.020 0.015 Most of the time 0.185 0.175 0.093 0.113 0.072 About half the time 0.198 0.303 0.410 0.235 0.173 Some of the time 0.286 0.438 0.399 0.512 0.238 Never 0.053 0.077 0.083 0.120 0.503 In Table 6.4, each column sums to 1. For example, we can say that it is estimated that of people who always trust in people, 27.7% also always trust in government based on the top-left cell but 5.3% never trust in government. The second option is to use group_by() and survey_mean() functions to calculate the proportions from the ANES design object. A reminder that with more than one variable listed in the group_by() statement, the proportions are within the first variable listed. As mentioned above, we are looking at the distribution of TrustGovernment for each level of TrustPeople. chi_ex2_obs &lt;- anes_des %&gt;% drop_na(TrustPeople, TrustGovernment) %&gt;% group_by(TrustPeople, TrustGovernment) %&gt;% summarize(Observed = round(survey_mean(vartype = &quot;ci&quot;), 3), .groups=&quot;drop&quot;) chi_ex2_obs_table&lt;-chi_ex2_obs %&gt;% mutate(prop = paste0(Observed, &quot; (&quot;, Observed_low, &quot;, &quot;, Observed_upp, &quot;)&quot;)) %&gt;% select(TrustGovernment, TrustPeople, prop) %&gt;% pivot_wider(names_from = TrustPeople, values_from = prop) %&gt;% gt(rowname_col = &quot;TrustGovernment&quot;) %&gt;% tab_stubhead(label = &quot;Trust in Government&quot;) %&gt;% tab_spanner(label = &quot;Trust in People&quot;, columns = everything()) %&gt;% tab_options(page.orientation = &quot;landscape&quot;) chi_ex2_obs_table #vpfhdsqoxw table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #vpfhdsqoxw thead, #vpfhdsqoxw tbody, #vpfhdsqoxw tfoot, #vpfhdsqoxw tr, #vpfhdsqoxw td, #vpfhdsqoxw th { border-style: none; } #vpfhdsqoxw p { margin: 0; padding: 0; } #vpfhdsqoxw .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #vpfhdsqoxw .gt_caption { padding-top: 4px; padding-bottom: 4px; } #vpfhdsqoxw .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #vpfhdsqoxw .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #vpfhdsqoxw .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #vpfhdsqoxw .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #vpfhdsqoxw .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #vpfhdsqoxw .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #vpfhdsqoxw .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #vpfhdsqoxw .gt_column_spanner_outer:first-child { padding-left: 0; } #vpfhdsqoxw .gt_column_spanner_outer:last-child { padding-right: 0; } #vpfhdsqoxw .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #vpfhdsqoxw .gt_spanner_row { border-bottom-style: hidden; } #vpfhdsqoxw .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #vpfhdsqoxw .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #vpfhdsqoxw .gt_from_md > :first-child { margin-top: 0; } #vpfhdsqoxw .gt_from_md > :last-child { margin-bottom: 0; } #vpfhdsqoxw .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #vpfhdsqoxw .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #vpfhdsqoxw .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #vpfhdsqoxw .gt_row_group_first td { border-top-width: 2px; } #vpfhdsqoxw .gt_row_group_first th { border-top-width: 2px; } #vpfhdsqoxw .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #vpfhdsqoxw .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #vpfhdsqoxw .gt_first_summary_row.thick { border-top-width: 2px; } #vpfhdsqoxw .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #vpfhdsqoxw .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #vpfhdsqoxw .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #vpfhdsqoxw .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #vpfhdsqoxw .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #vpfhdsqoxw .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #vpfhdsqoxw .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #vpfhdsqoxw .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #vpfhdsqoxw .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #vpfhdsqoxw .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #vpfhdsqoxw .gt_left { text-align: left; } #vpfhdsqoxw .gt_center { text-align: center; } #vpfhdsqoxw .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #vpfhdsqoxw .gt_font_normal { font-weight: normal; } #vpfhdsqoxw .gt_font_bold { font-weight: bold; } #vpfhdsqoxw .gt_font_italic { font-style: italic; } #vpfhdsqoxw .gt_super { font-size: 65%; } #vpfhdsqoxw .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #vpfhdsqoxw .gt_asterisk { font-size: 100%; vertical-align: 0; } #vpfhdsqoxw .gt_indent_1 { text-indent: 5px; } #vpfhdsqoxw .gt_indent_2 { text-indent: 10px; } #vpfhdsqoxw .gt_indent_3 { text-indent: 15px; } #vpfhdsqoxw .gt_indent_4 { text-indent: 20px; } #vpfhdsqoxw .gt_indent_5 { text-indent: 25px; } TABLE 6.5: Proportion of adults in the U.S. by levels of trust in people and government with confidence intervals, ANES 2020 Trust in Government Trust in People Always Most of the time About half the time Some of the time Never Always 0.277 (0.11, 0.444) 0.008 (0.004, 0.012) 0.015 (0.006, 0.024) 0.02 (0.008, 0.033) 0.015 (0, 0.029) Most of the time 0.185 (-0.009, 0.38) 0.175 (0.157, 0.192) 0.093 (0.078, 0.109) 0.113 (0.085, 0.141) 0.072 (0.021, 0.123) About half the time 0.198 (0.046, 0.35) 0.303 (0.281, 0.324) 0.41 (0.378, 0.441) 0.235 (0.2, 0.271) 0.173 (0.099, 0.246) Some of the time 0.286 (0.069, 0.503) 0.438 (0.415, 0.462) 0.399 (0.365, 0.433) 0.512 (0.481, 0.543) 0.238 (0.178, 0.298) Never 0.053 (-0.01, 0.117) 0.077 (0.064, 0.089) 0.083 (0.063, 0.103) 0.12 (0.097, 0.142) 0.503 (0.422, 0.583) Both methods produce the same output as the svychisq() function does account for the survey design. However, calculating the proportions directly from the design object means we can also obtain the variance information. In this case, the table output displays the survey estimate followed by the confidence intervals. Based on the output, we can see that of those who never trust people, 50.3% also never trust the government, while the proportions of never trusting the government are much lower for each of the other levels of trusting people. We may find it easier to look at these proportions graphically. We can use ggplot() and facets to provide an overview as shown below to create Figure 6.2: chi_ex2_obs %&gt;% mutate(TrustPeople= fct_reorder(str_c(&quot;Trust in People:\\n&quot;, TrustPeople), order(TrustPeople))) %&gt;% ggplot(aes(x = TrustGovernment, y = Observed, color = TrustGovernment)) + facet_wrap( ~ TrustPeople, ncol = 5) + geom_point() + geom_errorbar(aes(ymin = Observed_low, ymax = Observed_upp)) + ylab(&quot;Proportion&quot;) + xlab(&quot;&quot;) + theme_bw() + scale_color_manual(name=&quot;Trust in Government&quot;, values=book_colors) + theme(axis.text.x = element_blank(), axis.ticks.x = element_blank(), legend.position = &quot;bottom&quot;) + guides(col = guide_legend(nrow=2)) FIGURE 6.2: Proportion of adults in the U.S. by levels of trust in people and government with confidence intervals, ANES 2020 Example 3: Test of homogeneity Researchers and politicians often look at specific demographics each election cycle to understand how each group is leaning or voting toward candidates. The ANES data are collected post-election, but we can still see if there are differences in how specific demographic groups voted. If we want to see if there is a difference in how each age group voted for the 2020 candidates, this would be a test of homogeneity, and we can set up the hypothesis as follows: \\[\\begin{align*} H_0: p_{1_{Biden}} &amp;= p_{1_{Trump}} = p_{1_{Other}},\\\\ p_{2_{Biden}} &amp;= p_{2_{Trump}} = p_{2_{Other}},\\\\ p_{3_{Biden}} &amp;= p_{3_{Trump}} = p_{3_{Other}},\\\\ p_{4_{Biden}} &amp;= p_{4_{Trump}} = p_{4_{Other}},\\\\ p_{5_{Biden}} &amp;= p_{5_{Trump}} = p_{5_{Other}},\\\\ p_{6_{Biden}} &amp;= p_{6_{Trump}} = p_{6_{Other}} \\end{align*}\\] where \\(p_{i_{Biden}}\\) is the observed proportion of each age group (\\(i\\)) that voted for Joseph Biden, \\(p_{i_{Trump}}\\) is the observed proportion of each age group (\\(i\\)) that voted for Donald Trump, and \\(p_{i_{Other}}\\) is the observed proportion of each age group (\\(i\\)) that voted for another candidate \\(H_A:\\) at least one category of \\(p_{i_{Biden}}\\) does not match \\(p_{i_{Trump}}\\) or \\(p_{i_{Other}}\\) To conduct this in R, we use the svychisq() function to compare the two variables: chi_ex3 &lt;- anes_des %&gt;% drop_na(VotedPres2020_selection, AgeGroup) %&gt;% svychisq( formula = ~ AgeGroup + VotedPres2020_selection, design = ., statistic = &quot;Chisq&quot;, na.rm = TRUE ) chi_ex3 ## ## Pearson&#39;s X^2: Rao &amp; Scott adjustment ## ## data: NextMethod() ## X-squared = 171, df = 10, p-value &lt;2e-16 The output from svychisq() indicates a difference in how each age group voted in the 2020 election. To get a better idea of the different distributions, let’s output proportions to see the relationship. As we learned in Example 2 above, we can use chi_ex3$observed, or if we want to get the variance information (which is crucial with survey data), we can use survey_mean(). Remember, when we have two variables in group_by(), we obtain the proportions within each level of the variable listed. In this case, we are looking at the distribution of AgeGroup for each level of VotedPres2020_selection. chi_ex3_obs &lt;- anes_des %&gt;% filter(VotedPres2020 == &quot;Yes&quot;) %&gt;% drop_na(VotedPres2020_selection, AgeGroup) %&gt;% group_by(VotedPres2020_selection, AgeGroup) %&gt;% summarize(Observed = round(survey_mean(vartype = &quot;ci&quot;), 3)) chi_ex3_obs_table&lt;-chi_ex3_obs %&gt;% mutate(prop = paste0(Observed, &quot; (&quot;, Observed_low, &quot;, &quot;, Observed_upp, &quot;)&quot;)) %&gt;% select(AgeGroup, VotedPres2020_selection, prop) %&gt;% pivot_wider(names_from = VotedPres2020_selection, values_from = prop) %&gt;% gt(rowname_col = &quot;AgeGroup&quot;) %&gt;% tab_stubhead(label = &quot;Age Group&quot;) chi_ex3_obs_table #quoubnyrtk table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #quoubnyrtk thead, #quoubnyrtk tbody, #quoubnyrtk tfoot, #quoubnyrtk tr, #quoubnyrtk td, #quoubnyrtk th { border-style: none; } #quoubnyrtk p { margin: 0; padding: 0; } #quoubnyrtk .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #quoubnyrtk .gt_caption { padding-top: 4px; padding-bottom: 4px; } #quoubnyrtk .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #quoubnyrtk .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #quoubnyrtk .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #quoubnyrtk .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #quoubnyrtk .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #quoubnyrtk .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #quoubnyrtk .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #quoubnyrtk .gt_column_spanner_outer:first-child { padding-left: 0; } #quoubnyrtk .gt_column_spanner_outer:last-child { padding-right: 0; } #quoubnyrtk .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #quoubnyrtk .gt_spanner_row { border-bottom-style: hidden; } #quoubnyrtk .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #quoubnyrtk .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #quoubnyrtk .gt_from_md > :first-child { margin-top: 0; } #quoubnyrtk .gt_from_md > :last-child { margin-bottom: 0; } #quoubnyrtk .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #quoubnyrtk .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #quoubnyrtk .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #quoubnyrtk .gt_row_group_first td { border-top-width: 2px; } #quoubnyrtk .gt_row_group_first th { border-top-width: 2px; } #quoubnyrtk .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #quoubnyrtk .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #quoubnyrtk .gt_first_summary_row.thick { border-top-width: 2px; } #quoubnyrtk .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #quoubnyrtk .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #quoubnyrtk .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #quoubnyrtk .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #quoubnyrtk .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #quoubnyrtk .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #quoubnyrtk .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #quoubnyrtk .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #quoubnyrtk .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #quoubnyrtk .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #quoubnyrtk .gt_left { text-align: left; } #quoubnyrtk .gt_center { text-align: center; } #quoubnyrtk .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #quoubnyrtk .gt_font_normal { font-weight: normal; } #quoubnyrtk .gt_font_bold { font-weight: bold; } #quoubnyrtk .gt_font_italic { font-style: italic; } #quoubnyrtk .gt_super { font-size: 65%; } #quoubnyrtk .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #quoubnyrtk .gt_asterisk { font-size: 100%; vertical-align: 0; } #quoubnyrtk .gt_indent_1 { text-indent: 5px; } #quoubnyrtk .gt_indent_2 { text-indent: 10px; } #quoubnyrtk .gt_indent_3 { text-indent: 15px; } #quoubnyrtk .gt_indent_4 { text-indent: 20px; } #quoubnyrtk .gt_indent_5 { text-indent: 25px; } TABLE 6.6: Distribution of age group by presidential candidate selection with confidence intervals Age Group Biden Trump Other 18-29 0.203 (0.177, 0.229) 0.113 (0.095, 0.132) 0.221 (0.144, 0.298) 30-39 0.168 (0.152, 0.184) 0.146 (0.125, 0.168) 0.302 (0.21, 0.394) 40-49 0.163 (0.146, 0.18) 0.157 (0.137, 0.177) 0.21 (0.13, 0.29) 50-59 0.152 (0.135, 0.17) 0.229 (0.202, 0.256) 0.104 (0.04, 0.168) 60-69 0.177 (0.159, 0.196) 0.193 (0.173, 0.213) 0.103 (0.025, 0.182) 70 or older 0.136 (0.123, 0.149) 0.161 (0.143, 0.179) 0.06 (0.01, 0.109) We can see that the age group distribution that voted for Biden and other candidates was younger than those that voted for Trump. For example, of those who voted for Biden, 20.4% were in the 18-29 age group, compared to only 11.4% of those who voted for Trump were in that age group. On the other side, 23.4% of those who voted for Trump were in the 50-59 age group compared to only 15.4% of those who voted for Biden. 6.5 Exercises The exercises use the design objects anes_des and recs_des as provided in the Prerequisites box in the beginning of the chapter. Here are some exercises for practicing conducting t-tests using svyttest(): Using the RECS data, do more than 50% of U.S. households use AC (ACUsed)? Using the RECS data, does the average temperature that U.S. households set their thermostats to differ between the day and night in the winter (WinterTempDay and WinterTempNight)? Using the ANES data, does the average age (Age) of those who voted for Joseph Biden in 2020 (VotedPres2020_selection) differ from those who voted for another candidate? If you wanted to determine if the political party affiliation differed for males and females, what test would you use? Goodness of fit test (svygofchisq()) Test of independence (svychisq()) Test of homogeneity (svychisq()) In the RECS data, is there a relationship between the type of housing unit (HousingUnitType) and the year the house was built (YearMade)? In the ANES data, is there a difference in the distribution of gender (Gender) across early voting status in 2020 (EarlyVote2020)? References Csardi, Gabor. 2023. prettyunits: Pretty, Human Readable Formatting of Quantities. https://github.com/r-lib/prettyunits. Lumley, Thomas. 2010. Complex Surveys: A Guide to Analysis Using R. John Wiley; Sons. R Core Team. 2023. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/. Robinson, David, Alex Hayes, and Simon Couch. 2023. broom: Convert Statistical Objects into Tidy Tibbles. Scott, Alastair. 2007. “Rao-Scott Corrections and Their Impact.” In Section on Survey Research Methods, 3514–18. http://www.asasrms.org/Proceedings/y2007/Files/JSM2007-000874.pdf. ———. 2023a. forcats: Tools for Working with Categorical Variables (Factors). For more information on statistical testing, we recommend reviewing introduction to statistics textbooks.↩︎ This could change in the future if another package is built or {srvyr} is expanded to work with {tidymodels} packages but no such plans are known at this time.↩︎ During the summer, what is your home’s typical indoor temperature inside your home at night?↩︎ This is the temperature that Stephanie prefers at night during the summer, and she wanted to see if she was different from the population.↩︎ Is any air conditioning equipment used in your home?↩︎ Is any air conditioning equipment used in your home?↩︎ During the summer, what is your home’s typical indoor temperature inside your home at night?↩︎ During the winter, what is your home’s typical indoor temperature inside your home at night?↩︎ These two statistics can also be used for goodness of fit tests if the svygofchisq() function is not used.↩︎ What is the highest level of school you have completed or the highest degree you have received?↩︎ Data was pulled from data.census.gov using the S1501 Education Attainment 2020: ACS 5-Year Estimates Subject Tables↩︎ "],["c07-modeling.html", "Chapter 7 Modeling 7.1 Introduction 7.2 Analysis of variance (ANOVA) 7.3 Normal linear regression 7.4 Logistic regression 7.5 Exercises", " Chapter 7 Modeling Prerequisites For this chapter, load the following packages: library(tidyverse) library(survey) library(srvyr) library(srvyrexploR) library(broom) library(gt) library(prettyunits) We will be using data from ANES and RECS described in Chapter 4. As a reminder, here is the code to create the design objects for each to use throughout this chapter. For ANES, we need to adjust the weight so it sums to the population instead of the sample (see the ANES documentation and Chapter 4 for more information). targetpop &lt;- 231592693 anes_adjwgt &lt;- anes_2020 %&gt;% mutate(Weight = Weight / sum(Weight) * targetpop) anes_des &lt;- anes_adjwgt %&gt;% as_survey_design( weights = Weight, strata = Stratum, ids = VarUnit, nest = TRUE ) For RECS, details are included in the RECS documentation and Chapters 4 and 10. recs_des &lt;- recs_2020 %&gt;% as_survey_rep( weights = NWEIGHT, repweights = NWEIGHT1:NWEIGHT60, type = &quot;JK1&quot;, scale = 59/60, mse = TRUE ) 7.1 Introduction Modeling data is a way for researchers to investigate the relationship between a single dependent variable and one or more independent variables. This builds upon the analyses conducted in Chapter 6, which looked at the relationships between just two variables. For example, in Example 3 in Section 6.3.2, we investigated if there is a relationship between the electrical bill cost and whether or not the household used air-conditioning. However, there are potentially other elements that could go into what the cost of electrical bills are in a household (e.g., outside temperature, desired internal temperature, types and number of appliances, etc.). T-tests only allow us to investigate the relationship of one independent variable at a time, but using models we can look into multiple variables and even explore interactions between these variables. There are several types of models, but in this chapter we will cover Analysis of Variance (ANOVA) and linear regression models following common normal (Gaussian) and logit models. Jonas Kristoffer Lindeløv has an interesting discussion of many statistical tests and models being equivalent to a linear model. For example, a one-way ANOVA is a linear model with one categorical independent variable, and a two-sample t-test is an ANOVA where the independent variable has exactly two levels. When modeling data, it is helpful to first create an equation that provides an overview as to what it is that we are modeling. The main structure of these models is as follows: \\[y_i=\\beta_0 +\\sum_{i=1}^p \\beta_i x_i + \\epsilon_i\\] where \\(y_i\\) is the outcome, \\(\\beta_0\\) is an intercept, \\(x_1, \\cdots, x_p\\) are the predictors with \\(\\beta_1, \\cdots, \\beta_p\\) as the associated coefficients, and \\(\\epsilon_i\\) is the error. Not all models will have all components. For example, some models may not include an intercept (\\(\\beta_0\\)), may have interactions between different independent variables (\\(x_i\\)), or may have different underlying structures for the dependent variable (\\(y_i\\)). However, all linear models have the independent variables related to the dependent variable in a linear form. To specify these models in R, the formulas are the same with both survey data and other data. The left side of the formula is the response/dependent variable, and the right side of the formula has the predictor/independent variable(s). There are many symbols used in R to specify the formula. For example, a linear formula mathematically notated as \\[y_i=\\beta_0+\\beta_1 x_i+\\epsilon_i\\] would be specified in R as y~x where the intercept is not explicitly included. To fit a model with no intercept, that is, \\[y_i=\\beta_1 x_i+\\epsilon_i\\] it can be specified in R as y~x-1. Formula notation details in R can be found in the help file for formula23. A quick overview of the common formula notation is in Table 7.1: TABLE 7.1: Common symbols in formula notation Symbol Example Meaning + +x include this variable - -x delete this variable : x:z include the interaction between these variables * x*z include these variables and the interactions between them ^n (x+y+z)^3 include these variables and all interactions up to n-way I I(x-z) as-is: include a new variable which is calculated inside the parentheses (e.g., x-z, x*z, x/z are possible claculations that could be done) There are often multiple ways to specify the same formula. For example, consider the following equation using the mtcars dataset that is built into R: \\[mpg_i=\\beta_0+\\beta_1cyl_{i}+\\beta_2disp_{i}+\\beta_3hp_{i}+\\beta_4cyl_{i}disp_{i}+\\beta_5cyl_{i}hp_{i}+\\beta_6disp_{i}hp_{i}+\\epsilon_i\\] This could be specified in R code as any of the following: mpg ~ (cyl + disp + hp)^2 mpg ~ cyl + disp + hp + cyl:disp + cyl:hp + disp:hp mpg ~ cyl*disp + cyl*hp + disp*hp In the above options, the way the : and * notation are implemented are different. Using : only includes the interactions and not the main effects, while using * includes the main effects and all possible interactions. Table 7.2 provides an overview of the syntax and differences between the two notations. TABLE 7.2: Differences in formulas for : and * code syntax Symbol Syntax Formula : mpg ~ cyl:disp:hp \\[ \\begin{aligned} mpg_i = &amp;\\beta_0+\\beta_4cyl_{i}disp_{i}+\\beta_5cyl_{i}hp_{i}+ \\\\&amp; \\beta_6disp_{i}hp_{i}+\\epsilon_i\\end{aligned}\\] * mpg ~ cyl*disp*hp \\[ \\begin{aligned} mpg_i= &amp;\\beta_0+\\beta_1cyl_{i}+\\beta_2disp_{i}+\\beta_3hp_{i}+\\\\&amp; \\beta_4cyl_{i}disp_{i}+\\beta_5cyl_{i}hp_{i}+\\beta_6disp_{i}hp_{i}+\\\\&amp;\\beta_7cyl_{i}disp_{i}hp_{i}+\\epsilon_i\\end{aligned}\\] When using non-survey data such as experimental or observational data, researchers will use the glm() function for linear models. With survey data, however, we use svyglm() from the {survey} package to ensure that we account for the survey design and weights in modeling24. This allows us to generalize a model to the target population and accounts for the fact that the observations in the survey data may not be independent. As discussed in Chapter 6, modeling survey data cannot be directly done in {srvyr}, but can be done in the {survey} package (Lumley 2010). In this chapter, we will provide syntax and examples for linear models, including ANOVA, normal linear regression, and logistic regression. For details on other types of regression, including ordinal regression, log-linear models, and survival analysis, refer to Lumley (2010). Lumley (2010) also discusses custom models such as a negative binomial or Poisson model in Appendix E of his book. 7.2 Analysis of variance (ANOVA) In ANOVA, we are testing whether the mean of an outcome is the same across two or more groups. Statistically, we set up this as follows: \\(H_0: \\mu_1 = \\mu_2= \\dots = \\mu_k\\) where \\(\\mu_i\\) is the mean outcome for group \\(i\\) \\(H_A: \\text{At least one mean is different}\\) Using the framework, an ANOVA test is also a linear model, we can re-frame the problem as: \\[ y_i=\\sum_{i=1}^k \\mu_i x_i + \\epsilon_i\\] where \\(x_i\\) is a group indicator for groups \\(1, \\cdots, k\\). Some assumptions when using ANOVA on survey data include: The outcome variable is normally distributed within each group The variances of the outcome variable between each group are approximately equal We do NOT assume independence between the groups as with ANOVA on non-survey data. The covariance is accounted for in the survey design 7.2.1 Syntax To perform this type of analysis in R, the general syntax is as follows: des_obj %&gt;% svyglm( formula = outcome ~ group, design = ., na.action = na.omit, df.resid = NULL ) The arguments are: formula: Formula in the form of outcome~group. The group variable must be a factor or character. design: a tbl_svy object created by as_survey na.action: handling of missing data df.resid: degrees of freedom for Wald tests (optional) - defaults to using degf(design)-(g-1) where \\(g\\) is the number of groups The function svyglm() does not have the design as the first argument so the dot (.) notation is used to pass it with a pipe (see Chapter 6 for more details). The default for missing data is na.omit, this means that we are removing all records with any missing data in either predictors or outcomes from analyses. There are other options for handling missing data and we recommend looking at the help documentation for na.omit (run help(na.omit) or ?na.omit) for more information on options to use for na.action. For a discussion of how to handle missing data see Chapter 11. 7.2.2 Example Looking at an example will help us discuss the output and how to interpret the results. In RECS, respondents are asked what temperature they set their thermostat to during the day and evening when using the air-conditioning during the summer. To analyze this data, we filter the respondents to only those using AC (ACUsed). Then if we want to see if there are differences by region, we can use group_by(). A descriptive analysis of the temperature at night (SummerTempNight) set by region and the sample sizes is displayed below. recs_des %&gt;% filter(ACUsed) %&gt;% group_by(Region) %&gt;% summarize( SMN = survey_mean(SummerTempNight, na.rm = TRUE), n = unweighted(n()), n_na = unweighted(sum(is.na(SummerTempNight))) ) ## # A tibble: 4 × 5 ## Region SMN SMN_se n n_na ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;int&gt; &lt;int&gt; ## 1 Northeast 69.7 0.103 3204 0 ## 2 Midwest 71.0 0.0897 3619 0 ## 3 South 71.8 0.0536 6065 0 ## 4 West 72.5 0.129 3283 0 In the following code, we test whether this temperature varies by region by first using svyglm() to run the test and then using broom::tidy() to display the output. Note that the temperature setting is set to NA when the household does not use air-conditioning, and since the default handling of NAs is na.action=na.omit, records that do not use air-conditioning will not be included in this regression. anova_out &lt;- recs_des %&gt;% svyglm(design = ., formula = SummerTempNight ~ Region) tidy(anova_out) ## # A tibble: 4 × 5 ## term estimate std.error statistic p.value ## &lt;chr&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 (Intercept) 69.7 0.103 674. 3.69e-111 ## 2 RegionMidwest 1.34 0.138 9.68 1.46e- 13 ## 3 RegionSouth 2.05 0.128 16.0 1.36e- 22 ## 4 RegionWest 2.80 0.177 15.9 2.27e- 22 In the output above, we can see the estimated coefficients (estimate), estimated standard errors of the coefficients (std.error), the t-statistic (statistic), and the p-value for each coefficient. In this output, the intercept represents the reference value of the Northeast region. The other coefficients indicate the difference in temperature relative to the Northeast region. For example, in the Midwest, temperatures are set, on average, 1.34 (p-value&lt;0.0001) degrees higher than in the Northeast during summer nights and each region sets their thermostats at significantly higher temperatures than the Northeast. If we wanted to change the reference value we would reorder the factor before modeling using the function relevel() from {stats} or using one of many factor ordering functions in {forcats} such as fct_relevel() or fct_infreq(). For example, if we wanted the reference level to be the Midwest region, we could use the following code. Note the usage of the gt() function on top of tidy() to print a nice looking output table (Iannone et al. 2023; Robinson, Hayes, and Couch 2023) - we will go over more usage of the {gt} package in Chapter 8. anova_out_relevel &lt;- recs_des %&gt;% mutate(Region=fct_relevel(Region, &quot;Midwest&quot;, after = 0)) %&gt;% svyglm(design = ., formula = SummerTempNight ~ Region) tidy(anova_out_relevel) %&gt;% mutate(p.value=pretty_p_value(p.value)) %&gt;% gt() %&gt;% fmt_number() #jyqfcdcxup table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #jyqfcdcxup thead, #jyqfcdcxup tbody, #jyqfcdcxup tfoot, #jyqfcdcxup tr, #jyqfcdcxup td, #jyqfcdcxup th { border-style: none; } #jyqfcdcxup p { margin: 0; padding: 0; } #jyqfcdcxup .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #jyqfcdcxup .gt_caption { padding-top: 4px; padding-bottom: 4px; } #jyqfcdcxup .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #jyqfcdcxup .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #jyqfcdcxup .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #jyqfcdcxup .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #jyqfcdcxup .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #jyqfcdcxup .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #jyqfcdcxup .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #jyqfcdcxup .gt_column_spanner_outer:first-child { padding-left: 0; } #jyqfcdcxup .gt_column_spanner_outer:last-child { padding-right: 0; } #jyqfcdcxup .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #jyqfcdcxup .gt_spanner_row { border-bottom-style: hidden; } #jyqfcdcxup .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #jyqfcdcxup .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #jyqfcdcxup .gt_from_md > :first-child { margin-top: 0; } #jyqfcdcxup .gt_from_md > :last-child { margin-bottom: 0; } #jyqfcdcxup .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #jyqfcdcxup .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #jyqfcdcxup .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #jyqfcdcxup .gt_row_group_first td { border-top-width: 2px; } #jyqfcdcxup .gt_row_group_first th { border-top-width: 2px; } #jyqfcdcxup .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #jyqfcdcxup .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #jyqfcdcxup .gt_first_summary_row.thick { border-top-width: 2px; } #jyqfcdcxup .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #jyqfcdcxup .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #jyqfcdcxup .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #jyqfcdcxup .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #jyqfcdcxup .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #jyqfcdcxup .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #jyqfcdcxup .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #jyqfcdcxup .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #jyqfcdcxup .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #jyqfcdcxup .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #jyqfcdcxup .gt_left { text-align: left; } #jyqfcdcxup .gt_center { text-align: center; } #jyqfcdcxup .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #jyqfcdcxup .gt_font_normal { font-weight: normal; } #jyqfcdcxup .gt_font_bold { font-weight: bold; } #jyqfcdcxup .gt_font_italic { font-style: italic; } #jyqfcdcxup .gt_super { font-size: 65%; } #jyqfcdcxup .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #jyqfcdcxup .gt_asterisk { font-size: 100%; vertical-align: 0; } #jyqfcdcxup .gt_indent_1 { text-indent: 5px; } #jyqfcdcxup .gt_indent_2 { text-indent: 10px; } #jyqfcdcxup .gt_indent_3 { text-indent: 15px; } #jyqfcdcxup .gt_indent_4 { text-indent: 20px; } #jyqfcdcxup .gt_indent_5 { text-indent: 25px; } TABLE 7.3: ANOVA output for estimates of thermostat temperature setting at night by region with Midwest as the reference region, RECS 2020 term estimate std.error statistic p.value (Intercept) 71.04 0.09 791.83 &lt;0.0001 RegionNortheast −1.34 0.14 −9.68 &lt;0.0001 RegionSouth 0.71 0.10 6.91 &lt;0.0001 RegionWest 1.47 0.16 9.17 &lt;0.0001 This output now has the coefficients indicating the difference in temperature relative to the Midwest region. For example, in the Northeast, temperatures are set, on average, -1.34 (p-value&lt;0.0001) degrees lower than in the Midwest during summer nights and each region sets their thermostats at significantly lower temperatures than the Midwest. This is the reverse from what we saw in the prior model as we are still comparing the same two regions, just from different reference points. 7.3 Normal linear regression Normal linear regression is a more generalized method than ANOVA where we fit a model of a continuous outcome with any number of categorical or continuous predictors whereas ANOVA only has categorical predictors and is similarly specified as: \\[\\begin{equation} y_i=\\beta_0 +\\sum_{i=1}^p \\beta_i x_i + \\epsilon_i \\end{equation}\\] where \\(y_i\\) is the outcome, \\(\\beta_0\\) is an intercept, \\(x_1, \\cdots, x_n\\) are the predictors with \\(\\beta_1, \\cdots, \\beta_p\\) as the associated coefficients, and \\(\\epsilon_i\\) is the error. Assumptions in normal linear regression using survey data include: The residuals (\\(\\epsilon_i\\)) are normally distributed, but there is not an assumption of independence, and the correlation structure is captured in the survey design object There is a linear relationship between the outcome variable and the independent variables The residuals are homoscedastic, that is, the error term is the same across all values of independent variables 7.3.1 Syntax The syntax for this regression uses the same function as ANOVA, but can have more than one variable listed on the right-hand side of the formula: des_obj %&gt;% svyglm( formula = outcomevar ~ x1 + x2 + x3, design = ., na.action = na.omit, df.resid = NULL ) The arguments are: formula: Formula in the form of y~x design: a tbl_svy object created by as_survey na.action: handling of missing data df.resid: degrees of freedom for Wald tests (optional) - defaults to using degf(design)-p where \\(p\\) is the rank of the design matrix As discussed in Section 7.1, the formula on the right-hand side can be specified in many ways, whether interactions are desired or not, for example. 7.3.2 Examples Example 1: Linear regression with single variable On RECS, we can obtain information on the square footage of homes and the electric bills. We assume that square footage is related to the amount of money spent on electricity and examine a model for this. Before any modeling, we first plot the data to determine whether it is reasonable to assume a linear relationship. In Figure 7.1, each hexagon represents the weighted count of households in the bin, and we can see a general positive linear trend (as the square footage increases so does the amount of money spent on electricity). recs_2020 %&gt;% ggplot(aes( x = TOTSQFT_EN, y = DOLLAREL, weight = NWEIGHT / 1000000 )) + geom_hex() + scale_fill_gradientn( guide = &quot;colorbar&quot;, name = &quot;Housing Units\\n(Millions)&quot;, labels = scales::comma, colors = book_colors[c(3, 2, 1)] ) + xlab(&quot;Total square footage&quot;) + ylab(&quot;Amount spent on electricity&quot;) + scale_y_continuous(labels = scales::dollar_format()) + scale_x_continuous(labels = scales::comma_format()) + theme_minimal() FIGURE 7.1: Relationship between square footage and dollars spent on electricity, RECS 2020 Given that the plot shows a potential increasing relationship between square footage and electricity expenditure, fitting a model will allow us to determine if the relationship is statistically significant. The model is fit below with electricity expenditure as the outcome. m_electric_sqft &lt;- recs_des %&gt;% svyglm(design = ., formula = DOLLAREL ~ TOTSQFT_EN, na.action = na.omit) tidy(m_electric_sqft) %&gt;% mutate(p.value=pretty_p_value(p.value)) %&gt;% gt() %&gt;% fmt_number() #tymamautoy table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #tymamautoy thead, #tymamautoy tbody, #tymamautoy tfoot, #tymamautoy tr, #tymamautoy td, #tymamautoy th { border-style: none; } #tymamautoy p { margin: 0; padding: 0; } #tymamautoy .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #tymamautoy .gt_caption { padding-top: 4px; padding-bottom: 4px; } #tymamautoy .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #tymamautoy .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #tymamautoy .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #tymamautoy .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #tymamautoy .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #tymamautoy .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #tymamautoy .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #tymamautoy .gt_column_spanner_outer:first-child { padding-left: 0; } #tymamautoy .gt_column_spanner_outer:last-child { padding-right: 0; } #tymamautoy .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #tymamautoy .gt_spanner_row { border-bottom-style: hidden; } #tymamautoy .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #tymamautoy .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #tymamautoy .gt_from_md > :first-child { margin-top: 0; } #tymamautoy .gt_from_md > :last-child { margin-bottom: 0; } #tymamautoy .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #tymamautoy .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #tymamautoy .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #tymamautoy .gt_row_group_first td { border-top-width: 2px; } #tymamautoy .gt_row_group_first th { border-top-width: 2px; } #tymamautoy .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #tymamautoy .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #tymamautoy .gt_first_summary_row.thick { border-top-width: 2px; } #tymamautoy .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #tymamautoy .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #tymamautoy .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #tymamautoy .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #tymamautoy .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #tymamautoy .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #tymamautoy .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #tymamautoy .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #tymamautoy .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #tymamautoy .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #tymamautoy .gt_left { text-align: left; } #tymamautoy .gt_center { text-align: center; } #tymamautoy .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #tymamautoy .gt_font_normal { font-weight: normal; } #tymamautoy .gt_font_bold { font-weight: bold; } #tymamautoy .gt_font_italic { font-style: italic; } #tymamautoy .gt_super { font-size: 65%; } #tymamautoy .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #tymamautoy .gt_asterisk { font-size: 100%; vertical-align: 0; } #tymamautoy .gt_indent_1 { text-indent: 5px; } #tymamautoy .gt_indent_2 { text-indent: 10px; } #tymamautoy .gt_indent_3 { text-indent: 15px; } #tymamautoy .gt_indent_4 { text-indent: 20px; } #tymamautoy .gt_indent_5 { text-indent: 25px; } TABLE 7.4: Linear regression output predicting electricity expenditure given square footage, RECS 2020 term estimate std.error statistic p.value (Intercept) 836.72 12.77 65.51 &lt;0.0001 TOTSQFT_EN 0.30 0.01 41.67 &lt;0.0001 In the output above, we can see the estimated coefficients (estimate), estimated standard errors of the coefficients (std.error), the t-statistic (statistic), and the p-value for each coefficient. In these results, we can say that, on average, for every additional square foot of house size, the electricity bill increases by 29.9 cents and that square footage is significantly associated with electricity expenditure (p-value&lt;0.0001). This is a very simple model, and there are likely many more factors related to electricity expenditure, including the type of cooling, number of appliances, location, and more. However, starting with one variable models can help researchers understand what potential relationships there are between variables before fitting more complex models. Often researchers start with known relationships before building models to determine what impact additional variables have on the model. Example 2: Linear regression with multiple variables and interactions In the following example, a model is fit to predict electricity expenditure, including Census region (factor/categorical), urbanicity (factor/categorical), square footage (double/numeric), and whether air-conditioning is used (logical/categorical) with all two-way interactions also included. In this example, we are choosing to fit this model without an intercept (using -1 in the formula). This will result in an intercept estimate for each region instead of a single intercept for all data. m_electric_multi &lt;- recs_des %&gt;% svyglm( design = ., formula = DOLLAREL ~ (Region + Urbanicity + TOTSQFT_EN + ACUsed)^2 - 1, na.action = na.omit ) tidy(m_electric_multi) %&gt;% mutate(p.value=pretty_p_value(p.value)) %&gt;% gt() %&gt;% fmt_number() #jhtioslzwx table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #jhtioslzwx thead, #jhtioslzwx tbody, #jhtioslzwx tfoot, #jhtioslzwx tr, #jhtioslzwx td, #jhtioslzwx th { border-style: none; } #jhtioslzwx p { margin: 0; padding: 0; } #jhtioslzwx .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #jhtioslzwx .gt_caption { padding-top: 4px; padding-bottom: 4px; } #jhtioslzwx .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #jhtioslzwx .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #jhtioslzwx .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #jhtioslzwx .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #jhtioslzwx .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #jhtioslzwx .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #jhtioslzwx .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #jhtioslzwx .gt_column_spanner_outer:first-child { padding-left: 0; } #jhtioslzwx .gt_column_spanner_outer:last-child { padding-right: 0; } #jhtioslzwx .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #jhtioslzwx .gt_spanner_row { border-bottom-style: hidden; } #jhtioslzwx .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #jhtioslzwx .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #jhtioslzwx .gt_from_md > :first-child { margin-top: 0; } #jhtioslzwx .gt_from_md > :last-child { margin-bottom: 0; } #jhtioslzwx .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #jhtioslzwx .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #jhtioslzwx .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #jhtioslzwx .gt_row_group_first td { border-top-width: 2px; } #jhtioslzwx .gt_row_group_first th { border-top-width: 2px; } #jhtioslzwx .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #jhtioslzwx .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #jhtioslzwx .gt_first_summary_row.thick { border-top-width: 2px; } #jhtioslzwx .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #jhtioslzwx .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #jhtioslzwx .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #jhtioslzwx .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #jhtioslzwx .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #jhtioslzwx .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #jhtioslzwx .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #jhtioslzwx .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #jhtioslzwx .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #jhtioslzwx .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #jhtioslzwx .gt_left { text-align: left; } #jhtioslzwx .gt_center { text-align: center; } #jhtioslzwx .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #jhtioslzwx .gt_font_normal { font-weight: normal; } #jhtioslzwx .gt_font_bold { font-weight: bold; } #jhtioslzwx .gt_font_italic { font-style: italic; } #jhtioslzwx .gt_super { font-size: 65%; } #jhtioslzwx .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #jhtioslzwx .gt_asterisk { font-size: 100%; vertical-align: 0; } #jhtioslzwx .gt_indent_1 { text-indent: 5px; } #jhtioslzwx .gt_indent_2 { text-indent: 10px; } #jhtioslzwx .gt_indent_3 { text-indent: 15px; } #jhtioslzwx .gt_indent_4 { text-indent: 20px; } #jhtioslzwx .gt_indent_5 { text-indent: 25px; } TABLE 7.5: Linear regression output predicting electricity expenditure given region, urbanicity, square footage, air conditioning usage, and one-way interactions, RECS 2020 term estimate std.error statistic p.value RegionNortheast 543.73 56.57 9.61 &lt;0.0001 RegionMidwest 702.16 78.12 8.99 &lt;0.0001 RegionSouth 938.74 46.99 19.98 &lt;0.0001 RegionWest 603.27 36.31 16.61 &lt;0.0001 UrbanicityUrban Cluster 73.03 81.50 0.90 0.3764 UrbanicityRural 204.13 80.69 2.53 0.0161 TOTSQFT_EN 0.24 0.03 8.65 &lt;0.0001 ACUsedTRUE 252.06 54.05 4.66 &lt;0.0001 RegionMidwest:UrbanicityUrban Cluster 183.06 82.38 2.22 0.0328 RegionSouth:UrbanicityUrban Cluster 152.56 76.03 2.01 0.0526 RegionWest:UrbanicityUrban Cluster 98.02 75.16 1.30 0.2007 RegionMidwest:UrbanicityRural 312.83 50.88 6.15 &lt;0.0001 RegionSouth:UrbanicityRural 220.00 55.00 4.00 0.0003 RegionWest:UrbanicityRural 180.97 58.70 3.08 0.0040 RegionMidwest:TOTSQFT_EN −0.05 0.02 −2.09 0.0441 RegionSouth:TOTSQFT_EN 0.00 0.03 0.11 0.9109 RegionWest:TOTSQFT_EN −0.03 0.03 −1.00 0.3254 RegionMidwest:ACUsedTRUE −292.97 60.24 −4.86 &lt;0.0001 RegionSouth:ACUsedTRUE −294.07 57.44 −5.12 &lt;0.0001 RegionWest:ACUsedTRUE −77.68 47.05 −1.65 0.1076 UrbanicityUrban Cluster:TOTSQFT_EN −0.04 0.02 −1.63 0.1112 UrbanicityRural:TOTSQFT_EN −0.06 0.02 −2.60 0.0137 UrbanicityUrban Cluster:ACUsedTRUE −130.23 60.30 −2.16 0.0377 UrbanicityRural:ACUsedTRUE −33.80 59.30 −0.57 0.5724 TOTSQFT_EN:ACUsedTRUE 0.08 0.02 3.48 0.0014 As shown above, there are many terms in this model. To test whether coefficients for a term are different from zero, the function regTermTest() can be used. For example, in the above regression, we can test whether the interaction of region and urbanicity is significant as follows: urb_reg_test &lt;- regTermTest(m_electric_multi, ~Urbanicity:Region) urb_reg_test ## Wald test for Urbanicity:Region ## in svyglm(design = ., formula = DOLLAREL ~ (Region + Urbanicity + ## TOTSQFT_EN + ACUsed)^2 - 1, na.action = na.omit) ## F = 6.851 on 6 and 35 df: p= 7.2e-05 This output indicates there is a significant interaction between urbanicity and region (p-value=&lt;0.0001). To examine the predictions, residuals, and more from the model, the function augment() from {broom} can be used. The augment() function will return a tibble with the independent and dependent variables and other fit statistics. The augment() function has not been specifically written for objects of class svyglm, and as such, a warning will be displayed indicating this at this time. As it was not written exactly for this class of objects, a little tweaking needs to be done after using augment(). To obtain the standard error of the predicted values (.se.fit) we need to use the attr() function on the predicted values (.fitted) created by augment(). Additionally, the predicted values created are outputted as a svrep type of data. If we want to plot the predicted values, we need to use as.numeric() to get the predicted values into a numeric format to work with. However, it is important to note that this adjustment must be completed after the standard error adjustment. fitstats &lt;- augment(m_electric_multi) %&gt;% mutate(.se.fit = sqrt(attr(.fitted, &quot;var&quot;)), .fitted = as.numeric(.fitted)) fitstats ## # A tibble: 18,496 × 13 ## DOLLAREL Region Urbanicity TOTSQFT_EN ACUsed `(weights)` .fitted ## &lt;dbl&gt; &lt;fct&gt; &lt;fct&gt; &lt;dbl&gt; &lt;lgl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 1955. West Urban Area 2100 TRUE 0.492 1397. ## 2 713. South Urban Area 590 TRUE 1.35 1090. ## 3 335. West Urban Area 900 TRUE 0.849 1043. ## 4 1425. South Urban Area 2100 TRUE 0.793 1584. ## 5 1087 Northeast Urban Area 800 TRUE 1.49 1055. ## 6 1896. South Urban Area 4520 TRUE 1.09 2375. ## 7 1418. South Urban Area 2100 TRUE 0.851 1584. ## 8 1237. South Urban Clust… 900 FALSE 1.45 1349. ## 9 538. South Urban Area 750 TRUE 0.185 1142. ## 10 625. West Urban Area 760 TRUE 1.06 1002. ## # ℹ 18,486 more rows ## # ℹ 6 more variables: .resid &lt;dbl&gt;, .hat &lt;dbl&gt;, .sigma &lt;dbl&gt;, ## # .cooksd &lt;dbl&gt;, .std.resid &lt;dbl&gt;, .se.fit &lt;dbl&gt; These results can then be used in a variety of ways, including examining residual plots as illustrated in the code below and Figure 7.2. In the residual plot, we look for any patterns in the data. If we do see patterns, this may indicate a violation of the heteroscedasticity assumption and the standard errors of the coefficients may be incorrect. In Figure 7.2, we do not see a strong pattern indicating that our assumption of heteroscedasticity may hold. fitstats %&gt;% ggplot(aes(x = .fitted, .resid)) + geom_point() + geom_hline(yintercept = 0, color = &quot;red&quot;) + theme_minimal() + xlab(&quot;Fitted value of electricity cost&quot;) + ylab(&quot;Residual of model&quot;) + scale_y_continuous(labels = scales::dollar_format()) + scale_x_continuous(labels = scales::dollar_format()) FIGURE 7.2: Residual plot of electric cost model with covariates Region, Urbanicity, TOTSQFT_EN, and ACUsed Additionally, augment() can be used to predict outcomes for data not used in modeling. Perhaps, we would like to predict the energy expenditure for a home in an urban area in the south that uses air-conditioning and is 2,500 square feet. To do this, we first make a tibble including that additional data and then use the newdata argument in the augment() function. As before, to obtain the standard error of the predicted values we need to use the attr() function. add_data &lt;- recs_2020 %&gt;% select(DOEID, Region, Urbanicity, TOTSQFT_EN, ACUsed, DOLLAREL) %&gt;% rbind( tibble( DOEID = NA, Region = &quot;South&quot;, Urbanicity = &quot;Urban Area&quot;, TOTSQFT_EN = 2500, ACUsed = TRUE, DOLLAREL = NA ) ) %&gt;% tail(1) pred_data &lt;- augment(m_electric_multi, newdata = add_data) %&gt;% mutate(.se.fit = sqrt(attr(.fitted, &quot;var&quot;)), .fitted = as.numeric(.fitted)) pred_data ## # A tibble: 1 × 8 ## DOEID Region Urbanicity TOTSQFT_EN ACUsed DOLLAREL .fitted .se.fit ## &lt;dbl&gt; &lt;fct&gt; &lt;fct&gt; &lt;dbl&gt; &lt;lgl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 NA South Urban Area 2500 TRUE NA 1715. 22.6 In the above example, it is predicted that the energy expenditure would be $1,715. 7.4 Logistic regression Logistic regression is used to model binary outcomes such as whether or not someone voted. There are several instances where an outcome may not be originally binary but is collapsed into being binary. For example, given that gender is often asked in surveys with multiple response options and not a binary scale, many researchers now code gender in logistic modeling as cis-male compared to not cis-male. We could also convert a 4-point likert scale that has levels of “Strongly Agree”, “Agree”, “Disagree”, and “Strongly Disagree” to group the agreement levels into one group and disagreement levels into a second group. Logistic regression is a specific case of the generalized linear model (GLM). A GLM uses a link function to link the response variable to the linear model. If we tried to use a normal linear regression with a binary outcome, many assumptions are not held - namely the response is not continuous. Logistic regression allows us to link a linear model between the covariates and a propensity of an outcome. In logistic regression, the link model is the logit function. Specifically, the model is specified as follows: \\[ y_i \\sim \\text{Bernoulli}(\\pi_i)\\] \\[\\begin{equation} \\log \\left(\\frac{\\pi_i}{1-\\pi_i} \\right)=\\beta_0 +\\sum_{i=1}^n \\beta_i x_i \\end{equation}\\] which can be re-expressed as \\[ \\pi_i=\\frac{\\exp \\left(\\beta_0 +\\sum_{i=1}^n \\beta_i x_i \\right)}{1+\\exp \\left(\\beta_0 +\\sum_{i=1}^n \\beta_i x_i \\right)}.\\] where \\(y_i\\) is the outcome, \\(\\beta_0\\) is an intercept, and \\(x_1, \\cdots, x_n\\) are the predictors with \\(\\beta_1, \\cdots, \\beta_n\\) as the associated coefficients. The Bernoulli distribution is a distribution which has an outcome of 0 or 1 given some probability (\\(\\pi_i\\)) in this case and we model \\(\\pi_i\\) as a function of the covariates \\(x_i\\) using this logit link. Assumptions in logistic regression using survey data include: The outcome variable has two levels There is a linear relationship between the independent variables and the log odds (\\(\\log \\left(\\frac{\\pi_i}{1-\\pi_i} \\right)\\)) The residuals are homoscedastic, that is, the error term is the same across all values of independent variables 7.4.1 Syntax The syntax for logistic regression is as follows: des_obj %&gt;% svyglm( formula = outcomevar ~ x1 + x2 + x3, design = ., na.action = na.omit, df.resid = NULL, family = quasibinomial ) The arguments are: formula: Formula in the form of y~x design: a tbl_svy object created by as_survey na.action: handling of missing data df.resid: degrees of freedom for Wald tests (optional) - defaults to using degf(design)-p where \\(p\\) is the rank of the design matrix family: the error distribution/link function to be used in the model Note svyglm() is the same function used in both ANOVA and normal linear regression. However, we’ve added the link function quasibinomial. While we can use the binomial link function, it is recommended to use the quasibinomial as our weights may not be integers, and the quasibinomial also allows for overdispersion (Lumley 2010; McCullagh and Nelder 1989; R Core Team 2023). The quasibinomial family has a default logit link which is what is specified in the equations above. When specifying the outcome variable, it will likely be specified in one of three ways with survey data: A two level factor variable where the first level of the factor indicates a “failure” and the second level indicates a “success” A numeric variable which is 1 or 0 where 1 indicates a success A logical variable where TRUE indicates a success 7.4.2 Examples Example 1: Logistic regression with single variable In the following example, the ANES data is used, and we are modeling whether someone usually has trust in the government25 by who someone voted for president in 2020. As a reminder, the leading candidates were Biden and Trump though people could vote for someone else not in the Democratic or Republican parties. Those votes are all grouped into an “Other” category. We first create a binary outcome for trusting in the government by collapsing “Always” and “Most of the time” into a single factor level, and the other response options (“About half the time”, “Some of the time”, and “Never”) into a second factor level. Next, a scatter plot of the raw data is not useful as it is all 0 and 1 outcomes, so instead, we plot a summary of the data. anes_des_der &lt;- anes_des %&gt;% mutate(TrustGovernmentUsually = case_when( is.na(TrustGovernment) ~ NA, TRUE ~ TrustGovernment %in% c(&quot;Always&quot;, &quot;Most of the time&quot;) )) anes_des_der %&gt;% group_by(VotedPres2020_selection) %&gt;% summarize(pct_trust = survey_mean(TrustGovernmentUsually, na.rm = TRUE, proportion = TRUE, vartype = &quot;ci&quot;), .groups = &quot;drop&quot;) %&gt;% filter(complete.cases(.)) %&gt;% ggplot(aes(x = VotedPres2020_selection, y = pct_trust, fill = VotedPres2020_selection)) + geom_bar(stat = &quot;identity&quot;) + geom_errorbar(aes(ymin = pct_trust_low, ymax = pct_trust_upp), width = .2) + scale_fill_manual(values = c(&quot;#0b3954&quot;, &quot;#bfd7ea&quot;, &quot;#8d6b94&quot;)) + xlab(&quot;Election choice (2020)&quot;) + ylab(&quot;Usually trust the government&quot;) + scale_y_continuous(labels = scales::percent) + guides(fill = &quot;none&quot;) + theme_minimal() FIGURE 7.3: Relationship between candidate selection and trust in government, ANES 2020 By looking at Figure 7.3 it appears that people who voted for Trump are more likely to say that they usually have trust in the government compared to those who voted for Biden and Other candidates. To determine if this insight is accurate, we next we fit the model. logistic_trust_vote &lt;- anes_des_der %&gt;% svyglm(design = ., formula = TrustGovernmentUsually ~ VotedPres2020_selection, family = quasibinomial) tidy(logistic_trust_vote) %&gt;% mutate(p.value=pretty_p_value(p.value)) %&gt;% gt() %&gt;% fmt_number() #njegcpogfu table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #njegcpogfu thead, #njegcpogfu tbody, #njegcpogfu tfoot, #njegcpogfu tr, #njegcpogfu td, #njegcpogfu th { border-style: none; } #njegcpogfu p { margin: 0; padding: 0; } #njegcpogfu .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #njegcpogfu .gt_caption { padding-top: 4px; padding-bottom: 4px; } #njegcpogfu .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #njegcpogfu .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #njegcpogfu .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #njegcpogfu .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #njegcpogfu .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #njegcpogfu .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #njegcpogfu .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #njegcpogfu .gt_column_spanner_outer:first-child { padding-left: 0; } #njegcpogfu .gt_column_spanner_outer:last-child { padding-right: 0; } #njegcpogfu .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #njegcpogfu .gt_spanner_row { border-bottom-style: hidden; } #njegcpogfu .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #njegcpogfu .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #njegcpogfu .gt_from_md > :first-child { margin-top: 0; } #njegcpogfu .gt_from_md > :last-child { margin-bottom: 0; } #njegcpogfu .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #njegcpogfu .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #njegcpogfu .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #njegcpogfu .gt_row_group_first td { border-top-width: 2px; } #njegcpogfu .gt_row_group_first th { border-top-width: 2px; } #njegcpogfu .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #njegcpogfu .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #njegcpogfu .gt_first_summary_row.thick { border-top-width: 2px; } #njegcpogfu .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #njegcpogfu .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #njegcpogfu .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #njegcpogfu .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #njegcpogfu .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #njegcpogfu .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #njegcpogfu .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #njegcpogfu .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #njegcpogfu .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #njegcpogfu .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #njegcpogfu .gt_left { text-align: left; } #njegcpogfu .gt_center { text-align: center; } #njegcpogfu .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #njegcpogfu .gt_font_normal { font-weight: normal; } #njegcpogfu .gt_font_bold { font-weight: bold; } #njegcpogfu .gt_font_italic { font-style: italic; } #njegcpogfu .gt_super { font-size: 65%; } #njegcpogfu .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #njegcpogfu .gt_asterisk { font-size: 100%; vertical-align: 0; } #njegcpogfu .gt_indent_1 { text-indent: 5px; } #njegcpogfu .gt_indent_2 { text-indent: 10px; } #njegcpogfu .gt_indent_3 { text-indent: 15px; } #njegcpogfu .gt_indent_4 { text-indent: 20px; } #njegcpogfu .gt_indent_5 { text-indent: 25px; } TABLE 7.6: Logistic regression output predicting trust in government by presidential candidate selection, RECS 2020 term estimate std.error statistic p.value (Intercept) −1.96 0.07 −27.45 &lt;0.0001 VotedPres2020_selectionTrump 0.43 0.09 4.72 &lt;0.0001 VotedPres2020_selectionOther −0.65 0.44 −1.49 0.1429 In the output above, we can see the estimated coefficients (estimate), estimated standard errors of the coefficients (std.error), the t-statistic (statistic), and the p-value for each coefficient. This output indicates that respondents who voted for Trump are 0.435 times more likely to usually have trust in the government compared to those who voted for Biden (the reference level). Sometimes it is easier to talk about the odds instead of the likelihood. To do this, we need to exponentiate the coefficients. We can use the same tidy() function, but include the argument exponentiate = TRUE to see the odds. tidy(logistic_trust_vote, exponentiate = TRUE) %&gt;% select(term, estimate) %&gt;% gt() %&gt;% fmt_number() #wtkmsuxfxd table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #wtkmsuxfxd thead, #wtkmsuxfxd tbody, #wtkmsuxfxd tfoot, #wtkmsuxfxd tr, #wtkmsuxfxd td, #wtkmsuxfxd th { border-style: none; } #wtkmsuxfxd p { margin: 0; padding: 0; } #wtkmsuxfxd .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #wtkmsuxfxd .gt_caption { padding-top: 4px; padding-bottom: 4px; } #wtkmsuxfxd .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #wtkmsuxfxd .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #wtkmsuxfxd .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #wtkmsuxfxd .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #wtkmsuxfxd .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #wtkmsuxfxd .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #wtkmsuxfxd .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #wtkmsuxfxd .gt_column_spanner_outer:first-child { padding-left: 0; } #wtkmsuxfxd .gt_column_spanner_outer:last-child { padding-right: 0; } #wtkmsuxfxd .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #wtkmsuxfxd .gt_spanner_row { border-bottom-style: hidden; } #wtkmsuxfxd .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #wtkmsuxfxd .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #wtkmsuxfxd .gt_from_md > :first-child { margin-top: 0; } #wtkmsuxfxd .gt_from_md > :last-child { margin-bottom: 0; } #wtkmsuxfxd .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #wtkmsuxfxd .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #wtkmsuxfxd .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #wtkmsuxfxd .gt_row_group_first td { border-top-width: 2px; } #wtkmsuxfxd .gt_row_group_first th { border-top-width: 2px; } #wtkmsuxfxd .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #wtkmsuxfxd .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #wtkmsuxfxd .gt_first_summary_row.thick { border-top-width: 2px; } #wtkmsuxfxd .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #wtkmsuxfxd .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #wtkmsuxfxd .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #wtkmsuxfxd .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #wtkmsuxfxd .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #wtkmsuxfxd .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #wtkmsuxfxd .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #wtkmsuxfxd .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #wtkmsuxfxd .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #wtkmsuxfxd .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #wtkmsuxfxd .gt_left { text-align: left; } #wtkmsuxfxd .gt_center { text-align: center; } #wtkmsuxfxd .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #wtkmsuxfxd .gt_font_normal { font-weight: normal; } #wtkmsuxfxd .gt_font_bold { font-weight: bold; } #wtkmsuxfxd .gt_font_italic { font-style: italic; } #wtkmsuxfxd .gt_super { font-size: 65%; } #wtkmsuxfxd .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #wtkmsuxfxd .gt_asterisk { font-size: 100%; vertical-align: 0; } #wtkmsuxfxd .gt_indent_1 { text-indent: 5px; } #wtkmsuxfxd .gt_indent_2 { text-indent: 10px; } #wtkmsuxfxd .gt_indent_3 { text-indent: 15px; } #wtkmsuxfxd .gt_indent_4 { text-indent: 20px; } #wtkmsuxfxd .gt_indent_5 { text-indent: 25px; } TABLE 7.7: Logistic regression predicting trust in government by presidential candidate selection with exponentiated coefficients (odds), RECS 2020 term estimate (Intercept) 0.14 VotedPres2020_selectionTrump 1.54 VotedPres2020_selectionOther 0.52 We can interpret this as saying that the odds of usually trusting the government for someone who voted for Trump is 154% as likely to trust the government compared to a person who voted for Biden (the reference level). In comparison, a person who voted for neither Biden nor Trump is 52% as likely to trust the government as someone who voted for Biden. As with linear regression, the augment() can be used to predict values. By default, the prediction is the link function (logit function in this instance) and not the probability. To predict the probability, add an argument of type.predict=\"response\" as demonstrated below: logistic_trust_vote %&gt;% augment(type.predict = &quot;response&quot;) %&gt;% mutate(.se.fit = sqrt(attr(.fitted, &quot;var&quot;)), .fitted = as.numeric(.fitted)) %&gt;% select(TrustGovernmentUsually, VotedPres2020_selection, .fitted, .se.fit) ## # A tibble: 6,212 × 4 ## TrustGovernmentUsually VotedPres2020_selection .fitted .se.fit ## &lt;lgl&gt; &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 FALSE Other 0.0681 0.0279 ## 2 FALSE Biden 0.123 0.00772 ## 3 FALSE Biden 0.123 0.00772 ## 4 FALSE Trump 0.178 0.00919 ## 5 FALSE Biden 0.123 0.00772 ## 6 FALSE Trump 0.178 0.00919 ## 7 FALSE Biden 0.123 0.00772 ## 8 FALSE Biden 0.123 0.00772 ## 9 TRUE Biden 0.123 0.00772 ## 10 FALSE Biden 0.123 0.00772 ## # ℹ 6,202 more rows Example 2: Interaction effects Let’s look at another example with interaction effects. If we’re interested in understanding the demographics of people who voted for Biden among all voters in 2020, we could include EarlyVote2020 and Gender in our model. First we need to subset the data to 2020 voters and then create an indicator for voted for Biden. anes_des_ind &lt;- anes_des %&gt;% filter(!is.na(VotedPres2020_selection)) %&gt;% mutate(VoteBiden = case_when(VotedPres2020_selection == &quot;Biden&quot;~1, TRUE ~ 0)) Let’s first look at the main effects of gender and early voting behavior. log_biden_main &lt;- anes_des_ind %&gt;% mutate(EarlyVote2020 = fct_relevel(EarlyVote2020, &quot;No&quot;, after = 0)) %&gt;% svyglm(design = ., formula = VoteBiden ~ EarlyVote2020 + Gender, family = quasibinomial) tidy(log_biden_main) %&gt;% mutate(p.value=pretty_p_value(p.value)) %&gt;% gt() %&gt;% fmt_number() #jbboitmhxl table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #jbboitmhxl thead, #jbboitmhxl tbody, #jbboitmhxl tfoot, #jbboitmhxl tr, #jbboitmhxl td, #jbboitmhxl th { border-style: none; } #jbboitmhxl p { margin: 0; padding: 0; } #jbboitmhxl .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #jbboitmhxl .gt_caption { padding-top: 4px; padding-bottom: 4px; } #jbboitmhxl .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #jbboitmhxl .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #jbboitmhxl .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #jbboitmhxl .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #jbboitmhxl .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #jbboitmhxl .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #jbboitmhxl .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #jbboitmhxl .gt_column_spanner_outer:first-child { padding-left: 0; } #jbboitmhxl .gt_column_spanner_outer:last-child { padding-right: 0; } #jbboitmhxl .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #jbboitmhxl .gt_spanner_row { border-bottom-style: hidden; } #jbboitmhxl .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #jbboitmhxl .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #jbboitmhxl .gt_from_md > :first-child { margin-top: 0; } #jbboitmhxl .gt_from_md > :last-child { margin-bottom: 0; } #jbboitmhxl .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #jbboitmhxl .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #jbboitmhxl .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #jbboitmhxl .gt_row_group_first td { border-top-width: 2px; } #jbboitmhxl .gt_row_group_first th { border-top-width: 2px; } #jbboitmhxl .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #jbboitmhxl .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #jbboitmhxl .gt_first_summary_row.thick { border-top-width: 2px; } #jbboitmhxl .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #jbboitmhxl .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #jbboitmhxl .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #jbboitmhxl .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #jbboitmhxl .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #jbboitmhxl .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #jbboitmhxl .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #jbboitmhxl .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #jbboitmhxl .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #jbboitmhxl .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #jbboitmhxl .gt_left { text-align: left; } #jbboitmhxl .gt_center { text-align: center; } #jbboitmhxl .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #jbboitmhxl .gt_font_normal { font-weight: normal; } #jbboitmhxl .gt_font_bold { font-weight: bold; } #jbboitmhxl .gt_font_italic { font-style: italic; } #jbboitmhxl .gt_super { font-size: 65%; } #jbboitmhxl .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #jbboitmhxl .gt_asterisk { font-size: 100%; vertical-align: 0; } #jbboitmhxl .gt_indent_1 { text-indent: 5px; } #jbboitmhxl .gt_indent_2 { text-indent: 10px; } #jbboitmhxl .gt_indent_3 { text-indent: 15px; } #jbboitmhxl .gt_indent_4 { text-indent: 20px; } #jbboitmhxl .gt_indent_5 { text-indent: 25px; } TABLE 7.8: Logistic regression output for predicting voting for Biden given early voting behavior and gender - main effects only, RECS 2020 term estimate std.error statistic p.value (Intercept) −0.31 0.27 −1.15 0.2553 EarlyVote2020Yes 0.53 0.35 1.53 0.1338 GenderFemale 0.96 0.26 3.73 0.0005 This main effect model indicates that respondents with who early voted in 2020 are 0.528 (p-value=0.1338) times more likely to vote for Biden compared to respondents who did not early vote in the 2020 election (the reference level). We see that gender is also significant with females more likely to vote for Biden compared to males (p-value=0.0005). It is possible that there is an interaction between gender and early voting behavior. To determine this we can create a model that includes the interaction effects: log_biden_int &lt;- anes_des_ind %&gt;% mutate(EarlyVote2020 = fct_relevel(EarlyVote2020, &quot;No&quot;, after = 0)) %&gt;% svyglm(design = ., formula = VoteBiden ~ (EarlyVote2020 + Gender)^2, family = quasibinomial) tidy(log_biden_int) %&gt;% mutate(p.value=pretty_p_value(p.value)) %&gt;% gt() %&gt;% fmt_number() #wucwuafyzy table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #wucwuafyzy thead, #wucwuafyzy tbody, #wucwuafyzy tfoot, #wucwuafyzy tr, #wucwuafyzy td, #wucwuafyzy th { border-style: none; } #wucwuafyzy p { margin: 0; padding: 0; } #wucwuafyzy .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #wucwuafyzy .gt_caption { padding-top: 4px; padding-bottom: 4px; } #wucwuafyzy .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #wucwuafyzy .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #wucwuafyzy .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #wucwuafyzy .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #wucwuafyzy .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #wucwuafyzy .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #wucwuafyzy .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #wucwuafyzy .gt_column_spanner_outer:first-child { padding-left: 0; } #wucwuafyzy .gt_column_spanner_outer:last-child { padding-right: 0; } #wucwuafyzy .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #wucwuafyzy .gt_spanner_row { border-bottom-style: hidden; } #wucwuafyzy .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #wucwuafyzy .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #wucwuafyzy .gt_from_md > :first-child { margin-top: 0; } #wucwuafyzy .gt_from_md > :last-child { margin-bottom: 0; } #wucwuafyzy .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #wucwuafyzy .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #wucwuafyzy .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #wucwuafyzy .gt_row_group_first td { border-top-width: 2px; } #wucwuafyzy .gt_row_group_first th { border-top-width: 2px; } #wucwuafyzy .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #wucwuafyzy .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #wucwuafyzy .gt_first_summary_row.thick { border-top-width: 2px; } #wucwuafyzy .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #wucwuafyzy .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #wucwuafyzy .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #wucwuafyzy .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #wucwuafyzy .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #wucwuafyzy .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #wucwuafyzy .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #wucwuafyzy .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #wucwuafyzy .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #wucwuafyzy .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #wucwuafyzy .gt_left { text-align: left; } #wucwuafyzy .gt_center { text-align: center; } #wucwuafyzy .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #wucwuafyzy .gt_font_normal { font-weight: normal; } #wucwuafyzy .gt_font_bold { font-weight: bold; } #wucwuafyzy .gt_font_italic { font-style: italic; } #wucwuafyzy .gt_super { font-size: 65%; } #wucwuafyzy .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #wucwuafyzy .gt_asterisk { font-size: 100%; vertical-align: 0; } #wucwuafyzy .gt_indent_1 { text-indent: 5px; } #wucwuafyzy .gt_indent_2 { text-indent: 10px; } #wucwuafyzy .gt_indent_3 { text-indent: 15px; } #wucwuafyzy .gt_indent_4 { text-indent: 20px; } #wucwuafyzy .gt_indent_5 { text-indent: 25px; } TABLE 7.9: Logistic regression output for predicting voting for Biden given early voting behavior and gender - with interaction, RECS 2020 term estimate std.error statistic p.value (Intercept) −0.20 0.36 −0.55 0.5844 EarlyVote2020Yes 0.38 0.47 0.80 0.4277 GenderFemale 0.76 0.54 1.42 0.1625 EarlyVote2020Yes:GenderFemale 0.27 0.60 0.45 0.6583 The results from the interaction model show that the interaction between early voting behavior and gender is significant. To better understand what this interaction means, we will want to plot the predicted probabilities with an interaction plot. Let’s first obtain the predicted probabilities for each possible combination of variables using the augment() function. log_biden_pred &lt;- log_biden_int %&gt;% augment(type.predict = &quot;response&quot;) %&gt;% mutate(.se.fit = sqrt(attr(.fitted, &quot;var&quot;)), .fitted = as.numeric(.fitted)) %&gt;% select(VoteBiden, EarlyVote2020, Gender, .fitted, .se.fit) To create an interaction plot, the y-axis will be the predicted probabilities, and one of our x-variables will be on the x-axis and the other will be represented by multiple lines. Figure 7.4 shows the interaction plot with gender on the x-axis and early voting behavior represented by the lines. log_biden_pred %&gt;% filter(VoteBiden==1) %&gt;% distinct() %&gt;% arrange(Gender, EarlyVote2020) %&gt;% mutate(EarlyVote2020 = fct_reorder2(EarlyVote2020, Gender, .fitted)) %&gt;% ggplot(aes(x = Gender, y = .fitted, group = EarlyVote2020, color = EarlyVote2020, linetype = EarlyVote2020)) + geom_line(linewidth = 1.1) + scale_color_manual(values = book_colors[c(2,4)]) + ylab(&quot;Predicted Probability of Voting for Biden&quot;) + labs(color=&quot;Voted Early&quot;, linetype=&quot;Voted Early&quot;) + coord_cartesian(ylim=c(0,1)) + guides(fill = &quot;none&quot;) + theme_minimal() FIGURE 7.4: Interaction Plot of Gender and Early Voting Predicting the Probability of Voting for Biden From this plot we can see that respondents who indicated a male gender had roughly the same probability of voting for Biden regardless of if they voted early or not. However, females who voted early were more likely to vote for Biden if they voted early than if they did not vote early. Interactions in models can be difficult to understand from the coefficients alone. Using these interaction plots can help others understand the nuances of the results, and often can become even more helpful with more than two levels in a given factor (e.g., education or race/ethnicity). 7.5 Exercises The type of housing unit may have an impact on energy expenses. Is there any relationship between housing unit type (HousingUnitType) and total energy expenditure (TOTALDOL)? First, find the average energy expenditure by housing unit type as a descriptive analysis and then do the test. The reference level in the comparison should be the housing unit type that is most common. Does temperature play a role in electricity expenditure? Cooling degree days are a measure of how hot a place is. CDD65 for a given day indicates the number of degrees Fahrenheit warmer than 65°F (18.3°C) it is in a location. On a day that averages 65°F and below, CDD65=0. While a day that averages 85°F (29.4°C) would have CDD65=20 because it is 20 degrees Fahrenheit warmer (U.S. Energy Information Administration 2023d). For each day in the year, this is summed to give an indicator of how hot the place is throughout the year. Similarly, HDD65 indicates the days colder than 65°F. Can energy expenditure be predicted using these temperature indicators along with square footage? Is there a significant relationship? Include main effects and two-way interactions. Continuing with our results from question 2, create a plot between the actual and predicted expenditures and a residual plot for the predicted expenditures. Early voting expanded in 2020 (Sprunt 2020). Build a logistic model predicting early voting in 2020 (EarlyVote2020) using age (Age), education (Education), and party identification (PartyID). Include two-way interactions. Continuing from Exercise 4, predict the probability of early voting for two people. Both are 28 years old and have a graduate degree, but one person is a strong Democrat, and the other is a strong Republican. References Bollen, Kenneth A., Paul P. Biemer, Alan F. Karr, Stephen Tueller, and Marcus E. Berzofsky. 2016. “Are Survey Weights Needed? A Review of Diagnostic Tests in Regression Analysis.” Annual Review of Statistics and Its Application 3 (1): 375–92. https://doi.org/10.1146/annurev-statistics-011516-012958. Gelman, Andrew. 2007. “Struggles with Survey Weighting and Regression Modeling.” Statistical Science 22 (2): 153–64. https://doi.org/10.1214/088342306000000691. Iannone, Richard, Joe Cheng, Barret Schloerke, Ellis Hughes, Alexandra Lauer, and JooYoung Seo. 2023. gt: Easily Create Presentation-Ready Display Tables. Lumley, Thomas. 2010. Complex Surveys: A Guide to Analysis Using R. John Wiley; Sons. McCullagh, Peter, and John Ashworth Nelder. 1989. “Binary Data.” In Generalized Linear Models, 98–148. Springer. R Core Team. 2023. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/. Robinson, David, Alex Hayes, and Simon Couch. 2023. broom: Convert Statistical Objects into Tidy Tibbles. Sprunt, Barbara. 2020. “93 Million and Counting: Americans Are Shattering Early Voting Records.” National Public Radio. ———. 2023d. “Units and Calculators Explained: Degree Days.” https://www.eia.gov/energyexplained/units-and-calculators/degree-days.php. Use help(formula) or ?formula in R↩︎ There is some debate about whether weights should be used in regression (Bollen et al. 2016; Gelman 2007). However, for the purposes of providing complete information on how to analyze complex survey data, this chapter will include weights.↩︎ Question: How often can you trust the federal government in Washington to do what is right?↩︎ "],["c08-communicating-results.html", "Chapter 8 Communication of results 8.1 Introduction 8.2 Describing results through text 8.3 Visualizing data", " Chapter 8 Communication of results Prerequisites For this chapter, load the following packages: library(tidyverse) library(survey) library(srvyr) library(srvyrexploR) library(gt) library(gtsummary) We will be using data from ANES as described in Chapter 4. As a reminder, here is the code to create the design objects for each to use throughout this chapter. For ANES, we need to adjust the weight so it sums to the population instead of the sample (see the ANES documentation and Chapter 4 for more information). targetpop &lt;- 231592693 anes_adjwgt &lt;- anes_2020 %&gt;% mutate(Weight = Weight / sum(Weight) * targetpop) anes_des &lt;- anes_adjwgt %&gt;% as_survey_design( weights = Weight, strata = Stratum, ids = VarUnit, nest = TRUE ) 8.1 Introduction After finishing the analysis and modeling, we proceed to the important task of communicating the survey results. Our audience may range from seasoned researchers familiar with our survey data to newcomers encountering the information for the first time. We should aim to explain the methodology and analysis while presenting findings in an accessible way, and it is our responsibility to report information with care. Before beginning any dissemination of results, consider questions such as: How will we present results? Examples include a website, print, or other media. Based on the media type, we might limit or enhance the use of graphical representation. What is the audience’s familiarity with the study and/or data? Audiences can range from the general public to data experts. If we anticipate limited knowledge about the study, we should provide detailed descriptions (we discuss recommendations later in the chapter). What are we trying to communicate? It could be summary statistics, trends, patterns, or other insights. Tables might suit summary statistics, while plots are better at conveying trends and patterns. Is the audience accustomed to interpreting plots? If not, include explanatory text to guide them on how to interpret the plots effectively. What is the audience’s statistical knowledge? If the audience does not have a strong statistics background, provide text on standard errors, confidence intervals, and other estimate types to enhance understanding. 8.2 Describing results through text As analysts, our emphasis is often on the data, and communicating results can sometimes be overlooked. First, we need to identify the appropriate information to share with our audience. Chapters 2 and 3 provide insights into factors we need to consider during analysis, and they remain relevant when presenting results to others. 8.2.1 Methodology If we are using existing data, methodologically-sound surveys will provide documentation about how the survey was fielded, the questionnaires, and other necessary information for analyses. For example, the survey’s methodology reports should include the population of interest, sampling procedures, response rates, questionnaire documentation, weighting, and a general overview of disclosure statements. Many American organizations follow the American Association for Public Opinion Research’s (AAPOR) Transparency Initiative. The AAPOR Transparency Initiative requires organizations to include specific details in their methodology, making it clear how we can and should analyze the results. Being transparent about these methods is vital for the scientific rigor of the field. The details provided in Chapter 2 about the survey process should be shared with the audience when presenting the results. When using publicly-available data, like the examples in this book, we can often link to the methodology report in our final output. We should also provide high-level information for the audience to quickly grasp the context around the findings. For example, we can mention when and where the study was conducted, the population’s age range, or other contextual details. This information helps the audience understand how generalizable the results are. Providing this material is especially important when there’s no methodology report available for the analyzed data. For example, if a researcher conducted a new survey for a specific purpose, we should document and present all the pertinent information during the analysis and reporting process. Adhering to the AAPOR Transparency Initiative guidelines is a reliable method to guarantee that all essential information is communicated to the audience. 8.2.2 Analysis Along with the survey methodology and weight calculations, we should also share our approach to preparing, cleaning, and analyzing the data. For example, in Chapter 6, we compared education distributions from the ANES survey to the American Community Survey (ACS). To make the comparison, we had to collapse education categories provided in the ANES data to match the ACS. The process for this particular example may seem straightforward (like combining Bachelor’s and Graduate Degrees into a single category), but there are multiple ways to deal with the data. Our choice is just one of many. We should document both the original ANES question and response options and the steps we took to match it with ACS data. This transparency helps clarify our analysis to our audience. Missing data is another instance where we want to be unambigious and upfront with our audience. In this book, numerous examples and exercises remove missing data, as this is often the easiest way to handle them. However, there are circumstances where missing data holds substantive importance, and excluding them could introduce bias (see Chapter 11). Being transparent about our handling of missing data is important to maintaining the integrity of our analysis and ensuring a comprehensive understanding of the results. 8.2.3 Results While tables and graphs are commonly used to communicate results, there are instances where text can be more effective in sharing information. Narrative details, such as context around point estimates or model coefficients, can go a long way in improving our communication. We have several strategies to effectively convey the significance of the data to the audience through text. First, we can highlight important data points in a sentence using plain language. For example, if we were looking at election polling data conducted before an election, we could say something like: As of [DATE], an estimated XX% of registered U.S. voters say they will vote for [CANDIDATE NAME] for president in the [YEAR] general election. This sentence provides key pieces of information in a straightforward way: [DATE]: Given that polling data is time-specific, providing the date of reference lets the audience know when this data was valid. Registered U.S. voters: This tells the audience who we surveyed, letting them know the target population. XX%: This part provides the estimated percentage of people voting for a specific candidate for a specific office. [YEAR] general election: As with the bullet above, adding this gives more context about the election type and year. The estimate would take on a different meaning if we changed it to a primary election instead of a general election. We also included the word “estimated.” When presenting aggregate survey results, we have errors around each estimate. We want to convey this uncertainty rather than talk in absolutes. Words like “estimated,” “on average,” or “around” can help communicate this uncertainty to the audience. Instead of saying ‘XX%,’ we can also say ‘XX% (+/- Y%)’ to show the margin of error. Confidence intervals can also be incorporated into the text to assist readers. Second, providing context and discussing the meaning behind a point estimate can help the audience glean some insight into why the data is important. For example, when comparing two values, it can be helpful to highlight if there are statistically significant differences and explain the impact and relevance of this information. This is where we, as analysts, should to do our best to be mindful of biases and present the facts logically. Keep in mind how we discuss these findings can greatly influence how the audience interprets them. If we include speculation, using phrases like “the authors speculate” or “these findings may indicate” relays the uncertainty around the notion while still lending a plausible solution. Additionally, we can present alternative viewpoints or competing discussion points to explain the uncertainty in the results. 8.3 Visualizing data Although discussing key findings in the text is important, presenting large amounts of data is often more digestible for the audience in tables or visualizations. Effectively combining text, tables, and graphs can be powerful in communicating results. This section provides examples of using the {gt}, {gtsummary}, and {ggplot2} packages to enhance the dissemination of results (Iannone et al. 2023; Sjoberg et al. 2021; Wickham 2016). 8.3.1 Tables Tables are a great way to provide a large amount of data when individual data points need to be examined. However, it is important to present tables in a reader-friendly format. Numbers should align, rows and columns should be easy to follow, and the table size should not compromise readability. Using key visualization techniques, we can create tables that are informative and nice to look at. Many packages create easy-to-read tables (e.g., {kable} + {kableExtra}, {gt}, {gtsummary}, {DT}, {formattable}, {flextable}, {reactable}). While we will focus on {gt} here, we encourage learning about others as they may have additional helpful features. We appreciate the flexibility, ability to use pipes (e.g., %&gt;%), and numerous extensions of the {gt} package. Please note, at this time, {gtsummary} needs additional features to be widely used for survey analysis, particularly due to its lack of ability to work with replicate designs. We provide one example using {gtsummary} and hope it evolves into a more comprehensive tool over time. 8.3.1.1 Transitioning {srvyr} output to a {gt} table Let’s start by using some of the data we calculated earlier in this book. In Chapter 6, we looked at data on trust in government with the proportions calculated below: trust_gov &lt;- anes_des %&gt;% drop_na(TrustGovernment) %&gt;% group_by(TrustGovernment) %&gt;% summarize(trust_gov_p = survey_prop()) trust_gov ## # A tibble: 5 × 3 ## TrustGovernment trust_gov_p trust_gov_p_se ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Always 0.0155 0.00204 ## 2 Most of the time 0.132 0.00553 ## 3 About half the time 0.309 0.00829 ## 4 Some of the time 0.434 0.00855 ## 5 Never 0.110 0.00566 The default output generated by R may work for initial viewing inside our IDE or when creating basic output in an R Markdown or Quarto document. However, when presenting these results in other publications, such as the print version of this book or with other formal dissemination modes, modifying the display can improve our reader’s experience. Looking at the output from trust_gov, a couple of improvements are obvious: (1) switching to percentages instead of proportions and (2) using the variable names as column headers. The {gt} package is a good tool for implementing better labeling and creating publishable tables. Let’s walk through some code as we make a few changes to improve the table’s usefulness. First, we initiate the table with the gt() function. Next, we use the argument rowname_col() to designate the TrustGovernment column as the labels for each row (called the table “stub”). We apply the cols_label() function to create informative column labels instead of variable names, and then the tab_spanner() function to add a label across multiple columns. In this case, we label all columns except the stub with “Trust in Government, 2020”. We then format the proportions into percentages with the fmt_percent() function and reduce the number of decimals shown with decimals = 1. Finally, the tab_caption() function adds a table title for HTML version of the book. We can use the caption for cross-referencing in R Markdown, Quarto, and bookdown, as well as adding it to the list of tables in the book. trust_gov_gt &lt;- trust_gov %&gt;% gt(rowname_col = &quot;TrustGovernment&quot;) %&gt;% cols_label(trust_gov_p = &quot;%&quot;, trust_gov_p_se = &quot;s.e. (%)&quot;) %&gt;% tab_spanner(label = &quot;Trust in Government, 2020&quot;, columns = c(trust_gov_p, trust_gov_p_se)) %&gt;% fmt_percent(decimals = 1) trust_gov_gt %&gt;% tab_caption(&quot;Example of gt table with trust in government estimate&quot;) #eooentrgiy table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #eooentrgiy thead, #eooentrgiy tbody, #eooentrgiy tfoot, #eooentrgiy tr, #eooentrgiy td, #eooentrgiy th { border-style: none; } #eooentrgiy p { margin: 0; padding: 0; } #eooentrgiy .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #eooentrgiy .gt_caption { padding-top: 4px; padding-bottom: 4px; } #eooentrgiy .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #eooentrgiy .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #eooentrgiy .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #eooentrgiy .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #eooentrgiy .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #eooentrgiy .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #eooentrgiy .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #eooentrgiy .gt_column_spanner_outer:first-child { padding-left: 0; } #eooentrgiy .gt_column_spanner_outer:last-child { padding-right: 0; } #eooentrgiy .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #eooentrgiy .gt_spanner_row { border-bottom-style: hidden; } #eooentrgiy .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #eooentrgiy .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #eooentrgiy .gt_from_md > :first-child { margin-top: 0; } #eooentrgiy .gt_from_md > :last-child { margin-bottom: 0; } #eooentrgiy .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #eooentrgiy .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #eooentrgiy .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #eooentrgiy .gt_row_group_first td { border-top-width: 2px; } #eooentrgiy .gt_row_group_first th { border-top-width: 2px; } #eooentrgiy .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #eooentrgiy .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #eooentrgiy .gt_first_summary_row.thick { border-top-width: 2px; } #eooentrgiy .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #eooentrgiy .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #eooentrgiy .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #eooentrgiy .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #eooentrgiy .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #eooentrgiy .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #eooentrgiy .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #eooentrgiy .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #eooentrgiy .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #eooentrgiy .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #eooentrgiy .gt_left { text-align: left; } #eooentrgiy .gt_center { text-align: center; } #eooentrgiy .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #eooentrgiy .gt_font_normal { font-weight: normal; } #eooentrgiy .gt_font_bold { font-weight: bold; } #eooentrgiy .gt_font_italic { font-style: italic; } #eooentrgiy .gt_super { font-size: 65%; } #eooentrgiy .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #eooentrgiy .gt_asterisk { font-size: 100%; vertical-align: 0; } #eooentrgiy .gt_indent_1 { text-indent: 5px; } #eooentrgiy .gt_indent_2 { text-indent: 10px; } #eooentrgiy .gt_indent_3 { text-indent: 15px; } #eooentrgiy .gt_indent_4 { text-indent: 20px; } #eooentrgiy .gt_indent_5 { text-indent: 25px; } TABLE 8.1: Example of gt table with trust in government estimate Trust in Government, 2020 % s.e. (%) Always 1.6% 0.2% Most of the time 13.2% 0.6% About half the time 30.9% 0.8% Some of the time 43.4% 0.9% Never 11.0% 0.6% We can add a few more enhancements, such as a title, a data source note, and a footnote with the question information, using the functions tab_header(), tab_source_note(), and tab_footnote(). If having the percentage sign in both the header and the cells seems redundant, we can opt for fmt_number() instead of fmt_percent() and scale the number by 100 with scale_by = 100. trust_gov_gt2 &lt;- trust_gov_gt %&gt;% tab_header(&quot;American voter&#39;s trust in the federal government, 2020&quot;) %&gt;% tab_source_note(&quot;American National Election Studies, 2020&quot;) %&gt;% tab_footnote( &quot;Question text: How often can you trust the federal government in Washington to do what is right?&quot; ) %&gt;% fmt_number(scale_by = 100, decimals = 1) trust_gov_gt2 #gqxxiknzou table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #gqxxiknzou thead, #gqxxiknzou tbody, #gqxxiknzou tfoot, #gqxxiknzou tr, #gqxxiknzou td, #gqxxiknzou th { border-style: none; } #gqxxiknzou p { margin: 0; padding: 0; } #gqxxiknzou .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #gqxxiknzou .gt_caption { padding-top: 4px; padding-bottom: 4px; } #gqxxiknzou .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #gqxxiknzou .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #gqxxiknzou .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #gqxxiknzou .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #gqxxiknzou .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #gqxxiknzou .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #gqxxiknzou .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #gqxxiknzou .gt_column_spanner_outer:first-child { padding-left: 0; } #gqxxiknzou .gt_column_spanner_outer:last-child { padding-right: 0; } #gqxxiknzou .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #gqxxiknzou .gt_spanner_row { border-bottom-style: hidden; } #gqxxiknzou .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #gqxxiknzou .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #gqxxiknzou .gt_from_md > :first-child { margin-top: 0; } #gqxxiknzou .gt_from_md > :last-child { margin-bottom: 0; } #gqxxiknzou .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #gqxxiknzou .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #gqxxiknzou .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #gqxxiknzou .gt_row_group_first td { border-top-width: 2px; } #gqxxiknzou .gt_row_group_first th { border-top-width: 2px; } #gqxxiknzou .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #gqxxiknzou .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #gqxxiknzou .gt_first_summary_row.thick { border-top-width: 2px; } #gqxxiknzou .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #gqxxiknzou .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #gqxxiknzou .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #gqxxiknzou .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #gqxxiknzou .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #gqxxiknzou .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #gqxxiknzou .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #gqxxiknzou .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #gqxxiknzou .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #gqxxiknzou .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #gqxxiknzou .gt_left { text-align: left; } #gqxxiknzou .gt_center { text-align: center; } #gqxxiknzou .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #gqxxiknzou .gt_font_normal { font-weight: normal; } #gqxxiknzou .gt_font_bold { font-weight: bold; } #gqxxiknzou .gt_font_italic { font-style: italic; } #gqxxiknzou .gt_super { font-size: 65%; } #gqxxiknzou .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #gqxxiknzou .gt_asterisk { font-size: 100%; vertical-align: 0; } #gqxxiknzou .gt_indent_1 { text-indent: 5px; } #gqxxiknzou .gt_indent_2 { text-indent: 10px; } #gqxxiknzou .gt_indent_3 { text-indent: 15px; } #gqxxiknzou .gt_indent_4 { text-indent: 20px; } #gqxxiknzou .gt_indent_5 { text-indent: 25px; } TABLE 8.2: Example of gt table with trust in government estimates and additional context American voter's trust in the federal government, 2020 Trust in Government, 2020 % s.e. (%) Always 1.6 0.2 Most of the time 13.2 0.6 About half the time 30.9 0.8 Some of the time 43.4 0.9 Never 11.0 0.6 American National Election Studies, 2020 Question text: How often can you trust the federal government in Washington to do what is right? Expanding tables using {gtsummary} The {gtsummary} package simultaneously summarizes data and creates publication-ready tables. Initially designed for clinical trial data, it has been extended to include survey analysis in certain capacities. At this time, it is only compatible with survey objects using Taylor’s Series Linearization and not replicate methods. While it offers a restricted set of summary statistics, the following are available for categorical variables: {n} frequency {N} denominator, or cohort size {p} percentage {p.std.error} standard error of the sample proportion {deff} design effect of the sample proportion {n_unweighted} unweighted frequency {N_unweighted} unweighted denominator {p_unweighted} unweighted formatted percentage The following summary statistics are available for continuous variables: {median} median {mean} mean {mean.std.error} standard error of the sample mean {deff} design effect of the sample mean {sd} standard deviation {var} variance {min} minimum {max} maximum {p##} any integer percentile, where ## is an integer from 0 to 100 {sum} sum In the following example, we will build a table using {gtsummary}, similar to the table in the {gt} example. The main function we use is tbl_svysummary(). In this function, we include the variables we want to analyze in the include argument and define the statistics we want to display in the statistic argument. To specify the statistics, we apply the syntax from the {glue} package, where we enclose the variables we want to insert within curly brackets. We must specify the desired statistics using the names listed above. For example, to specify that we want the proportion followed by the standard error of the proportion in parentheses, we use {p} ({p.std.error}). anes_des_gtsum &lt;- anes_des %&gt;% tbl_svysummary(include = TrustGovernment, statistic = list(all_categorical() ~ &quot;{p} ({p.std.error})&quot;)) anes_des_gtsum #eqhjruykcc table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #eqhjruykcc thead, #eqhjruykcc tbody, #eqhjruykcc tfoot, #eqhjruykcc tr, #eqhjruykcc td, #eqhjruykcc th { border-style: none; } #eqhjruykcc p { margin: 0; padding: 0; } #eqhjruykcc .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #eqhjruykcc .gt_caption { padding-top: 4px; padding-bottom: 4px; } #eqhjruykcc .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #eqhjruykcc .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #eqhjruykcc .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #eqhjruykcc .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #eqhjruykcc .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #eqhjruykcc .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #eqhjruykcc .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #eqhjruykcc .gt_column_spanner_outer:first-child { padding-left: 0; } #eqhjruykcc .gt_column_spanner_outer:last-child { padding-right: 0; } #eqhjruykcc .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #eqhjruykcc .gt_spanner_row { border-bottom-style: hidden; } #eqhjruykcc .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #eqhjruykcc .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #eqhjruykcc .gt_from_md > :first-child { margin-top: 0; } #eqhjruykcc .gt_from_md > :last-child { margin-bottom: 0; } #eqhjruykcc .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #eqhjruykcc .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #eqhjruykcc .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #eqhjruykcc .gt_row_group_first td { border-top-width: 2px; } #eqhjruykcc .gt_row_group_first th { border-top-width: 2px; } #eqhjruykcc .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #eqhjruykcc .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #eqhjruykcc .gt_first_summary_row.thick { border-top-width: 2px; } #eqhjruykcc .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #eqhjruykcc .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #eqhjruykcc .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #eqhjruykcc .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #eqhjruykcc .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #eqhjruykcc .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #eqhjruykcc .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #eqhjruykcc .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #eqhjruykcc .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #eqhjruykcc .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #eqhjruykcc .gt_left { text-align: left; } #eqhjruykcc .gt_center { text-align: center; } #eqhjruykcc .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #eqhjruykcc .gt_font_normal { font-weight: normal; } #eqhjruykcc .gt_font_bold { font-weight: bold; } #eqhjruykcc .gt_font_italic { font-style: italic; } #eqhjruykcc .gt_super { font-size: 65%; } #eqhjruykcc .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #eqhjruykcc .gt_asterisk { font-size: 100%; vertical-align: 0; } #eqhjruykcc .gt_indent_1 { text-indent: 5px; } #eqhjruykcc .gt_indent_2 { text-indent: 10px; } #eqhjruykcc .gt_indent_3 { text-indent: 15px; } #eqhjruykcc .gt_indent_4 { text-indent: 20px; } #eqhjruykcc .gt_indent_5 { text-indent: 25px; } TABLE 8.3: Example of gtsummary table with trust in government estimates Characteristic N = 231,034,1251 PRE: How often trust government in Washington to do what is right [revised]     Always 1.6 (0.00)     Most of the time 13 (0.01)     About half the time 31 (0.01)     Some of the time 43 (0.01)     Never 11 (0.01)     Unknown 673,773 1 % (SE(%)) The default table includes the weighted number of missing (or Unknown) records. The standard error is reported as a proportion, while the proportion is styled as a percentage. In the next step, we remove the Unknown category by setting the missing argument to “no” and format the standard error as a percentage using the digits argument. To improve the table for publication, we provide a more polished label for the “TrustGovernment” variable using the label argument. anes_des_gtsum2 &lt;- anes_des %&gt;% tbl_svysummary( include = TrustGovernment, statistic = list(all_categorical() ~ &quot;{p} ({p.std.error})&quot;), missing = &quot;no&quot;, digits = list(TrustGovernment ~ style_percent), label = list(TrustGovernment ~ &quot;Trust in Government, 2020&quot;) ) anes_des_gtsum2 #xrxgqklkbn table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #xrxgqklkbn thead, #xrxgqklkbn tbody, #xrxgqklkbn tfoot, #xrxgqklkbn tr, #xrxgqklkbn td, #xrxgqklkbn th { border-style: none; } #xrxgqklkbn p { margin: 0; padding: 0; } #xrxgqklkbn .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #xrxgqklkbn .gt_caption { padding-top: 4px; padding-bottom: 4px; } #xrxgqklkbn .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #xrxgqklkbn .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #xrxgqklkbn .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #xrxgqklkbn .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #xrxgqklkbn .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #xrxgqklkbn .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #xrxgqklkbn .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #xrxgqklkbn .gt_column_spanner_outer:first-child { padding-left: 0; } #xrxgqklkbn .gt_column_spanner_outer:last-child { padding-right: 0; } #xrxgqklkbn .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #xrxgqklkbn .gt_spanner_row { border-bottom-style: hidden; } #xrxgqklkbn .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #xrxgqklkbn .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #xrxgqklkbn .gt_from_md > :first-child { margin-top: 0; } #xrxgqklkbn .gt_from_md > :last-child { margin-bottom: 0; } #xrxgqklkbn .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #xrxgqklkbn .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #xrxgqklkbn .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #xrxgqklkbn .gt_row_group_first td { border-top-width: 2px; } #xrxgqklkbn .gt_row_group_first th { border-top-width: 2px; } #xrxgqklkbn .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #xrxgqklkbn .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #xrxgqklkbn .gt_first_summary_row.thick { border-top-width: 2px; } #xrxgqklkbn .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #xrxgqklkbn .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #xrxgqklkbn .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #xrxgqklkbn .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #xrxgqklkbn .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #xrxgqklkbn .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #xrxgqklkbn .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #xrxgqklkbn .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #xrxgqklkbn .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #xrxgqklkbn .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #xrxgqklkbn .gt_left { text-align: left; } #xrxgqklkbn .gt_center { text-align: center; } #xrxgqklkbn .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #xrxgqklkbn .gt_font_normal { font-weight: normal; } #xrxgqklkbn .gt_font_bold { font-weight: bold; } #xrxgqklkbn .gt_font_italic { font-style: italic; } #xrxgqklkbn .gt_super { font-size: 65%; } #xrxgqklkbn .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #xrxgqklkbn .gt_asterisk { font-size: 100%; vertical-align: 0; } #xrxgqklkbn .gt_indent_1 { text-indent: 5px; } #xrxgqklkbn .gt_indent_2 { text-indent: 10px; } #xrxgqklkbn .gt_indent_3 { text-indent: 15px; } #xrxgqklkbn .gt_indent_4 { text-indent: 20px; } #xrxgqklkbn .gt_indent_5 { text-indent: 25px; } TABLE 8.4: Example of gtsummary table with trust in government estimates with labeling and digits options Characteristic N = 231,034,1251 Trust in Government, 2020     Always 1.6 (0.2)     Most of the time 13 (0.6)     About half the time 31 (0.8)     Some of the time 43 (0.9)     Never 11 (0.6) 1 % (SE(%)) To exclude the term “Characteristic” and the estimated population size, we can modify the header using themodify_header() function to update the label. Further adjustments can be made based on personal preferences, organizational guidelines, or other style guides. If we prefer having the standard error in the header, similar to the {gt} table, instead of in the footnote (the {gtsummary} default), we can make these changes by specifying stat_0 in the modify_header() function. Additionally, using modify_footnote() with update = everything() ~ NA removes the standard error from the footnote. After transforming the object into a gt table using as_gt(), we can add footnotes and a title using the same methods explained in Section 8.3.1.1. anes_des_gtsum3 &lt;- anes_des %&gt;% tbl_svysummary( include = TrustGovernment, statistic = list(all_categorical() ~ &quot;{p} ({p.std.error})&quot;), missing = &quot;no&quot;, digits = list(TrustGovernment ~ style_percent), label = list(TrustGovernment ~ &quot;Trust in Government, 2020&quot;) ) %&gt;% modify_footnote(update = everything() ~ NA) %&gt;% modify_header(label = &quot; &quot;, stat_0 = &quot;% (s.e.)&quot;) %&gt;% as_gt() %&gt;% tab_header(&quot;American voter&#39;s trust in the federal government, 2020&quot;) %&gt;% tab_source_note(&quot;American National Election Studies, 2020&quot;) %&gt;% tab_footnote( &quot;Question text: How often can you trust the federal government in Washington to do what is right?&quot; ) anes_des_gtsum3 #gcmixtdayn table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #gcmixtdayn thead, #gcmixtdayn tbody, #gcmixtdayn tfoot, #gcmixtdayn tr, #gcmixtdayn td, #gcmixtdayn th { border-style: none; } #gcmixtdayn p { margin: 0; padding: 0; } #gcmixtdayn .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #gcmixtdayn .gt_caption { padding-top: 4px; padding-bottom: 4px; } #gcmixtdayn .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #gcmixtdayn .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #gcmixtdayn .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #gcmixtdayn .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #gcmixtdayn .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #gcmixtdayn .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #gcmixtdayn .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #gcmixtdayn .gt_column_spanner_outer:first-child { padding-left: 0; } #gcmixtdayn .gt_column_spanner_outer:last-child { padding-right: 0; } #gcmixtdayn .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #gcmixtdayn .gt_spanner_row { border-bottom-style: hidden; } #gcmixtdayn .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #gcmixtdayn .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #gcmixtdayn .gt_from_md > :first-child { margin-top: 0; } #gcmixtdayn .gt_from_md > :last-child { margin-bottom: 0; } #gcmixtdayn .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #gcmixtdayn .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #gcmixtdayn .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #gcmixtdayn .gt_row_group_first td { border-top-width: 2px; } #gcmixtdayn .gt_row_group_first th { border-top-width: 2px; } #gcmixtdayn .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #gcmixtdayn .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #gcmixtdayn .gt_first_summary_row.thick { border-top-width: 2px; } #gcmixtdayn .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #gcmixtdayn .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #gcmixtdayn .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #gcmixtdayn .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #gcmixtdayn .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #gcmixtdayn .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #gcmixtdayn .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #gcmixtdayn .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #gcmixtdayn .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #gcmixtdayn .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #gcmixtdayn .gt_left { text-align: left; } #gcmixtdayn .gt_center { text-align: center; } #gcmixtdayn .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #gcmixtdayn .gt_font_normal { font-weight: normal; } #gcmixtdayn .gt_font_bold { font-weight: bold; } #gcmixtdayn .gt_font_italic { font-style: italic; } #gcmixtdayn .gt_super { font-size: 65%; } #gcmixtdayn .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #gcmixtdayn .gt_asterisk { font-size: 100%; vertical-align: 0; } #gcmixtdayn .gt_indent_1 { text-indent: 5px; } #gcmixtdayn .gt_indent_2 { text-indent: 10px; } #gcmixtdayn .gt_indent_3 { text-indent: 15px; } #gcmixtdayn .gt_indent_4 { text-indent: 20px; } #gcmixtdayn .gt_indent_5 { text-indent: 25px; } TABLE 8.5: Example of gtsummary table with trust in government estimates with more labeling options and context American voter's trust in the federal government, 2020 % (s.e.) Trust in Government, 2020     Always 1.6 (0.2)     Most of the time 13 (0.6)     About half the time 31 (0.8)     Some of the time 43 (0.9)     Never 11 (0.6) American National Election Studies, 2020 Question text: How often can you trust the federal government in Washington to do what is right? We can also include continuous variables in the table. Below, we add a summary of the age variable by updating the include, statistic, and digits arguments. anes_des_gtsum4 &lt;- anes_des %&gt;% tbl_svysummary( include = c(TrustGovernment, Age), statistic = list( all_categorical() ~ &quot;{p} ({p.std.error})&quot;, all_continuous() ~ &quot;{mean} ({mean.std.error})&quot; ), missing = &quot;no&quot;, digits = list(TrustGovernment ~ style_percent, Age ~ c(1, 2)), label = list(TrustGovernment ~ &quot;Trust in Government, 2020&quot;) ) %&gt;% modify_footnote(update = everything() ~ NA) %&gt;% modify_header(label = &quot; &quot;, stat_0 = &quot;% (s.e.)&quot;) %&gt;% as_gt() %&gt;% tab_header(&quot;American voter&#39;s trust in the federal government, 2020&quot;) %&gt;% tab_source_note(&quot;American National Election Studies, 2020&quot;) %&gt;% tab_footnote( &quot;Question text: How often can you trust the federal government in Washington to do what is right?&quot; ) %&gt;% tab_caption(&quot;Example of gtsummary table with trust in government estimates and average age&quot;) anes_des_gtsum4 #qvinrblcgt table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #qvinrblcgt thead, #qvinrblcgt tbody, #qvinrblcgt tfoot, #qvinrblcgt tr, #qvinrblcgt td, #qvinrblcgt th { border-style: none; } #qvinrblcgt p { margin: 0; padding: 0; } #qvinrblcgt .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #qvinrblcgt .gt_caption { padding-top: 4px; padding-bottom: 4px; } #qvinrblcgt .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #qvinrblcgt .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #qvinrblcgt .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #qvinrblcgt .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #qvinrblcgt .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #qvinrblcgt .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #qvinrblcgt .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #qvinrblcgt .gt_column_spanner_outer:first-child { padding-left: 0; } #qvinrblcgt .gt_column_spanner_outer:last-child { padding-right: 0; } #qvinrblcgt .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #qvinrblcgt .gt_spanner_row { border-bottom-style: hidden; } #qvinrblcgt .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #qvinrblcgt .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #qvinrblcgt .gt_from_md > :first-child { margin-top: 0; } #qvinrblcgt .gt_from_md > :last-child { margin-bottom: 0; } #qvinrblcgt .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #qvinrblcgt .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #qvinrblcgt .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #qvinrblcgt .gt_row_group_first td { border-top-width: 2px; } #qvinrblcgt .gt_row_group_first th { border-top-width: 2px; } #qvinrblcgt .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #qvinrblcgt .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #qvinrblcgt .gt_first_summary_row.thick { border-top-width: 2px; } #qvinrblcgt .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #qvinrblcgt .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #qvinrblcgt .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #qvinrblcgt .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #qvinrblcgt .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #qvinrblcgt .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #qvinrblcgt .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #qvinrblcgt .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #qvinrblcgt .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #qvinrblcgt .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #qvinrblcgt .gt_left { text-align: left; } #qvinrblcgt .gt_center { text-align: center; } #qvinrblcgt .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #qvinrblcgt .gt_font_normal { font-weight: normal; } #qvinrblcgt .gt_font_bold { font-weight: bold; } #qvinrblcgt .gt_font_italic { font-style: italic; } #qvinrblcgt .gt_super { font-size: 65%; } #qvinrblcgt .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #qvinrblcgt .gt_asterisk { font-size: 100%; vertical-align: 0; } #qvinrblcgt .gt_indent_1 { text-indent: 5px; } #qvinrblcgt .gt_indent_2 { text-indent: 10px; } #qvinrblcgt .gt_indent_3 { text-indent: 15px; } #qvinrblcgt .gt_indent_4 { text-indent: 20px; } #qvinrblcgt .gt_indent_5 { text-indent: 25px; } TABLE 8.6: Example of gtsummary table with trust in government estimates and average age American voter's trust in the federal government, 2020 % (s.e.) Trust in Government, 2020     Always 1.6 (0.2)     Most of the time 13 (0.6)     About half the time 31 (0.8)     Some of the time 43 (0.9)     Never 11 (0.6) PRE: SUMMARY: Respondent age 47.3 (0.36) American National Election Studies, 2020 Question text: How often can you trust the federal government in Washington to do what is right? With {gtsummary}, we can also calculate statistics by different groups. Let’s modify the previous example to analyze data on whether a respondent voted for president in 2020. We update the by argument and refine the header. anes_des_gtsum5 &lt;- anes_des %&gt;% drop_na(VotedPres2020) %&gt;% tbl_svysummary( include = TrustGovernment, statistic = list(all_categorical() ~ &quot;{p} ({p.std.error})&quot;), missing = &quot;no&quot;, digits = list(TrustGovernment ~ style_percent), label = list(TrustGovernment ~ &quot;Trust in Government, 2020&quot;), by = VotedPres2020 ) %&gt;% modify_footnote(update = everything() ~ NA) %&gt;% modify_header(label = &quot; &quot;, stat_1 = &quot;Voted&quot;, stat_2 = &quot;Didn&#39;t vote&quot;) %&gt;% modify_spanning_header(all_stat_cols() ~ &quot;% (s.e.)&quot;) %&gt;% as_gt() %&gt;% tab_header( &quot;American voter&#39;s trust in the federal government by whether they voted in the 2020 presidential election&quot; ) %&gt;% tab_source_note(&quot;American National Election Studies, 2020&quot;) %&gt;% tab_footnote( &quot;Question text: How often can you trust the federal government in Washington to do what is right?&quot; ) anes_des_gtsum5 #hjdkonkthi table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #hjdkonkthi thead, #hjdkonkthi tbody, #hjdkonkthi tfoot, #hjdkonkthi tr, #hjdkonkthi td, #hjdkonkthi th { border-style: none; } #hjdkonkthi p { margin: 0; padding: 0; } #hjdkonkthi .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #hjdkonkthi .gt_caption { padding-top: 4px; padding-bottom: 4px; } #hjdkonkthi .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #hjdkonkthi .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #hjdkonkthi .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #hjdkonkthi .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #hjdkonkthi .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #hjdkonkthi .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #hjdkonkthi .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #hjdkonkthi .gt_column_spanner_outer:first-child { padding-left: 0; } #hjdkonkthi .gt_column_spanner_outer:last-child { padding-right: 0; } #hjdkonkthi .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #hjdkonkthi .gt_spanner_row { border-bottom-style: hidden; } #hjdkonkthi .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #hjdkonkthi .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #hjdkonkthi .gt_from_md > :first-child { margin-top: 0; } #hjdkonkthi .gt_from_md > :last-child { margin-bottom: 0; } #hjdkonkthi .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #hjdkonkthi .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #hjdkonkthi .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #hjdkonkthi .gt_row_group_first td { border-top-width: 2px; } #hjdkonkthi .gt_row_group_first th { border-top-width: 2px; } #hjdkonkthi .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #hjdkonkthi .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #hjdkonkthi .gt_first_summary_row.thick { border-top-width: 2px; } #hjdkonkthi .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #hjdkonkthi .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #hjdkonkthi .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #hjdkonkthi .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #hjdkonkthi .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #hjdkonkthi .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #hjdkonkthi .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #hjdkonkthi .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #hjdkonkthi .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #hjdkonkthi .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #hjdkonkthi .gt_left { text-align: left; } #hjdkonkthi .gt_center { text-align: center; } #hjdkonkthi .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #hjdkonkthi .gt_font_normal { font-weight: normal; } #hjdkonkthi .gt_font_bold { font-weight: bold; } #hjdkonkthi .gt_font_italic { font-style: italic; } #hjdkonkthi .gt_super { font-size: 65%; } #hjdkonkthi .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #hjdkonkthi .gt_asterisk { font-size: 100%; vertical-align: 0; } #hjdkonkthi .gt_indent_1 { text-indent: 5px; } #hjdkonkthi .gt_indent_2 { text-indent: 10px; } #hjdkonkthi .gt_indent_3 { text-indent: 15px; } #hjdkonkthi .gt_indent_4 { text-indent: 20px; } #hjdkonkthi .gt_indent_5 { text-indent: 25px; } TABLE 8.7: Example of gtsummary table with trust in government estimates by voting status American voter's trust in the federal government by whether they voted in the 2020 presidential election % (s.e.) Voted Didn’t vote Trust in Government, 2020     Always 1.1 (0.2) 0.9 (0.9)     Most of the time 13 (0.6) 19 (5.3)     About half the time 32 (0.8) 30 (8.6)     Some of the time 45 (0.8) 45 (8.2)     Never 9.1 (0.7) 5.2 (2.2) American National Election Studies, 2020 Question text: How often can you trust the federal government in Washington to do what is right? 8.3.2 Charts and plots Survey analysis can yield an abundance of printed summary statistics and models. Even with the most careful analysis, interpreting the results can be overwhelming. This is where charts and plots play a key role in our work. By transforming complex data into a visual representation, we can recognize patterns, relationships, and trends with greater ease. R has numerous packages for creating compelling and insightful charts. In this section, we will focus on {ggplot2}, a member of the {tidyverse} collection of packages. Known for its power and flexibility, {ggplot2} is an invaluable tool for creating a wide range of data visualizations (Wickham 2016). The {ggplot2} package follows the “grammar of graphics,” a framework that incrementally adds layers of chart components. This approach allows us to customize visual elements such as scales, colors, labels, and annotations to enhance the clarity of our results. After creating the survey design object, we can modify it to include additional outcomes and calculate estimates for our desired data points. Below, we create a binary variable TrustGovernmentUsually, which is TRUE when TrustGovernment is “Always” or “Most of the time” and FALSE otherwise. Then, we calculate the percentage of people who usually trust the government based on their vote in the 2020 presidential election (VotedPres2020_selection). We remove the cases where people did not vote or did not indicate their choice. anes_des_der &lt;- anes_des %&gt;% mutate(TrustGovernmentUsually = case_when( is.na(TrustGovernment) ~ NA, TRUE ~ TrustGovernment %in% c(&quot;Always&quot;, &quot;Most of the time&quot;) )) %&gt;% drop_na(VotedPres2020_selection) %&gt;% group_by(VotedPres2020_selection) %&gt;% summarize( pct_trust = survey_mean( TrustGovernmentUsually, na.rm = TRUE, proportion = TRUE, vartype = &quot;ci&quot; ), .groups = &quot;drop&quot; ) anes_des_der ## # A tibble: 3 × 4 ## VotedPres2020_selection pct_trust pct_trust_low pct_trust_upp ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Biden 0.123 0.109 0.140 ## 2 Trump 0.178 0.161 0.198 ## 3 Other 0.0681 0.0290 0.152 Now, we can begin creating our chart with {ggplot2}. First, we set up our plot with ggplot(). Next, we define the data points to be displayed using aesthetics, or aes. Aesthetics represent the visual properties of the objects in the plot. In the example below, we map the x variable to VotedPres2020_selection from the dataset and the y variable to pct_trust. Finally, we specify the type of plot with geom_*(), in this case, geom_bar(). The resulting plot is displayed in Figure 8.1. p &lt;- anes_des_der %&gt;% ggplot(aes(x = VotedPres2020_selection, y = pct_trust)) + geom_bar(stat = &quot;identity&quot;) p FIGURE 8.1: Bar chart of trust in government by chosen 2020 presidential candidate This is a great starting point: we observe that a higher percentage of people stating they usually trust the government among those who voted for Trump compared to those who voted for Biden or other candidates. Now, what if we want to introduce color to better differentiate the three groups? We can add fill under aesthetics, indicating that we want to use distinct values of VotedPres2020_selection to color the bars. In this instance, Biden and Trump will be displayed in different colors. pcolor &lt;- anes_des_der %&gt;% ggplot(aes(x = VotedPres2020_selection, y = pct_trust, fill = VotedPres2020_selection)) + geom_bar(stat = &quot;identity&quot;) pcolor FIGURE 8.2: Bar chart of trust in government by chosen 2020 presidential candidate with colors Let’s say we wanted to follow proper statistical analysis practice and incorporate variability in our plot. We can add another geom, geom_errorbar(), to display the confidence intervals on top of our existing geom_bar() layer. We can add the layer using a plus sign +. pcol_error &lt;- anes_des_der %&gt;% ggplot(aes(x = VotedPres2020_selection, y = pct_trust, fill = VotedPres2020_selection)) + geom_bar(stat = &quot;identity&quot;) + geom_errorbar(aes(ymin = pct_trust_low, ymax = pct_trust_upp), width = .2) pcol_error FIGURE 8.3: Bar chart of trust in government by chosen 2020 presidential candidate with colors and error bars We can continue adding to our plot until we achieve our desired look. For example, we can eliminate the color legend as it doesn’t contribute meaningful information with guides(fill = \"none\"). We can specify specific colors for fill using scale_fill_manual(). Inside the function, we provide a vector of values corresponding to the colors in our plot. These values are hexadecimal (hex) color codes, denoted by a leading pound sign # followed by six letters or numbers. The hex code #0b3954 used below is a dark blue. There are many tools online that help pick hex codes, such as htmlcolorcodes.com/. pfull &lt;- anes_des_der %&gt;% ggplot(aes(x = VotedPres2020_selection, y = pct_trust, fill = VotedPres2020_selection)) + geom_bar(stat = &quot;identity&quot;) + geom_errorbar(aes(ymin = pct_trust_low, ymax = pct_trust_upp), width = .2) + scale_fill_manual(values = c(&quot;#0b3954&quot;, &quot;#bfd7ea&quot;, &quot;#8d6b94&quot;)) + xlab(&quot;Election choice (2020)&quot;) + ylab(&quot;Usually trust the government&quot;) + scale_y_continuous(labels = scales::percent) + guides(fill = &quot;none&quot;) + labs(title = &quot;Percent of voters who usually trust the government by chosen 2020 presidential candidate&quot;, caption = &quot;Source: American National Election Studies, 2020&quot;) pfull FIGURE 8.4: Bar chart of trust in government by chosen 2020 presidential candidate with colors, labels, error bars, and title What we’ve explored in this section are just the foundational aspects of {ggplot2}, and the capabilities of this package extend far beyond what we’ve covered. Advanced features such as annotation, faceting, and theming allow for more sophisticated and customized visualizations. The book Wickham (2016) is a comprehensive guide to learning more about this powerful tool. References Iannone, Richard, Joe Cheng, Barret Schloerke, Ellis Hughes, Alexandra Lauer, and JooYoung Seo. 2023. gt: Easily Create Presentation-Ready Display Tables. Sjoberg, Daniel D., Karissa Whiting, Michael Curry, Jessica A. Lavery, and Joseph Larmarange. 2021. “Reproducible Summary Tables with the Gtsummary Package.” The R Journal 13: 570–80. https://doi.org/10.32614/RJ-2021-053. Wickham, Hadley. 2016. Ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. https://ggplot2.tidyverse.org. "],["c09-reprex-data.html", "Chapter 9 Reproducible research 9.1 Introduction 9.2 Project-based workflows 9.3 Functions and packages 9.4 Version control with Git 9.5 Package management with {renv} 9.6 R environments with Docker 9.7 Workflow management with {targets} 9.8 Documentation with Quarto and R Markdown 9.9 Other tips for reproducibility 9.10 Summary", " Chapter 9 Reproducible research 9.1 Introduction Reproducing a data analysis’s results is a crucial aspect of any research. First, reproducibility serves as a form of quality assurance. If we pass an analysis project to another person, they should be able to run the entire project from start to finish and obtain the same results. They can critically assess the methodology and code while detecting potential errors. Another goal of reproducibility is enabling the verification of our analysis. When someone else is able to check our results, it ensures the integrity of the analyses by determining that the conclusions are not dependent on a particular person running the code or workflow on a particular day or in a particular environment. Not only is reproducibility a key component in ethical and accurate research, but it is also a requirement for many scientific journals. For example, the Journal of Survey Statistics and Methodology (JSSAM) and Public Opinion Quarterly (POQ) require authors to make code, data, and methodology transparent and accessible to other researchers who wish to verify or build on existing work. Reproducible research requires that the key components of analysis are available, discoverable, documented, and shared with others. The four main components that we should consider are: Code: source code used for data cleaning, analysis, modeling, and reporting Data: raw data used in the workflow, or if data is sensitive or proprietary, as much data as possible that would allow others to run our workflow (e.g., access to a restricted use file (RUF)) Environment: environment of the project, including the R version, packages, operating system, and other dependencies used in the analysis Methodology: analysis methodology, including rationale behind decisions, interpretations, and assumptions In Chapter 8, we briefly mention how each of these is important to include in the methodology report and when communicating the findings of a study. However, to be transparent and effective researchers, we need to ensure we not only discuss these through text but also provide files and additional information when requested. Often, when starting a project, analysts will dive into the data and make decisions as they go without full documentation, which can be challenging if we need to go back and make changes or understand even what we did a few months ago. It benefits other analysts and potentially our future selves to better document everything from the start. The good news is that many tools, practices, and project management techniques make survey analysis projects easy to reproduce. For best results, analysts should decide which techniques and tools will be used before starting a project (or very early on). This chapter covers some of our suggestions for tools and techniques we can use in projects. This list is not comprehensive but aims to provide a starting point for those looking to create a reproducible workflow. 9.2 Project-based workflows We recommend a project-based workflow for analysis projects as described by Wickham, Çetinkaya-Rundel, and Grolemund (2023). A project-based workflow maintains a “source of truth” for our analyses. It helps with file system discipline by putting everything related to a project in a designated folder. Since all associated files are in a single location, they are easy to find and organize. When we reopen the project, we can recreate the environment in which we originally ran the code to reproduce our results. The RStudio IDE has built-in support for projects. When we create a project in RStudio, it creates a .Rproj file that store settings specific to that project. Once we have created a project, we can create folders that help us organize our workflow. For example, a project directory could look like this: | anes_analysis/ | anes_analysis.Rproj | README.md | codebooks | codebook2020.pdf | codebook2016.pdf | rawdata | anes2020_raw.csv | anes2016_raw.csv | scripts | data-prep.R | data | anes2020_clean.csv | anes2016_clean.csv | report | anes_report.Rmd | anes_report.html | anes_report.pdf In a project-based workflow, all paths are relative and, by default, relative to the project’s folder. By using relative paths, others can open and run our files even if their directory configuration differs from ours. The {here} package enables easy file referencing, and we can start with using the here::here() function to build the path for loading or saving data (Müller 2020). Below, we ask R to read the CSV file anes_2020.csv in the project directory’s data folder: anes &lt;- read_csv(here::here(&quot;data&quot;, &quot;anes2020_clean.csv&quot;)) The combination of projects and the {here} package keep all associated files in an organized manner. This workflow makes it more likely that our analyses can be reproduced by us or our colleagues. 9.3 Functions and packages We may find ourselves repeating ourselves in our script, and the chances of errors increases whenever we copy and paste our code. By creating a function, we can create a consistent set of commands that reduce the likelihood of mistakes. Functions also organize our code, improve the code readability, and allow others to execute the same commands. Throughout this book, we have created functions, such as in Chapter 13, to run sequences of rename, filter, group_by, and summarize statements across different variables. The function helps us avoid overlooking necessary steps. A package is made up of a collection of functions. If we find ourselves sharing functions with others to replicate the same series of commands in a separate project, creating a package can be a useful tool for sharing the code along with data and documentation. 9.4 Version control with Git Often, a survey analysis project produces a lot of code. Keeping track of the latest version can become challenging as files evolve throughout a project. If a team of analysts is working on the same script, someone may use an outdated version, resulting in incorrect results or redundant work. Version control systems like Git can help alleviate these pains. Git is a system that helps track changes in computer files. Analysts can use Git to follow code evaluation and manage asynchronous work. With Git, it is easy to see any changes made in a script, revert changes, and resolve differences between code versions (called conflicts). Services such as GitHub or GitLab provide hosting and sharing of files as well as version control with Git. For example, we can visit the GitHub repository for this book (https://github.com/tidy-survey-r/tidy-survey-book) and see the files that build the book, when they were committed to the repository, and the history of modifications over time. In addition to code scripts, platforms like GitHub can store data and documentation. They provide a way to maintain a history of data modifications through versioning and timestamps. By saving the data and documentation alongside the code, it becomes easier for others to refer to and access everything they need in one place. Using version control in analysis projects makes collaboration and maintenance more manageable. For connecting Git with R, we recommend Bryan (2023). 9.5 Package management with {renv} Ensuring reproducibility involves not only using version control of code, but also managing the versions of packages. If two people run the same code but use different versions of a package, the results might differ because of changes in those packages. For example, this book currently uses a version of the {srvyr} package from GitHub and not from CRAN. This is because the version of {srvyr} on CRAN has some bugs (errors) that result in incorrect calculations. The version on GitHub has corrected these errors, so we have asked readers to install the GitHub version to obtain the same results. One way to handle different package versions is with the {renv} package. This package allows researchers to set the versions for each package used and manage package dependencies. Specifically, {renv} creates isolated, project-specific environments that record the packages and their versions used in the code. When initiated by a new user, {renv} checks whether the installed packages are consistent with the recorded version for the project. If not, it installs the appropriate versions so that others can replicate the project’s environment to rerun the code and obtain consistent results (Ushey and Wickham 2023). 9.6 R environments with Docker Just as different versions of packages can introduce discrepancies or compatibility issues, the version of R can also prevent reproducibility. Tools such as Docker can help with this potential issue by creating isolated environments that define the version of R being used, along with other dependencies and configurations. The entire environment is bundled in a container. The container, defined by a Dockerfile, can be shared so anybody, regardless of their local setup, can run the R code in the same environment. 9.7 Workflow management with {targets} With complex studies involving multiple code files and dependencies, it is important to ensures each step is executed in the intended sequence. We can do this manually, e.g., numbering files to indicate the order or providing detailed documentation on the order. Alternatively, we can automate the process so the code flows sequentially. Making sure that the code runs in the correct order helps ensure that the research is reproducible. Anyone should be able to pick up the set of scripts and get the same results by following the workflow. The {targets} package is growing as a popular workflow manager that documents, automates, and executes complex data workflows with multiple steps and dependencies. With this package, we first define the order of execution for our code, and then it will consistently execute the code in that order each time it is run. One beneficial feature of {targets} is that if you change code later in the workflow, only the affected code and its downstream targets (i.e., the subsequent code files) are re-executed when we change a script. The {targets} package also provides interactive progress monitoring and reporting, allowing us to track the status and progress of our analysis pipeline (Landau 2021). 9.8 Documentation with Quarto and R Markdown Tools like Quarto and R Markdown aid in reproducibility by creating documents that weave together code, text, and results. We can present analysis results alongside the report’s narrative, so there’s no need to copy and paste code output into the final documentation. By eliminating manual steps, we can reduce the chances of errors in the final output. Quarto and R Markdown documents also allow users to re-execute the underlying code when needed. Another analyst can see the steps we took, follow the scripts, and recreate the report. We can include details about our work in one place thanks to the combination of text and code, making our work transparent and easier to verify (R-quarto?; Xie, Dervieux, and Riederer 2020). 9.8.1 Parameterization Another useful feature of Quarto and R Markdown is the ability to reduce repetitive code by parameterizing the files. Parameters can control various aspects of the analysis, such as dates, geography, or other analysis variables. We can define and modify these parameters to explore different scenarios or inputs. For example, suppose we start by creating a document that provides survey analysis results for North Carolina but then later decide we want to look at another state. In that case, we can define a state parameter and rerun the same analysis for a state like Washington without having to edit the code throughout the document. Parameters can be defined in the header or code chunks of our Quarto or R Markdown documents and easily be modified and documented. We reduce errors that may occur by manually editing code throughout the script, and offer a flexible way for others to replicate the analysis and explore variations. 9.9 Other tips for reproducibility 9.9.1 Random number seeds Some tasks in survey analysis require randomness, such as imputation, model training, or creating random samples. By default, the random numbers generated by R change each time we rerun the code, making it difficult to reproduce the same results. By “setting the seed,” we can control the randomness and ensure that the random numbers remain consistent whenever we rerun the code. Others can use the same seed value to reproduce our random numbers and achieve the same results. In R, we can use the set.seed() function to control the randomness in our code. Set a seed value by providing an integer to the function: set.seed(999) runif(5) The runif() function generates five random numbers from a uniform distribution. Since the seed is set to 999, running runif() multiple times will always produce the same sequence: [1] 0.38907138 0.58306072 0.09466569 0.85263123 0.78674676 The choice of the seed number is up to the analyst. For example, this could be the date (20240102) or time of day (1056) when the analysis was first conducted, a phone number (8675309), or the first few numbers that come to mind (369). As long as the seed is set for a given analysis, the actual number is up to the analyst to decide. It is important to note that set.seed() should be used before random number generation. It would be unethical to run an analysis over and over to choose a seed that produces the result you want. Run it once per program, and the seed will be applied to the entire script. We recommend setting the seed at the beginning of a script, where libraries are loaded. 9.9.2 Descriptive names and labels Using descriptive variable names or labeling data can also assist with reproducible research. For example, in the ANES data, the variable names in the raw data all start with V20 and are a string of numbers. To make things easier to reproduce, we opted to change the variable names to be more descriptive of what they contained (e.g., Age). This can also be done with the data values themselves. One way to accomplish this is by creating factors for categorical data, which can ensure that we know that a value of 1 really means Female, for example. There are other ways of handling this, such as attaching labels to the data instead of recoding variables to be descriptive (see Chapter 11). As with random number seeds, the exact method is up to the analyst, but providing this information can help ensure our research is reproducible. 9.10 Summary We can promote accuracy and verification of results by making our analysis reproducible. There are various tools and guides available to help you achieve reproducibility in your work, a few of which were described in this chapter. Here are additional resources to explore: R for Data Science chapter on project-based workflows: https://r4ds.hadley.nz/workflow-scripts.html#projects Building reproducible analytical pipelines with R by Bruno Rodrigues: https://raps-with-r.dev/ Posit Solutions Site page on reproducible environments: https://solutions.posit.co/envs-pkgs/environments/ References Bryan, Jenny. 2023. Happy Git and GitHub for the useR. https://happygitwithr.com/. Landau, William Michael. 2021. “The Targets r Package: A Dynamic Make-Like Function-Oriented Pipeline Toolkit for Reproducibility and High-Performance Computing.” Journal of Open Source Software 6 (57): 2959. https://doi.org/10.21105/joss.02959. Müller, Kirill. 2020. here: A Simpler Way to Find Your Files. Ushey, Kevin, and Hadley Wickham. 2023. renv: Project Environments. Wickham, Hadley, Mine Çetinkaya-Rundel, and Garrett Grolemund. 2023. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. 2rd Edition. https://r4ds.hadley.nz/; O’Reilly Media. Xie, Yihui, Christophe Dervieux, and Emily Riederer. 2020. R Markdown Cookbook. Boca Raton, Florida: Chapman; Hall/CRC. https://bookdown.org/yihui/rmarkdown-cookbook. "],["c10-sample-designs-replicate-weights.html", "Chapter 10 Sample designs and replicate weights 10.1 Introduction 10.2 Common sampling designs 10.3 Combining sampling methods 10.4 Replicate weights 10.5 Exercises", " Chapter 10 Sample designs and replicate weights Prerequisites For this chapter, load the following packages: library(tidyverse) library(survey) library(srvyr) library(srvyrexploR) To help explain the different types of sample designs, this chapter will use the api and scd data that are included in the {survey} package (Lumley 2010): data(api) data(scd) This chapter also uses data from the Residential Energy Consumption Survey (RECS) - both 2015 and 2020, which are included in the {srvyrexploR} package as recs_2015 and recs_2020, respectively (Stephanie, Rebecca, and Isabella 2024). 10.1 Introduction The primary reason for using packages like {survey} and {srvyr} is to account for the sampling design or replicate weights into estimates (Freedman Ellis and Schneider 2023; Lumley 2010). By incorporating the sampling design or replicate weights, precision estimates (e.g., standard errors and confidence intervals) are appropriately calculated. In this chapter, we will introduce common sampling designs and common types of replicate weights, the mathematical methods for calculating estimates and standard errors for a given sampling design, and the R syntax to specify the sampling design or replicate weights. While we will show the math behind the estimates, the functions in these packages will do the calculation. To deeply understand the math and the derivation, refer to Penn State (2019), Särndal, Swensson, and Wretman (2003), Wolter (2007), or Fuller (2011) (these are listed in order of increasing statistical rigorousness). The general process for estimation in the {srvyr} package is to: Create a tbl_svy object (a survey object) using: as_survey_design() or as_survey_rep() Subset data (if needed) using filter() (subpopulations) Specify domains of analysis using group_by() Within summarize(), specify variables to calculate, including means, totals, proportions, quantiles, and more This chapter includes details on the first step - creating the survey object. Once this survey object is created, it can be used in the other steps (detailed in chapters 5 through 7) to account for the complex survey design. 10.2 Common sampling designs A sampling design is the method used to draw a sample. Both logistical and statistical elements are considered when developing a sampling design. When specifying a sampling design in R, the levels of sampling are specified along with the weights. The weight for each record is constructed so that the particular record represents that many units in the population. For example, in a survey of 6th-grade students in the United States, the weight associated with each responding student reflects how many 6th grade students across the country that record represents. Generally, the weights represent the inverse of the probability of selection such that the sum of the weights corresponds to the total population size, although some studies may have the sum of the weights equal to the number of respondent records. Some common terminology across the designs are: sample size, generally denoted as \\(n\\), is the number of units selected to be sampled population size, generally denoted as \\(N\\), is the number of units in the target population sampling frame, the list of units from which the sample is drawn (see Chapter 2 for more information) 10.2.1 Simple random sample without replacement The simple random sample (SRS) without replacement is a sampling design where a fixed sample size is selected from a sampling frame, and every possible subsample has an equal probability of selection. Without replacement refers to the fact that once a sampling unit has been selected, it is removed from the sample frame and cannot be selected again. Requirements: The sampling frame must include the entire population. Advantages: SRS requires no information about the units apart from contact information. Disadvantages: The sampling frame may not be available for the entire population. Example: Randomly select students in a university from a roster provided by the registrar’s office. The math The estimate for the population mean of variable \\(y\\) is: \\[\\bar{y}=\\frac{1}{n}\\sum_{i=1}^n y_i\\] where \\(\\bar{y}\\) represents the sample mean, \\(n\\) is the total number of respondents (or observations), and \\(y_i\\) is each individual value of \\(y\\). The estimate of the standard error of the mean is: \\[se(\\bar{y})=\\sqrt{\\frac{s^2}{n}\\left( 1-\\frac{n}{N} \\right)}\\] where \\[s^2=\\frac{1}{n-1}\\sum_{i=1}^n\\left(y_i-\\bar{y}\\right)^2.\\] and \\(N\\) is the population size. This standard error estimate might look very similar to equations in other applications except for the part on the right side of the equation: \\(1-\\frac{n}{N}\\). This is called the finite population correction (FPC) factor. If the size of the frame, \\(N\\), is very large in comparison to the sample, the FPC is negligible, so it is often ignored. A common guideline is if the sample is less than 10% of the population, the FPC is negligible. To estimate proportions, we define \\(x_i\\) as the indicator if the outcome is observed. That is, \\(x_i=1\\) if the outcome is observed, and \\(x_i=0\\) if the outcome is not observed for respondent \\(i\\). Then the estimated proportion from an SRS design is: \\[\\hat{p}=\\frac{1}{n}\\sum_{i=1}^n x_i \\] and the estimated standard error of the proportion is: \\[se(\\hat{p})=\\sqrt{\\frac{\\hat{p}(1-\\hat{p})}{n-1}\\left(1-\\frac{n}{N}\\right)} \\] The syntax If a sample was drawn through SRS and had no nonresponse or other weighting adjustments, in R, specify this design as: srs1_des &lt;- dat %&gt;% as_survey_design(fpc = fpcvar) where dat is a tibble or data.frame with the survey data, and fpcvar is a variable in the data indicating the sampling frame’s size (this variable will have the same value for all cases in an SRS design). If the frame is very large, sometimes the frame size is not provided. In that case, the FPC is not needed, and specify the design as: srs2_des &lt;- dat %&gt;% as_survey_design() If some post-survey adjustments were implemented and the weights are not all equal, specify the design as: srs3_des &lt;- dat %&gt;% as_survey_design(weights = wtvar, fpc = fpcvar) where wtvar is a variable in the data indicating the weight for each case. Again, the FPC can be omitted if it is unnecessary because the frame is large compared to the sample size. Example The {survey} package in R provides some example datasets that we will use throughout this chapter. The documentation provides detailed information about the variables. One of the example datasets we will use is from the Academic Performance Index (API). The API was a program administered by the California Department of Education, and the {survey} package includes a population file (sample frame) of all schools with at least 100 students and several different samples pulled from that data using different sampling methods. For this first example, we will use the apisrs dataset, which contains an SRS of 200 schools. For printing purposes, we create a new dataset called apisrs_slim, which sorts the data by the school district and school ID and subsets the data to only a few columns. The SRS sample data is illustrated below: apisrs_slim &lt;- apisrs %&gt;% as_tibble() %&gt;% arrange(dnum, snum) %&gt;% select(cds, dnum, snum, dname, sname, fpc, pw) apisrs_slim ## # A tibble: 200 × 7 ## cds dnum snum dname sname fpc pw ## &lt;chr&gt; &lt;int&gt; &lt;dbl&gt; &lt;chr&gt; &lt;chr&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 19642126061220 1 1121 ABC Unified Haske… 6194 31.0 ## 2 19642126066716 1 1124 ABC Unified Stowe… 6194 31.0 ## 3 36675876035174 5 3895 Adelanto Elementary Adela… 6194 31.0 ## 4 33669776031512 19 3347 Alvord Unified Arlan… 6194 31.0 ## 5 33669776031595 19 3352 Alvord Unified Wells… 6194 31.0 ## 6 31667876031033 39 3271 Auburn Union Elementary Cain … 6194 31.0 ## 7 19642876011407 42 1169 Baldwin Park Unified Deanz… 6194 31.0 ## 8 19642876011464 42 1175 Baldwin Park Unified Heath… 6194 31.0 ## 9 19642956011589 48 1187 Bassett Unified Erwin… 6194 31.0 ## 10 41688586043392 49 4948 Bayshore Elementary Baysh… 6194 31.0 ## # ℹ 190 more rows Table 10.1 provides details on all the variables in this dataset. TABLE 10.1: Overview of Variables in api Data Variable Name Description cds Unique identifier for each school dnum School district identifier within county snum School identifier within district dname District Name sname School Name fpc Finite population correction factor (FPC) pw Weight To create the tbl_survey object for this SRS data, the design should be specified as follows: apisrs_des &lt;- apisrs_slim %&gt;% as_survey_design(weights = pw, fpc = fpc) apisrs_des ## Independent Sampling design ## Called via srvyr ## Sampling variables: ## - ids: `1` ## - fpc: fpc ## - weights: pw ## Data variables: ## - cds (chr), dnum (int), snum (dbl), dname (chr), sname (chr), fpc ## (dbl), pw (dbl) In the printed design object above, the design is described as an “Independent Sampling design,” which is another term for SRS. The ids are specified as 1, which means there is no clustering (a topic described in Section 10.2.4), the FPC variable is indicated, and the weights are indicated. We can also look at the summary of the design object, and see the distribution of the probabilities (inverse of the weights) along with the population size and a list of the variables in the dataset. summary(apisrs_des) ## Independent Sampling design ## Called via srvyr ## Probabilities: ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0.0323 0.0323 0.0323 0.0323 0.0323 0.0323 ## Population size (PSUs): 6194 ## Data variables: ## [1] &quot;cds&quot; &quot;dnum&quot; &quot;snum&quot; &quot;dname&quot; &quot;sname&quot; &quot;fpc&quot; &quot;pw&quot; 10.2.2 Simple random sample with replacement Similar to the SRS design, the simple random sample with replacement (SRSWR) design randomly selects the sample from the entire sampling frame. However, while SRS removes sampled units before selecting again, the SRSWR instead replaces each sampled unit before drawing again, so units can be selected more than once. Requirements: The sampling frame must include the entire population. Advantages: SRSWR requires no information about the units apart from contact information. Disadvantages: The sampling frame may not be available for the entire population. Units can be selected more than once, resulting in a smaller realized sample size because receiving duplicate information from a single respondent does not provide additional information. For small populations, SRSWR has larger standard errors than SRS designs. Example: A professor puts all students’ names on paper slips and selects them randomly to ask students questions, but the professor replaces the paper after calling on the student so they can be selected again at any time. In general for surveys, using an SRS design (without replacement) is preferred as we do not want respondents to answer a survey more than once. The math The estimate for the population mean of variable \\(y\\) is: \\[\\bar{y}=\\frac{1}{n}\\sum_{i=1}^n y_i\\] and the estimate of the standard error of mean is: \\[se(\\bar{y})=\\sqrt{\\frac{s^2}{n}}\\] where \\[s^2=\\frac{1}{n-1}\\sum_{i=1}^n\\left(y_i-\\bar{y}\\right)^2.\\] To calculate the estimated proportion, we define \\(x_i\\) as the indicator that the outcome is observed (as we did with SRS): \\[\\hat{p}=\\frac{1}{n}\\sum_{i=1}^n x_i \\] and the estimated standard error of the proportion is: \\[se(\\hat{p})=\\sqrt{\\frac{\\hat{p}(1-\\hat{p})}{n}} \\] The syntax If we had a sample that was drawn through SRSWR and had no nonresponse or other weighting adjustments, in R, we should specify this design as: srswr1_des &lt;- dat %&gt;% as_survey_design() where dat is a tibble or data.frame containing our survey data. This syntax is the same as a SRS design, except a finite population correction (FPC) is not included. This is because when you claculate a sample with replacement, the population pool to select from is no longer finite, so a correction is not needed. Therefore, with large populations where the FPC is negligble, the underlying formulas for SRS and SRSWR designs are the same. If some post-survey adjustments were implemented and the weights are not all equal, specify the design as: srswr2_des &lt;- dat %&gt;% as_survey_design(weights = wtvar) where wtvar is the variable for the weight on the data. Example The {survey} package does not include an example of SRSWR, so to illustrate this design we need to create an example. We use the api population data provided by the {survey} package apipop and select a sample of 200 cases using the slice_sample() function from the tidyverse. One of the arguments in the slice_sample() function is replace. If replace=TRUE, then we are conducting a SRSWR. We then calculate selection weights as the inverse of the probability of selection and call this new dataset apisrswr. set.seed(409963) apisrswr &lt;- apipop %&gt;% as_tibble() %&gt;% slice_sample(n = 200, replace = TRUE) %&gt;% select(cds, dnum, snum, dname, sname) %&gt;% mutate( weight = nrow(apipop)/200 ) head(apisrswr) ## # A tibble: 6 × 6 ## cds dnum snum dname sname weight ## &lt;chr&gt; &lt;int&gt; &lt;dbl&gt; &lt;chr&gt; &lt;chr&gt; &lt;dbl&gt; ## 1 43696416060065 533 5348 Palo Alto Unified Jordan (Da… 31.0 ## 2 07618046005060 650 509 San Ramon Valley Unified Alamo Elem… 31.0 ## 3 19648086085674 457 2134 Montebello Unified La Merced … 31.0 ## 4 07617056003719 346 377 Knightsen Elementary Knightsen … 31.0 ## 5 19650606023022 744 2351 Torrance Unified Carr (Evel… 31.0 ## 6 01611196090120 6 13 Alameda City Unified Paden (Wil… 31.0 Because this is a SRS design with replacement, there will be duplicates in the data. It is important to keep the duplicates in the data for proper estimation, but for reference we can view the duplicates in the example data we just created. apisrswr %&gt;% group_by(cds) %&gt;% filter(n()&gt;1) %&gt;% arrange(cds) ## # A tibble: 4 × 6 ## # Groups: cds [2] ## cds dnum snum dname sname weight ## &lt;chr&gt; &lt;int&gt; &lt;dbl&gt; &lt;chr&gt; &lt;chr&gt; &lt;dbl&gt; ## 1 15633216008841 41 869 Bakersfield City Elem Chipman Junio… 31.0 ## 2 15633216008841 41 869 Bakersfield City Elem Chipman Junio… 31.0 ## 3 39686766042782 716 4880 Stockton City Unified Tyler Skills … 31.0 ## 4 39686766042782 716 4880 Stockton City Unified Tyler Skills … 31.0 We created a weight variable in this example data, which is the inverse of the probability of selection. To specify the sampling design for apisrswr, the following syntax should be used: apisrswr_des &lt;- apisrswr %&gt;% as_survey_design(weights = weight) apisrswr_des ## Independent Sampling design (with replacement) ## Called via srvyr ## Sampling variables: ## - ids: `1` ## - weights: weight ## Data variables: ## - cds (chr), dnum (int), snum (dbl), dname (chr), sname (chr), weight ## (dbl) summary(apisrswr_des) ## Independent Sampling design (with replacement) ## Called via srvyr ## Probabilities: ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0.0323 0.0323 0.0323 0.0323 0.0323 0.0323 ## Data variables: ## [1] &quot;cds&quot; &quot;dnum&quot; &quot;snum&quot; &quot;dname&quot; &quot;sname&quot; &quot;weight&quot; In the output above, the design object and the object summary are shown. Both note that the sampling is done “with replacement” because no FPC was specified. The probabilities, which are derived from the weights, are summarized in the summary. 10.2.3 Stratified sampling Stratified sampling occurs when a population is divided into mutually exclusive subpopulations (strata), and then samples are selected independently within each stratum. Requirements: The sampling frame must include the information to divide the population into groups for every unit. Advantages: This design ensures sample representation in all subpopulations. If the strata are correlated with survey outcomes, a stratified sample has smaller standard errors compared to a SRS sample of the same size. This results in a more efficient design. Disadvantages: Auxiliary data may not exist to divide the sampling frame into groups, or the data may be outdated. Examples: Example 1: A population of North Carolina residents could be separated (stratified) into urban and rural areas, and then a SRS of residents from both rural and urban areas is selected independently. This ensures there are residents from both areas in the sample. Example 2: Law enforcement agencies could be separated (stratified) into the three primary general-purpose categories in the US: local police, sheriff’s departments, and state police. A SRS of agencies from each of the three types is then selected independently to ensure all three types of agencies are represented. The math Let \\(\\bar{y}_h\\) be the sample mean for stratum \\(h\\), \\(N_h\\) be the population size of stratum \\(h\\), and \\(n_h\\) be the sample size of stratum \\(h\\). Then the estimate for the population mean under stratified SRS sampling is: \\[\\bar{y}=\\frac{1}{N}\\sum_{h=1}^H N_h\\bar{y}_h\\] and the estimate of the standard error of \\(\\bar{y}\\) is: \\[se(\\bar{y})=\\sqrt{\\frac{1}{N^2} \\sum_{h=1}^H N_h^2 \\frac{s_h^2}{n_h}\\left(1-\\frac{n_h}{N_h}\\right)} \\] where \\[s_h^2=\\frac{1}{n_h-1}\\sum_{i=1}^{n_h}\\left(y_{i,h}-\\bar{y}_h\\right)^2.\\] For estimates of proportions, let \\(\\hat{p}_h\\) be the estimated proportion in stratum \\(h\\). Then the population proportion estimate is: \\[\\hat{p}= \\frac{1}{N}\\sum_{h=1}^H N_h \\hat{p}_h\\] where \\(H\\) is the total number of strata. The standard error of the proportion is: \\[se(\\hat{p}) = \\frac{1}{N} \\sqrt{ \\sum_{h=1}^H N_h^2 \\frac{\\hat{p}_h(1-\\hat{p}_h)}{n_h-1} \\left(1-\\frac{n_h}{N_h}\\right)}\\] The syntax In addition to the fpc and weights arguments discussed in the types above, stratified designs requires the addition of the strata argument. For example, to specify a stratified SRS design in {srvyr} when using the FPC, that is, where the population sizes of the strata are not too large and are known, specify the design as: stsrs1_des &lt;- dat %&gt;% as_survey_design(fpc = fpcvar, strata = stratvar) where fpcvar is a variable on our data that indicates \\(N_h\\) for each row, and stratavar is a variable indicating the stratum for each row. You can omit the FPC if it is not applicable. Additionally, we can indicate the weight variable if it is present where wtvar is a variable on our data with a numeric weight. stsrs2_des &lt;- dat %&gt;% as_survey_design(weights = wtvar, strata = stratvar) Example In the example API data, apistrat is a stratified random sample, stratified by school type (stype) with three levels: E for elementary school, M for middle school, and H for high school. As with the SRS example above, we sort and select specific variables for use in printing. The data are illustrated below, including a count of the number of cases per stratum: apistrat_slim &lt;- apistrat %&gt;% as_tibble() %&gt;% arrange(dnum, snum) %&gt;% select(cds, dnum, snum, dname, sname, stype, fpc, pw) apistrat_slim %&gt;% count(stype, fpc) ## # A tibble: 3 × 3 ## stype fpc n ## &lt;fct&gt; &lt;dbl&gt; &lt;int&gt; ## 1 E 4421 100 ## 2 H 755 50 ## 3 M 1018 50 The FPC is the same for each case within each stratum. This output also shows that 100 elementary schools, 50 middle schools, and 50 high schools were sampled. It is often common for the number of units sampled from each strata to be different based on the goals of the project, or to mirror the size of each strata in the population. This design should be specified as follows: apistrat_des &lt;- apistrat_slim %&gt;% as_survey_design(strata = stype, weights = pw, fpc = fpc) apistrat_des ## Stratified Independent Sampling design ## Called via srvyr ## Sampling variables: ## - ids: `1` ## - strata: stype ## - fpc: fpc ## - weights: pw ## Data variables: ## - cds (chr), dnum (int), snum (dbl), dname (chr), sname (chr), stype ## (fct), fpc (dbl), pw (dbl) summary(apistrat_des) ## Stratified Independent Sampling design ## Called via srvyr ## Probabilities: ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0.0226 0.0226 0.0359 0.0401 0.0534 0.0662 ## Stratum Sizes: ## E H M ## obs 100 50 50 ## design.PSU 100 50 50 ## actual.PSU 100 50 50 ## Population stratum sizes (PSUs): ## E H M ## 4421 755 1018 ## Data variables: ## [1] &quot;cds&quot; &quot;dnum&quot; &quot;snum&quot; &quot;dname&quot; &quot;sname&quot; &quot;stype&quot; &quot;fpc&quot; &quot;pw&quot; When printing the object, it is specified as a “Stratified Independent Sampling design,” also known as a stratified SRS, and the strata variable is included. Printing the summary we see a distribution of probabilities, as we saw with SRS, but we also see the sample and populations sizes by stratum. 10.2.4 Clustered sampling Clustered sampling occurs when a population is divided into mutually exclusive subgroups called clusters or primary sampling units (PSUs). A random selection of PSUs is sampled, and then another level of sampling is done within these clusters. There can be multiple levels of this selection. Clustered sampling is often used when a list of the entire population is not available, or data collection involves interviewers needing direct contact with respondents. Requirements: There must be a way to divide the population into clusters. Clusters are commonly structural such as institutions (e.g., schools, prisons) or geography (e.g., states, counties). Advantages: Clustered sampling is advantageous when data collection is done in person, so interviewers are sent to specific sampled areas rather than completely at random across a country. With clustered sampling, a list of the entire population is not necessary. For example, if sampling students, we do not need a list of all students but only a list of all schools. Once the schools are sampled, lists of students can be obtained within the sampled schools. Disadvantages: Compared to a simple random sample for the same sample size, clustered samples generally have larger standard errors of estimates. Examples: Example 1: Consider a study needing a sample of 6th-grade students in the United States, no list likely exists of all these students. However, it is more likely to obtain a list of schools that have 6th graders, so a study design could select a random sample of schools that have 6th graders. The selected schools can then provide a list of students to do a second stage of sampling where 6th-grade students are randomly sampled within each of the sampled schools. This is a one-stage sample design (the one representing the number of clusters) and will be the type of design we will discuss in the formulas below. Example 2: Consider a study sending interviewers to households for a survey. This is a more complicated example that requires two levels of clustering (two-stage sample design) to efficiently use interviewers in geographic clusters. First, in the U.S., counties could be selected as the PSU, then Census block groups within counties could be selected as the secondary sampling unit (SSU). Households could then be randomly sampled within the block groups. This type of design is popular for in-person surveys as it reduces the travel necessary for interviewers. The math Consider a survey where a sample of \\(a\\) clusters are sampled from a population of \\(A\\) clusters via SRS. Units within each sampled cluster are sampled via SRS as well. Within each sampled cluster, \\(i\\), there are \\(B_i\\) units and \\(b_i\\) units are sampled via SRS. Let \\(\\bar{y}_{i}\\) be the sample mean of cluster \\(i\\). Then, a ratio estimator of the population mean is: \\[\\bar{y}=\\frac{\\sum_{i=1}^a B_i \\bar{y}_{i}}{ \\sum_{i=1}^a B_i}\\] Note this is a consistent but biased estimator. Often the population size is not known, so this is a method to estimate a mean without knowing the population size. The estimated standard error of the mean is: \\[se(\\bar{y})= \\frac{1}{\\hat{N}}\\sqrt{\\left(1-\\frac{a}{A}\\right)\\frac{s_a^2}{a} + \\frac{A}{a} \\sum_{i=1}^a \\left(1-\\frac{b_i}{B_i}\\right) \\frac{s_i^2}{b_i} }\\] where \\(\\hat{N}\\) is the estimated population size, \\(s_a^2\\) is the between-cluster variance and \\(s_i^2\\) is the within-cluster variance. The formula for the between-cluster variance (\\(s_a^2\\)) is: \\[s_a^2=\\frac{1}{a-1}\\sum_{i=1}^a \\left( \\hat{y}_i - \\frac{\\sum_{i=1}^a \\hat{y}_{i} }{a}\\right)^2\\] where \\(\\hat{y}_i =B_i\\bar{y_i}\\) . The formula for the within-cluster variance (\\(s_i^2\\)) is: \\[s_b^2=\\frac{1}{a(b_i-1)} \\sum_{j=1}^{b_i} \\left(y_{ij}-\\bar{y}_i\\right)^2\\] where \\(y_{ij}\\) is the outcome for sampled unit \\(j\\) within cluster \\(i\\). The syntax Clustered sampling designs require the addition of the ids argument which specifies what variables are the cluster levels. To specify a two-stage clustered design without replacement, use the following syntax: clus2_des &lt;- dat %&gt;% as_survey_design(weights = wtvar, ids = c(PSU, SSU), fpc = c(A, B)) where PSU and SSU are the variables indicating the PSU and SSU identifiers, and A and B are the variables indicating the population sizes for each level (i.e., A is the number of clusters, and B is the number of units within each cluster). Note that A will be the same for all records (within a strata), and B will be the same for all records within the same cluster. If clusters were sampled with replacement or from a very large population, a FPC is unnecessary. Additionally, only the first stage of selection is necessary regardless of whether the units were selected with replacement at any stage. The subsequent stages of selection are ignored in computation as their contribution to the variance is overpowered by the first stage (see Särndal, Swensson, and Wretman (2003) or Wolter (2007) for a more in-depth discussion). Therefore, the syntax below will yield the same estimates in the end: clus2wra_des &lt;- dat %&gt;% as_survey_design(weights = wtvar, ids = c(PSU, SSU)) clus2wrb_des &lt;- dat %&gt;% as_survey_design(weights = wtvar, ids = PSU) Note that there is one additional argument that is sometimes necessary which is nest = TRUE. This option relabels cluster IDs to enforce nesting within strata. Sometimes, as an example, there may be a cluster 1 and a cluster 2 within each stratum but these are actually different clusters. This option indicates that the repeated use of numbering does not mean it is the same cluster. If this option is not used and there are repeated cluster IDs across different strata, an error will be generated. Example The survey package includes a two-stage cluster sample data, apiclus2, in which school districts were sampled, and then a random sample of five schools was selected within each district. For districts with fewer than five schools, all schools were sampled. School districts are identified by dnum, and schools are identified by snum. The variable fpc1 indicates how many districts there are in California (A), and fpc2 indicates how many schools were in a given district with at least 100 students (B). The data has a row for each school. In the data printed below, there are 757 school districts, as indicated by fpc1, and there are nine schools in District 731, one school in District 742, two schools in District 768, and so on as indicated by fpc2. For illustration purposes, the object apiclus2_slim has been created from apiclus2, which subsets the data to only the necessary columns and sorts data. apiclus2_slim &lt;- apiclus2 %&gt;% as_tibble() %&gt;% arrange(desc(dnum), snum) %&gt;% select(cds, dnum, snum, fpc1, fpc2, pw) apiclus2_slim ## # A tibble: 126 × 6 ## cds dnum snum fpc1 fpc2 pw ## &lt;chr&gt; &lt;int&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;int[1d]&gt; &lt;dbl&gt; ## 1 47704826050942 795 5552 757 1 18.9 ## 2 07618126005169 781 530 757 6 22.7 ## 3 07618126005177 781 531 757 6 22.7 ## 4 07618126005185 781 532 757 6 22.7 ## 5 07618126005193 781 533 757 6 22.7 ## 6 07618126005243 781 535 757 6 22.7 ## 7 19650786023337 768 2371 757 2 18.9 ## 8 19650786023345 768 2372 757 2 18.9 ## 9 54722076054423 742 5898 757 1 18.9 ## 10 50712906053086 731 5781 757 9 34.1 ## # ℹ 116 more rows To specify this design in R, the following syntax should be used: apiclus2_des &lt;- apiclus2_slim %&gt;% as_survey_design(ids = c(dnum, snum), fpc = c(fpc1, fpc2), weights = pw) apiclus2_des ## 2 - level Cluster Sampling design ## With (40, 126) clusters. ## Called via srvyr ## Sampling variables: ## - ids: `dnum + snum` ## - fpc: `fpc1 + fpc2` ## - weights: pw ## Data variables: ## - cds (chr), dnum (int), snum (dbl), fpc1 (dbl), fpc2 (int[1d]), pw ## (dbl) summary(apiclus2_des) ## 2 - level Cluster Sampling design ## With (40, 126) clusters. ## Called via srvyr ## Probabilities: ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0.00367 0.03774 0.05284 0.04239 0.05284 0.05284 ## Population size (PSUs): 757 ## Data variables: ## [1] &quot;cds&quot; &quot;dnum&quot; &quot;snum&quot; &quot;fpc1&quot; &quot;fpc2&quot; &quot;pw&quot; The design objects are described as “2 - level Cluster Sampling design” and include the ids (cluster), FPC, and weight variables. The summary notes that the sample includes 40 first-level clusters (PSUs), which are school districts, and 126 second-level clusters (SSUs), which are schools. Additionally, the summary includes a numeric summary of the probabilities of selection and the population size (number of PSUs) as 757. 10.3 Combining sampling methods SRS, stratified, and clustered designs are the backbone of sampling designs, and the features are often combined in one design. Additionally, rather than using SRS for selection, other sampling mechanisms are commonly used, such as probability proportional to size (PPS), systematic sampling, or selection with unequal probabilities, which are briefly described here. In PPS sampling, a size measure is constructed for each unit (e.g., the population of the PSU or the number of occupied housing units) and then units with larger size measures are more likely to be sampled. Systematic sampling is commonly used to ensure representation across a population. Units are sorted by a feature and then every \\(k\\) units are selected from a random start point so the sample is spread across the population. In addition to PPS, other unequal probabilities of selection may be used. For example, in a study of establishments (e.g., businesses or public institutions) that conducts a survey every year, an establishment that recently participated (e.g., participated last year) may have a reduced chance of selection in a subsequent round to reduce the burden on the establishment. To learn more about sampling designs, refer to Valliant, Dever, and Kreuter (2013), Cox et al. (2011), Cochran (1977), and Deming (1991). A common method of sampling is to stratify PSUs, select PSUs within the stratum using PPS selection, and then select units within the PSUs either with SRS or PPS. Reading survey documentation is an important first step in survey analysis to understand the design of the survey we are using and variables necessary to specify the design. Good documentation will highlight the variables necessary to specify the design. This is often found in User’s Guides, methodology, analysis guides, or technical documentation (see Chapter 3 for more details). Example For example, the (2017-2019 National Survey of Family Growth)[ https://www.cdc.gov/nchs/data/nsfg/NSFG-2017-2019-Sample-Design-Documentation-508.pdf] (NSFG) had a stratified multi-stage area probability sample: 1. In the first stage, PSUs are counties or collections of counties and are stratified by Census region/division, size (population), and MSA status. Within each stratum, PSUs were selected via PPS. 2. In the second stage, neighborhoods were selected within the sampled PSUs using PPS selection. 3. In the third stage, housing units were selected within the sampled neighborhoods. 4. In the fourth stage, a person was randomly chosen within the selected housing units among eligible persons using unequal probabilities based on the person’s age and sex. The public use file does not include all these levels of selection and instead has pseudo-strata and pseudo-clusters, which are the variables used in R to specify the design. As specified on page 4 of the documentation, the stratum variable is SEST, the cluster variable is SECU, and the weight variable is WGT2017_2019. Thus, to specify this design in R, use the following syntax: nsfg_des &lt;- nsfgdata %&gt;% as_survey_design(ids = SECU, strata = SEST, weights = WGT2017_2019) 10.4 Replicate weights Replicate weights are often included on analysis files instead of, or in addition to, the design variables (strata and PSUs). Replicate weights are used as another method to estimate variability. Often researchers choose to use replicate weights to avoid publishing design variables (strata or clustering variables) as a measure to reduce the risk of disclosure. There are several types of replicate weights, including balanced repeated replication (BRR), Fay’s BRR, jackknife, and bootstrap methods. An overview of the process for using replicate weights is as follows: Divide the sample into subsample replicates that mirror the design of the sample Calculate weights for each replicate using the same procedures for the full-sample weight (i.e., nonresponse and post-stratification) Calculate estimates for each replicate using the same method as the full-sample estimate Calculate the estimated variance, which will be proportional to the variance of the replicate estimates The different types of replicate weights largely differ between step 1 (how the sample is divided into subsamples) and step 4 (which multiplication factors (scales) are used to multiply the variance). The general format for the standard error is: \\[ \\sqrt{\\alpha \\sum_{r=1}^R \\alpha_r (\\hat{\\theta}_r - \\hat{\\theta})^2 }\\] where \\(R\\) is the number of replicates, \\(\\alpha\\) is a constant that depends on the replication method, \\(\\alpha_r\\) is a factor associated with each replicate, \\(\\hat{\\theta}\\) is the weighted estimate based on the full sample, and \\(\\hat{\\theta}_r\\) is the weighted estimate of \\(\\theta\\) based on the \\(r^{\\text{th}}\\) replicate. To create the design object for surveys with replicate weights, we use as_survey_rep() instead of as_survey_design() that we use for the common sampling designs in the sections above. 10.4.1 Balanced Repeated Replication (BRR) method The BRR method requires a stratified sample design with two PSUs in each stratum. Each replicate is constructed by deleting one PSU per stratum using a Hadamard matrix. For the PSU that is included, the weight is generally multiplied by two but may have other adjustments, such as post-stratification. A Hadamard matrix is a special square matrix with entries of +1 or -1 with mutually orthogonal rows. Hadamard matrices must have one row, two rows, or a multiple of four rows. The size of the Hadamard matrix is determined by the first multiple of 4 greater than or equal to the number of strata. For example, if a survey had 7 strata, the Hadamard matrix would be an \\(8\\times8\\) matrix. Additionally, a survey with 8 strata would also have an \\(8\\times8\\) Hadamard matrix. The columns in the matrix specify the strata and the rows specify the replicate. In each replicate (row), a +1 means to use the first PSU and a -1 means to use the second PSU in the estimate. For example, here is a \\(4\\times4\\) Hadamard matrix: \\[ \\begin{array}{rrrr} +1 &amp;+1 &amp;+1 &amp;+1\\\\ +1&amp;-1&amp;+1&amp;-1\\\\ +1&amp;+1&amp;-1&amp;-1\\\\ +1 &amp;-1&amp;-1&amp;+1 \\end{array} \\] In the first replicate (row), all the values are +1, so in each stratum, the first PSU would be used in the estimate. In the second replicate, the first PSU would be used in stratum 1 and 3, while the second PSU would be used in stratum 2 and 4. In the third replicate, the first PSU would be used in stratum 1 and 2, while the second PSU would be used in strata 3 and 4. Finally, in the fourth replicate, the first PSU would be used in strata 1 and 4, while the second PSU would be used in strata 2 and 3. For more information about Hadamard matrices see Wolter (2007). Note that supplied BRR weights from a data provider will already incorporate this adjustment, and the {survey} package generates the Hadamard matrix, if necessary for calculating BRR weights so an analyst will not need to provide the matrix. The math A weighted estimate for the full sample is calculated as \\(\\hat{\\theta}\\), and then a weighted estimate for each replicate is calculated as \\(\\hat{\\theta}_r\\) for \\(R\\) replicates. Using the generic notation above, \\(\\alpha=\\frac{1}{R}\\) and \\(\\alpha_r=1\\) for each \\(r\\). The standard error of the estimate is calculated as follows: \\[se(\\hat{\\theta})=\\sqrt{\\frac{1}{R} \\sum_{r=1}^R \\left( \\hat{\\theta}_r-\\hat{\\theta}\\right)^2}\\] Specifying replicate weights in R requires specifying the type of replicate weights, the main weight variable, the replicate weight variables, and other options. One of the key options is for the mean squared error (MSE). If mse=TRUE, variances are computed around the point estimate \\((\\hat{\\theta})\\), whereas if mse=FALSE, variances are computed around the mean of the replicates \\((\\bar{\\theta})\\) instead which looks like this: \\[se(\\hat{\\theta})=\\sqrt{\\frac{1}{R} \\sum_{r=1}^R \\left( \\hat{\\theta}_r-\\bar{\\theta}\\right)^2}\\] where \\[\\bar{\\theta}=\\frac{1}{R}\\sum_{r=1}^R \\hat{\\theta}_r\\] The default option for mse is to use the global option of “survey.replicates.mse” which is set to FALSE initially unless a user changes it. To determine if mse should be set to TRUE or FALSE, read the survey documentation. If there is no indication in the survey documentation, for BRR, we recommend setting mse to TRUE as this is the default in other software (e.g., SAS, SUDAAN). The syntax Replicate weights generally come in groups and are sequentially numbered, such as PWGTP1, PWGTP2, …, PWGTP80 for the person weights in the American Community Survey (ACS) (U.S. Census Bureau 2021) or BRRWT1, BRRWT2, …, BRRWT96 in the 2015 Residential Energy Consumption Survey (RECS) (U.S. Energy Information Administration 2017). This makes it easy to use some of the (tidy selection)[https://dplyr.tidyverse.org/reference/dplyr_tidy_select.html] functions in R. To specify a BRR design, we need to specify the weight variable (weights), the replicate weight variables (repweights), the type of replicate weights is BRR (type = BRR), and whether the mean squared error should be used (mse = TRUE) or not (mse = FALSE). For example, if a dataset had WT0 for the main weight and had 20 BRR weights indicated WT1, WT2, …, WT20, we can use the following syntax (both are equivalent): brr_des &lt;- dat %&gt;% as_survey_rep(weights = WT0, repweights = all_of(str_c(&quot;WT&quot;, 1:20)), type = &quot;BRR&quot;, mse = TRUE) brr_des &lt;- dat %&gt;% as_survey_rep(weights = WT0, repweights = num_range(&quot;WT&quot;, 1:20), type = &quot;BRR&quot;, mse = TRUE) If a dataset had WT for the main weight and had 20 BRR weights indicated REPWT1, REPWT2, …, REPWT20, the following syntax could be used (both are equivalent): brr_des &lt;- dat %&gt;% as_survey_rep(weights = WT, repweights = all_of(str_c(&quot;REPWT&quot;, 1:20)), type = &quot;BRR&quot;, mse = TRUE) brr_des &lt;- dat %&gt;% as_survey_rep(weights = WT, repweights = starts_with(&quot;REPWT&quot;), type = &quot;BRR&quot;, mse = TRUE) If the replicate weight variables are in the file consecutively, the following syntax can also be used: brr_des &lt;- dat %&gt;% as_survey_rep(weights = WT, repweights = REPWT1:REPWT20, type = &quot;BRR&quot;, mse = TRUE) Typically, each replicate weight sums to a value similar to the main weight, as both the replicate weights and the main weight are supposed to provide population estimates. Rarely, an alternative method will be used where the replicate weights have values of 0 or 2 in the case of BRR weights. This would be indicated in the documentation (see Chapter 3 for more information on how to understand the provided documentation). In this case, the replicate weights are not combined, and the option combined_weights = FALSE should be indicated, as the default value for this argument is TRUE. This specific syntax is shown below: brr_des &lt;- dat %&gt;% as_survey_rep(weights = WT, repweights = starts_with(&quot;REPWT&quot;), type = &quot;BRR&quot;, combined_weights = FALSE, mse = TRUE) Example The {survey} package includes a data example from Section 12.2 of Levy and Lemeshow (2013). In this fictional data, two out of five ambulance stations were sampled from each of three emergency service areas (ESAs), thus BRR weights are appropriate with 2 PSUs (stations) sampled in each stratum (ESA). In the code below, BRR weights are created as was done by Levy and Lemeshow (2013). scdbrr &lt;- scd %&gt;% as_tibble() %&gt;% mutate(wt = 5 / 2, rep1 = 2 * c(1, 0, 1, 0, 1, 0), rep2 = 2 * c(1, 0, 0, 1, 0, 1), rep3 = 2 * c(0, 1, 1, 0, 0, 1), rep4 = 2 * c(0, 1, 0, 1, 1, 0)) scdbrr ## # A tibble: 6 × 9 ## ESA ambulance arrests alive wt rep1 rep2 rep3 rep4 ## &lt;int&gt; &lt;int&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 1 1 120 25 2.5 2 2 0 0 ## 2 1 2 78 24 2.5 0 0 2 2 ## 3 2 1 185 30 2.5 2 0 2 0 ## 4 2 2 228 49 2.5 0 2 0 2 ## 5 3 1 670 80 2.5 2 0 0 2 ## 6 3 2 530 70 2.5 0 2 2 0 To specify the BRR weights, the following syntax is used: scdbrr_des &lt;- scdbrr %&gt;% as_survey_rep(type = &quot;BRR&quot;, repweights = starts_with(&quot;rep&quot;), combined_weights = FALSE, weight = wt) scdbrr_des ## Call: Called via srvyr ## Balanced Repeated Replicates with 4 replicates. ## Sampling variables: ## - repweights: `rep1 + rep2 + rep3 + rep4` ## - weights: wt ## Data variables: ## - ESA (int), ambulance (int), arrests (dbl), alive (dbl), wt (dbl), ## rep1 (dbl), rep2 (dbl), rep3 (dbl), rep4 (dbl) summary(scdbrr_des) ## Call: Called via srvyr ## Balanced Repeated Replicates with 4 replicates. ## Sampling variables: ## - repweights: `rep1 + rep2 + rep3 + rep4` ## - weights: wt ## Data variables: ## - ESA (int), ambulance (int), arrests (dbl), alive (dbl), wt (dbl), ## rep1 (dbl), rep2 (dbl), rep3 (dbl), rep4 (dbl) ## Variables: ## [1] &quot;ESA&quot; &quot;ambulance&quot; &quot;arrests&quot; &quot;alive&quot; &quot;wt&quot; ## [6] &quot;rep1&quot; &quot;rep2&quot; &quot;rep3&quot; &quot;rep4&quot; Note that combined_weights was specified as FALSE because these weights are simply specified as 0 and 2 and do not incorporate the overall weight. When printing the object, the type of replication is noted as Balanced Repeated Replicates, and the replicate weights and the weight variable are specified. Additionally, the summary lists the variables included. 10.4.2 Fay’s BRR method Fay’s BRR method for replicate weights is similar to the BRR method in that it uses a Hadamard matrix to construct replicate weights. However, rather than deleting PSUs for each replicate, with Fay’s BRR half of the PSUs have a replicate weight which is the main weight multiplied by \\(\\rho\\), and the other half have the main weight multiplied by \\((2-\\rho)\\) where \\(0 \\le \\rho &lt; 1\\). Note that when \\(\\rho=0\\), this is equivalent to the standard BRR weights, and as \\(\\rho\\) becomes closer to 1, this method is more similar to jackknife discussed in the next section. To obtain the value of \\(\\rho\\), it is necessary to read the survey documentation (see Chapter 3). The math The standard error estimate for \\(\\hat{\\theta}\\) is slightly different than the BRR, due to the addition of the multiplier of \\(\\rho\\). Using the generic notation above, \\(\\alpha=\\frac{1}{R \\left(1-\\rho\\right)^2}\\) and \\(\\alpha_r=1 \\text{ for all } r\\). The standard error is calculated as: \\[se(\\hat{\\theta})=\\sqrt{\\frac{1}{R (1-\\rho)^2} \\sum_{r=1}^R \\left( \\hat{\\theta}_r-\\hat{\\theta}\\right)^2}\\] The syntax The syntax is very similar for BRR and Fay’s BRR. To specify a Fay’s BRR design, we need to specify the weight variable (weights), the replicate weight variables (repweights), the type of replicate weights is Fay’s BRR (type = Fay), whether the mean squared error should be used (mse = TRUE) or not (mse = FALSE), and Fay’s multiplier (rho). For example, if a dataset had WT0 for the main weight and had 20 BRR weights indicated as WT1, WT2, …, WT20, and Fay’s multiplier is 0.3, use the following syntax: fay_des &lt;- dat %&gt;% as_survey_rep(weights = WT0, repweights = num_range(&quot;WT&quot;, 1:20), type = &quot;Fay&quot;, mse = TRUE, rho = 0.3) Example The 2015 RECS (U.S. Energy Information Administration 2017) uses Fay’s BRR weights with the final weight as NWEIGHT and replicate weights as BRRWT1 - BRRWT96 and the documentation specifies a Fay’s multiplier of 0.5. On the file, DOEID is a unique identifier for each respondent, TOTALDOL is the total cost of energy, TOTSQFT_EN is the total square footage of the residence, and REGOINC is the Census region. We have already pulled in the 2015 RECS data from the {srvyrexploR} package that provides data for this book. To specify the design for the recs_2015 data, use the following syntax: recs_2015_des &lt;- recs_2015 %&gt;% as_survey_rep(weights = NWEIGHT, repweights = BRRWT1:BRRWT96, type = &quot;Fay&quot;, rho = 0.5, mse = TRUE, variables = c(DOEID, TOTALDOL, TOTSQFT_EN, REGIONC)) recs_2015_des ## Call: Called via srvyr ## Fay&#39;s variance method (rho= 0.5 ) with 96 replicates and MSE variances. ## Sampling variables: ## - repweights: `BRRWT1 + BRRWT2 + BRRWT3 + BRRWT4 + BRRWT5 + BRRWT6 + ## BRRWT7 + BRRWT8 + BRRWT9 + BRRWT10 + BRRWT11 + BRRWT12 + BRRWT13 + ## BRRWT14 + BRRWT15 + BRRWT16 + BRRWT17 + BRRWT18 + BRRWT19 + BRRWT20 ## + BRRWT21 + BRRWT22 + BRRWT23 + BRRWT24 + BRRWT25 + BRRWT26 + ## BRRWT27 + BRRWT28 + BRRWT29 + BRRWT30 + BRRWT31 + BRRWT32 + BRRWT33 ## + BRRWT34 + BRRWT35 + BRRWT36 + BRRWT37 + BRRWT38 + BRRWT39 + ## BRRWT40 + BRRWT41 + BRRWT42 + BRRWT43 + BRRWT44 + BRRWT45 + BRRWT46 ## + BRRWT47 + BRRWT48 + BRRWT49 + BRRWT50 + BRRWT51 + BRRWT52 + ## BRRWT53 + BRRWT54 + BRRWT55 + BRRWT56 + BRRWT57 + BRRWT58 + BRRWT59 ## + BRRWT60 + BRRWT61 + BRRWT62 + BRRWT63 + BRRWT64 + BRRWT65 + ## BRRWT66 + BRRWT67 + BRRWT68 + BRRWT69 + BRRWT70 + BRRWT71 + BRRWT72 ## + BRRWT73 + BRRWT74 + BRRWT75 + BRRWT76 + BRRWT77 + BRRWT78 + ## BRRWT79 + BRRWT80 + BRRWT81 + BRRWT82 + BRRWT83 + BRRWT84 + BRRWT85 ## + BRRWT86 + BRRWT87 + BRRWT88 + BRRWT89 + BRRWT90 + BRRWT91 + ## BRRWT92 + BRRWT93 + BRRWT94 + BRRWT95 + BRRWT96` ## - weights: NWEIGHT ## Data variables: ## - DOEID (dbl), TOTALDOL (dbl), TOTSQFT_EN (dbl), REGIONC (dbl) summary(recs_2015_des) ## Call: Called via srvyr ## Fay&#39;s variance method (rho= 0.5 ) with 96 replicates and MSE variances. ## Sampling variables: ## - repweights: `BRRWT1 + BRRWT2 + BRRWT3 + BRRWT4 + BRRWT5 + BRRWT6 + ## BRRWT7 + BRRWT8 + BRRWT9 + BRRWT10 + BRRWT11 + BRRWT12 + BRRWT13 + ## BRRWT14 + BRRWT15 + BRRWT16 + BRRWT17 + BRRWT18 + BRRWT19 + BRRWT20 ## + BRRWT21 + BRRWT22 + BRRWT23 + BRRWT24 + BRRWT25 + BRRWT26 + ## BRRWT27 + BRRWT28 + BRRWT29 + BRRWT30 + BRRWT31 + BRRWT32 + BRRWT33 ## + BRRWT34 + BRRWT35 + BRRWT36 + BRRWT37 + BRRWT38 + BRRWT39 + ## BRRWT40 + BRRWT41 + BRRWT42 + BRRWT43 + BRRWT44 + BRRWT45 + BRRWT46 ## + BRRWT47 + BRRWT48 + BRRWT49 + BRRWT50 + BRRWT51 + BRRWT52 + ## BRRWT53 + BRRWT54 + BRRWT55 + BRRWT56 + BRRWT57 + BRRWT58 + BRRWT59 ## + BRRWT60 + BRRWT61 + BRRWT62 + BRRWT63 + BRRWT64 + BRRWT65 + ## BRRWT66 + BRRWT67 + BRRWT68 + BRRWT69 + BRRWT70 + BRRWT71 + BRRWT72 ## + BRRWT73 + BRRWT74 + BRRWT75 + BRRWT76 + BRRWT77 + BRRWT78 + ## BRRWT79 + BRRWT80 + BRRWT81 + BRRWT82 + BRRWT83 + BRRWT84 + BRRWT85 ## + BRRWT86 + BRRWT87 + BRRWT88 + BRRWT89 + BRRWT90 + BRRWT91 + ## BRRWT92 + BRRWT93 + BRRWT94 + BRRWT95 + BRRWT96` ## - weights: NWEIGHT ## Data variables: ## - DOEID (dbl), TOTALDOL (dbl), TOTSQFT_EN (dbl), REGIONC (dbl) ## Variables: ## [1] &quot;DOEID&quot; &quot;TOTALDOL&quot; &quot;TOTSQFT_EN&quot; &quot;REGIONC&quot; In specifying the design, the variables option was also used to include which variables might be used in analyses. This is optional but can make our object smaller and easier to work with. When printing the design object or looking at the summary, the replicate weight type is re-iterated as Fay's variance method (rho= 0.5) with 96 replicates and MSE variances, and the variables are included. No weight or probability summary is included in this output as we have seen in some other design objects. 10.4.3 Jackknife method There are three jackknife estimators implemented in {srvyr} - jackknife 1 (JK1), jackknife n (JKn), and jackknife 2 (JK2). The JK1 method can be used for unstratified designs, and replicates are created by removing one PSU at a time so the number of replicates is the same as the number of PSUs. If there is no clustering, then the PSU is the ultimate sampling unit (e.g., unit). The JKn method is used for stratified designs and requires two or more PSUs per stratum. In this case, each replicate is created by deleting one PSU from a single stratum, so the number of replicates is the number of total PSUs across all strata. The JK2 method is a special case of JKn when there are exactly 2 PSUs sampled per stratum. For variance estimation, scaling constants must also be specified. The math Using the generic notation above, \\(\\alpha=\\frac{R-1}{R}\\) and \\(\\alpha_r=1 \\text{ for all } r\\). For the JK1 method, the standard error estimate for \\(\\hat{\\theta}\\) is calculated as: \\[se(\\hat{\\theta})=\\sqrt{\\frac{R-1}{R} \\sum_{r=1}^R \\left( \\hat{\\theta}_r-\\hat{\\theta}\\right)^2}\\] The JKn method is a bit more complex, but the coefficients are generally provided with restricted and public-use files. For each replicate, one stratum has a PSU removed, and the weights are adjusted by \\(n_h/(n_h-1)\\) where \\(n_h\\) is the number of PSUs in stratum \\(h\\). The coefficients in other strata are set to 1. Denote the coefficient that results from this process for replicate \\(r\\) as \\(\\alpha_r\\), then the standard error estimate for \\(\\hat{\\theta}\\) is calculated as: \\[se(\\hat{\\theta})=\\sqrt{\\sum_{r=1}^R \\alpha_r \\left( \\hat{\\theta}_r-\\hat{\\theta}\\right)^2}\\] The syntax To specify the jackknife method, we use the survey documentation to understand the type of jackknife (1, n, or 2) and the multiplier. In the syntax we need to specify the weight variable (weights), the replicate weight variables (repweights), the type of replicate weights as jackknife 1 (type = \"JK1\"), n (type = \"JKN\"), or 2 (type = \"JK2\"), whether the mean squared error should be used (mse = TRUE) or not (mse = FALSE), and the multiplier (scale). For example, if the survey is a jackknife 1 method with a multiplier of \\(\\alpha_r=(R-1)/R=19/20=0.95\\), the dataset has WT0 for the main weight and 20 replicate weights indicated as WT1, WT2, …, WT20, use the following syntax: jk1_des &lt;- dat %&gt;% as_survey_rep(weights = WT0, repweights= num_range(&quot;WT&quot;, 1:20), type=&quot;JK1&quot;, mse=TRUE, scale=0.95) For a jackknife n method, we need to specify the multiplier for all replicates. In this case we use the rscales argument to specify each one. The documentation will provide details on what the multipliers (\\(\\alpha_r\\)) are, and they may be the same for all replicates. For example, consider a case where \\(\\alpha_r=0.1\\) for all replicates and the dataset had WT0 for the main weight and had 20 replicate weights indicated as WT1, WT2, …, WT20. We specify the type as type = \"JKN\", and the multiplier as rscales=rep(0.1,20): jkn_des &lt;- dat %&gt;% as_survey_rep(weights = WT0, repweights= num_range(&quot;WT&quot;, 1:20), type=&quot;JKN&quot;, mse=TRUE, rscales=rep(0.1, 20)) Example The 2020 RECS (U.S. Energy Information Administration 2023c) uses jackknife weights with the final weight as NWEIGHT and replicate weights as NWEIGHT1 - NWEIGHT60 with a scale of \\((R-1)/R=59/60\\). On the file, DOEID is a unique identifier for each respondent, TOTALDOL is the total cost of energy, TOTSQFT_EN is the total square footage of the residence, and REGOINC is the Census region. We have already read in the RECS data and created a dataset called recs_2020 above in the prerequisites. To specify this design, use the following syntax: recs_des &lt;- recs_2020 %&gt;% as_survey_rep( weights = NWEIGHT, repweights = NWEIGHT1:NWEIGHT60, type = &quot;JK1&quot;, scale = 59/60, mse = TRUE, variables = c(DOEID, TOTALDOL, TOTSQFT_EN, REGIONC) ) recs_des ## Call: Called via srvyr ## Unstratified cluster jacknife (JK1) with 60 replicates and MSE variances. ## Sampling variables: ## - repweights: `NWEIGHT1 + NWEIGHT2 + NWEIGHT3 + NWEIGHT4 + NWEIGHT5 + ## NWEIGHT6 + NWEIGHT7 + NWEIGHT8 + NWEIGHT9 + NWEIGHT10 + NWEIGHT11 + ## NWEIGHT12 + NWEIGHT13 + NWEIGHT14 + NWEIGHT15 + NWEIGHT16 + ## NWEIGHT17 + NWEIGHT18 + NWEIGHT19 + NWEIGHT20 + NWEIGHT21 + ## NWEIGHT22 + NWEIGHT23 + NWEIGHT24 + NWEIGHT25 + NWEIGHT26 + ## NWEIGHT27 + NWEIGHT28 + NWEIGHT29 + NWEIGHT30 + NWEIGHT31 + ## NWEIGHT32 + NWEIGHT33 + NWEIGHT34 + NWEIGHT35 + NWEIGHT36 + ## NWEIGHT37 + NWEIGHT38 + NWEIGHT39 + NWEIGHT40 + NWEIGHT41 + ## NWEIGHT42 + NWEIGHT43 + NWEIGHT44 + NWEIGHT45 + NWEIGHT46 + ## NWEIGHT47 + NWEIGHT48 + NWEIGHT49 + NWEIGHT50 + NWEIGHT51 + ## NWEIGHT52 + NWEIGHT53 + NWEIGHT54 + NWEIGHT55 + NWEIGHT56 + ## NWEIGHT57 + NWEIGHT58 + NWEIGHT59 + NWEIGHT60` ## - weights: NWEIGHT ## Data variables: ## - DOEID (dbl), TOTALDOL (dbl), TOTSQFT_EN (dbl), REGIONC (chr) summary(recs_des) ## Call: Called via srvyr ## Unstratified cluster jacknife (JK1) with 60 replicates and MSE variances. ## Sampling variables: ## - repweights: `NWEIGHT1 + NWEIGHT2 + NWEIGHT3 + NWEIGHT4 + NWEIGHT5 + ## NWEIGHT6 + NWEIGHT7 + NWEIGHT8 + NWEIGHT9 + NWEIGHT10 + NWEIGHT11 + ## NWEIGHT12 + NWEIGHT13 + NWEIGHT14 + NWEIGHT15 + NWEIGHT16 + ## NWEIGHT17 + NWEIGHT18 + NWEIGHT19 + NWEIGHT20 + NWEIGHT21 + ## NWEIGHT22 + NWEIGHT23 + NWEIGHT24 + NWEIGHT25 + NWEIGHT26 + ## NWEIGHT27 + NWEIGHT28 + NWEIGHT29 + NWEIGHT30 + NWEIGHT31 + ## NWEIGHT32 + NWEIGHT33 + NWEIGHT34 + NWEIGHT35 + NWEIGHT36 + ## NWEIGHT37 + NWEIGHT38 + NWEIGHT39 + NWEIGHT40 + NWEIGHT41 + ## NWEIGHT42 + NWEIGHT43 + NWEIGHT44 + NWEIGHT45 + NWEIGHT46 + ## NWEIGHT47 + NWEIGHT48 + NWEIGHT49 + NWEIGHT50 + NWEIGHT51 + ## NWEIGHT52 + NWEIGHT53 + NWEIGHT54 + NWEIGHT55 + NWEIGHT56 + ## NWEIGHT57 + NWEIGHT58 + NWEIGHT59 + NWEIGHT60` ## - weights: NWEIGHT ## Data variables: ## - DOEID (dbl), TOTALDOL (dbl), TOTSQFT_EN (dbl), REGIONC (chr) ## Variables: ## [1] &quot;DOEID&quot; &quot;TOTALDOL&quot; &quot;TOTSQFT_EN&quot; &quot;REGIONC&quot; When printing the design object or looking at the summary, the replicate weight type is re-iterated as Unstratified cluster jacknife (JK1) with 60 replicates and MSE variances, and the variables are included. No weight or probability summary is included. 10.4.4 Bootstrap method In bootstrap resampling, replicates are created by selecting random samples of the PSUs with replacement (SRSWR). If there are \\(M\\) PSUs in the sample, then each replicate will be created by selecting a random sample of \\(M\\) PSUs with replacement. Each replicate is created independently, and the weights for each replicate are adjusted to reflect the population, generally using the same method as how the analysis weight was adjusted. The math A weighted estimate for the full sample is calculated as \\(\\hat{\\theta}\\), and then a weighted estimate for each replicate is calculated as \\(\\hat{\\theta}_r\\) for \\(R\\) replicates. Then the standard error of the estimate is calculated as follows: \\[se(\\hat{\\theta})=\\sqrt{\\alpha \\sum_{r=1}^R \\left( \\hat{\\theta}_r-\\hat{\\theta}\\right)^2}\\] where \\(\\alpha\\) is the scaling constant. Note that the scaling constant (\\(\\alpha\\)) is provided in the survey documentation as there are many types of bootstrap methods which generate custom scaling constants. The syntax To specify a bootstrap method, we need to specify the weight variable (weights), the replicate weight variables (repweights), the type of replicate weights as bootstrap (type = \"bootstrap\"), whether the mean squared error should be used (mse = TRUE) or not (mse = FALSE), and the multiplier (scale). For example, if a dataset had WT0 for the main weight, 20 bootstrap weights indicated WT1, WT2, …, WT20, and a multiplier of \\(\\alpha=.02\\), use the following syntax: bs_des &lt;- dat %&gt;% as_survey_rep(weights = WT0, repweights= num_range(&quot;WT&quot;, 1:20), type=&quot;bootstrap&quot;, mse=TRUE, scale=.02) Example Returning to the api example, we are going to create a dataset with bootstrap weights to use as an example. In this example, we construct a one-cluster design with fifty replicate weights.26 apiclus1_slim &lt;- apiclus1 %&gt;% as_tibble() %&gt;% arrange(dnum) %&gt;% select(cds, dnum, fpc, pw) set.seed(662152) apibw &lt;- bootweights(psu = apiclus1_slim$dnum, strata = rep(1, nrow(apiclus1_slim)), fpc = apiclus1_slim$fpc, replicates = 50) bwmata &lt;- apibw$repweights$weights[apibw$repweights$index,] * apiclus1_slim$pw apiclus1_slim &lt;- bwmata %&gt;% as.data.frame() %&gt;% set_names(str_c(&quot;pw&quot;, 1:50)) %&gt;% cbind(apiclus1_slim) %&gt;% as_tibble() %&gt;% select(cds, dnum, fpc, pw, everything()) apiclus1_slim ## # A tibble: 183 × 54 ## cds dnum fpc pw pw1 pw2 pw3 pw4 pw5 pw6 pw7 ## &lt;chr&gt; &lt;int&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 43693776… 61 757 33.8 33.8 0 0 33.8 0 33.8 0 ## 2 43693776… 61 757 33.8 33.8 0 0 33.8 0 33.8 0 ## 3 43693776… 61 757 33.8 33.8 0 0 33.8 0 33.8 0 ## 4 43693776… 61 757 33.8 33.8 0 0 33.8 0 33.8 0 ## 5 43693776… 61 757 33.8 33.8 0 0 33.8 0 33.8 0 ## 6 43693776… 61 757 33.8 33.8 0 0 33.8 0 33.8 0 ## 7 43693776… 61 757 33.8 33.8 0 0 33.8 0 33.8 0 ## 8 43693776… 61 757 33.8 33.8 0 0 33.8 0 33.8 0 ## 9 43693776… 61 757 33.8 33.8 0 0 33.8 0 33.8 0 ## 10 43693776… 61 757 33.8 33.8 0 0 33.8 0 33.8 0 ## # ℹ 173 more rows ## # ℹ 43 more variables: pw8 &lt;dbl&gt;, pw9 &lt;dbl&gt;, pw10 &lt;dbl&gt;, pw11 &lt;dbl&gt;, ## # pw12 &lt;dbl&gt;, pw13 &lt;dbl&gt;, pw14 &lt;dbl&gt;, pw15 &lt;dbl&gt;, pw16 &lt;dbl&gt;, ## # pw17 &lt;dbl&gt;, pw18 &lt;dbl&gt;, pw19 &lt;dbl&gt;, pw20 &lt;dbl&gt;, pw21 &lt;dbl&gt;, ## # pw22 &lt;dbl&gt;, pw23 &lt;dbl&gt;, pw24 &lt;dbl&gt;, pw25 &lt;dbl&gt;, pw26 &lt;dbl&gt;, ## # pw27 &lt;dbl&gt;, pw28 &lt;dbl&gt;, pw29 &lt;dbl&gt;, pw30 &lt;dbl&gt;, pw31 &lt;dbl&gt;, ## # pw32 &lt;dbl&gt;, pw33 &lt;dbl&gt;, pw34 &lt;dbl&gt;, pw35 &lt;dbl&gt;, pw36 &lt;dbl&gt;, … The output of apiclus1_slim includes the same variables we have seen in other api examples (see Table 10.1), but now additionally includes bootstrap weights pw1, …, pw50. When creating the survey design object, we use the bootstrap weights as the replicate weights. Additionally, with replicate weights we need to include the scale (\\(\\alpha\\)). For this example we created, \\[\\alpha=\\frac{M}{(M-1)(R-1)}=\\frac{15}{(15-1)*(50-1)}=0.02186589\\] where \\(M\\) is the average number of PSUs per strata and \\(R\\) is the number of replicates. There is only 1 stratum and the number of clusters/PSUs is 15 so \\(M=15\\). api1_bs_des &lt;- apiclus1_slim %&gt;% as_survey_rep(weights = pw, repweights = pw1:pw50, type = &quot;bootstrap&quot;, scale = 0.02186589, mse = TRUE) api1_bs_des ## Call: Called via srvyr ## Survey bootstrap with 50 replicates and MSE variances. ## Sampling variables: ## - repweights: `pw1 + pw2 + pw3 + pw4 + pw5 + pw6 + pw7 + pw8 + pw9 + ## pw10 + pw11 + pw12 + pw13 + pw14 + pw15 + pw16 + pw17 + pw18 + pw19 ## + pw20 + pw21 + pw22 + pw23 + pw24 + pw25 + pw26 + pw27 + pw28 + ## pw29 + pw30 + pw31 + pw32 + pw33 + pw34 + pw35 + pw36 + pw37 + pw38 ## + pw39 + pw40 + pw41 + pw42 + pw43 + pw44 + pw45 + pw46 + pw47 + ## pw48 + pw49 + pw50` ## - weights: pw ## Data variables: ## - cds (chr), dnum (int), fpc (dbl), pw (dbl), pw1 (dbl), pw2 (dbl), ## pw3 (dbl), pw4 (dbl), pw5 (dbl), pw6 (dbl), pw7 (dbl), pw8 (dbl), ## pw9 (dbl), pw10 (dbl), pw11 (dbl), pw12 (dbl), pw13 (dbl), pw14 ## (dbl), pw15 (dbl), pw16 (dbl), pw17 (dbl), pw18 (dbl), pw19 (dbl), ## pw20 (dbl), pw21 (dbl), pw22 (dbl), pw23 (dbl), pw24 (dbl), pw25 ## (dbl), pw26 (dbl), pw27 (dbl), pw28 (dbl), pw29 (dbl), pw30 (dbl), ## pw31 (dbl), pw32 (dbl), pw33 (dbl), pw34 (dbl), pw35 (dbl), pw36 ## (dbl), pw37 (dbl), pw38 (dbl), pw39 (dbl), pw40 (dbl), pw41 (dbl), ## pw42 (dbl), pw43 (dbl), pw44 (dbl), pw45 (dbl), pw46 (dbl), pw47 ## (dbl), pw48 (dbl), pw49 (dbl), pw50 (dbl) summary(api1_bs_des) ## Call: Called via srvyr ## Survey bootstrap with 50 replicates and MSE variances. ## Sampling variables: ## - repweights: `pw1 + pw2 + pw3 + pw4 + pw5 + pw6 + pw7 + pw8 + pw9 + ## pw10 + pw11 + pw12 + pw13 + pw14 + pw15 + pw16 + pw17 + pw18 + pw19 ## + pw20 + pw21 + pw22 + pw23 + pw24 + pw25 + pw26 + pw27 + pw28 + ## pw29 + pw30 + pw31 + pw32 + pw33 + pw34 + pw35 + pw36 + pw37 + pw38 ## + pw39 + pw40 + pw41 + pw42 + pw43 + pw44 + pw45 + pw46 + pw47 + ## pw48 + pw49 + pw50` ## - weights: pw ## Data variables: ## - cds (chr), dnum (int), fpc (dbl), pw (dbl), pw1 (dbl), pw2 (dbl), ## pw3 (dbl), pw4 (dbl), pw5 (dbl), pw6 (dbl), pw7 (dbl), pw8 (dbl), ## pw9 (dbl), pw10 (dbl), pw11 (dbl), pw12 (dbl), pw13 (dbl), pw14 ## (dbl), pw15 (dbl), pw16 (dbl), pw17 (dbl), pw18 (dbl), pw19 (dbl), ## pw20 (dbl), pw21 (dbl), pw22 (dbl), pw23 (dbl), pw24 (dbl), pw25 ## (dbl), pw26 (dbl), pw27 (dbl), pw28 (dbl), pw29 (dbl), pw30 (dbl), ## pw31 (dbl), pw32 (dbl), pw33 (dbl), pw34 (dbl), pw35 (dbl), pw36 ## (dbl), pw37 (dbl), pw38 (dbl), pw39 (dbl), pw40 (dbl), pw41 (dbl), ## pw42 (dbl), pw43 (dbl), pw44 (dbl), pw45 (dbl), pw46 (dbl), pw47 ## (dbl), pw48 (dbl), pw49 (dbl), pw50 (dbl) ## Variables: ## [1] &quot;cds&quot; &quot;dnum&quot; &quot;fpc&quot; &quot;pw&quot; &quot;pw1&quot; &quot;pw2&quot; &quot;pw3&quot; &quot;pw4&quot; &quot;pw5&quot; ## [10] &quot;pw6&quot; &quot;pw7&quot; &quot;pw8&quot; &quot;pw9&quot; &quot;pw10&quot; &quot;pw11&quot; &quot;pw12&quot; &quot;pw13&quot; &quot;pw14&quot; ## [19] &quot;pw15&quot; &quot;pw16&quot; &quot;pw17&quot; &quot;pw18&quot; &quot;pw19&quot; &quot;pw20&quot; &quot;pw21&quot; &quot;pw22&quot; &quot;pw23&quot; ## [28] &quot;pw24&quot; &quot;pw25&quot; &quot;pw26&quot; &quot;pw27&quot; &quot;pw28&quot; &quot;pw29&quot; &quot;pw30&quot; &quot;pw31&quot; &quot;pw32&quot; ## [37] &quot;pw33&quot; &quot;pw34&quot; &quot;pw35&quot; &quot;pw36&quot; &quot;pw37&quot; &quot;pw38&quot; &quot;pw39&quot; &quot;pw40&quot; &quot;pw41&quot; ## [46] &quot;pw42&quot; &quot;pw43&quot; &quot;pw44&quot; &quot;pw45&quot; &quot;pw46&quot; &quot;pw47&quot; &quot;pw48&quot; &quot;pw49&quot; &quot;pw50&quot; As with other replicate design objects, when printing the object or looking at the summary, the replicate weights are provided along with the data variables. 10.5 Exercises For this chapter, the exercises entail reading public documentation to determine how to specify the survey design. While reading the documentation, be on the lookout for description of the weights and the survey design variables or replicate weights. The National Health Interview Survey (NHIS) is an annual household survey conducted by the National Center for Health Statistics (NCHS). The NHIS includes a wide variety of health topics for adults including health status and conditions, functioning and disability, health care access and health service utilization, health-related behaviors, health promotion, mental health, barriers to care, and community engagement. Like many national in-person surveys, the sampling design is a stratified clustered design with details included in the Survey Description (National Center for Health Statistics 2023). The Survey Description provides information on setting up syntax in SUDAAN, Stata, SPSS, SAS, and R ({survey} package implementation). You have imported the data and the variable containing the data is: nhis_adult_data. How would you specify the design using {srvyr} using either as_survey_design() or as_survey_rep()? The General Social Survey is a survey that has been administered since 1972 on social, behavioral, and attitudinal topics. The 2016-2020 GSS Panel codebook provides examples of setting up syntax in SAS and Stata but not R (Davern et al. 2021). You have imported the data and the variable containing the data is: gss_data. How would you specify the design in R using either as_survey_design() or as_survey_rep()? References Cochran, William G. 1977. Sampling Techniques. John Wiley &amp; Sons. Cox, Brenda G, David A Binder, B Nanjamma Chinnappa, Anders Christianson, Michael J Colledge, and Phillip S Kott. 2011. Business Survey Methods. John Wiley &amp; Sons. Davern, Michael, Rene Bautista, Jeremy Freese, Stephen L. Morgan, and Tom W. Smith. 2021. “General Social Survey 2016-2020 Panel Codebook.” Edited by Chicago NORC. https://gss.norc.org/Documents/codebook/2016-2020%20GSS%20Panel%20Codebook%20-%20R1a.pdf. Deming, W Edwards. 1991. Sample Design in Business Research. Vol. 23. John Wiley &amp; Sons. Freedman Ellis, Greg, and Ben Schneider. 2023. srvyr: Dplyr-Like Syntax for Summary Statistics of Survey Data. Fuller, Wayne A. 2011. Sampling Statistics. John Wiley &amp; Sons. Levy, Paul S, and Stanley Lemeshow. 2013. Sampling of Populations: Methods and Applications. John Wiley &amp; Sons. Lumley, Thomas. 2010. Complex Surveys: A Guide to Analysis Using R. John Wiley; Sons. National Center for Health Statistics. 2023. “National Health Interview Survey, 2022 survey description.” https://ftp.cdc.gov/pub/Health_Statistics/NCHS/Dataset_Documentation/NHIS/2022/srvydesc-508.pdf. Penn State. 2019. “STAT 506: Sampling Theory and Methods [Online Course].” https://online.stat.psu.edu/stat506/. Särndal, Carl-Erik, Bengt Swensson, and Jan Wretman. 2003. Model Assisted Survey Sampling. Springer Science &amp; Business Media. Stephanie, Zimmer, Powell Rebecca, and Velásquez Isabella. 2024. srvyrexploR: Data Supplement for Exploring Complex Survey Data Analysis in R. U.S. Census Bureau. 2021. “Understanding and Using the American Community Survey Public Use Microdata Sample Files What Data Users Need to Know.” U.S. Government Printing Office; https://www.census.gov/content/dam/Census/library/publications/2021/acs/acs_pums_handbook_2021.pdf. U.S. Energy Information Administration. 2017. “Residential Energy Consumption Survey (RECS): Using the 2015 microdata file to compute estimates and standard errors (RSEs).” https://www.eia.gov/consumption/residential/data/2015/pdf/microdata_v3.pdf. ———. 2023c. “2020 Residential Energy Consumption Survey: Using the microdata file to compute estimates and relative standard errors (RSEs).” https://www.eia.gov/consumption/residential/data/2020/pdf/microdata-guide.pdf. Valliant, Richard, Jill A Dever, and Frauke Kreuter. 2013. Practical Tools for Designing and Weighting Survey Samples. Vol. 1. Springer. Wolter, Kirk M. 2007. Introduction to Variance Estimation. Vol. 53. Springer. We provide the code here for you to replicate this example, but are not focusing on the creation of the weights as that is outside the scope of this book. We recommend you reference Wolter (2007) for more information on creating bootstrap weights.↩︎ "],["c11-missing-data.html", "Chapter 11 Missing data 11.1 Introduction 11.2 Missing data mechanisms 11.3 Assessing missing data 11.4 Analysis with missing data", " Chapter 11 Missing data Prerequisites For this chapter, load the following packages: library(tidyverse) library(survey) library(srvyr) library(srvyrexploR) library(naniar) library(haven) library(gt) We will be using data from ANES and RECS. Here is the code to create the design objects for each to use throughout this chapter. For ANES, we need to adjust the weight so it sums to the population instead of the sample (see the ANES documentation and Chapter 3 for more information). targetpop &lt;- 231592693 anes_adjwgt &lt;- anes_2020 %&gt;% mutate(Weight = Weight / sum(Weight) * targetpop) anes_des &lt;- anes_adjwgt %&gt;% as_survey_design( weights = Weight, strata = Stratum, ids = VarUnit, nest = TRUE ) For RECS, details are included in the RECS documentation and Chapter 10. recs_des &lt;- recs_2020 %&gt;% as_survey_rep( weights = NWEIGHT, repweights = NWEIGHT1:NWEIGHT60, type = &quot;JK1&quot;, scale = 59/60, mse = TRUE ) 11.1 Introduction Missing data in surveys refers to situations where participants do not provide complete responses to survey questions. Respondents may not have seen a question by design. Or, they may not respond to a question for various other reasons, such as not wanting to answer a particular question, not understanding the question, or simply forgetting to answer. Missing data is important to consider and account for, as it can introduce bias and reduce the representativeness of the data. This chapter provides an overview of the types of missing data, how to assess missing data in surveys, and how to conduct analysis when missing data is present. Understanding this complex topic can help ensure accurate reporting of survey results and can provide insight into potential changes to the survey design for the future. 11.2 Missing data mechanisms There are two main categories that missing data typically fall into: missing by design or unintentional missing data. Missing by design is part of the survey plan and can be more easily incorporated into weights and analyses. Unintentional missing data on the other hand, can lead to bias in survey estimates if not correctly accounted for. Below we provide more information on the types of missing data. Missing by design/questionnaire skip logic: This type of missingness occurs when certain respondents are intentionally directed to skip specific questions based on their previous responses or characteristics. For example, in a survey about employment, if a respondent indicates that they are not employed, they may be directed to skip questions related to their job responsibilities. Additionally, some surveys randomize questions or modules so that not all participants respond to all questions. In these instances, respondents would have missing data for the modules not randomly assigned to them. Unintentional missing data: This type of missingness occurs when researchers do not intend for there to be missing data on a particular question, for example, if respondents did not finish the survey or refused to answer individual questions. There are three main types of unintentional missing data that each should be considered and handled differently (Mack, Su, and Westreich 2018; Schafer and Graham 2002): Missing completely at random (MCAR): The missing data is unrelated to both observed and unobserved data, and the probability of being missing is the same across all cases. For example, if a respondent missed a question because they had to leave the survey early due to an emergency. Missing at random (MAR): The missing data is related to observed data but not unobserved data, and the probability of being missing is the same within groups. For example, if older respondents choose not to answer specific questions but younger respondents do answer them and we know the respondent’s age. Missing not at random (MNAR): The missing data is related to unobserved data, and the probability of being missing varies for reasons we are not measuring. For example, if respondents with depression do not answer a question about depression severity. 11.3 Assessing missing data Before beginning analysis, we should explore the data to determine if there is missing data and what types of missing data are present. Conducting this descriptive analysis can help with analysis and reporting of survey data (see Section 12), and can inform the survey design in future studies. For example, large amounts of unexpected missing data may indicate the questions were unclear or difficult to recall. There are several ways to explore missing data which we walk through below. When assessing the missing data, we recommend using a data.frame object and not the survey object as most of the analysis is about patterns of records and weights are not necessary. 11.3.1 Summarize data A very rudimentary first exploration is to use the summary() function to summarize the data which will illuminate NA values in the data. Let’s look at a few analytic variables on the ANES 2020 data using summary(): anes_2020 %&gt;% select(V202051:EarlyVote2020) %&gt;% summary() ## V202051 Income7 Income ## Min. :-9.000 $125k or more:1468 Under $9,999 : 647 ## 1st Qu.:-1.000 Under $20k :1076 $50,000-59,999 : 485 ## Median :-1.000 $20k to &lt; 40k:1051 $100,000-109,999: 451 ## Mean :-0.726 $40k to &lt; 60k: 984 $250,000 or more: 405 ## 3rd Qu.:-1.000 $60k to &lt; 80k: 920 $80,000-89,999 : 383 ## Max. : 3.000 (Other) :1437 (Other) :4565 ## NA&#39;s : 517 NA&#39;s : 517 ## V201617x V201616 V201615 V201613 V201611 ## Min. :-9.0 Min. :-3 Min. :-3 Min. :-3 Min. :-3 ## 1st Qu.: 4.0 1st Qu.:-3 1st Qu.:-3 1st Qu.:-3 1st Qu.:-3 ## Median :11.0 Median :-3 Median :-3 Median :-3 Median :-3 ## Mean :10.4 Mean :-3 Mean :-3 Mean :-3 Mean :-3 ## 3rd Qu.:17.0 3rd Qu.:-3 3rd Qu.:-3 3rd Qu.:-3 3rd Qu.:-3 ## Max. :22.0 Max. :-3 Max. :-3 Max. :-3 Max. :-3 ## ## V201610 V201607 Gender V201600 ## Min. :-3 Min. :-3 Male :3375 Min. :-9.00 ## 1st Qu.:-3 1st Qu.:-3 Female:4027 1st Qu.: 1.00 ## Median :-3 Median :-3 NA&#39;s : 51 Median : 2.00 ## Mean :-3 Mean :-3 Mean : 1.47 ## 3rd Qu.:-3 3rd Qu.:-3 3rd Qu.: 2.00 ## Max. :-3 Max. :-3 Max. : 2.00 ## ## RaceEth V201549x V201547z V201547e ## White :5420 Min. :-9.0 Min. :-3 Min. :-3 ## Black : 650 1st Qu.: 1.0 1st Qu.:-3 1st Qu.:-3 ## Hispanic : 662 Median : 1.0 Median :-3 Median :-3 ## Asian, NH/PI : 248 Mean : 1.5 Mean :-3 Mean :-3 ## AI/AN : 155 3rd Qu.: 2.0 3rd Qu.:-3 3rd Qu.:-3 ## Other/multiple race: 237 Max. : 6.0 Max. :-3 Max. :-3 ## NA&#39;s : 81 ## V201547d V201547c V201547b V201547a V201546 ## Min. :-3 Min. :-3 Min. :-3 Min. :-3 Min. :-9.00 ## 1st Qu.:-3 1st Qu.:-3 1st Qu.:-3 1st Qu.:-3 1st Qu.: 2.00 ## Median :-3 Median :-3 Median :-3 Median :-3 Median : 2.00 ## Mean :-3 Mean :-3 Mean :-3 Mean :-3 Mean : 1.84 ## 3rd Qu.:-3 3rd Qu.:-3 3rd Qu.:-3 3rd Qu.:-3 3rd Qu.: 2.00 ## Max. :-3 Max. :-3 Max. :-3 Max. :-3 Max. : 2.00 ## ## Education V201510 AgeGroup Age ## Less than HS: 312 Min. :-9.00 18-29 : 871 Min. :18.0 ## High school :1160 1st Qu.: 3.00 30-39 :1241 1st Qu.:37.0 ## Post HS :2514 Median : 5.00 40-49 :1081 Median :53.0 ## Bachelor&#39;s :1877 Mean : 5.62 50-59 :1200 Mean :51.8 ## Graduate :1474 3rd Qu.: 6.00 60-69 :1436 3rd Qu.:66.0 ## NA&#39;s : 116 Max. :95.00 70 or older:1330 Max. :80.0 ## NA&#39;s : 294 NA&#39;s :294 ## V201507x TrustPeople V201237 ## Min. :-9.0 Always : 48 Min. :-9.00 ## 1st Qu.:35.0 Most of the time :3511 1st Qu.: 2.00 ## Median :51.0 About half the time:2020 Median : 3.00 ## Mean :49.4 Some of the time :1597 Mean : 2.78 ## 3rd Qu.:66.0 Never : 264 3rd Qu.: 3.00 ## Max. :80.0 NA&#39;s : 13 Max. : 5.00 ## ## TrustGovernment V201233 ## Always : 80 Min. :-9.00 ## Most of the time :1016 1st Qu.: 3.00 ## About half the time:2313 Median : 4.00 ## Some of the time :3313 Mean : 3.43 ## Never : 702 3rd Qu.: 4.00 ## NA&#39;s : 29 Max. : 5.00 ## ## PartyID V201231x V201230 ## Strong democrat :1796 Min. :-9.00 Min. :-9.000 ## Strong republican :1545 1st Qu.: 2.00 1st Qu.:-1.000 ## Independent-democrat : 881 Median : 4.00 Median :-1.000 ## Independent : 876 Mean : 3.83 Mean : 0.013 ## Not very strong democrat: 790 3rd Qu.: 6.00 3rd Qu.: 1.000 ## (Other) :1540 Max. : 7.00 Max. : 3.000 ## NA&#39;s : 25 ## V201229 V201228 VotedPres2016_selection ## Min. :-9.000 Min. :-9.00 Clinton:2911 ## 1st Qu.:-1.000 1st Qu.: 1.00 Trump :2466 ## Median : 1.000 Median : 2.00 Other : 390 ## Mean : 0.515 Mean : 1.99 NA&#39;s :1686 ## 3rd Qu.: 1.000 3rd Qu.: 3.00 ## Max. : 2.000 Max. : 5.00 ## ## V201103 VotedPres2016 V201102 V201101 ## Min. :-9.00 Yes :5810 Min. :-9.000 Min. :-9.000 ## 1st Qu.: 1.00 No :1622 1st Qu.:-1.000 1st Qu.:-1.000 ## Median : 1.00 NA&#39;s: 21 Median : 1.000 Median :-1.000 ## Mean : 1.04 Mean : 0.105 Mean : 0.085 ## 3rd Qu.: 2.00 3rd Qu.: 1.000 3rd Qu.: 1.000 ## Max. : 5.00 Max. : 2.000 Max. : 2.000 ## ## V201029 V201028 V201025x V201024 ## Min. :-9.000 Min. :-9.0 Min. :-4.00 Min. :-9.00 ## 1st Qu.:-1.000 1st Qu.:-1.0 1st Qu.: 3.00 1st Qu.:-1.00 ## Median :-1.000 Median :-1.0 Median : 3.00 Median :-1.00 ## Mean :-0.897 Mean :-0.9 Mean : 2.92 Mean :-0.86 ## 3rd Qu.:-1.000 3rd Qu.:-1.0 3rd Qu.: 3.00 3rd Qu.:-1.00 ## Max. :12.000 Max. : 2.0 Max. : 4.00 Max. : 4.00 ## ## EarlyVote2020 ## Yes : 375 ## No : 115 ## NA&#39;s:6963 ## ## ## ## We see that there are NA values in several of the derived variables (those not beginning with “V”) and negative values in the original variables (those beginning with “V”). We can also use the count() function to get an understanding of the different types of missing data on the original variables. For example, let’s look at the count of data for V202072, which corresponds to our VotedPres2020 variable. anes_2020 %&gt;% count(VotedPres2020,V202072) ## # A tibble: 7 × 3 ## VotedPres2020 V202072 n ## &lt;fct&gt; &lt;dbl+lbl&gt; &lt;int&gt; ## 1 Yes -1 [-1. Inapplicable] 361 ## 2 Yes 1 [1. Yes, voted for President] 5952 ## 3 No -1 [-1. Inapplicable] 10 ## 4 No 2 [2. No, didn&#39;t vote for President] 77 ## 5 &lt;NA&gt; -9 [-9. Refused] 2 ## 6 &lt;NA&gt; -6 [-6. No post-election interview] 4 ## 7 &lt;NA&gt; -1 [-1. Inapplicable] 1047 Here we can see that there are three types of missing data, and that the majority of them fall under the “Inapplicable” category. This is usually a term associated with data missing due to skip patterns and is considered to be missing data by design. Based on the documentation from ANES (DeBell 2010), we can see that this question was only asked to respondents who voted in the election. 11.3.2 Visualization of missing data It can be challenging to look at tables for every variable, and instead may be more efficient to view missing data in a graphical format to help narrow in on patterns or unique variables. The {naniar} package is very useful in exploring missing data visually. It provides quick graphics to explore the missingness patterns in the data. We can use the vis_miss() function available in both {visdat} and {naniar} packages to view the amount of missing data by variable (Tierney 2017; Tierney and Cook 2023). anes_2020_derived&lt;-anes_2020 %&gt;% select(!starts_with(&quot;V2&quot;),-CaseID,-InterviewMode,-Weight,-Stratum,-VarUnit) anes_2020_derived %&gt;% vis_miss(cluster= TRUE, show_perc = FALSE) + scale_fill_manual(values = book_colors[c(3,1)], labels = c(&quot;Present&quot;,&quot;Missing&quot;), name = &quot;&quot;) FIGURE 11.1: Visual depiction of missing data in the ANES 2020 data From this visualization, we can start to get a picture of what questions may be related to each other in terms of missing data. Even if we did not have the informative variable names, we could be able to deduce that VotedPres2020, VotedPres2020_selection, and EarlyVote2020 are likely related since their missing data patterns are similar. Additionally, we can also look at VotedPres2016_selection and see that there is a lot of missing data in that variable. Most likely this is due to a skip pattern, and we can look at further graphics to see how it might be related to other variables. The {naniar} package has multiple visualization functions that can help dive deeper such as the gg_miss_fct() function which looks at missing data for all variables by levels of another variable. anes_2020_derived %&gt;% gg_miss_fct(VotedPres2016) + scale_fill_gradientn( guide = &quot;colorbar&quot;, name = &quot;% Miss&quot;, colors = book_colors[c(3, 2, 1)] ) + ylab(&quot;Variable&quot;) + xlab(&quot;Voted for President in 2016&quot;) ## Scale for fill is already present. ## Adding another scale for fill, which will replace the existing scale. FIGURE 11.2: Missingness in variables for each level of VotedPres2016 in the ANES 2020 data In this case, we can see that if they did not vote for president in 2016 or did not answer that question, then they were not asked about who they voted for in 2016 (the percentage of missing data if 100%). Additionally, we can see with this graphic, that there is more missing data across all questions if they did not provide an answer to VotedPres2016. There are other graphics that work well with numeric data. For example, in the RECS 2020 data we can plot two continuous variables and the missing data associated with it to see if there are any patterns to the missingness. To do this, we can use the bind_shadow() function from the {naniar} package. This creates a nabular (combination of “na” with “tabular”), which features the original columns followed by the same number of columns with a specific NA format. These NA columns are indicators of if the value in the original data is missing or not. The example printed below shows how most levels of HeatingBehavior are not missing !NA in the NA variable of HeatingBehavior_NA, but those missing in HeatingBehavior are also missing in HeatingBehavior_NA. recs_2020_shadow &lt;- recs_2020 %&gt;% bind_shadow() ncol(recs_2020) ## [1] 118 ncol(recs_2020_shadow) ## [1] 236 recs_2020_shadow %&gt;% count(HeatingBehavior,HeatingBehavior_NA) ## # A tibble: 7 × 3 ## HeatingBehavior HeatingBehavior_NA n ## &lt;fct&gt; &lt;fct&gt; &lt;int&gt; ## 1 Set one temp and leave it !NA 7806 ## 2 Manually adjust at night/no one home !NA 4654 ## 3 Programmable or smart thermostat automatical… !NA 3310 ## 4 Turn on or off as needed !NA 1491 ## 5 No control !NA 438 ## 6 Other !NA 46 ## 7 &lt;NA&gt; NA 751 We can then use these new variables to plot the missing data along side the actual data. For example, let’s plot a histogram of the total electric bill grouped by those that are missing and not missing by heating behavior. recs_2020_shadow %&gt;% filter(TOTALDOL &lt; 5000) %&gt;% ggplot(aes(x=TOTALDOL,fill=HeatingBehavior_NA)) + geom_histogram() + scale_fill_manual(values = book_colors[c(3, 1)], labels = c(&quot;Present&quot;, &quot;Missing&quot;), name = &quot;Heating Behavior&quot;) + theme_minimal() + xlab(&quot;Total Energy Cost (Truncated at $5000)&quot;) + ylab(&quot;Number of Households&quot;) + labs(title = &quot;Histogram of Energy Cost by Heating Behavior Missing Data&quot;) ## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`. FIGURE 11.3: Histogram of Energy Cost by Heating Behavior Missing Data This plot indicates that respondents who did not provide a response for the heating behavior question may have a different distribution of total energy cost compared to respondents who did provide a response. This view of the raw data and missingness could indicate some bias in the data. Researchers take these different bias aspects into account when calculating weights and we need to make sure that the weights are incorporated when analyzing the data. There are many other visualizations that can be helpful in reviewing the data, and we recommend reviewing the {naniar} documentation for more information (Tierney and Cook 2023). 11.4 Analysis with missing data Once we understand the types of missingness, we can begin the analysis of the data. Different missingness types may be handled in different ways. In most publicly available datasets, researchers will have already calculated weights and imputed missing values if deemed necessary. Those interested in learning more about how to calculate weights and impute data for different missing data mechanisms, we recommended Kim and Shao (2021) and Valliant and Dever (2018). Even with weights and imputation, missing data will still most likely exist in the data and need to be accounted for in analysis. This section provides an overview on how to recode missing data in R, and how to account for skip patterns in analysis. 11.4.1 Recoding missing data Even within a variable, there can be different reasons for missing data. In publicly released data negative values are often present to provide different meaning for values. For example, in the ANES 2020 data they have the following negative values to represent different types of missing data: * -9: Refused * -8: Don’t Know * -7: No post-election data, deleted due to incomplete interview * -6: No post-election interview * -5: Interview breakoff (sufficient partial IW) * -4: Technical error * -3: Restricted * -2: Other missing reason (question specific) * -1: Inapplicable When we created the derived variables for use in this book, we coded all negative values as NA and proceeded to analyze the data. For most cases this is an appropriate approach as long as you filter the data appropriately to account for skip patterns (see next section). However, the {naniar} package does have the option to code special missing values. For example, if we wanted to have two NA values, one that indicated the question was missing by design (e.g., due to skip patterns) and one for the other missing categories we can use the nabular format to incorporate these with the recode_shadow() function. anes_2020_shadow&lt;-anes_2020 %&gt;% select(starts_with(&quot;V2&quot;)) %&gt;% mutate(across(everything(),~case_when(.x &lt; -1 ~ NA, TRUE~.x))) %&gt;% bind_shadow() %&gt;% recode_shadow(V201103 = .where(V201103==-1~&quot;skip&quot;)) anes_2020_shadow %&gt;% count(V201103,V201103_NA) ## # A tibble: 5 × 3 ## V201103 V201103_NA n ## &lt;dbl+lbl&gt; &lt;fct&gt; &lt;int&gt; ## 1 -1 [-1. Inapplicable] NA_skip 1643 ## 2 1 [1. Hillary Clinton] !NA 2911 ## 3 2 [2. Donald Trump] !NA 2466 ## 4 5 [5. Other {SPECIFY}] !NA 390 ## 5 NA NA 43 However it is important to note that at the time of publication, there is no easy way to implement recode_shadow() to multiple variables at once (e.g., we cannot use the tidyverse feature of across()). The example code above only implements this for a single variable, so this would have to be done to all variables of interest manually or in a loop. 11.4.2 Accounting for skip patterns When questions are skipped by design in a survey, it is meaningful that the data is later missing. For example the RECS survey asks people how they control the heat in their home in the winter (HeatingBehavior). This is only among those who have heat in their home (SpaceHeatingUsed). If no there is no heating equipment used, the value of HeatingBehavior is missing. One has several choices when analyzing this data which include 1) only including those with a valid value of HeatingBehavior and specifying the universe as those with heat or 2) including those who do not have heat. It is important to specify what population an analysis generalizes to. Here is example code where we only include those with a valid value of HeatingBehavior (choice 1). Note that we use the design object (recs_des) then filter to those that are not missing on HeatingBehavior. heat_cntl_1 &lt;- recs_des %&gt;% filter(!is.na(HeatingBehavior)) %&gt;% group_by(HeatingBehavior) %&gt;% summarize( p=survey_prop() ) heat_cntl_1 ## # A tibble: 6 × 3 ## HeatingBehavior p p_se ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Set one temp and leave it 0.430 4.69e-3 ## 2 Manually adjust at night/no one home 0.264 4.54e-3 ## 3 Programmable or smart thermostat automatically adjust… 0.168 3.12e-3 ## 4 Turn on or off as needed 0.102 2.89e-3 ## 5 No control 0.0333 1.70e-3 ## 6 Other 0.00208 3.59e-4 Here is example code where we include those that do not have heat (choice 2). To help understand what we are looking at we have included the output to show both variables SpaceHeatingUsed and HeatingBehavior. heat_cntl_2 &lt;- recs_des %&gt;% group_by(interact(SpaceHeatingUsed, HeatingBehavior)) %&gt;% summarize( p=survey_prop() ) heat_cntl_2 ## # A tibble: 7 × 4 ## SpaceHeatingUsed HeatingBehavior p p_se ## &lt;lgl&gt; &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 FALSE &lt;NA&gt; 0.0469 2.07e-3 ## 2 TRUE Set one temp and leave it 0.410 4.60e-3 ## 3 TRUE Manually adjust at night/no one home 0.251 4.36e-3 ## 4 TRUE Programmable or smart thermostat aut… 0.160 2.95e-3 ## 5 TRUE Turn on or off as needed 0.0976 2.79e-3 ## 6 TRUE No control 0.0317 1.62e-3 ## 7 TRUE Other 0.00198 3.41e-4 If we ran the first analysis, we would say that 16.8% of households with heat use a programmable or smart thermostat for the heating of their home. While if we used the results from the second analysis, we could say that 16% of households use a programmable or smart thermostat for the heating of their home. The distinction of the two statements is bolded for emphasis. Skip patterns often change the universe that we are talking about and need to be carefully examined. Filtering to the correct universe is important when handling these types of missing data. The nabular we created above can also help with this. If we have NA_skip values in the shadow, we can make sure that we filter out all of these values and only include relevant missing. To do this with survey data we could first create the nabular, then create the design object on that data, and then use the shadow variables to assist with filtering the data. Let’s use the nabular we created above for ANES 2020 (anes_2020_shadow) to create the design object. anes_adjwgt_shadow &lt;- anes_2020_shadow %&gt;% mutate(V200010b = V200010b/sum(V200010b)*targetpop) anes_des_shadow &lt;- anes_adjwgt_shadow %&gt;% as_survey_design( weights = V200010b, strata = V200010d, ids = V200010c, nest = TRUE ) Then we can use this design object to look at the percent of the population that voted for each candidate in 2016 (V201103). First, let’s look at the percentages without removing any cases: pres16_select1&lt;-anes_des_shadow %&gt;% group_by(V201103) %&gt;% summarize( All_Missing=survey_prop() ) pres16_select1 ## # A tibble: 5 × 3 ## V201103 All_Missing All_Missing_se ## &lt;dbl+lbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 -1 [-1. Inapplicable] 0.324 0.00933 ## 2 1 [1. Hillary Clinton] 0.330 0.00728 ## 3 2 [2. Donald Trump] 0.299 0.00728 ## 4 5 [5. Other {SPECIFY}] 0.0409 0.00230 ## 5 NA 0.00627 0.00121 Next, we will look at the percentages removing only those that were missing due to skip patterns (i.e., they did not receive this question). pres16_select2&lt;-anes_des_shadow %&gt;% filter(V201103_NA!=&quot;NA_skip&quot;) %&gt;% group_by(V201103) %&gt;% summarize( No_Skip_Missing=survey_prop() ) pres16_select2 ## # A tibble: 4 × 3 ## V201103 No_Skip_Missing No_Skip_Missing_se ## &lt;dbl+lbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 1 [1. Hillary Clinton] 0.488 0.00870 ## 2 2 [2. Donald Trump] 0.443 0.00856 ## 3 5 [5. Other {SPECIFY}] 0.0606 0.00330 ## 4 NA 0.00928 0.00178 Finally, we will look at the percentages removing all missing values both due to skip patterns and due to those who refused to answer the question. pres16_select3&lt;-anes_des_shadow %&gt;% filter(V201103_NA==&quot;!NA&quot;) %&gt;% group_by(V201103) %&gt;% summarize( No_Missing=survey_prop() ) pres16_select3 ## # A tibble: 3 × 3 ## V201103 No_Missing No_Missing_se ## &lt;dbl+lbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 1 [1. Hillary Clinton] 0.492 0.00875 ## 2 2 [2. Donald Trump] 0.447 0.00861 ## 3 5 [5. Other {SPECIFY}] 0.0611 0.00332 #edxahdlkim table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #edxahdlkim thead, #edxahdlkim tbody, #edxahdlkim tfoot, #edxahdlkim tr, #edxahdlkim td, #edxahdlkim th { border-style: none; } #edxahdlkim p { margin: 0; padding: 0; } #edxahdlkim .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #edxahdlkim .gt_caption { padding-top: 4px; padding-bottom: 4px; } #edxahdlkim .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #edxahdlkim .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #edxahdlkim .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #edxahdlkim .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #edxahdlkim .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #edxahdlkim .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #edxahdlkim .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #edxahdlkim .gt_column_spanner_outer:first-child { padding-left: 0; } #edxahdlkim .gt_column_spanner_outer:last-child { padding-right: 0; } #edxahdlkim .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #edxahdlkim .gt_spanner_row { border-bottom-style: hidden; } #edxahdlkim .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #edxahdlkim .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #edxahdlkim .gt_from_md > :first-child { margin-top: 0; } #edxahdlkim .gt_from_md > :last-child { margin-bottom: 0; } #edxahdlkim .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #edxahdlkim .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #edxahdlkim .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #edxahdlkim .gt_row_group_first td { border-top-width: 2px; } #edxahdlkim .gt_row_group_first th { border-top-width: 2px; } #edxahdlkim .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #edxahdlkim .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #edxahdlkim .gt_first_summary_row.thick { border-top-width: 2px; } #edxahdlkim .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #edxahdlkim .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #edxahdlkim .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #edxahdlkim .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #edxahdlkim .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #edxahdlkim .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #edxahdlkim .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #edxahdlkim .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #edxahdlkim .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #edxahdlkim .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #edxahdlkim .gt_left { text-align: left; } #edxahdlkim .gt_center { text-align: center; } #edxahdlkim .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #edxahdlkim .gt_font_normal { font-weight: normal; } #edxahdlkim .gt_font_bold { font-weight: bold; } #edxahdlkim .gt_font_italic { font-style: italic; } #edxahdlkim .gt_super { font-size: 65%; } #edxahdlkim .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #edxahdlkim .gt_asterisk { font-size: 100%; vertical-align: 0; } #edxahdlkim .gt_indent_1 { text-indent: 5px; } #edxahdlkim .gt_indent_2 { text-indent: 10px; } #edxahdlkim .gt_indent_3 { text-indent: 15px; } #edxahdlkim .gt_indent_4 { text-indent: 20px; } #edxahdlkim .gt_indent_5 { text-indent: 25px; } TABLE 11.1: Percentage of Votes by Candidate for Different Missing Data Inclusions Candidate Including All Missing Data Removing Skip Patterns Only Removing All Missing Data % s.e. (%) % s.e. (%) % s.e. (%) Did not Vote for President in 2016 32.4% 0.9% NA NA NA NA Hillary Clinton 33.0% 0.7% 48.8% 0.9% 49.2% 0.9% Donald Trump 29.9% 0.7% 44.3% 0.9% 44.7% 0.9% Other Candidate 4.1% 0.2% 6.1% 0.3% 6.1% 0.3% Missing 0.6% 0.1% 0.9% 0.2% NA NA As Table 11.1 shows, the results can vary greatly depending on which type of missing data that are removed. If we remove only the skip patterns the margin between the Clinton and Trump is 4.5 percentage points, but if we include all data even including those that did not vote in 2016, the margin is 3.1 percentage points. How we handle the different types of missing values is important for interpretation of the data. References DeBell, Matthew. 2010. “How to Analyze ANES Survey Data.” ANES Technical Report Series nes012492. Palo Alto, CA: Stanford University; Ann Arbor, MI: the University of Michigan; https://electionstudies.org/wp-content/uploads/2018/05/HowToAnalyzeANESData.pdf. Kim, Jae Kwang, and Jun Shao. 2021. Statistical Methods for Handling Incomplete Data. Chapman &amp; Hall/CRC Press. Mack, Christina, Zhaohui Su, and Daniel Westreich. 2018. “Types of Missing Data.” In Managing Missing Data in Patient Registries: Addendum to Registries for Evaluating Patient Outcomes: A User’s Guide, Third Edition [Internet]. Rockville (MD): Agency for Healthcare Research; Quality (US); https://www.ncbi.nlm.nih.gov/books/NBK493614/. Schafer, Joseph L, and John W Graham. 2002. “Missing Data: Our View of the State of the Art.” Psychological Methods 7: 147–77. https://doi.org/10.1037//1082-989X.7.2.147. Tierney, Nicholas. 2017. “Visdat: Visualising Whole Data Frames.” JOSS 2 (16): 355. https://doi.org/10.21105/joss.00355. Tierney, Nicholas, and Dianne Cook. 2023. “Expanding Tidy Data Principles to Facilitate Missing Data Exploration, Visualization and Assessment of Imputations.” Journal of Statistical Software 105 (7): 1–31. https://doi.org/10.18637/jss.v105.i07. Valliant, Richard, and Jill A. Dever. 2018. Survey Weights: A Step-by-Step Guide to Calculation. Stata Press. "],["c12-recommendations.html", "Chapter 12 Successful survey analysis recommendations 12.1 Introduction 12.2 Follow survey analysis process 12.3 Begin with descriptive analysis 12.4 Check variable types 12.5 Improve debugging skills 12.6 Think critically about conclusions", " Chapter 12 Successful survey analysis recommendations Prerequisites For this chapter, load the following packages: library(tidyverse) library(survey) library(srvyr) library(srvyrexploR) To illustrate the importance of data visualization, we will discuss Anscombe’s Quartet. The dataset can be replicated by running the code below: anscombe_tidy &lt;- anscombe %&gt;% mutate(observation = row_number()) %&gt;% pivot_longer(-observation, names_to = &quot;key&quot;, values_to = &quot;value&quot;) %&gt;% separate(key, c(&quot;variable&quot;, &quot;set&quot;), 1, convert = TRUE) %&gt;% mutate(set = c(&quot;I&quot;, &quot;II&quot;, &quot;III&quot;, &quot;IV&quot;)[set]) %&gt;% pivot_wider(names_from = variable, values_from = value) We create an example survey dataset to explain potential pitfalls and how to overcome them in survey analysis. To recreate the dataset, run the code below: example_srvy &lt;- tribble( ~id, ~region, ~q_d1, ~q_d2_1, ~gender, ~weight, 1L, 1L, 1L, &quot;Somewhat interested&quot;, &quot;female&quot;, 1740, 2L, 1L, 1L, &quot;Not at all interested&quot;, &quot;female&quot;, 1428, 3L, 2L, NA, &quot;Somewhat interested&quot;, &quot;female&quot;, 496, 4L, 2L, 1L, &quot;Not at all interested&quot;, &quot;female&quot;, 550, 5L, 3L, 1L, &quot;Somewhat interested&quot;, &quot;female&quot;, 1762, 6L, 4L, NA, &quot;Very interested&quot;, &quot;female&quot;, 1004, 7L, 4L, NA, &quot;Somewhat interested&quot;, &quot;female&quot;, 522, 8L, 3L, 2L, &quot;Not at all interested&quot;, &quot;female&quot;, 1099, 9L, 4L, 2L, &quot;Somewhat interested&quot;, &quot;female&quot;, 1295, 10L, 2L, 2L, &quot;Somewhat interested&quot;, &quot;male&quot;, 983 ) example_des &lt;- example_srvy %&gt;% as_survey_design(weights = weight) 12.1 Introduction The previous chapters in this book aimed to provide the technical skills and knowledge required for running survey analyses. This chapter builds upon the previously mentioned best practices to present a curated set of recommendations for running a successful survey analysis. We hope this list equips you with practical insights that assist in producing meaningful and reliable results. 12.2 Follow survey analysis process As we first introduced in Chapter 4 (Section 4.3), there are four main steps to successfully analyze survey data: Create a tbl_svy object (a survey object) using: as_survey_design() or as_survey_rep() Subset data (if needed) using filter() (to create subpopulations) Specify domains of analysis using group_by() Within summarize(), specify variables to calculate, including means, totals, proportions, quantiles, and more The order of these steps matters in survey analysis. For example, if we need to subset the data, we must use filter() on our data after creating the survey design. If we do this before the survey design is created, we may not be correctly accounting for the study design, resulting in incorrect findings. Additionally, correctly identifying the survey design is one of the most important steps in survey analysis. Knowing the type of sample design (e.g., clustered, stratified) will help ensure the underlying error structure is correctly calculated and weights are correctly used. Reviewing the documentation (see Chapter 3) will help us understand what variables to use from the data. Learning about complex design factors such as clustering, stratification, and weighting is foundational to complex survey analysis, and we recommend that all analysts review Chapter 10 before creating their first design object. Making sure to use the survey analysis functions from the {srvyr} and {survey} packages is also important in survey analysis. For example, using mean() and survey_mean() on the same data will result in different findings and outputs. Each of the survey functions from {srvyr} and {survey} impacts standard errors and variance, and we cannot treat complex surveys as unweighted simple random samples if we want to produce unbiased estimates (Freedman Ellis and Schneider 2023; Lumley 2010). 12.3 Begin with descriptive analysis When receiving a fresh batch of data, it’s tempting to jump right into running models to find significant results. However, a successful data analyst begins by exploring the dataset. This involves running descriptive analysis on the dataset as a whole, as well as individual variables and combinations of variables. As described in Chapter 5, descriptive analyses should always precede statistical analysis to prevent avoidable (and potentially embarrassing) mistakes. 12.3.1 Table review Even before applying weights, consider running cross-tabulations on the raw data. Crosstabs can help us see if any patterns stand out that may be alarming or something worth further investigating. For example, let’s explore the example survey dataset introduced in the Prerequisites box, example_srvy. We run the code below on the unweighted data to inspect the gender variable: example_srvy %&gt;% group_by(gender) %&gt;% summarise(n = n()) ## # A tibble: 2 × 2 ## gender n ## &lt;chr&gt; &lt;int&gt; ## 1 female 9 ## 2 male 1 The data shows that males comprise 1 out of 10, or 10%, of the sample. Generally, we assume something close to a 50/50 split between male and female respondents in a population. The sizeable female proportion could indicate either a unique sample or a potential error in the data. If we review the survey documentation and see this was a deliberate part of the design, we can continue our analysis using the appropriate methods. If this was not an intentional choice by the researchers, the results alert us that something may be incorrect in the data or our code, and we can verify if there’s an issue by comparing the results with the weighted means. 12.3.2 Graphical review Tables provide a quick check of our assumptions, but there is no substitute for graphs and plots to visualize the distribution of data. We might miss outliers or nuances if we scan only summary statistics. For example, Anscombe’s Quartet demonstrates the importance of visualization in analysis. Let’s say we have a dataset with x- and y- variables in an object called anscombe_tidy. Let’s take a look at how the da taset is structured: head(anscombe_tidy) ## # A tibble: 6 × 4 ## observation set x y ## &lt;int&gt; &lt;chr&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 1 I 10 8.04 ## 2 1 II 10 9.14 ## 3 1 III 10 7.46 ## 4 1 IV 8 6.58 ## 5 2 I 8 6.95 ## 6 2 II 8 8.14 We can begin by checking one set of variables. For Set I, the x-variables have an average of 9 with a standard deviation of 3.3; for y, we have an average of 7.5 with a standard deviation of 2.03. The two variables have a correlation of 0.81. anscombe_tidy %&gt;% filter(set == &quot;I&quot;) %&gt;% summarize( x_mean = mean(x), x_sd = sd(x), y_mean = mean(y), y_sd = sd(y), correlation = cor(x, y) ) ## # A tibble: 1 × 5 ## x_mean x_sd y_mean y_sd correlation ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 9 3.32 7.50 2.03 0.816 These are useful statistics. We can note that the data doesn’t have high variability, and the two variables are strongly correlated. Now, let’s check all the sets (I-IV) in the Anscombe data. Notice anything interesting? anscombe_tidy %&gt;% group_by(set) %&gt;% summarize( x_mean = mean(x), x_sd = sd(x, na.rm = TRUE), y_mean = mean(y), y_sd = sd(y, na.rm = TRUE), correlation = cor(x, y) ) ## # A tibble: 4 × 6 ## set x_mean x_sd y_mean y_sd correlation ## &lt;chr&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 I 9 3.32 7.50 2.03 0.816 ## 2 II 9 3.32 7.50 2.03 0.816 ## 3 III 9 3.32 7.5 2.03 0.816 ## 4 IV 9 3.32 7.50 2.03 0.817 The summary results for these four sets are nearly identical! Based on this, we might assume that each distribution is similar. Let’s look at a data visualization to see if our assumption is correct. ggplot(anscombe_tidy, aes(x, y)) + geom_point() + facet_wrap( ~ set) + geom_smooth(method = &quot;lm&quot;, se = FALSE, alpha = 0.5) + theme_minimal() ## `geom_smooth()` using formula = &#39;y ~ x&#39; Although each of the four sets has the same summary statistics and regression line, when reviewing the plots, it becomes apparent that the distributions of the data are not the same at all. Each set of points results in different shapes and distributions. Imagine sharing each set (I-IV) and the corresponding plot with a different colleague. The interpretations and descriptions of the data would be very different even though the statistics are similar. Plotting data can also ensure that we are using the correct analysis method on the data, so understanding the underlying distributions is an important first step. With survey data, we may not always have continuous data that we can plot like Anscombe’s Quartet. However, if the dataset does contain continuous data or other types of data that would benefit from a visual representation, we recommend taking the time to graph distributions and correlations. 12.4 Check variable types When we pull the data from surveys into R, the data may be listed as character, factor, numeric, or logical/Boolean. The tidyverse functions that read in data (e.g., read_csv(), read_excel()) default to have all strings load as character variables. This is important when dealing with survey data, as many strings may be better suited for factors than character variables. For example, let’s revisit the example_srvy data. Taking a glimpse() of the data gives us insight into what it contains: example_srvy %&gt;% glimpse() ## Rows: 10 ## Columns: 6 ## $ id &lt;int&gt; 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 ## $ region &lt;int&gt; 1, 1, 2, 2, 3, 4, 4, 3, 4, 2 ## $ q_d1 &lt;int&gt; 1, 1, NA, 1, 1, NA, NA, 2, 2, 2 ## $ q_d2_1 &lt;chr&gt; &quot;Somewhat interested&quot;, &quot;Not at all interested&quot;, &quot;Somewh… ## $ gender &lt;chr&gt; &quot;female&quot;, &quot;female&quot;, &quot;female&quot;, &quot;female&quot;, &quot;female&quot;, &quot;fema… ## $ weight &lt;dbl&gt; 1740, 1428, 496, 550, 1762, 1004, 522, 1099, 1295, 983 The output shows that q_d2_1 is a character variable, but the values of that variable show three options (Very interested / Somewhat interested / Not at all interested). In this case, we will most likely want to change q_d2_1 to be a factor variable and order the factor levels to indicate that this is an ordinal variable. Here is some code on how we might approach this task using the {forcats} package (Wickham 2023a): example_srvy_fct &lt;- example_srvy %&gt;% mutate(q_d2_1_fct = factor( q_d2_1, levels = c(&quot;Very interested&quot;, &quot;Somewhat interested&quot;, &quot;Not at all interested&quot;) )) example_srvy_fct %&gt;% glimpse() ## Rows: 10 ## Columns: 7 ## $ id &lt;int&gt; 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 ## $ region &lt;int&gt; 1, 1, 2, 2, 3, 4, 4, 3, 4, 2 ## $ q_d1 &lt;int&gt; 1, 1, NA, 1, 1, NA, NA, 2, 2, 2 ## $ q_d2_1 &lt;chr&gt; &quot;Somewhat interested&quot;, &quot;Not at all interested&quot;, &quot;So… ## $ gender &lt;chr&gt; &quot;female&quot;, &quot;female&quot;, &quot;female&quot;, &quot;female&quot;, &quot;female&quot;, &quot;… ## $ weight &lt;dbl&gt; 1740, 1428, 496, 550, 1762, 1004, 522, 1099, 1295, … ## $ q_d2_1_fct &lt;fct&gt; Somewhat interested, Not at all interested, Somewha… example_srvy_fct %&gt;% count(q_d2_1_fct, q_d2_1) ## # A tibble: 3 × 3 ## q_d2_1_fct q_d2_1 n ## &lt;fct&gt; &lt;chr&gt; &lt;int&gt; ## 1 Very interested Very interested 1 ## 2 Somewhat interested Somewhat interested 6 ## 3 Not at all interested Not at all interested 3 This example data also includes a column called region, which is imported as a number (&lt;int&gt;). This is a good hint to use the questionnaire and codebook along with the data to find out if the values actually reflect a number or are perhaps a coded categorical variable (see Chapter 3 for more details). R will calculate the mean even if it is not appropriate, leading to the common mistake of applying an average to categorical values instead of a proportion function. For example, for ease of coding, we may use the across() function to calculate the mean across all numeric variables: example_des %&gt;% select(-weight) %&gt;% summarize(across(where(is.numeric), ~ survey_mean(.x, na.rm = TRUE))) ## # A tibble: 1 × 6 ## id id_se region region_se q_d1 q_d1_se ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 5.24 1.12 2.49 0.428 1.38 0.196 In this example, if we do not adjust region to be a factor variable type, we might accidentally report an average region of 2.49 in our findings which is meaningless. Checking that our variables are appropriate will avoid this pitfall and ensure the measures and models are suitable for the variable type. 12.5 Improve debugging skills It is common for analysts working in R to come across warning or error messages, and learning how to debug these messages (i.e., find and fix issues) ensures we can proceed with our work and avoid potential mistakes. We’ve discussed a few examples in this book. For example, if we calculate an average with survey_mean() and get NA instead of a number, it may be because our column has missing values. example_des %&gt;% summarize(mean = survey_mean(q_d1)) ## # A tibble: 1 × 2 ## mean mean_se ## &lt;dbl&gt; &lt;dbl&gt; ## 1 NA NaN Including the na.rm = TRUE would resolve the issue: example_des %&gt;% summarize(mean = survey_mean(q_d1, na.rm = TRUE)) ## # A tibble: 1 × 2 ## mean mean_se ## &lt;dbl&gt; &lt;dbl&gt; ## 1 1.38 0.196 Another common error message that we may see with survey analysis may look something like the following: example_des %&gt;% svyttest(q_d1~gender) ## Error in UseMethod(&quot;svymean&quot;, design): no applicable method for &#39;svymean&#39; applied to an object of class &quot;formula&quot; In this case, we need to remember that with functions from the {survey} packages like svyttest(), the design object is not the first argument, and we have to use the dot (.) notation (see Chapter 6). Adding in the named argument of design=. will fix this error. example_des %&gt;% svyttest(q_d1 ~ gender, design = .) ## ## Design-based t-test ## ## data: q_d1 ~ gender ## t = 3.5, df = 5, p-value = 0.02 ## alternative hypothesis: true difference in mean is not equal to 0 ## 95 percent confidence interval: ## 0.1878 1.2041 ## sample estimates: ## difference in mean ## 0.696 Often, debugging involves interpreting the message from R. For example, if our code results in this error: Error in `contrasts&lt;-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : contrasts can be applied only to factors with 2 or more levels We can see that the error has to do with a function requiring a factor with two or more levels and that it has been applied to something else. This ties back to our section on using appropriate variable types. We can check the variable of interest to examine whether it’s the correct type. The internet also offers many resources for debugging. Searching for a specific error message can often lead to a solution. In addition, we can post on community forums like Posit Community for direct help from others. 12.6 Think critically about conclusions Once we have our findings, we need to learn to think critically about our findings. As mentioned in Chapter 2, many aspects of the study design can impact our interpretation of the results, for example, the number and types of response options provided to the respondent or who was asked the question (both thinking about the full sample and any skip patterns). Knowing the overall study design can help us accurately think through what the findings may mean and identify any issues with our analyses. Additionally, we should make sure that our survey design object is correctly defined (see Chapter 10), carefully consider how we are managing missing data (see Chapter 11), and follow statistical analysis procedures such as avoiding model overfitting by using too many variables in our formulas. These considerations allow us to conduct our analyses and review findings for statistically significant results. It’s important to note that even significant results do not mean that they are meaningful or important. A large enough sample can produce statistically significant results. Therefore, we want to look at our results in context, such as comparing them with results from other studies or analyzing them in conjunction with confidence intervals and other measures. Communicating the results (see Chapter 8) in an unbiased manner is also a critical step in any analysis project. If we present results without error measures or only present results that support our initial hypotheses, we are not thinking critically and may incorrectly represent the data. As survey data analysts, we often interpret the survey data for the public. We must ensure that we are the best stewards of the data and work to bring light to meaningful and interesting findings that the public will want and need to know about. References Freedman Ellis, Greg, and Ben Schneider. 2023. srvyr: Dplyr-Like Syntax for Summary Statistics of Survey Data. Lumley, Thomas. 2010. Complex Surveys: A Guide to Analysis Using R. John Wiley; Sons. ———. 2023a. forcats: Tools for Working with Categorical Variables (Factors). "],["c13-ncvs-vignette.html", "Chapter 13 National Crime Victimization Survey Vignette 13.1 Introduction 13.2 Data structure 13.3 Survey notation 13.4 Data file preparation 13.5 Survey design objects 13.6 Calculating estimates 13.7 Statistical testing 13.8 Exercises", " Chapter 13 National Crime Victimization Survey Vignette Prerequisites For this chapter, load the following packages: library(tidyverse) library(survey) library(srvyr) library(srvyrexploR) library(gt) We will use data from the United States National Crime Victimization Survey (NCVS). These data are available in the {srvyrexploR} package as ncvs_2021_incident, ncvs_2021_household, and ncvs_2021_person. 13.1 Introduction The NCVS is a household survey sponsored by the Bureau of Justice Statistics (BJS), which collects data on criminal victimization, including characteristics of the crimes, offenders, and victims. Crime types include both household and personal crimes, as well as violent and non-violent crimes. The target population of this survey is all people in the United States age 12 and older living in housing units and noninstitutional group quarters. The NCVS has been ongoing since 1992. An earlier survey, the National Crime Survey, was run from 1972 to 1991 (Bureau of Justice Statistics 2017). The survey is administered using a rotating panel. When an address enters the sample, the residents of that address are interviewed every six months for a total of seven interviews. If the initial residents move away from the address during the period, the new residents are included in the survey, as people are not followed when they move. NCVS data is publicly available and distributed by Inter-university Consortium for Political and Social Research (ICPSR), with data going back to 1992. The vignette in this book will include data from 2021 (United States. Bureau of Justice Statistics 2022). The NCVS data structure is complicated, and the User’s Guide contains examples for analysis in SAS, SUDAAN, SPSS, and Stata, but not R (Shook-Sa, Bonnie, Couzens, G. Lance, and Berzofsky, Marcus 2015). This vignette will adapt those examples for R. 13.2 Data structure The data from ICPSR is distributed with five files, each having its unique identifier indicated: Address Record - YEARQ, IDHH Household Record - YEARQ, IDHH Person Record - YEARQ, IDHH, IDPER Incident Record - YEARQ, IDHH, IDPER 2021 Collection Year Incident - YEARQ, IDHH, IDPER We will focus on the household, person, and incident files. From these files, we selected a subset of columns for examples to use in this vignette. We have included data in the {srvyexploR} package with a subset of columns, but you can download the complete files at ICPSR (United States. Bureau of Justice Statistics 2022). 13.3 Survey notation The NCVS User Guide (Shook-Sa, Bonnie, Couzens, G. Lance, and Berzofsky, Marcus 2015) uses the following notation: \\(i\\) represents NCVS households, identified on the household-level file with the household identification number IDHH. \\(j\\) represents NCVS individual respondents within households \\(i\\), identified on the person-level file with the person identification number IDPER. \\(k\\) represents reporting periods (i.e., YEARQ) for households \\(i\\) and individual respondent \\(j\\). \\(l\\) represents victimization records for respondent \\(j\\) in household \\(i\\) and reporting period \\(k\\). Each record on the NCVS incident-level file is associated with a victimization record \\(l\\). \\(D\\) represents one or more domain characteristics of interest in the calculation of NCVS estimates. For victimization totals and proportions, domains can be defined on the basis of crime types (e.g., violent crimes, property crimes), characteristics of victims (e.g., age, sex, household income), or characteristics of the victimizations (e.g., victimizations reported to police, victimizations committed with a weapon present). Domains could also be a combination of all of these types of characteristics. For example, in the calculation of victimization rates, domains are defined on the basis of the characteristics of the victims. \\(A_a\\) represents the level \\(a\\) of covariate \\(A\\). Covariate \\(A\\) is defined in the calculation of victimization proportions and represents the characteristic for which the analyst wants to obtain the distribution of victimizations in domain \\(D\\). \\(C\\) represents the personal or property crime for which we want to obtain a victimization rate. In this vignette, we will discuss four estimates: Victimization totals estimate the number of criminal victimizations with a given characteristic. As demonstrated below, these can be calculated from any of the data files. The estimated victimization total, \\(\\hat{t}_D\\) for domain \\(D\\) is estimated as \\[ \\hat{t}_D = \\sum_{ijkl \\in D} v_{ijkl}\\] where \\(v_{ijkl}\\) is the series-adjusted victimization weight for household \\(i\\), respondent \\(j\\), reporting period \\(k\\), and victimization \\(l\\), that is WGTVICCY. Victimization proportions estimate characteristics among victimizations or victims. Victimization proportions are calculated using the incident data file. The estimated victimization proportion for domain \\(D\\) across level \\(a\\) of covariate \\(A\\), \\(\\hat{p}_{A_a,D}\\) is \\[ \\hat{p}_{A_a,D} =\\frac{\\sum_{ijkl \\in A_a, D} v_{ijkl}}{\\sum_{ijkl \\in D} v_{ijkl}}.\\] The numerator is the number of incidents with a particular characteristic in a domain, and the denominator is the number of incidents in a domain. Victimization rates are estimates of the number of victimizations per 1,000 persons or households in the population27. Victimization rates are calculated using the household or person-level data files. The estimated victimization rate for crime \\(C\\) in domain \\(D\\) is \\[\\hat{VR}_{C,D}= \\frac{\\sum_{ijkl \\in C,D} v_{ijkl}}{\\sum_{ijk \\in D} w_{ijk}}\\times 1000\\] where \\(w_{ijk}\\) is the person weight (WGTPERCY) or household weight (WGTHHCY) for personal and household crimes, respectively. The numerator is the number of incidents in a domain, and the denominator is the number of persons or households in a domain. Notice that the weights in the numerator and denominator are different - this is important, and in the syntax and examples below, we will discuss how to make an estimate that involves two weights. Prevalence rates are estimates of the percentage of the population (persons or households) who are victims of a crime. These are estimated using the household or person-level data files. The estimated prevalence rate for crime \\(C\\) in domain \\(D\\) is \\[ \\hat{PR}_{C, D}= \\frac{\\sum_{ijk \\in {C,D}} I_{ij}w_{ijk}}{\\sum_{ijk \\in D} w_{ijk}} \\times 100\\] where \\(I_{ij}\\) is an indicator that a person or household in domain \\(D\\) was a victim of crime \\(C\\) at any time in the year. The numerator is the number of victims in domain \\(D\\) for crime \\(C\\), and the denominator is the number of people or households in the population. 13.4 Data file preparation Some work is necessary to prepare the files before analysis. The design variables indicating pseudostratum (V2117) and half-sample code (V2118) are only included on the household file, so they must be added to the person and incident files for any analysis. For victimization rates, we need to know the victimization status for both victims and non-victims. Therefore, the incident file must be summarized and merged onto the household or person files for household-level and person-level crimes, respectively. We begin this vignette by discussing how to create these incident summary files. This is following Section 2.2 of the NCVS User’s Guide (Shook-Sa, Bonnie, Couzens, G. Lance, and Berzofsky, Marcus 2015). 13.4.1 Preparing files for estimation of victimization rates Each record on the incident file represents one victimization, which is not the same as one incident. Some victimizations have several instances that make it difficult for the victim to differentiate the details of these incidents, labeled as “series crimes”. Appendix A of the User’s Guide indicates how to calculate the series weight in other statistical languages. Here, we adapt that code for R. Essentially, if a victimization is a series crime, its series weight is top-coded at 10 based on the number of actual victimizations, that is that even if the crime repeatedly occurred more than 10 times, it is counted as 10 times to reduce the influence of extreme outliers. If an incident is a series crime, but the number of occurrences is unknown, the series weight is set to 6. A description of the variables used to create indicators of series and the associated weights is included in Table 13.1. TABLE 13.1: Codebook for incident variables - related to series weight Description Value Label V4016 How many times incident occur last 6 mos 1-996 Number of times 997 Don’t know V4017 How many incidents 1 1-5 incidents (not a “series”) 2 6 or more incidents 8 Residue (invalid data) V4018 Incidents similar in detail 1 Similar 2 Different (not in a “series”) 8 Residue (invalid data) V4019 Enough detail to distinguish incidents 1 Yes (not a “series”) 2 No (is a “series”) 8 Residue (invalid data) WGTVICCY Adjusted victimization weight Numeric We want to create four variables to indicate if an incident is a series crime. First, we create a variable called series using V4017, V4018, and V4019 where an incident is considered a series crime if there are 6 or more incidents (V4107), the incidents are similar in detail (V4018), or there is not enough detail to distinguish the incidents (V4019). Next, we top-code the number of incidents (V4016) by creating a variable n10v4016 which is set to 10 if V4016 &gt; 10. Finally, we create the series weight using our new top-coded variable and the existing weight. inc_series &lt;- ncvs_2021_incident %&gt;% mutate( series = case_when(V4017 %in% c(1, 8) ~ 1, V4018 %in% c(2, 8) ~ 1, V4019 %in% c(1, 8) ~ 1, TRUE ~ 2 ), n10v4016 = case_when(V4016 %in% c(997, 998) ~ NA_real_, V4016 &gt; 10 ~ 10, TRUE ~ V4016), serieswgt = case_when(series == 2 &amp; is.na(n10v4016) ~ 6, series == 2 ~ n10v4016, TRUE ~ 1), NEWWGT = WGTVICCY * serieswgt ) The next step in preparing the files for estimation is to create indicators on the victimization file for characteristics of interest. Almost all BJS publications limit the analysis to records where the victimization occurred in the United States, where V4022 is not equal to 1, and we will do this for all estimates as well. A brief codebook of variables for this task is located in Table 13.2 TABLE 13.2: Codebook for incident variables - crime type indicators and characteristics Variable Description Value Label V4022 In what city/town/village 1 Outside U.S. 2 Not inside a city/town/village 3 Same city/town/village as present residence 4 Different city/town/village as present residence 5 Don’t know 6 Don’t know if 2, 4, or 5 V4049 Did offender have weapon 1 Yes 2 No 3 Don’t know V4050 What was weapon 1 At least one good entry 3 Indicates “Yes-Type Weapon-NA” 7 Indicates “Gun Type Unknown” 8 No good entry V4051 Hand gun 0 No 1 Yes V4052 Other gun 0 No 1 Yes V4053 Knife 0 No 1 Yes V4399 Reported to police 1 Yes 2 No 3 Don’t know V4529 Type of crime code 01 Completed rape 02 Attempted rape 03 Sexual attack with serious assault 04 Sexual attack with minor assault 05 Completed robbery with injury from serious assault 06 Completed robbery with injury from minor assault 07 Completed robbery without injury from minor assault 08 Attempted robbery with injury from serious assault 09 Attempted robbery with injury from minor assault 10 Attempted robbery without injury 11 Completed aggravated assault with injury 12 Attempted aggravated assault with weapon 13 Threatened assault with weapon 14 Simple assault completed with injury 15 Sexual assault without injury 16 Unwanted sexual contact without force 17 Assault without weapon without injury 18 Verbal threat of rape 19 Verbal threat of sexual assault 20 Verbal threat of assault 21 Completed purse snatching 22 Attempted purse snatching 23 Pocket picking (completed only) 31 Completed burglary, forcible entry 32 Completed burglary, unlawful entry without force 33 Attempted forcible entry 40 Completed motor vehicle theft 41 Attempted motor vehicle theft 54 Completed theft less than $10 55 Completed theft $10 to $49 56 Completed theft $50 to $249 57 Completed theft $250 or greater 58 Completed theft value NA 59 Attempted theft Using these variables, we will create the following indicators: Property crime V4529 &gt;= 31 Variable: Property Violent crime V4529 &lt;= 20 Variable: Violent Property crime reported to the police V4529 &gt;= 31 and V4399=1 Variable: Property_ReportPolice Violent crime reported to the police V4529 &lt; 31 and V4399=1 Variable: Violent_ReportPolice Aggravated assault without a weapon V4529 in 11:12 and V4049=2 Variable: AAST_NoWeap Aggravated assault with a firearm V4529 in 11:12 and V4049=1 and (V4051=1 or V4052=1 or V4050=7) Variable: AAST_Firearm Aggravated assault with a knife or sharp object V4529 in 11:12 and V4049=1 and (V4053=1 or V4054=1) Variable: AAST_Knife Aggravated assault with another type of weapon V4529 in 11:12 and V4049=1 and V4050=1 and not firearm or knife Variable: AAST_Other inc_ind &lt;- inc_series %&gt;% filter(V4022 != 1) %&gt;% mutate( WeapCat = case_when( is.na(V4049) ~ NA_character_, V4049 == 2 ~ &quot;NoWeap&quot;, V4049 == 3 ~ &quot;UnkWeapUse&quot;, V4050 == 3 ~ &quot;Other&quot;, V4051 == 1 | V4052 == 1 | V4050 == 7 ~ &quot;Firearm&quot;, V4053 == 1 | V4054 == 1 ~ &quot;Knife&quot;, TRUE ~ &quot;Other&quot; ), V4529_num = parse_number(as.character(V4529)), ReportPolice = V4399 == 1, Property = V4529_num &gt;= 31, Violent = V4529_num &lt;= 20, Property_ReportPolice = Property &amp; ReportPolice, Violent_ReportPolice = Violent &amp; ReportPolice, AAST = V4529_num %in% 11:13, AAST_NoWeap = AAST &amp; WeapCat == &quot;NoWeap&quot;, AAST_Firearm = AAST &amp; WeapCat == &quot;Firearm&quot;, AAST_Knife = AAST &amp; WeapCat == &quot;Knife&quot;, AAST_Other = AAST &amp; WeapCat == &quot;Other&quot; ) This is a good point to pause to look at the output of crosswalks between an original variable and a derived one to check that the logic was programmed correctly and that everything ends up in the expected category. inc_series %&gt;% count(V4022) ## # A tibble: 6 × 2 ## V4022 n ## &lt;fct&gt; &lt;int&gt; ## 1 1 34 ## 2 2 65 ## 3 3 7697 ## 4 4 1143 ## 5 5 39 ## 6 8 4 inc_ind %&gt;% count(V4022) ## # A tibble: 5 × 2 ## V4022 n ## &lt;fct&gt; &lt;int&gt; ## 1 2 65 ## 2 3 7697 ## 3 4 1143 ## 4 5 39 ## 5 8 4 inc_ind %&gt;% count(WeapCat, V4049, V4050, V4051, V4052, V4052, V4053, V4054) ## # A tibble: 13 × 8 ## WeapCat V4049 V4050 V4051 V4052 V4053 V4054 n ## &lt;chr&gt; &lt;fct&gt; &lt;fct&gt; &lt;fct&gt; &lt;fct&gt; &lt;fct&gt; &lt;fct&gt; &lt;int&gt; ## 1 Firearm 1 1 0 1 0 0 15 ## 2 Firearm 1 1 0 1 1 1 1 ## 3 Firearm 1 1 1 0 0 0 125 ## 4 Firearm 1 1 1 0 1 0 2 ## 5 Firearm 1 1 1 1 0 0 3 ## 6 Firearm 1 7 0 0 0 0 3 ## 7 Knife 1 1 0 0 0 1 14 ## 8 Knife 1 1 0 0 1 0 71 ## 9 NoWeap 2 &lt;NA&gt; &lt;NA&gt; &lt;NA&gt; &lt;NA&gt; &lt;NA&gt; 1794 ## 10 Other 1 1 0 0 0 0 147 ## 11 Other 1 3 0 0 0 0 26 ## 12 UnkWeapUse 3 &lt;NA&gt; &lt;NA&gt; &lt;NA&gt; &lt;NA&gt; &lt;NA&gt; 519 ## 13 &lt;NA&gt; &lt;NA&gt; &lt;NA&gt; &lt;NA&gt; &lt;NA&gt; &lt;NA&gt; &lt;NA&gt; 6228 inc_ind %&gt;% count(V4529, Property, Violent, AAST) %&gt;% print(n = 40) ## # A tibble: 34 × 5 ## V4529 Property Violent AAST n ## &lt;fct&gt; &lt;lgl&gt; &lt;lgl&gt; &lt;lgl&gt; &lt;int&gt; ## 1 1 FALSE TRUE FALSE 45 ## 2 2 FALSE TRUE FALSE 20 ## 3 3 FALSE TRUE FALSE 11 ## 4 4 FALSE TRUE FALSE 3 ## 5 5 FALSE TRUE FALSE 24 ## 6 6 FALSE TRUE FALSE 26 ## 7 7 FALSE TRUE FALSE 59 ## 8 8 FALSE TRUE FALSE 5 ## 9 9 FALSE TRUE FALSE 7 ## 10 10 FALSE TRUE FALSE 57 ## 11 11 FALSE TRUE TRUE 97 ## 12 12 FALSE TRUE TRUE 91 ## 13 13 FALSE TRUE TRUE 163 ## 14 14 FALSE TRUE FALSE 165 ## 15 15 FALSE TRUE FALSE 24 ## 16 16 FALSE TRUE FALSE 12 ## 17 17 FALSE TRUE FALSE 357 ## 18 18 FALSE TRUE FALSE 14 ## 19 19 FALSE TRUE FALSE 3 ## 20 20 FALSE TRUE FALSE 607 ## 21 21 FALSE FALSE FALSE 2 ## 22 22 FALSE FALSE FALSE 2 ## 23 23 FALSE FALSE FALSE 19 ## 24 31 TRUE FALSE FALSE 248 ## 25 32 TRUE FALSE FALSE 634 ## 26 33 TRUE FALSE FALSE 188 ## 27 40 TRUE FALSE FALSE 256 ## 28 41 TRUE FALSE FALSE 97 ## 29 54 TRUE FALSE FALSE 407 ## 30 55 TRUE FALSE FALSE 1006 ## 31 56 TRUE FALSE FALSE 1686 ## 32 57 TRUE FALSE FALSE 1420 ## 33 58 TRUE FALSE FALSE 798 ## 34 59 TRUE FALSE FALSE 395 inc_ind %&gt;% count(ReportPolice, V4399) ## # A tibble: 4 × 3 ## ReportPolice V4399 n ## &lt;lgl&gt; &lt;fct&gt; &lt;int&gt; ## 1 FALSE 2 5670 ## 2 FALSE 3 103 ## 3 FALSE 8 12 ## 4 TRUE 1 3163 inc_ind %&gt;% count(AAST, WeapCat, AAST_NoWeap, AAST_Firearm, AAST_Knife, AAST_Other) ## # A tibble: 11 × 7 ## AAST WeapCat AAST_NoWeap AAST_Firearm AAST_Knife AAST_Other n ## &lt;lgl&gt; &lt;chr&gt; &lt;lgl&gt; &lt;lgl&gt; &lt;lgl&gt; &lt;lgl&gt; &lt;int&gt; ## 1 FALSE Firearm FALSE FALSE FALSE FALSE 34 ## 2 FALSE Knife FALSE FALSE FALSE FALSE 23 ## 3 FALSE NoWeap FALSE FALSE FALSE FALSE 1769 ## 4 FALSE Other FALSE FALSE FALSE FALSE 27 ## 5 FALSE UnkWeapUse FALSE FALSE FALSE FALSE 516 ## 6 FALSE &lt;NA&gt; FALSE FALSE FALSE FALSE 6228 ## 7 TRUE Firearm FALSE TRUE FALSE FALSE 115 ## 8 TRUE Knife FALSE FALSE TRUE FALSE 62 ## 9 TRUE NoWeap TRUE FALSE FALSE FALSE 25 ## 10 TRUE Other FALSE FALSE FALSE TRUE 146 ## 11 TRUE UnkWeapUse FALSE FALSE FALSE FALSE 3 After creating indicators of victimization types and characteristics, the file is summarized, and crimes are summed across persons or households by YEARQ. Property crimes (i.e., crimes committed against households, such as household burglary or motor vehicle theft) are summed across households, and personal crimes (i.e., crimes committed against an individual, such as assault, robbery, and personal theft) are summed across persons. The indicators are summed using the serieswgt, and the variable WGTVICCY needs to be retained for later analysis. inc_hh_sums &lt;- inc_ind %&gt;% filter(V4529_num &gt; 23) %&gt;% # restrict to household crimes group_by(YEARQ, IDHH) %&gt;% summarize(WGTVICCY = WGTVICCY[1], across(starts_with(&quot;Property&quot;), ~ sum(. * serieswgt), .names = &quot;{.col}&quot;), .groups = &quot;drop&quot;) inc_pers_sums &lt;- inc_ind %&gt;% filter(V4529_num &lt;= 23) %&gt;% # restrict to person crimes group_by(YEARQ, IDHH, IDPER) %&gt;% summarize(WGTVICCY = WGTVICCY[1], across(c(starts_with(&quot;Violent&quot;), starts_with(&quot;AAST&quot;)), ~ sum(. * serieswgt), .names = &quot;{.col}&quot;), .groups = &quot;drop&quot;) Now, we merge the victimization summary files into the appropriate files. For any record on the household or person file that is not on the victimization file, the victimization counts are set to 0 after merging. In this step, we will also create the victimization adjustment factor. See 2.2.4 in the User’s Guide for details of why this adjustment is created (Shook-Sa, Bonnie, Couzens, G. Lance, and Berzofsky, Marcus (2015)). It is calculated as follows: \\[ A_{ijk}=\\frac{v_{ijk}}{w_{ijk}}\\] where \\(w_{ijk}\\) is the person weight (WGTPERCY) for personal crimes or the household weight (WGTHHCY) for household crimes, and \\(v_{ijk}\\) is the victimization weight (WGTVICCY) for household \\(i\\), respondent \\(j\\), in reporting period \\(k\\). The adjustment factor is set to 0 if no incidents are reported. # Set up a list of 0s for each crime type/characteristic to replace NA&#39;s hh_z_list &lt;- rep(0, ncol(inc_hh_sums) - 3) %&gt;% as.list() %&gt;% setNames(names(inc_hh_sums)[-(1:3)]) pers_z_list &lt;- rep(0, ncol(inc_pers_sums) - 4) %&gt;% as.list() %&gt;% setNames(names(inc_pers_sums)[-(1:4)]) hh_vsum &lt;- ncvs_2021_household %&gt;% full_join(inc_hh_sums, by = c(&quot;YEARQ&quot;, &quot;IDHH&quot;)) %&gt;% replace_na(hh_z_list) %&gt;% mutate(ADJINC_WT = if_else(is.na(WGTVICCY), 0, WGTVICCY / WGTHHCY)) pers_vsum &lt;- ncvs_2021_person %&gt;% full_join(inc_pers_sums, by = c(&quot;YEARQ&quot;, &quot;IDHH&quot;, &quot;IDPER&quot;)) %&gt;% replace_na(pers_z_list) %&gt;% mutate(ADJINC_WT = if_else(is.na(WGTVICCY), 0, WGTVICCY / WGTPERCY)) 13.4.2 Derived demographic variables A final step in file preparation for the household and person files is creating any derived variables on the household and person files, such as income categories or age categories, for subgroup analysis. We can do this step before or after merging the victimization counts. 13.4.2.1 Household variables For the household file, we create categories for tenure (rental status), urbanicity, income, place size, and region. A codebook of the household variables are located in Table 13.3. TABLE 13.3: Codebook for household variables Variable Description Value Label V2015 Tenure 1 Owned or being bought 2 Rented for cash 3 No cash rent SC214A Household Income 01 Less than $5,000 02 $5,000 to $7,499 03 $7,500 to $9,999 04 $10,000 to $12,499 05 $12,500 to $14,999 06 $15,000 to $17,499 07 $17,500 to $19,999 08 $20,000 to $24,999 09 $25,000 to $29,999 10 $30,000 to $34,999 11 $35,000 to $39,999 12 $40,000 to $49,999 13 $50,000 to $74,999 15 $75,000 to $99,999 16 $100,000-$149,999 17 $150,000-$199,999 18 $200,000 or more V2126B Place Size Code 00 Not in a place 13 Under 10,000 16 10,000-49,999 17 50,000-99,999 18 100,000-249,999 19 250,000-499,999 20 500,000-999,999 21 1,000,000-2,499,999 22 2,500,000-4,999,999 23 5,000,000 or more V2127B Region 1 Northeast 2 Midwest 3 South 4 West V2143 Urbanicity 1 Urban 2 Suburban 3 Rural hh_vsum_der &lt;- hh_vsum %&gt;% mutate( Tenure = factor(case_when(V2015 == 1 ~ &quot;Owned&quot;, !is.na(V2015) ~ &quot;Rented&quot;), levels = c(&quot;Owned&quot;, &quot;Rented&quot;)), Urbanicity = factor(case_when(V2143 == 1 ~ &quot;Urban&quot;, V2143 == 2 ~ &quot;Suburban&quot;, V2143 == 3 ~ &quot;Rural&quot;), levels = c(&quot;Urban&quot;, &quot;Suburban&quot;, &quot;Rural&quot;)), SC214A_num = as.numeric(as.character(SC214A)), Income = case_when(SC214A_num &lt;= 8 ~ &quot;Less than $25,000&quot;, SC214A_num &lt;= 12 ~ &quot;$25,000-49,999&quot;, SC214A_num &lt;= 15 ~ &quot;$50,000-99,999&quot;, SC214A_num &lt;= 17 ~ &quot;$100,000-199,999&quot;, SC214A_num &lt;= 18 ~ &quot;$200,000 or more&quot;), Income = fct_reorder(Income, SC214A_num, .na_rm = FALSE), PlaceSize = case_match(as.numeric(as.character(V2126B)), 0 ~ &quot;Not in a place&quot;, 13 ~ &quot;Under 10,000&quot;, 16 ~ &quot;10,000-49,999&quot;, 17 ~ &quot;50,000-99,999&quot;, 18 ~ &quot;100,000-249,999&quot;, 19 ~ &quot;250,000-499,999&quot;, 20 ~ &quot;500,000-999,999&quot;, c(21, 22, 23) ~ &quot;1,000,000 or more&quot;), PlaceSize = fct_reorder(PlaceSize, as.numeric(V2126B)), Region = case_match(as.numeric(V2127B), 1 ~ &quot;Northeast&quot;, 2 ~ &quot;Midwest&quot;, 3 ~ &quot;South&quot;, 4 ~ &quot;West&quot;), Region = fct_reorder(Region, as.numeric(V2127B)) ) As before, we want to check to make sure the recoded variables we create match the existing data as expected. hh_vsum_der %&gt;% count(Tenure, V2015) ## # A tibble: 4 × 3 ## Tenure V2015 n ## &lt;fct&gt; &lt;fct&gt; &lt;int&gt; ## 1 Owned 1 101944 ## 2 Rented 2 46269 ## 3 Rented 3 1925 ## 4 &lt;NA&gt; &lt;NA&gt; 106322 hh_vsum_der %&gt;% count(Urbanicity, V2143) ## # A tibble: 3 × 3 ## Urbanicity V2143 n ## &lt;fct&gt; &lt;fct&gt; &lt;int&gt; ## 1 Urban 1 26878 ## 2 Suburban 2 173491 ## 3 Rural 3 56091 hh_vsum_der %&gt;% count(Income, SC214A) ## # A tibble: 18 × 3 ## Income SC214A n ## &lt;fct&gt; &lt;fct&gt; &lt;int&gt; ## 1 Less than $25,000 1 7841 ## 2 Less than $25,000 2 2626 ## 3 Less than $25,000 3 3949 ## 4 Less than $25,000 4 5546 ## 5 Less than $25,000 5 5445 ## 6 Less than $25,000 6 4821 ## 7 Less than $25,000 7 5038 ## 8 Less than $25,000 8 11887 ## 9 $25,000-49,999 9 11550 ## 10 $25,000-49,999 10 13689 ## 11 $25,000-49,999 11 13655 ## 12 $25,000-49,999 12 23282 ## 13 $50,000-99,999 13 44601 ## 14 $50,000-99,999 15 33353 ## 15 $100,000-199,999 16 34287 ## 16 $100,000-199,999 17 15317 ## 17 $200,000 or more 18 16892 ## 18 &lt;NA&gt; &lt;NA&gt; 2681 hh_vsum_der %&gt;% count(PlaceSize, V2126B) ## # A tibble: 10 × 3 ## PlaceSize V2126B n ## &lt;fct&gt; &lt;fct&gt; &lt;int&gt; ## 1 Not in a place 0 69484 ## 2 Under 10,000 13 39873 ## 3 10,000-49,999 16 53002 ## 4 50,000-99,999 17 27205 ## 5 100,000-249,999 18 24461 ## 6 250,000-499,999 19 13111 ## 7 500,000-999,999 20 15194 ## 8 1,000,000 or more 21 6167 ## 9 1,000,000 or more 22 3857 ## 10 1,000,000 or more 23 4106 hh_vsum_der %&gt;% count(Region, V2127B) ## # A tibble: 4 × 3 ## Region V2127B n ## &lt;fct&gt; &lt;fct&gt; &lt;int&gt; ## 1 Northeast 1 41585 ## 2 Midwest 2 74666 ## 3 South 3 87783 ## 4 West 4 52426 13.4.2.2 Person variables For the person file, we create categories for sex, race/Hispanic origin, age categories, and marital status. A codebook of the household variables is located in Table 13.4. We also merge the household demographics to the person file as well as the design variables (V2117 and V2118). TABLE 13.4: Codebook for person variables Variable Description Value Label V3014 Age 12 through 90 V3015 Current Marital Status 1 Married 2 Widowed 3 Divorced 4 Separated 5 Never married V3018 Sex 1 Male 2 Female V3023A Race 01 White only 02 Black only 03 American Indian, Alaska native only 04 Asian only 05 Hawaiian/Pacific Islander only 06 White-Black 07 White-American Indian 08 White-Asian 09 White-Hawaiian 10 Black-American Indian 11 Black-Asian 12 Black-Hawaiian/Pacific Islander 13 American Indian-Asian 14 Asian-Hawaiian/Pacific Islander 15 White-Black-American Indian 16 White-Black-Asian 17 White-American Indian-Asian 18 White-Asian-Hawaiian 19 2 or 3 races 20 4 or 5 races V3024 Hispanic Origin 1 Yes 2 No # Set label for usage later NHOPI &lt;- &quot;Native Hawaiian or Other Pacific Islander&quot; pers_vsum_der &lt;- pers_vsum %&gt;% mutate( Sex = factor(case_when(V3018 == 1 ~ &quot;Male&quot;, V3018 == 2 ~ &quot;Female&quot;)), RaceHispOrigin = factor(case_when(V3024 == 1 ~ &quot;Hispanic&quot;, V3023A == 1 ~ &quot;White&quot;, V3023A == 2 ~ &quot;Black&quot;, V3023A == 4 ~ &quot;Asian&quot;, V3023A == 5 ~ NHOPI, TRUE ~ &quot;Other&quot;), levels = c(&quot;White&quot;, &quot;Black&quot;, &quot;Hispanic&quot;, &quot;Asian&quot;, NHOPI, &quot;Other&quot;)), V3014_num = as.numeric(as.character(V3014)), AgeGroup = case_when(V3014_num &lt;= 17 ~ &quot;12-17&quot;, V3014_num &lt;= 24 ~ &quot;18-24&quot;, V3014_num &lt;= 34 ~ &quot;25-34&quot;, V3014_num &lt;= 49 ~ &quot;35-49&quot;, V3014_num &lt;= 64 ~ &quot;50-64&quot;, V3014_num &lt;= 90 ~ &quot;65 or older&quot;), AgeGroup = fct_reorder(AgeGroup, V3014_num), MaritalStatus = factor(case_when(V3015 == 1 ~ &quot;Married&quot;, V3015 == 2 ~ &quot;Widowed&quot;, V3015 == 3 ~ &quot;Divorced&quot;, V3015 == 4 ~ &quot;Separated&quot;, V3015 == 5 ~ &quot;Never married&quot;), levels = c(&quot;Never married&quot;, &quot;Married&quot;, &quot;Widowed&quot;,&quot;Divorced&quot;, &quot;Separated&quot;)) ) %&gt;% left_join(hh_vsum_der %&gt;% select(YEARQ, IDHH, V2117, V2118, Tenure:Region), by = c(&quot;YEARQ&quot;, &quot;IDHH&quot;)) As before, we want to check to make sure the recoded variables we create match the existing data as expected. pers_vsum_der %&gt;% count(Sex, V3018) ## # A tibble: 2 × 3 ## Sex V3018 n ## &lt;fct&gt; &lt;fct&gt; &lt;int&gt; ## 1 Female 2 150956 ## 2 Male 1 140922 pers_vsum_der %&gt;% count(RaceHispOrigin, V3024) ## # A tibble: 11 × 3 ## RaceHispOrigin V3024 n ## &lt;fct&gt; &lt;fct&gt; &lt;int&gt; ## 1 White 2 197292 ## 2 White 8 883 ## 3 Black 2 29947 ## 4 Black 8 120 ## 5 Hispanic 1 41450 ## 6 Asian 2 16015 ## 7 Asian 8 61 ## 8 Native Hawaiian or Other Pacific Islander 2 891 ## 9 Native Hawaiian or Other Pacific Islander 8 9 ## 10 Other 2 5161 ## 11 Other 8 49 pers_vsum_der %&gt;% filter(RaceHispOrigin != &quot;Hispanic&quot; | is.na(RaceHispOrigin)) %&gt;% count(RaceHispOrigin, V3023A) ## # A tibble: 20 × 3 ## RaceHispOrigin V3023A n ## &lt;fct&gt; &lt;fct&gt; &lt;int&gt; ## 1 White 1 198175 ## 2 Black 2 30067 ## 3 Asian 4 16076 ## 4 Native Hawaiian or Other Pacific Islander 5 900 ## 5 Other 3 1319 ## 6 Other 6 1217 ## 7 Other 7 1025 ## 8 Other 8 837 ## 9 Other 9 184 ## 10 Other 10 178 ## 11 Other 11 87 ## 12 Other 12 27 ## 13 Other 13 13 ## 14 Other 14 53 ## 15 Other 15 136 ## 16 Other 16 45 ## 17 Other 17 11 ## 18 Other 18 33 ## 19 Other 19 22 ## 20 Other 20 23 pers_vsum_der %&gt;% group_by(AgeGroup) %&gt;% summarize(minAge = min(V3014), maxAge = max(V3014), .groups = &quot;drop&quot;) ## # A tibble: 6 × 3 ## AgeGroup minAge maxAge ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 12-17 12 17 ## 2 18-24 18 24 ## 3 25-34 25 34 ## 4 35-49 35 49 ## 5 50-64 50 64 ## 6 65 or older 65 90 pers_vsum_der %&gt;% count(MaritalStatus, V3015) ## # A tibble: 6 × 3 ## MaritalStatus V3015 n ## &lt;fct&gt; &lt;fct&gt; &lt;int&gt; ## 1 Never married 5 90425 ## 2 Married 1 148131 ## 3 Widowed 2 17668 ## 4 Divorced 3 28596 ## 5 Separated 4 4524 ## 6 &lt;NA&gt; 8 2534 We then create tibbles that contain only the variables we need, which makes it easier for analyses. hh_vsum_slim &lt;- hh_vsum_der %&gt;% select(YEARQ:V2118, WGTVICCY:ADJINC_WT, Tenure, Urbanicity, Income, PlaceSize, Region) pers_vsum_slim &lt;- pers_vsum_der %&gt;% select(YEARQ:WGTPERCY, WGTVICCY:ADJINC_WT, Sex:Region) To calculate estimates about types of crime, such as what percentage of violent crimes are reported to the police, we must use the incident file. The incident file is not guaranteed to have every pseudostratum and half-sample code, so dummy records are created to append before estimation. Finally, we merge demographic variables onto the incident tibble. dummy_records &lt;- hh_vsum_slim %&gt;% distinct(V2117, V2118) %&gt;% mutate(Dummy = 1, WGTVICCY = 1, NEWWGT = 1) inc_analysis &lt;- inc_ind %&gt;% mutate(Dummy = 0) %&gt;% left_join(select(pers_vsum_slim, YEARQ, IDHH, IDPER, Sex:Region), by = c(&quot;YEARQ&quot;, &quot;IDHH&quot;, &quot;IDPER&quot;)) %&gt;% bind_rows(dummy_records) %&gt;% select(YEARQ:IDPER, WGTVICCY, NEWWGT, V4529, WeapCat, ReportPolice, Property:Region) The tibbles hh_vsum_slim, pers_vsum_slim, and inc_analysis can now be used to create design objects and calculate crime rate estimates. 13.5 Survey design objects All the data prep above is necessary to prepare the data for survey analysis. At this point, we can create the design objects and finally begin analysis. We will create three design objects for different types of analysis as they depend on which type of estimate we are creating. For the incident data, the weight of analysis is NEWWGT, which we constructed previously. The household and person-level data use WGTHHCY and WGTPERCY, respectively. For all analyses, V2117 is the strata variable, and V2118 is the cluster/PSU variable for analysis. inc_des &lt;- inc_analysis %&gt;% as_survey( weight = NEWWGT, strata = V2117, ids = V2118, nest = TRUE ) hh_des &lt;- hh_vsum_slim %&gt;% as_survey( weight = WGTHHCY, strata = V2117, ids = V2118, nest = TRUE ) pers_des &lt;- pers_vsum_slim %&gt;% as_survey( weight = WGTPERCY, strata = V2117, ids = V2118, nest = TRUE ) 13.6 Calculating estimates Now that we have prepared our data and created the design objects, we can calculate our estimates. As a reminder, those are: Victimization totals estimate the number of criminal victimizations with a given characteristic. Victimization proportions estimate characteristics among victimizations or victims. Victimization rates are estimates of the number of victimizations per 1,000 persons or households in the population. Prevalence rates are estimates of the percentage of the population (persons or households) who are victims of a crime. 13.6.1 Estimation 1: Victimization totals There are two ways to calculate victimization totals. Using the incident design object (inc_des) is the most straightforward method, but the person (pers_des) and household (hh_des) design objects can be used as well if the adjustment factor (ADJINC_WT) is incorporated. In the example below, the total number of property and violent victimizations is first calculated using the incident file and then using the household and person design objects. The incident file is smaller, and thus, estimation is faster using that file, but the estimates will be the same as illustrated below: vt1 &lt;- inc_des %&gt;% summarize(Property_Vzn = survey_total(Property, na.rm = TRUE), Violent_Vzn = survey_total(Violent, na.rm = TRUE)) vt2a &lt;- hh_des %&gt;% summarize(Property_Vzn = survey_total(Property * ADJINC_WT, na.rm = TRUE)) vt2b &lt;- pers_des %&gt;% summarize(Violent_Vzn = survey_total(Violent * ADJINC_WT, na.rm = TRUE)) vt1 ## # A tibble: 1 × 4 ## Property_Vzn Property_Vzn_se Violent_Vzn Violent_Vzn_se ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 11682056. 263844. 4598306. 198115. vt2a ## # A tibble: 1 × 2 ## Property_Vzn Property_Vzn_se ## &lt;dbl&gt; &lt;dbl&gt; ## 1 11682056. 263844. vt2b ## # A tibble: 1 × 2 ## Violent_Vzn Violent_Vzn_se ## &lt;dbl&gt; &lt;dbl&gt; ## 1 4598306. 198115. The number of victimizations estimated using the incident file is equivalent to the person and household file method. There are 11,682,056 property incidents and 4,598,306 violent incidents in a six-month period. 13.6.2 Estimation 2: Victimization proportions Victimization proportions are proportions describing features of a victimization. The key here is that these are questions among victimizations, not among the population. These types of estimates can only be calculated using the incident design object (inc_des). For example, we could be interested in the percentage of property victimizations reported to the police as shown in the following code with an estimate, the standard error, and 95% confidence interval: prop1 &lt;- inc_des %&gt;% filter(Property) %&gt;% summarize(Pct = survey_mean(ReportPolice, na.rm = TRUE, proportion=TRUE, vartype=c(&quot;se&quot;, &quot;ci&quot;)) * 100) prop1 ## # A tibble: 1 × 4 ## Pct Pct_se Pct_low Pct_upp ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 30.8 0.798 29.2 32.4 Or, the percentage of violent victimizations that are in urban areas: prop2 &lt;- inc_des %&gt;% filter(Violent) %&gt;% summarize(Pct = survey_mean(Urbanicity==&quot;Urban&quot;, na.rm = TRUE) * 100) prop2 ## # A tibble: 1 × 2 ## Pct Pct_se ## &lt;dbl&gt; &lt;dbl&gt; ## 1 18.1 1.49 In 2021, we estimate that 30.8% of property crimes were reported to the police and 18.1% of violent crimes occurred in urban areas. 13.6.3 Estimation 3: Victimization rates Victimization rates measure the number of victimizations per population. They are not an estimate of the proportion of households or persons who are victimized, which is a prevalence rate described in section 13.6.4. Victimization rates are estimated using the household (hh_des) or person (pers_des) design objects depending on the type of crime, and the adjustment factor (ADJINC_WT) must be incorporated. We return to the example of property and violent victimizations used in the example for victimization totals (section 13.6.1). In the following example, the property victimization totals are calculated as above, as well as the property victimization rate (using survey_mean()) and the population size using survey_total(). As mentioned in the introduction, victimization rates use the incident weight in the numerator and the person or household weight in the denominator. This is accomplished by calculating the rates with the weight adjustment (ADJINC_WT) multiplied by the estimate of interest. Let’s look at an example of property victimization. vr_prop &lt;- hh_des %&gt;% summarize( Property_Vzn = survey_total(Property * ADJINC_WT, na.rm = TRUE), Property_Rate = survey_mean(Property * ADJINC_WT * 1000, na.rm = TRUE), PopSize = survey_total(1, vartype = NULL) ) vr_prop ## # A tibble: 1 × 5 ## Property_Vzn Property_Vzn_se Property_Rate Property_Rate_se PopSize ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 11682056. 263844. 90.3 1.95 129319232. In the output above, we see the estimate for property victimization rate in 2021 was 90.3 per 1,000 households, which is consistent with calculating as the number of victimizations per 1,000 population as demonstrated in the next chunk: vr_prop %&gt;% select(-ends_with(&quot;se&quot;)) %&gt;% mutate(Property_Rate_manual=Property_Vzn/PopSize*1000) ## # A tibble: 1 × 4 ## Property_Vzn Property_Rate PopSize Property_Rate_manual ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 11682056. 90.3 129319232. 90.3 Victimization rates can also be calculated for particular characteristics of the victimization. In the following example, the rate of aggravated assault with no weapon, with a firearm, with a knife, and with another weapon. pers_des %&gt;% summarize(across( starts_with(&quot;AAST_&quot;), ~ survey_mean(. * ADJINC_WT * 1000, na.rm = TRUE) )) ## # A tibble: 1 × 8 ## AAST_NoWeap AAST_NoWeap_se AAST_Firearm AAST_Firearm_se AAST_Knife ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 0.249 0.0595 0.860 0.101 0.455 ## # ℹ 3 more variables: AAST_Knife_se &lt;dbl&gt;, AAST_Other &lt;dbl&gt;, ## # AAST_Other_se &lt;dbl&gt; A common desire is to calculate victimization rates by several characteristics. For example, we may want to calculate the violent victimization rate and aggravated assault rate by sex, race/Hispanic origin, age group, marital status, and household income. This requires a group_by() statement for each categorization separately. Thus, we make a function to do this and then use map_df() from the {purrr} package (part of the tidyverse) to loop through the variables (Wickham and Henry 2023). This function takes a demographic variable as its input (byarvar) and calculates the violent and aggravated assault victimization rate for each level. It then creates some columns with the variable, the level of each variable, and a numeric version of the variable (LevelNum) for sorting later. The function is run across multiple variables using map() and then stacks the results into a single output using bind_rows(). pers_est_by &lt;- function(byvar) { pers_des %&gt;% rename(Level := {{byvar}}) %&gt;% filter(!is.na(Level)) %&gt;% group_by(Level) %&gt;% summarize( Violent = survey_mean(Violent * ADJINC_WT * 1000, na.rm = TRUE), AAST = survey_mean(AAST * ADJINC_WT * 1000, na.rm = TRUE) ) %&gt;% mutate( Variable = byvar, LevelNum = as.numeric(Level), Level = as.character(Level) ) %&gt;% select(Variable, Level, LevelNum, everything()) } pers_est_df &lt;- c(&quot;Sex&quot;, &quot;RaceHispOrigin&quot;, &quot;AgeGroup&quot;, &quot;MaritalStatus&quot;, &quot;Income&quot;) %&gt;% map(pers_est_by) %&gt;% bind_rows() The output from all the estimates is cleanded to create better labels such as going from “RaceHispOrigin” to “Race/Hispanic Origin”. Finally, the {gt} package is used to make a publishable table (Table 13.5). Using the functions from the {gt} package, column labels and footnotes are added and estimates are presented to the first decimal place (Iannone et al. 2023). vr_gt&lt;-pers_est_df %&gt;% mutate( Variable = case_when( Variable == &quot;RaceHispOrigin&quot; ~ &quot;Race/Hispanic origin&quot;, Variable == &quot;MaritalStatus&quot; ~ &quot;Marital status&quot;, Variable == &quot;AgeGroup&quot; ~ &quot;Age&quot;, TRUE ~ Variable ) ) %&gt;% select(-LevelNum) %&gt;% group_by(Variable) %&gt;% gt(rowname_col = &quot;Level&quot;) %&gt;% tab_spanner( label = &quot;Violent crime&quot;, id = &quot;viol_span&quot;, columns = c(&quot;Violent&quot;, &quot;Violent_se&quot;) ) %&gt;% tab_spanner(label = &quot;Aggravated assault&quot;, columns = c(&quot;AAST&quot;, &quot;AAST_se&quot;)) %&gt;% cols_label( Violent = &quot;Rate&quot;, Violent_se = &quot;SE&quot;, AAST = &quot;Rate&quot;, AAST_se = &quot;SE&quot;, ) %&gt;% fmt_number( columns = c(&quot;Violent&quot;, &quot;Violent_se&quot;, &quot;AAST&quot;, &quot;AAST_se&quot;), decimals = 1 ) %&gt;% tab_footnote( footnote = &quot;Includes rape or sexual assault, robbery, aggravated assault, and simple assault.&quot;, locations = cells_column_spanners(spanners = &quot;viol_span&quot;) ) %&gt;% tab_footnote( footnote = &quot;Excludes persons of Hispanic origin&quot;, locations = cells_stub(rows = Level %in% c(&quot;White&quot;, &quot;Black&quot;, &quot;Asian&quot;, NHOPI, &quot;Other&quot;))) %&gt;% tab_footnote( footnote = &quot;Includes persons who identified as Native Hawaiian or Other Pacific Islander only.&quot;, locations = cells_stub(rows = Level == NHOPI) ) %&gt;% tab_footnote( footnote = &quot;Includes persons who identified as American Indian or Alaska Native only or as two or more races.&quot;, locations = cells_stub(rows = Level == &quot;Other&quot;) ) %&gt;% tab_source_note( source_note = &quot;Note: Rates per 1,000 persons age 12 or older.&quot;) %&gt;% tab_source_note(source_note = &quot;Source: Bureau of Justice Statistics, National Crime Victimization Survey, 2021.&quot;) %&gt;% tab_stubhead(label = &quot;Victim demographic&quot;) %&gt;% tab_caption(&quot;Rate and standard error of violent victimization, by type of crime and demographic characteristics, 2021&quot;) vr_gt #jslvphoojc table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #jslvphoojc thead, #jslvphoojc tbody, #jslvphoojc tfoot, #jslvphoojc tr, #jslvphoojc td, #jslvphoojc th { border-style: none; } #jslvphoojc p { margin: 0; padding: 0; } #jslvphoojc .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #jslvphoojc .gt_caption { padding-top: 4px; padding-bottom: 4px; } #jslvphoojc .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #jslvphoojc .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #jslvphoojc .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #jslvphoojc .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #jslvphoojc .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #jslvphoojc .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #jslvphoojc .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #jslvphoojc .gt_column_spanner_outer:first-child { padding-left: 0; } #jslvphoojc .gt_column_spanner_outer:last-child { padding-right: 0; } #jslvphoojc .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #jslvphoojc .gt_spanner_row { border-bottom-style: hidden; } #jslvphoojc .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #jslvphoojc .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #jslvphoojc .gt_from_md > :first-child { margin-top: 0; } #jslvphoojc .gt_from_md > :last-child { margin-bottom: 0; } #jslvphoojc .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #jslvphoojc .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #jslvphoojc .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #jslvphoojc .gt_row_group_first td { border-top-width: 2px; } #jslvphoojc .gt_row_group_first th { border-top-width: 2px; } #jslvphoojc .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #jslvphoojc .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #jslvphoojc .gt_first_summary_row.thick { border-top-width: 2px; } #jslvphoojc .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #jslvphoojc .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #jslvphoojc .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #jslvphoojc .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #jslvphoojc .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #jslvphoojc .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #jslvphoojc .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #jslvphoojc .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #jslvphoojc .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #jslvphoojc .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #jslvphoojc .gt_left { text-align: left; } #jslvphoojc .gt_center { text-align: center; } #jslvphoojc .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #jslvphoojc .gt_font_normal { font-weight: normal; } #jslvphoojc .gt_font_bold { font-weight: bold; } #jslvphoojc .gt_font_italic { font-style: italic; } #jslvphoojc .gt_super { font-size: 65%; } #jslvphoojc .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #jslvphoojc .gt_asterisk { font-size: 100%; vertical-align: 0; } #jslvphoojc .gt_indent_1 { text-indent: 5px; } #jslvphoojc .gt_indent_2 { text-indent: 10px; } #jslvphoojc .gt_indent_3 { text-indent: 15px; } #jslvphoojc .gt_indent_4 { text-indent: 20px; } #jslvphoojc .gt_indent_5 { text-indent: 25px; } TABLE 13.5: Rate and standard error of violent victimization, by type of crime and demographic characteristics, 2021 Victim demographic Violent crime1 Aggravated assault Rate SE Rate SE Sex Female 15.5 0.9 2.3 0.2 Male 17.5 1.1 3.2 0.3 Race/Hispanic origin White2 16.1 0.9 2.7 0.3 Black2 18.5 2.2 3.7 0.7 Hispanic 15.9 1.7 2.3 0.4 Asian2 8.6 1.3 1.9 0.6 Native Hawaiian or Other Pacific Islander2,3 36.1 34.4 0.0 0.0 Other2,4 45.4 13.0 6.2 2.0 Age 12-17 13.2 2.2 2.5 0.8 18-24 23.1 2.1 3.9 0.9 25-34 22.0 2.1 4.0 0.6 35-49 19.4 1.6 3.6 0.5 50-64 16.9 1.9 2.0 0.3 65 or older 6.4 1.1 1.1 0.3 Marital status Never married 22.2 1.4 4.0 0.4 Married 9.5 0.9 1.5 0.2 Widowed 10.7 3.5 0.9 0.2 Divorced 27.4 2.9 4.0 0.7 Separated 36.8 6.7 8.8 3.1 Income Less than $25,000 29.6 2.5 5.1 0.7 $25,000-49,999 16.9 1.5 3.0 0.4 $50,000-99,999 14.6 1.1 1.9 0.3 $100,000-199,999 12.2 1.3 2.5 0.4 $200,000 or more 9.7 1.4 1.7 0.6 Note: Rates per 1,000 persons age 12 or older. Source: Bureau of Justice Statistics, National Crime Victimization Survey, 2021. 1 Includes rape or sexual assault, robbery, aggravated assault, and simple assault. 2 Excludes persons of Hispanic origin 3 Includes persons who identified as Native Hawaiian or Other Pacific Islander only. 4 Includes persons who identified as American Indian or Alaska Native only or as two or more races. 13.6.4 Estimation 4: Prevalence rates Prevalence rates differ from victimization rates as the numerator is the number of people or households victimized rather than the number of victimizations. To calculate the prevalence rates, we must run another summary of the data by calculating an indicator for whether a person or household is a victim of a particular crime at any point in the year. Below is an example of calculating first the indicator and then the prevalence rate of violent crime and aggravated assault. pers_prev_des &lt;- pers_vsum_slim %&gt;% mutate(Year = floor(YEARQ)) %&gt;% mutate(Violent_Ind = sum(Violent) &gt; 0, AAST_Ind = sum(AAST) &gt; 0, .by = c(&quot;Year&quot;, &quot;IDHH&quot;, &quot;IDPER&quot;)) %&gt;% as_survey( weight = WGTPERCY, strata = V2117, ids = V2118, nest = TRUE ) pers_prev_ests &lt;- pers_prev_des %&gt;% summarize(Violent_Prev = survey_mean(Violent_Ind * 100), AAST_Prev = survey_mean(AAST_Ind * 100)) pers_prev_ests ## # A tibble: 1 × 4 ## Violent_Prev Violent_Prev_se AAST_Prev AAST_Prev_se ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 0.980 0.0349 0.215 0.0143 In the example above, the indicator is multiplied by 100 to return a percentage rather than a proportion. In 2021, we estimate that 0.98% of people aged 12 and older were a victim of violent crime in the United States, and 0.22% were victims of aggravated assault. 13.7 Statistical testing For any of the types of estimates discussed, we can also perform statistical testing. For example, we could test whether property victimization rates are different between properties that are owned versus rented. First, we calculate the point estimates. prop_tenure &lt;- hh_des %&gt;% group_by(Tenure) %&gt;% summarize( Property_Rate = survey_mean(Property * ADJINC_WT * 1000, na.rm = TRUE, vartype=&quot;ci&quot;), ) prop_tenure ## # A tibble: 3 × 4 ## Tenure Property_Rate Property_Rate_low Property_Rate_upp ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Owned 68.2 64.3 72.1 ## 2 Rented 130. 123. 137. ## 3 &lt;NA&gt; NaN NaN NaN The property victimization rate for rented households is 129.8 per 1,000 households while the property victimization rate for owned households is 68.2, which seem very different especially given the non-overlapping confidence intervals. However, survey data is inheriently non-independent so statistical testing cannot be done by comparing confidence intervals. To conduct the statistical test, we first need to create a variable that we will compare which incorporates the adjusted incident weight (ADJINC_WT) and then the test can be conducted as discussed in Chapter 6. prop_tenure_test &lt;- hh_des %&gt;% mutate( Prop_Adj=Property * ADJINC_WT * 1000 ) %&gt;% svyttest( formula = Prop_Adj ~ Tenure, design = ., na.rm = TRUE ) %&gt;% broom::tidy() prop_tenure_test ## # A tibble: 1 × 8 ## estimate statistic p.value parameter conf.low conf.high method ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;chr&gt; ## 1 61.6 16.0 8.91e-36 169 54.0 69.2 Design-based… ## # ℹ 1 more variable: alternative &lt;chr&gt; prop_tenure_test %&gt;% mutate(p.value = pretty_p_value(p.value)) %&gt;% gt() %&gt;% fmt_number() #uphlolqabb table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #uphlolqabb thead, #uphlolqabb tbody, #uphlolqabb tfoot, #uphlolqabb tr, #uphlolqabb td, #uphlolqabb th { border-style: none; } #uphlolqabb p { margin: 0; padding: 0; } #uphlolqabb .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #uphlolqabb .gt_caption { padding-top: 4px; padding-bottom: 4px; } #uphlolqabb .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #uphlolqabb .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #uphlolqabb .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #uphlolqabb .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #uphlolqabb .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #uphlolqabb .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #uphlolqabb .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #uphlolqabb .gt_column_spanner_outer:first-child { padding-left: 0; } #uphlolqabb .gt_column_spanner_outer:last-child { padding-right: 0; } #uphlolqabb .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #uphlolqabb .gt_spanner_row { border-bottom-style: hidden; } #uphlolqabb .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #uphlolqabb .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #uphlolqabb .gt_from_md > :first-child { margin-top: 0; } #uphlolqabb .gt_from_md > :last-child { margin-bottom: 0; } #uphlolqabb .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #uphlolqabb .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #uphlolqabb .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #uphlolqabb .gt_row_group_first td { border-top-width: 2px; } #uphlolqabb .gt_row_group_first th { border-top-width: 2px; } #uphlolqabb .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #uphlolqabb .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #uphlolqabb .gt_first_summary_row.thick { border-top-width: 2px; } #uphlolqabb .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #uphlolqabb .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #uphlolqabb .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #uphlolqabb .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #uphlolqabb .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #uphlolqabb .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #uphlolqabb .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #uphlolqabb .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #uphlolqabb .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #uphlolqabb .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #uphlolqabb .gt_left { text-align: left; } #uphlolqabb .gt_center { text-align: center; } #uphlolqabb .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #uphlolqabb .gt_font_normal { font-weight: normal; } #uphlolqabb .gt_font_bold { font-weight: bold; } #uphlolqabb .gt_font_italic { font-style: italic; } #uphlolqabb .gt_super { font-size: 65%; } #uphlolqabb .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #uphlolqabb .gt_asterisk { font-size: 100%; vertical-align: 0; } #uphlolqabb .gt_indent_1 { text-indent: 5px; } #uphlolqabb .gt_indent_2 { text-indent: 10px; } #uphlolqabb .gt_indent_3 { text-indent: 15px; } #uphlolqabb .gt_indent_4 { text-indent: 20px; } #uphlolqabb .gt_indent_5 { text-indent: 25px; } TABLE 13.6: T-test output for estimates of property victimization rates between properties that are owned versus rented, NCVS 2021 estimate statistic p.value parameter conf.low conf.high method alternative 61.62 16.04 &lt;0.0001 169.00 54.03 69.21 Design-based t-test two.sided The output of the statistical test shows the same difference of 61.6 between the property victimization rates of renters and owners and the test is highly significant with the p-value of &lt;0.0001. 13.8 Exercises What proportion of completed motor vehicle thefts are not reported to the police? Hint: Use the codebook to look at the definition of Type of Crime (V4529). How many violent crimes occur in each region? What is the property victimization rate among each income level? What is the difference between the violent victimization rate between males and females? Is it statistically different? References Bureau of Justice Statistics. 2017. “National Crime Victimization Survey, 2016: Technical Documentation.” https://bjs.ojp.gov/sites/g/files/xyckuh236/files/media/document/ncvstd16.pdf. Iannone, Richard, Joe Cheng, Barret Schloerke, Ellis Hughes, Alexandra Lauer, and JooYoung Seo. 2023. gt: Easily Create Presentation-Ready Display Tables. Shook-Sa, Bonnie, Couzens, G. Lance, and Berzofsky, Marcus. 2015. “Users’ Guide to the National Crime Victimization Survey (NCVS) Direct Variance Estimation.” https://bjs.ojp.gov/sites/g/files/xyckuh236/files/media/document/ncvs_variance_user_guide_11.06.14.pdf; Bureau of Justice Statistics. United States. Bureau of Justice Statistics. 2022. “National Crime Victimization Survey, [United States], 2021.” https://www.icpsr.umich.edu/web/NACJD/studies/38429; Inter-university Consortium for Political; Social Research [distributor]. https://doi.org/10.3886/ICPSR38429.v1. Wickham, Hadley, and Lionel Henry. 2023. purrr: Functional Programming Tools. BJS publishes victimization rates per 1,000, which are also presented in these examples↩︎ "],["c14-ambarom-vignette.html", "Chapter 14 AmericasBarometer Vignette 14.1 Introduction 14.2 Data structure 14.3 Preparing files 14.4 Survey design objects 14.5 Calculating estimates 14.6 Mapping survey data 14.7 Exercises", " Chapter 14 AmericasBarometer Vignette Prerequisites For this chapter, load the following packages: library(tidyverse) library(survey) library(srvyr) library(sf) library(rnaturalearth) library(rnaturalearthdata) library(gt) library(ggpattern) In this vignette, we use a subset of data from the 2021 AmericasBarometer survey. Download the raw files, available on the LAPOP website. We work with version 1.2 of the data, and there are separate files for each of the 22 countries. To read all files into R while ignoring the Stata labels, we recommend running code like this using read_stata() function from the {haven} package to import the data (Wickham, Miller, and Smith 2023): stata_files &lt;- list.files(here(&quot;RawData&quot;, &quot;LAPOP_2021&quot;), &quot;*.dta&quot;) read_stata_unlabeled &lt;- function(file) { read_stata(file) %&gt;% zap_labels() %&gt;% zap_label() } ambarom_in &lt;- here(&quot;RawData&quot;, &quot;LAPOP_2021&quot;, stata_files) %&gt;% map_df(read_stata_unlabeled) %&gt;% select(pais, strata, upm, weight1500, strata, core_a_core_b, q2, q1tb, covid2at, a4, idio2, idio2cov, it1, jc13, m1, mil10a, mil10e, ccch1, ccch3, ccus1, ccus3, edr, ocup4a, q14, q11n, q12c, q12bn, starts_with(&quot;covidedu1&quot;), gi0n, r15, r18n, r18) The code above reads all .dta files and combines them into one tibble. 14.1 Introduction The AmericasBarometer surveys, conducted by the LAPOP Lab (LAPOP 2023b), are public opinion surveys of the Americas focused on democracy. The study was launched in 2004/2005 with 11 countries. Though the countries grow and fluctuate over time, AmericasBarometers maintains a consistent methodology across many countries. In 2021, the study included 22 countries ranging from Canada in the north to Chile and Argentina in the South (LAPOP 2023a). Historically, surveys were administered through in-person household interviews, but the COVID-19 pandemic changed the study significantly. Now, random-digit dialing (RDD) of mobile phones is used in all countries except the United States and Canada (LAPOP 2021c). In Canada, LAPOP collaborated with the Environics Institute to collect data from a panel of Canadians using a web survey (LAPOP 2021a). In the United States, YouGov conducted the survey on behalf of LAPOP by conducting a web survey among its panelists (LAPOP 2021b). The survey includes a core set of questions for all countries, but not every question is asked in each country. Additionally, some questions are only posed to half of the respondents in a country, with different sections randomized to respondents (LAPOP 2021d). 14.2 Data structure Each country and year has its own file available in Stata format (.dta). In this vignette, we download and combine all the data from the 22 participating countries in 2021. We subset the data to a smaller set of columns, as noted in the prerequisites box. Review the core questionnaire to understand the common variables across the countries (LAPOP 2021d). 14.3 Preparing files Many of the variables are coded as numeric and do not have intuitive variable names, so the next step is to create derived variables and wrangle the data for analysis. Using the core questionnaire as a codebook, we reference the factor descriptions to create derived variables with informative names: ambarom &lt;- ambarom_in %&gt;% mutate( Country = factor( case_match(pais, 1 ~ &quot;Mexico&quot;, 2 ~ &quot;Guatemala&quot;, 3 ~ &quot;El Salvador&quot;, 4 ~ &quot;Honduras&quot;, 5 ~ &quot;Nicaragua&quot;, 6 ~ &quot;Costa Rica&quot;, 7 ~ &quot;Panama&quot;, 8 ~ &quot;Colombia&quot;, 9 ~ &quot;Ecuador&quot;, 10 ~ &quot;Bolivia&quot;, 11 ~ &quot;Peru&quot;, 12 ~ &quot;Paraguay&quot;, 13 ~ &quot;Chile&quot;, 14 ~ &quot;Uruguay&quot;, 15 ~ &quot;Brazil&quot;, 17 ~ &quot;Argentina&quot;, 21 ~ &quot;Dominican Republic&quot;, 22 ~ &quot;Haiti&quot;, 23 ~ &quot;Jamaica&quot;, 24 ~ &quot;Guyana&quot;, 40 ~ &quot;United States&quot;, 41 ~ &quot;Canada&quot;)), CovidWorry = fct_reorder( case_match(covid2at, 1 ~ &quot;Very worried&quot;, 2 ~ &quot;Somewhat worried&quot;, 3 ~ &quot;A little worried&quot;, 4 ~ &quot;Not worried at all&quot;), covid2at, .na_rm = FALSE) ) %&gt;% rename(Educ_NotInSchool = covidedu1_1, Educ_NormalSchool = covidedu1_2, Educ_VirtualSchool = covidedu1_3, Educ_Hybrid = covidedu1_4, Educ_NoSchool = covidedu1_5, BroadbandInternet = r18n, Internet = r18) At this point, it is a good time to check the cross-tabs between the original and newly derived variables. These tables help us confirm that we have correctly matched the numeric data from the original dataset to the renamed factor data in the new dataset. For instance, let’s check the original variable pais and the derived variable Country. We can consult the questionnaire or codebook to confirm that Argentina is coded as 17, Bolivia as 10, etc. Similarly, for CovidWorry and covid2at, we can verify that Very worried is coded as 1, and so on for the other variables. ambarom %&gt;% count(Country, pais) %&gt;% print(n = 22) ## # A tibble: 22 × 3 ## Country pais n ## &lt;fct&gt; &lt;dbl&gt; &lt;int&gt; ## 1 Argentina 17 3011 ## 2 Bolivia 10 3002 ## 3 Brazil 15 3016 ## 4 Canada 41 2201 ## 5 Chile 13 2954 ## 6 Colombia 8 2993 ## 7 Costa Rica 6 2977 ## 8 Dominican Republic 21 3000 ## 9 Ecuador 9 3005 ## 10 El Salvador 3 3245 ## 11 Guatemala 2 3000 ## 12 Guyana 24 3011 ## 13 Haiti 22 3088 ## 14 Honduras 4 2999 ## 15 Jamaica 23 3121 ## 16 Mexico 1 2998 ## 17 Nicaragua 5 2997 ## 18 Panama 7 3183 ## 19 Paraguay 12 3004 ## 20 Peru 11 3038 ## 21 United States 40 1500 ## 22 Uruguay 14 3009 ambarom %&gt;% count(CovidWorry, covid2at) ## # A tibble: 5 × 3 ## CovidWorry covid2at n ## &lt;fct&gt; &lt;dbl&gt; &lt;int&gt; ## 1 Very worried 1 24327 ## 2 Somewhat worried 2 13233 ## 3 A little worried 3 11478 ## 4 Not worried at all 4 8628 ## 5 &lt;NA&gt; NA 6686 14.4 Survey design objects The technical report is the best reference for understanding how to specify the sampling design in R (LAPOP 2021c). The data includes two weights: wt and weight1500. The first weight variable is specific to each country and sums to the sample size, but it is calibrated to reflect each country’s demographics. The second weight variable sums to 1500 for each country and is recommended for multi-country analyses. Although not explicitly stated in the documentation, the Stata syntax example (svyset upm [pw=weight1500], strata(strata)) indicates the variable upm is a clustering variable and strata is the strata variable. Therefore, the design object is created in R as follows: ambarom_des &lt;- ambarom %&gt;% as_survey_design(ids = upm, strata = strata, weight = weight1500) One interesting thing to note is that these weight variables can provide estimates for comparing countries but not for multi-country estimates. The reason is that the weights do not account for the different sizes of countries. For example, Canada has about 10% of the population of the United States, but an estimate that uses records from both countries would weigh them equally. 14.5 Calculating estimates When calculating estimates from the data, we use the survey design object ambarom_des and then apply the survey_mean() function. The next sections walk through a few examples. 14.5.1 Example: Worried about COVID This survey was administered between March and August of 2021, with the specific timing varying by country28. Given the state of the pandemic at that time, several questions about COVID were included. The first question about COVID asked: How worried are you about the possibility that you or someone in your household will get sick from coronavirus in the next 3 months? Very worried Somewhat worried A little worried Not worried at all If we are interested in those who are very worried or somewhat worried, we can create a new variable (CovidWorry_bin) that groups levels of the original question using the fct_collapse() function from the {forcats} package (Wickham 2023a). We then use the survey_count() function to understand how responses are distributed across each category of the original variable (CovidWorry) and the new variable (CovidWorry_bin). covid_worry_collapse &lt;- ambarom_des %&gt;% mutate(CovidWorry_bin = fct_collapse( CovidWorry, WorriedHi = c(&quot;Very worried&quot;, &quot;Somewhat worried&quot;), WorriedLo = c(&quot;A little worried&quot;, &quot;Not worried at all&quot;) )) covid_worry_collapse %&gt;% survey_count(CovidWorry_bin, CovidWorry) ## # A tibble: 5 × 4 ## CovidWorry_bin CovidWorry n n_se ## &lt;fct&gt; &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 WorriedHi Very worried 12369. 83.6 ## 2 WorriedHi Somewhat worried 6378. 63.4 ## 3 WorriedLo A little worried 5896. 62.6 ## 4 WorriedLo Not worried at all 4840. 59.7 ## 5 &lt;NA&gt; &lt;NA&gt; 3518. 42.2 With this new variable, we can now use survey_mean() to calculate the percentage of people in each country who are either very or somewhat worried about COVID. There are missing data, as indicated in the survey_count() output above, so we need to use na.rm = TRUE in the survey_mean() function to handle the missing values. covid_worry_country_ests &lt;- covid_worry_collapse %&gt;% group_by(Country) %&gt;% summarize(p = survey_mean(CovidWorry_bin == &quot;WorriedHi&quot;, na.rm = TRUE) * 100) covid_worry_country_ests ## # A tibble: 22 × 3 ## Country p p_se ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Argentina 65.8 1.08 ## 2 Bolivia 71.6 0.960 ## 3 Brazil 83.5 0.962 ## 4 Canada 48.9 1.34 ## 5 Chile 81.8 0.828 ## 6 Colombia 67.9 1.12 ## 7 Costa Rica 72.6 0.952 ## 8 Dominican Republic 50.1 1.13 ## 9 Ecuador 71.7 0.967 ## 10 El Salvador 52.5 1.02 ## # ℹ 12 more rows To view the results for all countries, we can use the {gt} package to create Table 14.1 (Iannone et al. 2023). covid_worry_country_ests_gt &lt;- covid_worry_country_ests %&gt;% gt(rowname_col = &quot;Country&quot;) %&gt;% cols_label(p = &quot;Percent&quot;, p_se = &quot;SE&quot;) %&gt;% fmt_number(decimals = 1) %&gt;% tab_source_note(&quot;AmericasBarometer Surveys, 2021&quot;) covid_worry_country_ests_gt #ismfkpkdnv table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #ismfkpkdnv thead, #ismfkpkdnv tbody, #ismfkpkdnv tfoot, #ismfkpkdnv tr, #ismfkpkdnv td, #ismfkpkdnv th { border-style: none; } #ismfkpkdnv p { margin: 0; padding: 0; } #ismfkpkdnv .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #ismfkpkdnv .gt_caption { padding-top: 4px; padding-bottom: 4px; } #ismfkpkdnv .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #ismfkpkdnv .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #ismfkpkdnv .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #ismfkpkdnv .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #ismfkpkdnv .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #ismfkpkdnv .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #ismfkpkdnv .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #ismfkpkdnv .gt_column_spanner_outer:first-child { padding-left: 0; } #ismfkpkdnv .gt_column_spanner_outer:last-child { padding-right: 0; } #ismfkpkdnv .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #ismfkpkdnv .gt_spanner_row { border-bottom-style: hidden; } #ismfkpkdnv .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #ismfkpkdnv .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #ismfkpkdnv .gt_from_md > :first-child { margin-top: 0; } #ismfkpkdnv .gt_from_md > :last-child { margin-bottom: 0; } #ismfkpkdnv .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #ismfkpkdnv .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #ismfkpkdnv .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #ismfkpkdnv .gt_row_group_first td { border-top-width: 2px; } #ismfkpkdnv .gt_row_group_first th { border-top-width: 2px; } #ismfkpkdnv .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #ismfkpkdnv .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #ismfkpkdnv .gt_first_summary_row.thick { border-top-width: 2px; } #ismfkpkdnv .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #ismfkpkdnv .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #ismfkpkdnv .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #ismfkpkdnv .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #ismfkpkdnv .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #ismfkpkdnv .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #ismfkpkdnv .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #ismfkpkdnv .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #ismfkpkdnv .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #ismfkpkdnv .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #ismfkpkdnv .gt_left { text-align: left; } #ismfkpkdnv .gt_center { text-align: center; } #ismfkpkdnv .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #ismfkpkdnv .gt_font_normal { font-weight: normal; } #ismfkpkdnv .gt_font_bold { font-weight: bold; } #ismfkpkdnv .gt_font_italic { font-style: italic; } #ismfkpkdnv .gt_super { font-size: 65%; } #ismfkpkdnv .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #ismfkpkdnv .gt_asterisk { font-size: 100%; vertical-align: 0; } #ismfkpkdnv .gt_indent_1 { text-indent: 5px; } #ismfkpkdnv .gt_indent_2 { text-indent: 10px; } #ismfkpkdnv .gt_indent_3 { text-indent: 15px; } #ismfkpkdnv .gt_indent_4 { text-indent: 20px; } #ismfkpkdnv .gt_indent_5 { text-indent: 25px; } TABLE 14.1: Percentage worried about the possibility that they or someone in their household will get sick from coronavirus in the next 3 months Percent SE Argentina 65.8 1.1 Bolivia 71.6 1.0 Brazil 83.5 1.0 Canada 48.9 1.3 Chile 81.8 0.8 Colombia 67.9 1.1 Costa Rica 72.6 1.0 Dominican Republic 50.1 1.1 Ecuador 71.7 1.0 El Salvador 52.5 1.0 Guatemala 69.3 1.0 Guyana 60.0 1.6 Haiti 54.4 1.8 Honduras 64.6 1.1 Jamaica 28.4 0.9 Mexico 63.6 1.0 Nicaragua 80.0 1.0 Panama 70.2 1.0 Paraguay 61.5 1.1 Peru 77.1 2.5 United States 46.6 1.7 Uruguay 60.9 1.1 AmericasBarometer Surveys, 2021 14.5.2 Example: Education affected by COVID Respondents were also asked a question about how the pandemic affected education. This question was asked to households with children under the age of 13, and respondents could select more than one option, as follows: Did any of these children have their school education affected due to the pandemic?   - No, because they are not yet school age or because they do not attend school for another reason   - No, their classes continued normally   - Yes, they went to virtual or remote classes   - Yes, they switched to a combination of virtual and in-person classes   - Yes, they cut all ties with the school Working with multiple-choice questions can be both challenging and interesting. Let’s walk through how to analyze this question. If we are interested in the impact on education, we should focus on the data of those whose children are attending school. This means we need to exclude those who selected the first response option: “No, because they are not yet school age or because they do not attend school for another reason.” To do this, we use the Educ_NotInSchool variable in the dataset, which has values of 0 and 1. A value of 1 indicates that the respondent chose the first response option (none of the children are in school), and a value of 0 means that at least one of their children is in school. By filtering the data to those with a value of 0 (they have at least one child in school), we can consider only respondents with at least one child attending school. Now, let’s review the data for those who selected one of the next three response options: No, their classes continued normally: Educ_NormalSchool Yes, they went to virtual or remote classes: Educ_VirtualSchool Yes, they switched to a combination of virtual and in-person classes: Educ_Hybrid The unweighted cross-tab for these responses is included below. It reveals a wide range of impacts, where many combinations of effects on education are possible. ambarom %&gt;% filter(Educ_NotInSchool == 0) %&gt;% count(Educ_NormalSchool, Educ_VirtualSchool, Educ_Hybrid) ## # A tibble: 8 × 4 ## Educ_NormalSchool Educ_VirtualSchool Educ_Hybrid n ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;int&gt; ## 1 0 0 0 861 ## 2 0 0 1 1192 ## 3 0 1 0 7554 ## 4 0 1 1 280 ## 5 1 0 0 833 ## 6 1 0 1 18 ## 7 1 1 0 72 ## 8 1 1 1 7 In reviewing the survey question, we might be interested in knowing the answers to the following: What percentage of households indicated that school continued as normal with no virtual or hybrid option? What percentage of households indicated that the education medium was changed to either virtual or hybrid? What percentage of households indicated that they cut ties with their school? To find the answers, we create indicators for the first two questions, make national estimates for all three questions, and then construct a summary table for easy viewing. First, we create and inspect the indicators and their distributions using survey_count(). ambarom_des_educ &lt;- ambarom_des %&gt;% filter(Educ_NotInSchool == 0) %&gt;% mutate( Educ_OnlyNormal = (Educ_NormalSchool == 1 &amp; Educ_VirtualSchool == 0 &amp; Educ_Hybrid == 0), Educ_MediumChange = (Educ_VirtualSchool == 1 | Educ_Hybrid == 1) ) ambarom_des_educ %&gt;% survey_count(Educ_OnlyNormal, Educ_NormalSchool, Educ_VirtualSchool, Educ_Hybrid) ## # A tibble: 8 × 6 ## Educ_OnlyNormal Educ_NormalSchool Educ_VirtualSchool Educ_Hybrid ## &lt;lgl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 FALSE 0 0 0 ## 2 FALSE 0 0 1 ## 3 FALSE 0 1 0 ## 4 FALSE 0 1 1 ## 5 FALSE 1 0 1 ## 6 FALSE 1 1 0 ## 7 FALSE 1 1 1 ## 8 TRUE 1 0 0 ## # ℹ 2 more variables: n &lt;dbl&gt;, n_se &lt;dbl&gt; ambarom_des_educ %&gt;% survey_count(Educ_MediumChange, Educ_VirtualSchool, Educ_Hybrid) ## # A tibble: 4 × 5 ## Educ_MediumChange Educ_VirtualSchool Educ_Hybrid n n_se ## &lt;lgl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 FALSE 0 0 880. 26.1 ## 2 TRUE 0 1 561. 19.2 ## 3 TRUE 1 0 3812. 49.4 ## 4 TRUE 1 1 136. 9.86 Next, we group the data by country and calculate the population estimates for our three questions. covid_educ_ests &lt;- ambarom_des_educ %&gt;% group_by(Country) %&gt;% summarize( p_onlynormal = survey_mean(Educ_OnlyNormal, na.rm = TRUE) * 100, p_mediumchange = survey_mean(Educ_MediumChange, na.rm = TRUE) * 100, p_noschool = survey_mean(Educ_NoSchool, na.rm = TRUE) * 100, ) covid_educ_ests ## # A tibble: 16 × 7 ## Country p_onlynormal p_onlynormal_se p_mediumchange p_mediumchange_se ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Argent… 5.39 1.14 87.1 1.72 ## 2 Brazil 4.28 1.17 81.5 2.33 ## 3 Chile 0.715 0.267 96.2 0.962 ## 4 Colomb… 2.84 0.727 90.3 1.40 ## 5 Domini… 3.75 0.793 87.4 1.45 ## 6 Ecuador 5.18 0.963 87.5 1.39 ## 7 El Sal… 2.92 0.680 85.8 1.53 ## 8 Guatem… 3.00 0.727 82.2 1.73 ## 9 Guyana 3.34 0.702 85.3 1.67 ## 10 Haiti 81.1 2.25 7.25 1.48 ## 11 Hondur… 3.68 0.882 80.7 1.72 ## 12 Jamaica 5.42 0.950 88.1 1.43 ## 13 Panama 7.20 1.18 89.4 1.42 ## 14 Paragu… 4.66 0.939 90.7 1.37 ## 15 Peru 2.04 0.604 91.8 1.20 ## 16 Uruguay 8.60 1.40 84.3 2.02 ## # ℹ 2 more variables: p_noschool &lt;dbl&gt;, p_noschool_se &lt;dbl&gt; Finally, to view the results for all countries, we can use the {gt} package to construct Table 14.2. covid_educ_ests_gt &lt;- covid_educ_ests %&gt;% gt(rowname_col = &quot;Country&quot;) %&gt;% cols_label( p_onlynormal = &quot;%&quot;, p_onlynormal_se = &quot;SE&quot;, p_mediumchange = &quot;%&quot;, p_mediumchange_se = &quot;SE&quot;, p_noschool = &quot;%&quot;, p_noschool_se = &quot;SE&quot; ) %&gt;% tab_spanner(label = &quot;Normal school only&quot;, columns = c(&quot;p_onlynormal&quot;, &quot;p_onlynormal_se&quot;)) %&gt;% tab_spanner(label = &quot;Medium change&quot;, columns = c(&quot;p_mediumchange&quot;, &quot;p_mediumchange_se&quot;)) %&gt;% tab_spanner(label = &quot;Cut ties with school&quot;, columns = c(&quot;p_noschool&quot;, &quot;p_noschool_se&quot;)) %&gt;% fmt_number(decimals = 1) %&gt;% tab_source_note(&quot;AmericasBarometer Surveys, 2021&quot;) covid_educ_ests_gt #zpnruhcqur table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #zpnruhcqur thead, #zpnruhcqur tbody, #zpnruhcqur tfoot, #zpnruhcqur tr, #zpnruhcqur td, #zpnruhcqur th { border-style: none; } #zpnruhcqur p { margin: 0; padding: 0; } #zpnruhcqur .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #zpnruhcqur .gt_caption { padding-top: 4px; padding-bottom: 4px; } #zpnruhcqur .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #zpnruhcqur .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #zpnruhcqur .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #zpnruhcqur .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #zpnruhcqur .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #zpnruhcqur .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #zpnruhcqur .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #zpnruhcqur .gt_column_spanner_outer:first-child { padding-left: 0; } #zpnruhcqur .gt_column_spanner_outer:last-child { padding-right: 0; } #zpnruhcqur .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #zpnruhcqur .gt_spanner_row { border-bottom-style: hidden; } #zpnruhcqur .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #zpnruhcqur .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #zpnruhcqur .gt_from_md > :first-child { margin-top: 0; } #zpnruhcqur .gt_from_md > :last-child { margin-bottom: 0; } #zpnruhcqur .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #zpnruhcqur .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #zpnruhcqur .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #zpnruhcqur .gt_row_group_first td { border-top-width: 2px; } #zpnruhcqur .gt_row_group_first th { border-top-width: 2px; } #zpnruhcqur .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #zpnruhcqur .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #zpnruhcqur .gt_first_summary_row.thick { border-top-width: 2px; } #zpnruhcqur .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #zpnruhcqur .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #zpnruhcqur .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #zpnruhcqur .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #zpnruhcqur .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #zpnruhcqur .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #zpnruhcqur .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #zpnruhcqur .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #zpnruhcqur .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #zpnruhcqur .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #zpnruhcqur .gt_left { text-align: left; } #zpnruhcqur .gt_center { text-align: center; } #zpnruhcqur .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #zpnruhcqur .gt_font_normal { font-weight: normal; } #zpnruhcqur .gt_font_bold { font-weight: bold; } #zpnruhcqur .gt_font_italic { font-style: italic; } #zpnruhcqur .gt_super { font-size: 65%; } #zpnruhcqur .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #zpnruhcqur .gt_asterisk { font-size: 100%; vertical-align: 0; } #zpnruhcqur .gt_indent_1 { text-indent: 5px; } #zpnruhcqur .gt_indent_2 { text-indent: 10px; } #zpnruhcqur .gt_indent_3 { text-indent: 15px; } #zpnruhcqur .gt_indent_4 { text-indent: 20px; } #zpnruhcqur .gt_indent_5 { text-indent: 25px; } TABLE 14.2: Impact on education in households with children under the age of 13 who had children that would generally attend school Normal school only Medium change Cut ties with school % SE % SE % SE Argentina 5.4 1.1 87.1 1.7 9.9 1.6 Brazil 4.3 1.2 81.5 2.3 22.1 2.5 Chile 0.7 0.3 96.2 1.0 4.0 1.0 Colombia 2.8 0.7 90.3 1.4 7.5 1.3 Dominican Republic 3.8 0.8 87.4 1.5 10.5 1.4 Ecuador 5.2 1.0 87.5 1.4 7.9 1.1 El Salvador 2.9 0.7 85.8 1.5 11.8 1.4 Guatemala 3.0 0.7 82.2 1.7 17.7 1.8 Guyana 3.3 0.7 85.3 1.7 13.0 1.6 Haiti 81.1 2.3 7.2 1.5 11.7 1.8 Honduras 3.7 0.9 80.7 1.7 16.9 1.6 Jamaica 5.4 0.9 88.1 1.4 7.5 1.2 Panama 7.2 1.2 89.4 1.4 3.8 0.9 Paraguay 4.7 0.9 90.7 1.4 6.4 1.2 Peru 2.0 0.6 91.8 1.2 6.8 1.1 Uruguay 8.6 1.4 84.3 2.0 8.0 1.6 AmericasBarometer Surveys, 2021 In the countries that were asked this question, many households experienced a change in their child’s education medium. However, in Haiti, only 7.2% of households with children switched to virtual or hybrid learning. 14.6 Mapping survey data While the table effectively presents the data, a map could also be insightful. To generate maps of the countries, we can use the package {rnaturalearth} and subset North and South America with the ne_countries() function (Massicotte and South 2023). The function returns an sf (simple features) object with many columns (Pebesma and Bivand 2023), but most importantly, soverignt (sovereignty), geounit (country or territory), and geometry (the shape). For an example of the difference between sovereignty and country/territory, the United States, Puerto Rico, and the US Virgin Islands are all separate units with the same sovereignty. A map without data is plotted in Figure 14.1 using geom_sf() from the {ggplot2} package which plots sf objects (Wickham 2016). country_shape &lt;- ne_countries( scale = &quot;medium&quot;, returnclass = &quot;sf&quot;, continent = c(&quot;North America&quot;, &quot;South America&quot;) ) country_shape %&gt;% ggplot() + geom_sf() FIGURE 14.1: Map of North and South America The map in Figure 14.1 appears very wide due to the Aleutian islands in Alaska extending into the Eastern Hemisphere. We can crop the shapefile to include only the Western Hemisphere, which removes some of the trailing islands of Alaska using st_crop() from the {sf} package. country_shape_crop &lt;- country_shape %&gt;% st_crop(c(xmin = -180, xmax = 0, ymin = -90, ymax = 90)) Now that we have the necessary shape files, our next step is to match our survey data to the map. Countries can be named differently (e.g., “U.S”, “U.S.A”, “United States”). To make sure we can visualize our survey data on the map, we need to match the country names in both the survey data and the map data. To do this, we can use the anti_join() function to identify the countries in the survey data that aren’t in the map data. For example, as shown below, the United States is referred to as “United States” in the survey data but “United States of America” in the map data. Table 14.3 shows the countries in the survey data but not the map data and Table 14.4 shows the countries in the map data but not the survey data. survey_country_list &lt;- ambarom %&gt;% distinct(Country) survey_country_list_gt &lt;- survey_country_list %&gt;% anti_join(country_shape_crop, by = c(&quot;Country&quot; = &quot;geounit&quot;)) %&gt;% gt() survey_country_list_gt #sgxskozkog table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #sgxskozkog thead, #sgxskozkog tbody, #sgxskozkog tfoot, #sgxskozkog tr, #sgxskozkog td, #sgxskozkog th { border-style: none; } #sgxskozkog p { margin: 0; padding: 0; } #sgxskozkog .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #sgxskozkog .gt_caption { padding-top: 4px; padding-bottom: 4px; } #sgxskozkog .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #sgxskozkog .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #sgxskozkog .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #sgxskozkog .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #sgxskozkog .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #sgxskozkog .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #sgxskozkog .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #sgxskozkog .gt_column_spanner_outer:first-child { padding-left: 0; } #sgxskozkog .gt_column_spanner_outer:last-child { padding-right: 0; } #sgxskozkog .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #sgxskozkog .gt_spanner_row { border-bottom-style: hidden; } #sgxskozkog .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #sgxskozkog .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #sgxskozkog .gt_from_md > :first-child { margin-top: 0; } #sgxskozkog .gt_from_md > :last-child { margin-bottom: 0; } #sgxskozkog .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #sgxskozkog .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #sgxskozkog .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #sgxskozkog .gt_row_group_first td { border-top-width: 2px; } #sgxskozkog .gt_row_group_first th { border-top-width: 2px; } #sgxskozkog .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #sgxskozkog .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #sgxskozkog .gt_first_summary_row.thick { border-top-width: 2px; } #sgxskozkog .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #sgxskozkog .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #sgxskozkog .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #sgxskozkog .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #sgxskozkog .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #sgxskozkog .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #sgxskozkog .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #sgxskozkog .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #sgxskozkog .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #sgxskozkog .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #sgxskozkog .gt_left { text-align: left; } #sgxskozkog .gt_center { text-align: center; } #sgxskozkog .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #sgxskozkog .gt_font_normal { font-weight: normal; } #sgxskozkog .gt_font_bold { font-weight: bold; } #sgxskozkog .gt_font_italic { font-style: italic; } #sgxskozkog .gt_super { font-size: 65%; } #sgxskozkog .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #sgxskozkog .gt_asterisk { font-size: 100%; vertical-align: 0; } #sgxskozkog .gt_indent_1 { text-indent: 5px; } #sgxskozkog .gt_indent_2 { text-indent: 10px; } #sgxskozkog .gt_indent_3 { text-indent: 15px; } #sgxskozkog .gt_indent_4 { text-indent: 20px; } #sgxskozkog .gt_indent_5 { text-indent: 25px; } TABLE 14.3: Countries in the survey data but not the map data Country United States map_country_list_gt&lt;-country_shape_crop %&gt;% as_tibble() %&gt;% select(geounit, sovereignt) %&gt;% anti_join(survey_country_list, by = c(&quot;geounit&quot; = &quot;Country&quot;)) %&gt;% arrange(geounit) %&gt;% gt() map_country_list_gt #ibkckwmzsj table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #ibkckwmzsj thead, #ibkckwmzsj tbody, #ibkckwmzsj tfoot, #ibkckwmzsj tr, #ibkckwmzsj td, #ibkckwmzsj th { border-style: none; } #ibkckwmzsj p { margin: 0; padding: 0; } #ibkckwmzsj .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #ibkckwmzsj .gt_caption { padding-top: 4px; padding-bottom: 4px; } #ibkckwmzsj .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #ibkckwmzsj .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #ibkckwmzsj .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #ibkckwmzsj .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #ibkckwmzsj .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #ibkckwmzsj .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #ibkckwmzsj .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #ibkckwmzsj .gt_column_spanner_outer:first-child { padding-left: 0; } #ibkckwmzsj .gt_column_spanner_outer:last-child { padding-right: 0; } #ibkckwmzsj .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #ibkckwmzsj .gt_spanner_row { border-bottom-style: hidden; } #ibkckwmzsj .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #ibkckwmzsj .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #ibkckwmzsj .gt_from_md > :first-child { margin-top: 0; } #ibkckwmzsj .gt_from_md > :last-child { margin-bottom: 0; } #ibkckwmzsj .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #ibkckwmzsj .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #ibkckwmzsj .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #ibkckwmzsj .gt_row_group_first td { border-top-width: 2px; } #ibkckwmzsj .gt_row_group_first th { border-top-width: 2px; } #ibkckwmzsj .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #ibkckwmzsj .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #ibkckwmzsj .gt_first_summary_row.thick { border-top-width: 2px; } #ibkckwmzsj .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #ibkckwmzsj .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #ibkckwmzsj .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #ibkckwmzsj .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #ibkckwmzsj .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #ibkckwmzsj .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #ibkckwmzsj .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #ibkckwmzsj .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #ibkckwmzsj .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #ibkckwmzsj .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #ibkckwmzsj .gt_left { text-align: left; } #ibkckwmzsj .gt_center { text-align: center; } #ibkckwmzsj .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #ibkckwmzsj .gt_font_normal { font-weight: normal; } #ibkckwmzsj .gt_font_bold { font-weight: bold; } #ibkckwmzsj .gt_font_italic { font-style: italic; } #ibkckwmzsj .gt_super { font-size: 65%; } #ibkckwmzsj .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #ibkckwmzsj .gt_asterisk { font-size: 100%; vertical-align: 0; } #ibkckwmzsj .gt_indent_1 { text-indent: 5px; } #ibkckwmzsj .gt_indent_2 { text-indent: 10px; } #ibkckwmzsj .gt_indent_3 { text-indent: 15px; } #ibkckwmzsj .gt_indent_4 { text-indent: 20px; } #ibkckwmzsj .gt_indent_5 { text-indent: 25px; } TABLE 14.4: Countries in the map data but not the survey data geounit sovereignt Anguilla United Kingdom Antigua and Barbuda Antigua and Barbuda Aruba Netherlands Barbados Barbados Belize Belize Bermuda United Kingdom British Virgin Islands United Kingdom Cayman Islands United Kingdom Cuba Cuba Curaçao Netherlands Dominica Dominica Falkland Islands United Kingdom Greenland Denmark Grenada Grenada Montserrat United Kingdom Puerto Rico United States of America Saint Barthelemy France Saint Kitts and Nevis Saint Kitts and Nevis Saint Lucia Saint Lucia Saint Martin France Saint Pierre and Miquelon France Saint Vincent and the Grenadines Saint Vincent and the Grenadines Sint Maarten Netherlands Suriname Suriname The Bahamas The Bahamas Trinidad and Tobago Trinidad and Tobago Turks and Caicos Islands United Kingdom United States Virgin Islands United States of America United States of America United States of America Venezuela Venezuela There are several ways to fix the mismatched names for a successful join. The simplest solution is to rename the data in the shape object before merging. Since only one country name in the survey data differs from the map data, we rename the map data accordingly. country_shape_upd &lt;- country_shape_crop %&gt;% mutate(geounit = if_else(geounit == &quot;United States of America&quot;, &quot;United States&quot;, geounit)) Now that the country names match, we can merge the survey and map data and then plot the data. We begin with the map file and merge it with the survey estimates generated in Section 14.5 (covid_worry_country_ests and covid_educ_ests). We use the {sf} function of full_join(), which joins the rows in the map data and the survey estimates based on the columns geounit and Country. A full join keeps all the rows from both datasets, matching rows when possible. For any rows without matches, the function fills in an NA for the missing value (Pebesma and Bivand 2023). covid_sf &lt;- country_shape_upd %&gt;% full_join(covid_worry_country_ests, by = c(&quot;geounit&quot; = &quot;Country&quot;)) %&gt;% full_join(covid_educ_ests, by = c(&quot;geounit&quot; = &quot;Country&quot;)) After the merge, we create two figures that display the population estimates for the percentage of people worried about COVID (Figure 14.2) and the percentage of households with at least one child participating in virtual or hybrid learning (Figure 14.3). We also add a cross-hatching pattern to the countries without any data using the geom_sf_pattern() function from the {ggpattern} package (FC, Davis, and ggplot2 authors 2022). ggplot() + geom_sf(data = covid_sf, aes(fill = p, geometry = geometry), color = &quot;darkgray&quot;) + scale_fill_gradientn( guide = &quot;colorbar&quot;, name = &quot;Percent&quot;, labels = scales::comma, colors = c(&quot;#BFD7EA&quot;, &quot;#087e8b&quot;, &quot;#0B3954&quot;), na.value = NA ) + geom_sf_pattern( data = filter(covid_sf, is.na(p)), pattern = &quot;crosshatch&quot;, pattern_fill = &quot;lightgray&quot;, pattern_color = &quot;lightgray&quot;, fill = NA, color = &quot;darkgray&quot; ) + theme_minimal() FIGURE 14.2: Percent of households worried someone in their household will get COVID-19 in the next 3 months by country ggplot() + geom_sf( data = covid_sf, aes(fill = p_mediumchange, geometry = geometry), color = &quot;darkgray&quot; ) + scale_fill_gradientn( guide = &quot;colorbar&quot;, name = &quot;Percent&quot;, labels = scales::comma, colors = c(&quot;#BFD7EA&quot;, &quot;#087e8b&quot;, &quot;#0B3954&quot;), na.value = NA ) + geom_sf_pattern( data = filter(covid_sf, is.na(p_mediumchange)), pattern = &quot;crosshatch&quot;, pattern_fill = &quot;lightgray&quot;, pattern_color = &quot;lightgray&quot;, fill = NA, color = &quot;darkgray&quot; ) + theme_minimal() FIGURE 14.3: Percent of households who had at least one child participate in virtual or hybrid learning In Figure 14.3, we observe missing data (represented by the crosshatch pattern) for Canada, Mexico, and the United States. The questionnaires indicate that these three countries did not include the education question in the survey. To focus on countries with available data, we can remove North America from the map and show only Central and South America. We do this below by restricting the shape files to Latin America and the Caribbean, as depicted in Figure 14.4. covid_c_s &lt;- covid_sf %&gt;% filter(region_wb == &quot;Latin America &amp; Caribbean&quot;) ggplot() + geom_sf( data = covid_c_s, aes(fill = p_mediumchange, geometry = geometry), color = &quot;darkgray&quot; ) + scale_fill_gradientn( guide = &quot;colorbar&quot;, name = &quot;Percent&quot;, labels = scales::comma, colors = c(&quot;#BFD7EA&quot;, &quot;#087e8b&quot;, &quot;#0B3954&quot;), na.value = NA ) + geom_sf_pattern( data = filter(covid_c_s, is.na(p_mediumchange)), pattern = &quot;crosshatch&quot;, pattern_fill = &quot;lightgray&quot;, pattern_color = &quot;lightgray&quot;, fill = NA, color = &quot;darkgray&quot; ) + theme_minimal() FIGURE 14.4: Percent of households who had at least one child participate in virtual or hybrid learning, Central and South America In Figure 14.4, we can see that most countries with available data have similar percentages (reflected in their similar shades). However, Haiti stands out with a lighter shade, indicating a considerably lower percentage of households with at least one child participating in virtual or hybrid learning. 14.7 Exercises Calculate the percentage of households with broadband internet in and those with any internet at home, including from a phone or tablet in Latin America and the Caribbean. Hint: if you come across countries with 0% internet usage, you may want to filter by something first. Create a faceted map showing both broadband internet and any internet usage. References FC, Mike, Trevor L Davis, and ggplot2 authors. 2022. ggpattern: Ggplot2 Pattern Geoms. Iannone, Richard, Joe Cheng, Barret Schloerke, Ellis Hughes, Alexandra Lauer, and JooYoung Seo. 2023. gt: Easily Create Presentation-Ready Display Tables. LAPOP. 2021a. “AmericasBarometer 2021 - Canada: Technical Information.” Vanderbilt University; http://datasets.americasbarometer.org/database/files/ABCAN2021-Technical-Report-v1.0-FINAL-eng-110921.pdf. ———. 2021b. “AmericasBarometer 2021 - U.S.: Technical Information.” Vanderbilt University; http://datasets.americasbarometer.org/database/files/ABUSA2021-Technical-Report-v1.0-FINAL-eng-110921.pdf. ———. 2021c. “AmericasBarometer 2021: Technical Information.” Vanderbilt University; https://www.vanderbilt.edu/lapop/ab2021/AB2021-Technical-Report-v1.0-FINAL-eng-030722.pdf. ———. 2021d. “Core Questionnaire.” https://www.vanderbilt.edu/lapop/ab2021/AB2021-Core-Questionnaire-v17.5-Eng-210514-W-v2.pdf. ———. 2023a. “About the AmericasBarometer.” https://www.vanderbilt.edu/lapop/about-americasbarometer.php. ———. 2023b. “The AmericasBarometer by the LAPOP Lab.” www.vanderbilt.edu/lapop. Massicotte, Philippe, and Andy South. 2023. rnaturalearth: World Map Data from Natural Earth. https://docs.ropensci.org/rnaturalearth/ Pebesma, Edzer, and Roger Bivand. 2023. Spatial Data Science: With applications in R. Chapman and Hall/CRC. https://doi.org/10.1201/9780429459016. Wickham, Hadley. 2016. Ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. https://ggplot2.tidyverse.org. ———. 2023a. forcats: Tools for Working with Categorical Variables (Factors). Wickham, Hadley, Evan Miller, and Danny Smith. 2023. haven: Import and Export SPSS, Stata and SAS Files. See Table 2 in LAPOP (2021c) for dates by country↩︎ "],["importing-survey-data-into-r.html", "A Importing survey data into R A.1 Importing delimiter-separated files into R A.2 Loading Excel files into R A.3 Importing Stata, SAS, and SPSS files into R A.4 Importing data from APIs into R A.5 Accessing databases in R A.6 Importing data from other formats", " A Importing survey data into R To analyze a survey, we need to import the survey data into R. This process is often referred to as importing, loading, or reading in data. Survey files come in different formats depending on the software used to create them. One of the many advantages of R is the flexibility in handling various data formats, regardless of their file extensions. Here are examples of common public-use survey file formats we may encounter: Delimiter-separated text files Excel spreadsheets in .xls or .xlsx format R native .rda files Stata datasets in .dta format SAS datasets in .sas format SPSS datasets in .sav format Application Programming Interfaces (APIs), often in JSON format Data stored in databases This appendix guides analysts through the process of importing these various types of survey data into R. A.1 Importing delimiter-separated files into R Delimiter-separated files use specific characters, known as delimiters, to separate values within the file. For example, CSV (Comma-Separated Values) files use commas as delimiters, while TSV (Tab-Separated Values) files use tabs. These file formats are widely used because of their simplicity and compatibility with various software applications. The {readr} package, part of the tidyverse ecosystem, offers efficient ways to import delimiter-separated files into R (Wickham, Hester, and Bryan 2023). It provides several advantages, including automatic data type detection and flexible handling of missing values, depending on one’s survey research needs. The {readr} package includes functions for: read_csv(): This function is specifically designed to read CSV files. read_tsv(): Use this function for Tab-Separated Values (TSV) files. read_delim(): This function can handle a broader range of delimiter-separated files, including CSV and TSV. Specify the delimiter using the delim argument. read_fwf(): This function is useful for importing Fixed-Width Files, where columns have predetermined widths, and values are aligned in specific positions. read_table(): Use this function when dealing with whitespace-separated files, such as those with spaces or multiple spaces as delimiters. read_log(): This function can read and parse web log files. The syntax for read_csv() is: read_csv( file, col_names = TRUE, col_types = NULL, col_select = NULL, id = NULL, locale = default_locale(), na = c(&quot;&quot;, &quot;NA&quot;), comment = &quot;&quot;, trim_ws = TRUE, skip = 0, n_max = Inf, guess_max = min(1000, n_max), name_repair = &quot;unique&quot;, num_threads = readr_threads(), progress = show_progress(), show_col_types = should_show_types(), skip_empty_rows = TRUE, lazy = should_read_lazy() ) The arguments are: file: the path to the Excel file to import col_names: a value of TRUE will import the first row of the file as column names and not included in the data frame. A value of FALSE will create automated column names. Alternatively, we can provide a vector of column names. col_types: by default, R will infer the column variable types. We can also provide a column specification using list() or cols(); for example, use col_types = cols(.default = \"c\") to read all the columns as characters. Alternatively, we can use a string to specify the variable types for each column. col_select: the columns to include in the results id: a column for storing the file path. This is useful for keeping track of the input file when importing multiple CSVs at a time. locale: the location-specific defaults for the file na: a character vector of values to interpret as missing comment: a character vector of values to interpret as comments trim_ws: a value of TRUE will trim leading and trailing white space skip: number of lines to skip before importing the data n_max: maximum number of lines to read guess_max: maximum number of lines use for guessing column types name_repair: whether to check column names. By default, the column names are unique. num_threads: the number of processing threads to use for initial parsing and lazy reading of data progress: a value of TRUE displays a progress bar show_col_types: a value of TRUE displays the column types skip_empty_rows: a value of TRUE will ignore blank rows lazy: a value of TRUE will read values lazily The other functions share a similar syntax to read_csv(). To find more details, run ?? followed by the function name. For example, run ??read_delim in the Console for additional information. In the example below, we use {readr} to load a CSV file named ‘anes_timeseries_2020_csv_20220210.csv’ into an R object called anes_csv. The read_csv() imports the file and stores the data in the anes_csv object. We can then use this object for further analysis. library(readr) anes_csv &lt;- read_csv(&quot;data/anes_timeseries_2020_csv_20220210.csv&quot;) A.2 Loading Excel files into R Excel, a widely used spreadsheet software program created by Microsoft, is a common file format in survey research. We can load Excel spreadsheets into the R environment using the {readxl} package. The package supports both the legacy .xls files and the modern .xlsx format. To load Excel data into R, we can use the read_excel() function from the {readxl} package. This function offers a range of customizable options for the import process. Let’s explore the syntax: read_excel( path, sheet = NULL, range = NULL, col_names = TRUE, col_types = NULL, na = &quot;&quot;, trim_ws = TRUE, skip = 0, n_max = Inf, guess_max = min(1000, n_max), progress = readxl_progress(), .name_repair = &quot;unique&quot; ) The arguments are: path: the path to the Excel file to import sheet: the name or index of the sheet (sometimes called tabs) within the Excel file range: the range of cells to import (for example, “P15:T87”) col_names: indicates whether the first row of the dataset contains column names col_types: specify the data types of columns na: define the representation of missing values (for example, NULL) trim_ws: controls whether leading and trailing whitespaces should be trimmed skip and n_max: enable skipping rows and limit the number of rows imported guess_max: sets the maximum number of rows used for data type guessing progress: specifies a progress bar for large imports .name_repair: determines how column names are repaired if they are not valid In the code example below, we load an Excel spreadsheet named ‘anes_timeseries_2020_csv_20220210.xlsx’ into R. The resulting data is saved as a tibble in the anes_excel object, ready for further analysis. library(readxl) anes_excel &lt;- read_excel(path = &quot;data/anes_timeseries_2020_csv_20220210.xlsx&quot;) A.3 Importing Stata, SAS, and SPSS files into R The {haven} package, also from the tidyverse ecosystem, imports various proprietary data formats: Stata .dta files, SPSS .sav files, and SAS .sas7bdat and .sas7bcat files (Wickham, Miller, and Smith 2023). One of the notable strengths of the {haven} package is its ability to handle multiple proprietary formats within a unified framework. It offers dedicated functions for each supported proprietary format, making it straightforward to import data regardless of the program. Here, we introduce read_dat() for Stata files, read_sav() for SPSS files, and read_sas() for SAS files. A.3.1 Syntax Let’s explore the syntax for importing Stata files .dat files using haven::read_dat(): read_dta( file, encoding = NULL, col_select = NULL, skip = 0, n_max = Inf, .name_repair = &quot;unique&quot; ) The arguments are: file: the path to the proprietary data file to import encoding: specifies the character encoding of the data file col_select: select specific columns for import skip and n_max: control the number of rows skipped and the maximum number of rows imported .name_repair: determines how column names are repaired if they are not valid The syntax for read_sav() is similar to read_dat(): read_sav( file, encoding = NULL, user_na = FALSE, col_select = NULL, skip = 0, n_max = Inf, .name_repair = &quot;unique&quot; ) The arguments are: file: the path to the proprietary data file to import encoding: specifies the character encoding of the data file col_select: select specific columns for import user_na: a value of TRUE will read variables with user defined missing labels will be read into labelled_spss() objects skip and n_max: control the number of rows skipped and the maximum number of rows imported .name_repair: determines how column names are repaired if they are not valid The syntax for importing SAS files with read_sas() is as follows: read_sas( data_file, catalog_file = NULL, encoding = NULL, catalog_encoding = encoding, col_select = NULL, skip = 0L, n_max = Inf, .name_repair = &quot;unique&quot; ) The arguments are: data_file: the path to the proprietary data file to import catalog_file: the path to the catalog file to import encoding: specifies the character encoding of the data file catalog_encoding: specifies the character encoding of the catalog file col_select: select specific columns for import skip and n_max: control the number of rows skipped and the maximum number of rows imported .name_repair: determines how column names are repaired if they are not valid In the code examples below, we demonstrate how to load Stata, SPSS, and SAS files into R using the respective {haven} functions. The resulting data is stored in anes_dta, anes_sav, and anes_sas objects as tibbles, ready for use in R. Stata: library(haven) anes_dta &lt;- read_dta(system.file(&quot;extdata&quot;, &quot;anes_2020_stata_example.dta&quot;, package=&quot;srvyrexploR&quot;)) SPSS: library(haven) anes_sav &lt;- read_sav(file = &quot;data/anes_timeseries_2020_spss_20220210.sav&quot;) SAS: library(haven) anes_sas &lt;- read_sas(file = &quot;data/anes_timeseries_2020_sas_20220210.sas7bdat&quot;) A.3.2 Working with labeled data Stata, SPSS, and SAS files often contain labeled variables and values. These labels provide descriptive information about categorical data, making it easier to understand and analyze. When importing data from Stata, SPSS, or SAS, preserving these labels is essential for maintaining data fidelity. Consider a variable like ‘Education Level’ with coded values (e.g., 1, 2, 3). Without labels, these codes can be cryptic. However, with labels (‘High School Graduate,’ ‘Bachelor’s Degree,’ ‘Master’s Degree’), the data becomes more informative and easier to work with. With the {haven} package, we have the capability to import and work with labeled data from Stata, SPSS, and SAS files. The package uses a special class of data called haven_labelled to store labeled variables. When a dataset label is defined in Stata, it is stored in the ‘label’ attribute of the tibble when imported, ensuring that the information is not lost. We can use functions like select(), glimpse(), and is.labelled() to inspect the imported data and verify if variables are labeled. Take a look at the ANES Stata file. Notice that categorical variables are marked with a type of &lt;dbl+lbl&gt;. This notation indicates that these variables are labeled. library(dplyr) anes_dta %&gt;% select(1:6) %&gt;% glimpse() ## Rows: 7,453 ## Columns: 6 ## $ V200001 &lt;dbl&gt; 200015, 200022, 200039, 200046, 200053, 200060, 20008… ## $ V200002 &lt;dbl+lbl&gt; 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3… ## $ V200010b &lt;dbl&gt; 1.0057, 1.1635, 0.7687, 0.5210, 0.9658, 0.2347, 0.440… ## $ V200010d &lt;dbl&gt; 9, 26, 41, 29, 23, 37, 7, 37, 32, 41, 22, 7, 38, 21, … ## $ V200010c &lt;dbl&gt; 2, 2, 1, 2, 1, 2, 1, 2, 2, 2, 1, 1, 2, 2, 2, 2, 1, 1,… ## $ V201006 &lt;dbl+lbl&gt; 2, 3, 2, 3, 2, 1, 2, 3, 2, 2, 2, 2, 2, 1, 2, 1, 1… We can confirm this label status using the haven::is.labelled() function. haven::is.labelled(anes_dta$V200002) ## [1] TRUE To explore the labels further, we can use the attributes() function. This function provides insights into both the variable labels ($label) and the associated value labels ($labels). attributes(anes_dta$V200002) ## $label ## [1] &quot;Mode of interview: pre-election interview&quot; ## ## $format.stata ## [1] &quot;%10.0g&quot; ## ## $class ## [1] &quot;haven_labelled&quot; &quot;vctrs_vctr&quot; &quot;double&quot; ## ## $labels ## 1. Video 2. Telephone 3. Web ## 1 2 3 When we import a labeled dataset using {haven}, it results in a tibble containing both the data and label information. However, this is meant to be an intermediary data structure and not intended to be the final data format for analysis. Instead, we should convert it into a regular R data frame before continuing our data workflow. There are two primary methods to achieve this conversion: (1) convert to factors or (2) remove the labels. Option 1: Convert the vector into a factor Factors are native R data types for working with categorical data. They consist of integer values that correspond to character values, known as levels. Below is a dummy example of factors. Printing factors shows the four different levels in the data: strongly agree, agree, disagree, and strongly disagree. response &lt;- c(&quot;strongly agree&quot;, &quot;agree&quot;, &quot;agree&quot;, &quot;disagree&quot;) response_levels &lt;- c(&quot;strongly agree&quot;, &quot;agree&quot;, &quot;disagree&quot;, &quot;strongly disagree&quot;) factors &lt;- factor(response, levels = response_levels) factors ## [1] strongly agree agree agree disagree ## Levels: strongly agree agree disagree strongly disagree Factors are integer vectors, though they may look like character strings. We can confirm by looking at the vector’s structure: glimpse(factors) ## Factor w/ 4 levels &quot;strongly agree&quot;,..: 1 2 2 3 R’s factors differ from Stata, SPSS, or SAS’ labeled vectors. However, we can convert labeled variables into factors using the as_factor() function. anes_dta %&gt;% transmute(V200002 = as_factor(V200002)) ## # A tibble: 7,453 × 1 ## V200002 ## &lt;fct&gt; ## 1 3. Web ## 2 3. Web ## 3 3. Web ## 4 3. Web ## 5 3. Web ## 6 3. Web ## 7 3. Web ## 8 3. Web ## 9 3. Web ## 10 3. Web ## # ℹ 7,443 more rows The as_factor() function can be applied to all columns in a data frame or individual ones. Below, we convert all &lt;dbl+lbl&gt; columns into factors. anes_dta_factor &lt;- anes_dta %&gt;% as_factor() anes_dta_factor %&gt;% select(1:6) %&gt;% glimpse() ## Rows: 7,453 ## Columns: 6 ## $ V200001 &lt;dbl&gt; 200015, 200022, 200039, 200046, 200053, 200060, 20008… ## $ V200002 &lt;fct&gt; 3. Web, 3. Web, 3. Web, 3. Web, 3. Web, 3. Web, 3. We… ## $ V200010b &lt;dbl&gt; 1.0057, 1.1635, 0.7687, 0.5210, 0.9658, 0.2347, 0.440… ## $ V200010d &lt;dbl&gt; 9, 26, 41, 29, 23, 37, 7, 37, 32, 41, 22, 7, 38, 21, … ## $ V200010c &lt;dbl&gt; 2, 2, 1, 2, 1, 2, 1, 2, 2, 2, 1, 1, 2, 2, 2, 2, 1, 1,… ## $ V201006 &lt;fct&gt; 2. Somewhat interested, 3. Not much interested, 2. So… Option 2: Strip the labels The second option is to remove the labels altogether, converting the labeled data into a regular R data frame. To remove, or ‘zap’ the labels from our tibble, we can use the {haven} package’s zap_label() and zap_labels() functions. This approach removes the labels but retains the data values in their original form. The ANES Stata file columns contains variable labels. Using purrr’s map(), we can review the labels using attr. In the example below, we list the first two variables and their labels. For instance, the label for V200002 is “Mode of interview: pre-election interview”. purrr::map(anes_dta, ~attr(.x, &quot;label&quot;)) %&gt;% head(2) ## $V200001 ## [1] &quot;2020 Case ID&quot; ## ## $V200002 ## [1] &quot;Mode of interview: pre-election interview&quot; Use zap_label() to remove the variable labels but retain the value labels. Notice that the labels return as NULL. zap_label(anes_dta) %&gt;% purrr::map(~attr(.x, &quot;label&quot;)) %&gt;% head(2) ## $V200001 ## NULL ## ## $V200002 ## 1. Video 2. Telephone 3. Web ## 1 2 3 To remove the value labels, use zap_labels(). Notice the previous &lt;dbl+lbl&gt; columns are now &lt;dbl&gt;. zap_labels(anes_dta) %&gt;% select(1:6) %&gt;% glimpse() ## Rows: 7,453 ## Columns: 6 ## $ V200001 &lt;dbl&gt; 200015, 200022, 200039, 200046, 200053, 200060, 20008… ## $ V200002 &lt;dbl&gt; 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,… ## $ V200010b &lt;dbl&gt; 1.0057, 1.1635, 0.7687, 0.5210, 0.9658, 0.2347, 0.440… ## $ V200010d &lt;dbl&gt; 9, 26, 41, 29, 23, 37, 7, 37, 32, 41, 22, 7, 38, 21, … ## $ V200010c &lt;dbl&gt; 2, 2, 1, 2, 1, 2, 1, 2, 2, 2, 1, 1, 2, 2, 2, 2, 1, 1,… ## $ V201006 &lt;dbl&gt; 2, 3, 2, 3, 2, 1, 2, 3, 2, 2, 2, 2, 2, 1, 2, 1, 1, 1,… While it is important to convert labeled datasets into regular R data frames for working in R, the labels themselves often contain valuable information that provide context and meaning to the survey variables. To aid with interpretability and documention, consider creating a data dictionary from the labeled dataset. A data dictionary is a reference document that provides detailed information about the variables and values of a survey. The {labelled} package offers a convenient function, generate_dictionary(), that creates data dictionaries directly from a labeled dataset (Larmarange 2023). This function extracts variable labels, value labels, and other metadata and organizes them into a structured document that we can browse and reference throughout our analysis. Let’s create a data dictionary from the ANES Stata dataset as an example: library(labelled) dictionary &lt;- generate_dictionary(anes_dta) Once we’ve generated the data dictionary, we can take a look at the V200002 variable and see the label, column type, number of missing entries, and associated values. dictionary %&gt;% filter(variable == &quot;V200002&quot;) ## pos variable label col_type missing ## 2 V200002 Mode of interview: pre-electi~ dbl+lbl 0 ## ## ## values ## [1] 1. Video ## [2] 2. Telephone ## [3] 3. Web A.3.3 Labeled missing data values In survey data analysis, dealing with missing values is a crucial aspect of data preparation. Stata, SPSS, and SAS files each have their own methods for handling missing values. Stata has “extended” missing values, .A through .Z. SAS has “special” missing values, .A through .Z and ._. SPSS has per-column “user” missing values. Each column can declare up to three distinct values or a range of values (plus one distinct value) that should be treated as missing. SAS and Stata use a concept known as ‘tagged’ missing values, which extend R’s regular NA. A ‘tagged’ missing value is essentially an NA with an additional single-character label. These values behave identically to regular NA in standard R operations while preserving the informative tag associated with the missing value. Here is an example from the NORC at the University of Chicago’s 2018 General Society Survey. head(gss_dta$HEALTH) #&gt; &lt;labelled&lt;double&gt;[6]&gt;: condition of health #&gt; [1] 2 1 NA(i) NA(i) 1 2 #&gt; #&gt; Labels: #&gt; value label #&gt; 1 excellent #&gt; 2 good #&gt; 3 fair #&gt; 4 poor #&gt; NA(d) DK #&gt; NA(i) IAP #&gt; NA(n) NA In contrast, SPSS uses a different approach called ‘user-defined values’ to denote missing values. Each column in an SPSS dataset can have up to three distinct values designated as missing or a specified range of missing values. To model these additional user-defined missing values, {haven} provides the labeled_spss() subclass of labeled(). When you import SPSS data using {haven}, it ensures that user-defined missing values are correctly handled. You can work with this data in R while preserving the unique missing value conventions from SPSS. Here is what the GSS SPSS data looks like when loaded with {haven}. head(gss_sps$HEALTH) #&gt; &lt;labelled_spss&lt;double&gt;[6]&gt;: Condition of health #&gt; [1] 2 1 0 0 1 2 #&gt; Missing values: 0, 8, 9 #&gt; #&gt; Labels: #&gt; value label #&gt; 0 IAP #&gt; 1 EXCELLENT #&gt; 2 GOOD #&gt; 3 FAIR #&gt; 4 POOR #&gt; 8 DK #&gt; 9 NA A.4 Importing data from APIs into R In addition to working with data saved as files, we may also need to retrieve data through Application Programming Interfaces (APIs). APIs provide a structured way to access data hosted on external servers and import it directly into R for analysis. To access this data, you need to understand how to construct API requests. Each API has unique endpoints, parameters, and authentication requirements. Pay attention to: Endpoints: These are URLs that point to specific data or services. Parameters: Information you pass to the API to customize your request (e.g., date ranges, filters). Authentication: APIs may require API keys or tokens for access. Rate Limits: APIs may have usage limits, so be aware of any rate limits or quotas. Typically, we begin by making a GET request to an API endpoint. The {httr2} package allows us to generate and process HTTP requests (Wickham 2023b). We can make the GET request by pointing to the URL that contains the data we would like. library(httr2) api_url &lt;- &quot;https://api.example.com/survey-data&quot; response &lt;- GET(api_url) Once we make the request, we will obtain the data as the response. The data often comes in JSON format. We can extract and parse the data using the {jsonlite} package, allowing us to work with it in R (Ooms 2014). The fromJSON() function, shown below, coverts JSON data to an R object. survey_data &lt;- fromJSON(content(response, &quot;text&quot;)) Note that these are dummy examples. Please review the documentation to understand how to make requests from your specific API. R offers several packages that simplify API access by providing ready-to-use functions for popular APIs. These packages are called “wrappers”, as they “wrap” the API to make it easier to use. For example, the {tidycensus} package used in this book simplifies access to U.S. Census data, allowing us to retrieve data with R commands instead of writing complex API requests (Walker and Herman 2024). For example, if we are interested in the population (B01003_001) of each census tract in North Carolina from the 2020 ACS, we would use the get_acs() function and the code below. Behind the scenes, get_acs() is making a GET request from the Census API and the {tidycensus} functions are converting the response into an R-friendly format. library(tidycensus) census_data &lt;- get_acs( geography = &quot;tract&quot;, variables = &quot;B01003_001&quot;, year = 2020, state = &quot;NC&quot; ) In Chapter 4, we used the {censusapi} package to get data from the Census data API for the Current Population Survey. To discover if there’s an R package that directly interfaces with a specific survey or data source, search for “[survey] R wrapper” or “[data source] R package” online. A.5 Accessing databases in R Databases provide a secure and organized solution as the volume and complexity of data grow. We can access, manage, and update data stored in databases in a systematic way. Because of how the data are organized, teams can draw from the same source and obtain any metadata that would be helpful for analysis. There are various ways of working with databases in RStudio. We can connect to different databases through the Connections Pane in the top right of the IDE. We can also use packages like {DBI} and {odbc} to access database tables in R files. Here is an example script connecting to a database: con &lt;- DBI::dbConnect(odbc::odbc(), Driver = &quot;[your driver&#39;s name]&quot;, Server = &quot;[your server&#39;s path]&quot;, UID = rstudioapi::askForPassword(&quot;Database user&quot;), PWD = rstudioapi::askForPassword(&quot;Database password&quot;), Database = &quot;[your database&#39;s name]&quot;, Warehouse = &quot;[your warehouse&#39;s name]&quot;, Schema = &quot;[your schema&#39;s name]&quot; ) The {dbplyr} and {dplyr} packages allow us to make queries and run data analysis entirely using {dplyr} syntax. All of the code can be written in R so we do not have to switch between R and SQL to explore the data. Here is some sample code: q1 &lt;- tbl(con, &quot;bank&quot;) %&gt;% group_by(month_idx, year, month) %&gt;% summarise( subscribe = sum(ifelse(term_deposit == &quot;yes&quot;, 1, 0)), total = n()) show_query(q1) Be sure to check the documentation to configure a database connection. A.6 Importing data from other formats R also offers dedicated packages such as {googlesheets4} for Google Sheets or {qualtRics} for Qualtrics. With less common or proprietary file formats, the broader data science community can often provide guidance. Online resources like Stack Overflow and dedicated forums like Posit Community are valuable sources of information for importing data into R. References Larmarange, Joseph. 2023. labelled: Manipulating Labelled Data. https://larmarange.github.io/labelled/. Ooms, Jeroen. 2014. “The Jsonlite Package: A Practical and Consistent Mapping Between JSON Data and r Objects.” arXiv:1403.2805 [Stat.CO]. https://arxiv.org/abs/1403.2805. Walker, Kyle, and Matt Herman. 2024. tidycensus: Load US Census Boundary and Attribute Data as Tidyverse and Sf-Ready Data Frames. https://walker-data.com/tidycensus/. ———. 2023b. httr2: Perform HTTP Requests and Process the Responses. Wickham, Hadley, Jim Hester, and Jennifer Bryan. 2023. readr: Read Rectangular Text Data. Wickham, Hadley, Evan Miller, and Danny Smith. 2023. haven: Import and Export SPSS, Stata and SAS Files. "],["anes-cb.html", "B ANES derived variable codebook B.1 ADMIN B.2 WEIGHTS B.3 PRE-ELECTION SURVEY QUESTIONNAIRE B.4 POST-ELECTION SURVEY QUESTIONNAIRE", " B ANES derived variable codebook The full codebook with the original variables is available at American National Election Studies (2022). This is a codebook for the ANES data used in this book (anes_2020) from the {srvyrexploR} package. B.1 ADMIN V200001 Description: 2020 Case ID Variable class: numeric CaseID Description: 2020 Case ID Variable class: numeric V200002 Description: Mode of interview: pre-election interview Variable class: haven_labelled, vctrs_vctr, double V200002 Label n Unweighted Freq 1 Video 274 0.037 2 Telephone 115 0.015 3 Web 7064 0.948 Total 7453 1.000 InterviewMode Description: Mode of interview: pre-election interview Variable class: factor InterviewMode n Unweighted Freq Video 274 0.037 Telephone 115 0.015 Web 7064 0.948 Total 7453 1.000 B.2 WEIGHTS V200010b Description: Full sample post-election weight Variable class: numeric N Missing Minimum Median Maximum 0 0.0083 0.6863 6.651 Weight Description: Full sample post-election weight Variable class: numeric N Missing Minimum Median Maximum 0 0.0083 0.6863 6.651 V200010c Description: Full sample variance unit Variable class: numeric N Missing Minimum Median Maximum 0 1 2 3 VarUnit Description: Full sample variance unit Variable class: factor VarUnit n Unweighted Freq 1 3689 0.495 2 3750 0.503 3 14 0.002 Total 7453 1.000 V200010d Description: Full sample variance stratum Variable class: numeric N Missing Minimum Median Maximum 0 1 24 50 Stratum Description: Full sample variance stratum Variable class: factor Stratum n Unweighted Freq 1 167 0.022 2 148 0.020 3 158 0.021 4 151 0.020 5 147 0.020 6 172 0.023 7 163 0.022 8 159 0.021 9 160 0.021 10 159 0.021 11 137 0.018 12 179 0.024 13 148 0.020 14 160 0.021 15 159 0.021 16 148 0.020 17 158 0.021 18 156 0.021 19 154 0.021 20 144 0.019 21 170 0.023 22 146 0.020 23 165 0.022 24 147 0.020 25 169 0.023 26 165 0.022 27 172 0.023 28 133 0.018 29 157 0.021 30 167 0.022 31 154 0.021 32 143 0.019 33 143 0.019 34 124 0.017 35 138 0.019 36 130 0.017 37 136 0.018 38 145 0.019 39 140 0.019 40 125 0.017 41 158 0.021 42 146 0.020 43 130 0.017 44 126 0.017 45 126 0.017 46 135 0.018 47 133 0.018 48 140 0.019 49 133 0.018 50 130 0.017 Total 7453 1.000 B.3 PRE-ELECTION SURVEY QUESTIONNAIRE V201006 Description: PRE: How interested in following campaigns Question: Some people don’t pay much attention to political campaigns. How about you? Would you say that you have been very much interested, somewhat interested or not much interested in the political campaigns so far this year? Variable class: haven_labelled, vctrs_vctr, double V201006 Label n Unweighted Freq -9 -9. Refused 1 0.000 1 Very much interested 3940 0.529 2 Somewhat interested 2569 0.345 3 Not much interested 943 0.127 Total 7453 1.000 CampaignInterest Description: PRE: How interested in following campaigns Question: Some people don’t pay much attention to political campaigns. How about you? Would you say that you have been very much interested, somewhat interested or not much interested in the political campaigns so far this year? Variable class: factor CampaignInterest n Unweighted Freq Very much interested 3940 0.529 Somewhat interested 2569 0.345 Not much interested 943 0.127 NA 1 0.000 Total 7453 1.000 V201023 Description: PRE: Confirmation voted (early) in November 3 Election (2020) Question: Just to be clear, I’m recording that you already voted in the election that is scheduled to take place on November 3. Is that right? Variable class: haven_labelled, vctrs_vctr, double V201023 Label n Unweighted Freq -9 -9. Refused 2 0.000 -1 -1. Inapplicable 6961 0.934 1 Yes, voted 375 0.050 2 No, have not voted 115 0.015 Total 7453 1.000 EarlyVote2020 Description: PRE: Confirmation voted (early) in November 3 Election (2020) Question: Just to be clear, I’m recording that you already voted in the election that is scheduled to take place on November 3. Is that right? Variable class: factor EarlyVote2020 n Unweighted Freq Yes 375 0.050 No 115 0.015 NA 6963 0.934 Total 7453 1.000 V201024 Description: PRE: In what manner did R vote Question: Which one of the following best describes how you voted? Variable class: haven_labelled, vctrs_vctr, double V201024 Label n Unweighted Freq -9 -9. Refused 1 0.000 -1 -1. Inapplicable 7078 0.950 1 Definitely voted in person at a polling place before election day 101 0.014 2 Definitely voted by mailing a ballot to elections officials before election day 242 0.032 3 Definitely voted in some other way 28 0.004 4 Not completely sure whether you voted or not 3 0.000 Total 7453 1.000 V201025x Description: PRE: SUMMARY: Registration and early vote status Variable class: haven_labelled, vctrs_vctr, double V201025x Label n Unweighted Freq -4 -4. Technical error 1 0.000 1 Not registered (or DK/RF), does not intend to register (or DK/RF intent) 339 0.045 2 Not registered (or DK/RF), intends to register 290 0.039 3 Registered but did not vote early (or DK/RF) 6452 0.866 4 Registered and voted early 371 0.050 Total 7453 1.000 V201028 Description: PRE: DID R VOTE FOR PRESIDENT Question: How about the election for President? Did you vote for a candidate for President? Variable class: haven_labelled, vctrs_vctr, double V201028 Label n Unweighted Freq -9 -9. Refused 1 0.000 -1 -1. Inapplicable 7081 0.950 1 Yes, voted for President 361 0.048 2 No, didn’t vote for President 10 0.001 Total 7453 1.000 V201029 Description: PRE: For whom did R vote for President Question: Who did you vote for? [Joe Biden, Donald Trump/Donald Trump, Joe Biden], Jo Jorgensen, Howie Hawkins, or someone else? Variable class: haven_labelled, vctrs_vctr, double V201029 Label n Unweighted Freq -9 -9. Refused 10 0.001 -1 -1. Inapplicable 7092 0.952 1 Joe Biden 239 0.032 2 Donald Trump 103 0.014 3 Jo Jorgensen 2 0.000 4 Howie Hawkins 1 0.000 5 Other candidate {SPECIFY} 4 0.001 12 Specified as refused 2 0.000 Total 7453 1.000 V201101 Description: PRE: Did R vote for President in 2016 [revised] Question: Four years ago, in 2016, Hillary Clinton ran on the Democratic ticket against Donald Trump for the Republicans. We talk to many people who tell us they did not vote. And we talk to a few people who tell us they did vote, who really did not. We can tell they did not vote by checking with official government records. What about you? If we check the official government voter records, will they show that you voted in the 2016 presidential election, or that you did not vote in that election? Variable class: haven_labelled, vctrs_vctr, double V201101 Label n Unweighted Freq -9 -9. Refused 13 0.002 -8 -8. Don’t know 1 0.000 -1 -1. Inapplicable 3780 0.507 1 Yes, voted 2780 0.373 2 No, didn’t vote 879 0.118 Total 7453 1.000 V201102 Description: PRE: Did R vote for President in 2016 Question: Four years ago, in 2016, Hillary Clinton ran on the Democratic ticket against Donald Trump for the Republicans. Do you remember for sure whether or not you voted in that election? Variable class: haven_labelled, vctrs_vctr, double V201102 Label n Unweighted Freq -9 -9. Refused 6 0.001 -8 -8. Don’t know 1 0.000 -1 -1. Inapplicable 3673 0.493 1 Yes, voted 3030 0.407 2 No, didn’t vote 743 0.100 Total 7453 1.000 VotedPres2016 Description: PRE: Did R vote for President in 2016 Question: Derived from V201102, V201101 Variable class: factor VotedPres2016 n Unweighted Freq Yes 5810 0.780 No 1622 0.218 NA 21 0.003 Total 7453 1.000 V201103 Description: PRE: Recall of last (2016) Presidential vote choice Question: Which one did you vote for? Variable class: haven_labelled, vctrs_vctr, double V201103 Label n Unweighted Freq -9 -9. Refused 41 0.006 -8 -8. Don’t know 2 0.000 -1 -1. Inapplicable 1643 0.220 1 Hillary Clinton 2911 0.391 2 Donald Trump 2466 0.331 5 Other {SPECIFY} 390 0.052 Total 7453 1.000 VotedPres2016_selection Description: PRE: Recall of last (2016) Presidential vote choice Question: Which one did you vote for? Variable class: factor VotedPres2016_selection n Unweighted Freq Clinton 2911 0.391 Trump 2466 0.331 Other 390 0.052 NA 1686 0.226 Total 7453 1.000 V201228 Description: PRE: Party ID: Does R think of self as Democrat, Republican, or Independent Question: Generally speaking, do you usually think of yourself as [a Democrat, a Republican / a Republican, a Democrat], an independent, or what? Variable class: haven_labelled, vctrs_vctr, double V201228 Label n Unweighted Freq -9 -9. Refused 37 0.005 -8 -8. Don’t know 4 0.001 -4 -4. Technical error 1 0.000 0 No preference {VOL - video/phone only} 6 0.001 1 Democrat 2589 0.347 2 Republican 2304 0.309 3 Independent 2277 0.306 5 Other party {SPECIFY} 235 0.032 Total 7453 1.000 V201229 Description: PRE: Party Identification strong - Democrat Republican Question: Would you call yourself a strong [Democrat / Republican] or a not very strong [Democrat / Republican]? Variable class: haven_labelled, vctrs_vctr, double V201229 Label n Unweighted Freq -9 -9. Refused 4 0.001 -1 -1. Inapplicable 2560 0.343 1 Strong 3341 0.448 2 Not very strong 1548 0.208 Total 7453 1.000 V201230 Description: PRE: No Party Identification - closer to Democratic Party or Republican Party Question: Do you think of yourself as closer to the Republican Party or to the Democratic Party? Variable class: haven_labelled, vctrs_vctr, double V201230 Label n Unweighted Freq -9 -9. Refused 19 0.003 -8 -8. Don’t know 2 0.000 -1 -1. Inapplicable 4893 0.657 1 Closer to Republican 782 0.105 2 Neither {VOL in video and phone} 876 0.118 3 Closer to Democratic 881 0.118 Total 7453 1.000 V201231x Description: PRE: SUMMARY: Party ID Question: Derived from V201228, V201229, and PTYID_LEANPTY Variable class: haven_labelled, vctrs_vctr, double V201231x Label n Unweighted Freq -9 -9. Refused 23 0.003 -8 -8. Don’t know 2 0.000 1 Strong Democrat 1796 0.241 2 Not very strong Democrat 790 0.106 3 Independent-Democrat 881 0.118 4 Independent 876 0.118 5 Independent-Republican 782 0.105 6 Not very strong Republican 758 0.102 7 Strong Republican 1545 0.207 Total 7453 1.000 PartyID Description: PRE: SUMMARY: Party ID Question: Derived from V201228, V201229, and PTYID_LEANPTY Variable class: factor PartyID n Unweighted Freq Strong democrat 1796 0.241 Not very strong democrat 790 0.106 Independent-democrat 881 0.118 Independent 876 0.118 Independent-republican 782 0.105 Not very strong republican 758 0.102 Strong republican 1545 0.207 NA 25 0.003 Total 7453 1.000 V201233 Description: PRE: How often trust government in Washington to do what is right [revised] Question: How often can you trust the federal government in Washington to do what is right? Variable class: haven_labelled, vctrs_vctr, double V201233 Label n Unweighted Freq -9 -9. Refused 26 0.003 -8 -8. Don’t know 3 0.000 1 Always 80 0.011 2 Most of the time 1016 0.136 3 About half the time 2313 0.310 4 Some of the time 3313 0.445 5 Never 702 0.094 Total 7453 1.000 TrustGovernment Description: PRE: How often trust government in Washington to do what is right [revised] Question: How often can you trust the federal government in Washington to do what is right? Variable class: factor TrustGovernment n Unweighted Freq Always 80 0.011 Most of the time 1016 0.136 About half the time 2313 0.310 Some of the time 3313 0.445 Never 702 0.094 NA 29 0.004 Total 7453 1.000 V201237 Description: PRE: How often can people be trusted Question: Generally speaking, how often can you trust other people? Variable class: haven_labelled, vctrs_vctr, double V201237 Label n Unweighted Freq -9 -9. Refused 12 0.002 -8 -8. Don’t know 1 0.000 1 Always 48 0.006 2 Most of the time 3511 0.471 3 About half the time 2020 0.271 4 Some of the time 1597 0.214 5 Never 264 0.035 Total 7453 1.000 TrustPeople Description: PRE: How often can people be trusted Question: Generally speaking, how often can you trust other people? Variable class: factor TrustPeople n Unweighted Freq Always 48 0.006 Most of the time 3511 0.471 About half the time 2020 0.271 Some of the time 1597 0.214 Never 264 0.035 NA 13 0.002 Total 7453 1.000 V201507x Description: PRE: SUMMARY: Respondent age Question: Derived from birth month, day and year Variable class: haven_labelled, vctrs_vctr, double N Missing N Refused (-9) Minimum Median Maximum 0 294 18 53 80 Age Description: PRE: SUMMARY: Respondent age Question: Derived from birth month, day and year Variable class: numeric N Missing Minimum Median Maximum 294 18 53 80 AgeGroup Description: PRE: SUMMARY: Respondent age Question: Derived from birth month, day and year Variable class: factor AgeGroup n Unweighted Freq 18-29 871 0.117 30-39 1241 0.167 40-49 1081 0.145 50-59 1200 0.161 60-69 1436 0.193 70 or older 1330 0.178 NA 294 0.039 Total 7453 1.000 V201510 Description: PRE: Highest level of Education Question: What is the highest level of school you have completed or the highest degree you have received? Variable class: haven_labelled, vctrs_vctr, double V201510 Label n Unweighted Freq -9 -9. Refused 25 0.003 -8 -8. Don’t know 1 0.000 1 Less than high school credential 312 0.042 2 High school graduate - High school diploma or equivalent (e.g. GED) 1160 0.156 3 Some college but no degree 1519 0.204 4 Associate degree in college - occupational/vocational 550 0.074 5 Associate degree in college - academic 445 0.060 6 Bachelor’s degree (e.g. BA, AB, BS) 1877 0.252 7 Master’s degree (e.g. MA, MS, MEng, MEd, MSW, MBA) 1092 0.147 8 Professional school degree (e.g. MD, DDS, DVM, LLB, JD)/Doctoral degree (e.g. PHD, EDD) 382 0.051 95 Other {SPECIFY} 90 0.012 Total 7453 1.000 Education Description: PRE: Highest level of Education Question: What is the highest level of school you have completed or the highest degree you have received? Variable class: factor Education n Unweighted Freq Less than HS 312 0.042 High school 1160 0.156 Post HS 2514 0.337 Bachelor’s 1877 0.252 Graduate 1474 0.198 NA 116 0.016 Total 7453 1.000 V201546 Description: PRE: R: Are you Spanish, Hispanic, or Latino Question: Are you of Hispanic, Latino, or Spanish origin? Variable class: haven_labelled, vctrs_vctr, double V201546 Label n Unweighted Freq -9 -9. Refused 45 0.006 -8 -8. Don’t know 3 0.000 1 Yes 662 0.089 2 No 6743 0.905 Total 7453 1.000 V201547a Description: RESTRICTED: PRE: Race of R: White [mention] Question: I am going to read you a list of five race categories. You may choose one or more races. For this survey, Hispanic origin is not a race. Are you White? Variable class: haven_labelled, vctrs_vctr, double V201547a Label n Unweighted Freq -3 -3. Restricted 7453 1 Total 7453 1 V201547b Description: RESTRICTED: PRE: Race of R: Black or African-American [mention] Question: I am going to read you a list of five race categories. You may choose one or more races. For this survey, Hispanic origin is not a race. Are you Black or African American? Variable class: haven_labelled, vctrs_vctr, double V201547b Label n Unweighted Freq -3 -3. Restricted 7453 1 Total 7453 1 V201547c Description: RESTRICTED: PRE: Race of R: Asian [mention] Question: I am going to read you a list of five race categories. You may choose one or more races. For this survey, Hispanic origin is not a race. Are you Asian? Variable class: haven_labelled, vctrs_vctr, double V201547c Label n Unweighted Freq -3 -3. Restricted 7453 1 Total 7453 1 V201547d Description: RESTRICTED: PRE: Race of R: Native Hawaiian or Pacific Islander [mention] Question: I am going to read you a list of five race categories. You may choose one or more races. For this survey, Hispanic origin is not a race. Are you White; Black or African American; American Indian or Alaska Native; Asian; or Native Hawaiian or Other Pacific Islander? Variable class: haven_labelled, vctrs_vctr, double V201547d Label n Unweighted Freq -3 -3. Restricted 7453 1 Total 7453 1 V201547e Description: RESTRICTED: PRE: Race of R: Native American or Alaska Native [mention] Question: I am going to read you a list of five race categories. You may choose one or more races. For this survey, Hispanic origin is not a race. Are you American Indian or Alaska Native? Variable class: haven_labelled, vctrs_vctr, double V201547e Label n Unweighted Freq -3 -3. Restricted 7453 1 Total 7453 1 V201547z Description: RESTRICTED: PRE: Race of R: other specify Question: I am going to read you a list of five race categories. You may choose one or more races. For this survey, Hispanic origin is not a race. Reported other Variable class: haven_labelled, vctrs_vctr, double V201547z Label n Unweighted Freq -3 -3. Restricted 7453 1 Total 7453 1 V201549x Description: PRE: SUMMARY: R self-identified race/ethnicity Question: Derived from V201546, V201547a-V201547e, and V201547z Variable class: haven_labelled, vctrs_vctr, double V201549x Label n Unweighted Freq -9 -9. Refused 75 0.010 -8 -8. Don’t know 6 0.001 1 White, non-Hispanic 5420 0.727 2 Black, non-Hispanic 650 0.087 3 Hispanic 662 0.089 4 Asian or Native Hawaiian/other Pacific Islander, non-Hispanic alone 248 0.033 5 Native American/Alaska Native or other race, non-Hispanic alone 155 0.021 6 Multiple races, non-Hispanic 237 0.032 Total 7453 1.000 RaceEth Description: PRE: SUMMARY: R self-identified race/ethnicity Question: Derived from V201546, V201547a-V201547e, and V201547z Variable class: factor RaceEth n Unweighted Freq White 5420 0.727 Black 650 0.087 Hispanic 662 0.089 Asian, NH/PI 248 0.033 AI/AN 155 0.021 Other/multiple race 237 0.032 NA 81 0.011 Total 7453 1.000 V201600 Description: PRE: What is your (R) sex? [revised] Question: What is your sex? Variable class: haven_labelled, vctrs_vctr, double V201600 Label n Unweighted Freq -9 -9. Refused 51 0.007 1 Male 3375 0.453 2 Female 4027 0.540 Total 7453 1.000 Gender Description: PRE: What is your (R) sex? [revised] Question: What is your sex? Variable class: factor Gender n Unweighted Freq Male 3375 0.453 Female 4027 0.540 NA 51 0.007 Total 7453 1.000 V201607 Description: RESTRICTED: PRE: Total income amount - revised Question: The next question is about [the total combined income of all members of your family / your total income] during the past 12 months. This includes money from jobs, net income from business, farm or rent, pensions, dividends, interest, Social Security payments, and any other money income received by members of your family who are 15 years of age or older. What was the total income of your family during the past 12 months? TYPE THE NUMBER. YOUR BEST GUESS IS FINE. Variable class: haven_labelled, vctrs_vctr, double V201607 Label n Unweighted Freq -3 -3. Restricted 7453 1 Total 7453 1 V201610 Description: RESTRICTED: PRE: Income amt missing - categories lt 20K Question: Please choose the answer that includes the income of all members of your family during the past 12 months before taxes. Variable class: haven_labelled, vctrs_vctr, double V201610 Label n Unweighted Freq -3 -3. Restricted 7453 1 Total 7453 1 V201611 Description: RESTRICTED: PRE: Income amt missing - categories 20-40K Question: Please choose the answer that includes the income of all members of your family during the past 12 months before taxes. Variable class: haven_labelled, vctrs_vctr, double V201611 Label n Unweighted Freq -3 -3. Restricted 7453 1 Total 7453 1 V201613 Description: RESTRICTED: PRE: Income amt missing - categories 40-70K Question: Please choose the answer that includes the income of all members of your family during the past 12 months before taxes. Variable class: haven_labelled, vctrs_vctr, double V201613 Label n Unweighted Freq -3 -3. Restricted 7453 1 Total 7453 1 V201615 Description: RESTRICTED: PRE: Income amt missing - categories 70-100K Question: Please choose the answer that includes the income of all members of your family during the past 12 months before taxes. Variable class: haven_labelled, vctrs_vctr, double V201615 Label n Unweighted Freq -3 -3. Restricted 7453 1 Total 7453 1 V201616 Description: RESTRICTED: PRE: Income amt missing - categories 100+K Question: Please choose the answer that includes the income of all members of your family during the past 12 months before taxes. Variable class: haven_labelled, vctrs_vctr, double V201616 Label n Unweighted Freq -3 -3. Restricted 7453 1 Total 7453 1 V201617x Description: PRE: SUMMARY: Total (family) income Question: Derived from V201607, V201610, V201611, V201613, V201615, V201616 Variable class: haven_labelled, vctrs_vctr, double V201617x Label n Unweighted Freq -9 -9. Refused 502 0.067 -5 -5. Interview breakoff (sufficient partial IW) 15 0.002 1 Under $9,999 647 0.087 2 $10,000-14,999 244 0.033 3 $15,000-19,999 185 0.025 4 $20,000-24,999 301 0.040 5 $25,000-29,999 228 0.031 6 $30,000-34,999 296 0.040 7 $35,000-39,999 226 0.030 8 $40,000-44,999 286 0.038 9 $45,000-49,999 213 0.029 10 $50,000-59,999 485 0.065 11 $60,000-64,999 294 0.039 12 $65,000-69,999 168 0.023 13 $70,000-74,999 243 0.033 14 $75,000-79,999 215 0.029 15 $80,000-89,999 383 0.051 16 $90,000-99,999 291 0.039 17 $100,000-109,999 451 0.061 18 $110,000-124,999 312 0.042 19 $125,000-149,999 323 0.043 20 $150,000-174,999 366 0.049 21 $175,000-249,999 374 0.050 22 $250,000 or more 405 0.054 Total 7453 1.000 Income Description: PRE: SUMMARY: Total (family) income Question: Derived from V201607, V201610, V201611, V201613, V201615, V201616 Variable class: factor Income n Unweighted Freq Under $9,999 647 0.087 $10,000-14,999 244 0.033 $15,000-19,999 185 0.025 $20,000-24,999 301 0.040 $25,000-29,999 228 0.031 $30,000-34,999 296 0.040 $35,000-39,999 226 0.030 $40,000-44,999 286 0.038 $45,000-49,999 213 0.029 $50,000-59,999 485 0.065 $60,000-64,999 294 0.039 $65,000-69,999 168 0.023 $70,000-74,999 243 0.033 $75,000-79,999 215 0.029 $80,000-89,999 383 0.051 $90,000-99,999 291 0.039 $100,000-109,999 451 0.061 $110,000-124,999 312 0.042 $125,000-149,999 323 0.043 $150,000-174,999 366 0.049 $175,000-249,999 374 0.050 $250,000 or more 405 0.054 NA 517 0.069 Total 7453 1.000 Income7 Description: PRE: SUMMARY: Total (family) income Question: Derived from V201607, V201610, V201611, V201613, V201615, V201616 Variable class: factor Income7 n Unweighted Freq Under $20k 1076 0.144 $20k to &lt; 40k 1051 0.141 $40k to &lt; 60k 984 0.132 $60k to &lt; 80k 920 0.123 $80k to &lt; 100k 674 0.090 $100k to &lt; 125k 763 0.102 $125k or more 1468 0.197 NA 517 0.069 Total 7453 1.000 B.4 POST-ELECTION SURVEY QUESTIONNAIRE V202051 Description: POST: R registered to vote (post-election) Question: Now on a different topic. Are you registered to vote at [Respondent’s preloaded address], registered at a different address, or not currently registered? Variable class: haven_labelled, vctrs_vctr, double V202051 Label n Unweighted Freq -9 -9. Refused 4 0.001 -6 -6. No post-election interview 4 0.001 -1 -1. Inapplicable 6820 0.915 1 Registered at this address 173 0.023 2 Registered at a different address 59 0.008 3 Not currently registered 393 0.053 Total 7453 1.000 V202066 Description: POST: Did R vote in November 2020 election Question: In talking to people about elections, we often find that a lot of people were not able to vote because they weren’t registered, they were sick, or they just didn’t have time. Which of the following statements best describes you: Variable class: haven_labelled, vctrs_vctr, double V202066 Label n Unweighted Freq -9 -9. Refused 7 0.001 -6 -6. No post-election interview 4 0.001 -1 -1. Inapplicable 372 0.050 1 I did not vote (in the election this November) 582 0.078 2 I thought about voting this time, but didn’t 265 0.036 3 I usually vote, but didn’t this time 192 0.026 4 I am sure I voted 6031 0.809 Total 7453 1.000 V202072 Description: POST: Did R vote for President Question: How about the election for President? Did you vote for a candidate for President? Variable class: haven_labelled, vctrs_vctr, double V202072 Label n Unweighted Freq -9 -9. Refused 2 0.000 -6 -6. No post-election interview 4 0.001 -1 -1. Inapplicable 1418 0.190 1 Yes, voted for President 5952 0.799 2 No, didn’t vote for President 77 0.010 Total 7453 1.000 VotedPres2020 Description: POST: Did R vote for President Question: How about the election for President? Did you vote for a candidate for President? Variable class: factor VotedPres2020 n Unweighted Freq Yes 6313 0.847 No 87 0.012 NA 1053 0.141 Total 7453 1.000 V202073 Description: POST: For whom did R vote for President Question: Who did you vote for? [Joe Biden, Donald Trump/Donald Trump, Joe Biden], Jo Jorgensen, Howie Hawkins, or someone else? Variable class: haven_labelled, vctrs_vctr, double V202073 Label n Unweighted Freq -9 -9. Refused 53 0.007 -6 -6. No post-election interview 4 0.001 -1 -1. Inapplicable 1497 0.201 1 Joe Biden 3267 0.438 2 Donald Trump 2462 0.330 3 Jo Jorgensen 69 0.009 4 Howie Hawkins 23 0.003 5 Other candidate {SPECIFY} 56 0.008 7 Specified as Republican candidate 1 0.000 8 Specified as Libertarian candidate 3 0.000 11 Specified as don’t know 2 0.000 12 Specified as refused 16 0.002 Total 7453 1.000 V202109x Description: PRE-POST: SUMMARY: Voter turnout in 2020 Question: Derived from V201024, V202066, V202051 Variable class: haven_labelled, vctrs_vctr, double V202109x Label n Unweighted Freq -2 -2. Not reported 7 0.001 0 Did not vote 1039 0.139 1 Voted 6407 0.860 Total 7453 1.000 V202110x Description: PRE-POST: SUMMARY: 2020 Presidential vote Question: Derived from V201029, V202073 Variable class: haven_labelled, vctrs_vctr, double V202110x Label n Unweighted Freq -9 -9. Refused 81 0.011 -8 -8. Don’t know 2 0.000 -1 -1. Inapplicable 1136 0.152 1 Joe Biden 3509 0.471 2 Donald Trump 2567 0.344 3 Jo Jorgensen 74 0.010 4 Howie Hawkins 24 0.003 5 Other candidate {SPECIFY} 60 0.008 Total 7453 1.000 VotedPres2020_selection Description: PRE-POST: SUMMARY: 2020 Presidential vote Question: Derived from V201029, V202073 Variable class: factor VotedPres2020_selection n Unweighted Freq Biden 3509 0.471 Trump 2567 0.344 Other 158 0.021 NA 1219 0.164 Total 7453 1.000 References ———. 2022. “ANES 2020 Time Series Study Full Release: User Guide and Codebook.” https://electionstudies.org/wp-content/uploads/2022/02/anes_timeseries_2020_userguidecodebook_20220210.pdf. "],["recs-cb.html", "C RECS derived variable codebook C.1 ADMIN C.2 GEOGRAPHY C.3 WEATHER C.4 YOUR HOME C.5 SPACE HEATING C.6 AIR CONDITIONING C.7 THERMOSTAT C.8 WEIGHTS C.9 CONSUMPTION AND EXPENDITURE", " C RECS derived variable codebook The full codebook with the original variables is available at https://www.eia.gov/consumption/residential/data/2020/index.php?view=microdata - “Variable and response codebook”. This is a codebook for the RECS data used in this book (recs_2020) from the {srvyrexploR} package. C.1 ADMIN DOEID Description: Unique identifier for each respondent ClimateRegion_BA Description: Building America Climate Zone ClimateRegion_BA n Unweighted Freq Mixed-Dry 142 0.008 Mixed-Humid 5579 0.302 Hot-Humid 2545 0.138 Hot-Dry 1577 0.085 Very-Cold 572 0.031 Cold 7116 0.385 Marine 911 0.049 Subarctic 54 0.003 Total 18496 1.000 Urbanicity Description: 2010 Census Urban Type Code Urbanicity n Unweighted Freq Urban Area 12395 0.670 Urban Cluster 2020 0.109 Rural 4081 0.221 Total 18496 1.000 C.2 GEOGRAPHY Region Description: Census Region Region n Unweighted Freq Northeast 3657 0.198 Midwest 3832 0.207 South 6426 0.347 West 4581 0.248 Total 18496 1.000 REGIONC Description: Census Region REGIONC n Unweighted Freq MIDWEST 3832 0.207 NORTHEAST 3657 0.198 SOUTH 6426 0.347 WEST 4581 0.248 Total 18496 1.000 Division Description: Census Division, Mountain Division is divided into North and South for RECS purposes Division n Unweighted Freq New England 1680 0.091 Middle Atlantic 1977 0.107 East North Central 2014 0.109 West North Central 1818 0.098 South Atlantic 3256 0.176 East South Central 1343 0.073 West South Central 1827 0.099 Mountain North 1180 0.064 Mountain South 904 0.049 Pacific 2497 0.135 Total 18496 1.000 STATE_FIPS Description: State Federal Information Processing System Code STATE_FIPS n Unweighted Freq 01 242 0.013 02 311 0.017 04 495 0.027 05 268 0.014 06 1152 0.062 08 360 0.019 09 294 0.016 10 143 0.008 11 221 0.012 12 655 0.035 13 417 0.023 15 282 0.015 16 270 0.015 17 530 0.029 18 400 0.022 19 286 0.015 20 208 0.011 21 428 0.023 22 311 0.017 23 223 0.012 24 359 0.019 25 552 0.030 26 388 0.021 27 325 0.018 28 168 0.009 29 296 0.016 30 172 0.009 31 189 0.010 32 231 0.012 33 175 0.009 34 456 0.025 35 178 0.010 36 904 0.049 37 479 0.026 38 331 0.018 39 339 0.018 40 232 0.013 41 313 0.017 42 617 0.033 44 191 0.010 45 334 0.018 46 183 0.010 47 505 0.027 48 1016 0.055 49 188 0.010 50 245 0.013 51 451 0.024 53 439 0.024 54 197 0.011 55 357 0.019 56 190 0.010 Total 18496 1.000 state_postal Description: State Postal Code state_postal n Unweighted Freq AL 242 0.013 AK 311 0.017 AZ 495 0.027 AR 268 0.014 CA 1152 0.062 CO 360 0.019 CT 294 0.016 DE 143 0.008 DC 221 0.012 FL 655 0.035 GA 417 0.023 HI 282 0.015 ID 270 0.015 IL 530 0.029 IN 400 0.022 IA 286 0.015 KS 208 0.011 KY 428 0.023 LA 311 0.017 ME 223 0.012 MD 359 0.019 MA 552 0.030 MI 388 0.021 MN 325 0.018 MS 168 0.009 MO 296 0.016 MT 172 0.009 NE 189 0.010 NV 231 0.012 NH 175 0.009 NJ 456 0.025 NM 178 0.010 NY 904 0.049 NC 479 0.026 ND 331 0.018 OH 339 0.018 OK 232 0.013 OR 313 0.017 PA 617 0.033 RI 191 0.010 SC 334 0.018 SD 183 0.010 TN 505 0.027 TX 1016 0.055 UT 188 0.010 VT 245 0.013 VA 451 0.024 WA 439 0.024 WV 197 0.011 WI 357 0.019 WY 190 0.010 Total 18496 1.000 state_name Description: State Name state_name n Unweighted Freq Alabama 242 0.013 Alaska 311 0.017 Arizona 495 0.027 Arkansas 268 0.014 California 1152 0.062 Colorado 360 0.019 Connecticut 294 0.016 Delaware 143 0.008 District of Columbia 221 0.012 Florida 655 0.035 Georgia 417 0.023 Hawaii 282 0.015 Idaho 270 0.015 Illinois 530 0.029 Indiana 400 0.022 Iowa 286 0.015 Kansas 208 0.011 Kentucky 428 0.023 Louisiana 311 0.017 Maine 223 0.012 Maryland 359 0.019 Massachusetts 552 0.030 Michigan 388 0.021 Minnesota 325 0.018 Mississippi 168 0.009 Missouri 296 0.016 Montana 172 0.009 Nebraska 189 0.010 Nevada 231 0.012 New Hampshire 175 0.009 New Jersey 456 0.025 New Mexico 178 0.010 New York 904 0.049 North Carolina 479 0.026 North Dakota 331 0.018 Ohio 339 0.018 Oklahoma 232 0.013 Oregon 313 0.017 Pennsylvania 617 0.033 Rhode Island 191 0.010 South Carolina 334 0.018 South Dakota 183 0.010 Tennessee 505 0.027 Texas 1016 0.055 Utah 188 0.010 Vermont 245 0.013 Virginia 451 0.024 Washington 439 0.024 West Virginia 197 0.011 Wisconsin 357 0.019 Wyoming 190 0.010 Total 18496 1.000 C.3 WEATHER HDD65 Description: Heating degree days in 2020, base temperature 65F; Derived from the weighted temperatures of nearby weather stations N Missing Minimum Median Maximum 0 0 4396 17383 CDD65 Description: Cooling degree days in 2020, base temperature 65F; Derived from the weighted temperatures of nearby weather stations N Missing Minimum Median Maximum 0 0 1179 5534 HDD30YR Description: Heating degree days, 30-year average 1981-2010, base temperature 65F; Taken from nearest weather station, inoculated with random errors N Missing Minimum Median Maximum 0 0 4825 16071 CDD30YR Description: Cooling degree days, 30-year average 1981-2010, base temperature 65F; Taken from nearest weather station, inoculated with random errors N Missing Minimum Median Maximum 0 0 1020 4905 C.4 YOUR HOME HousingUnitType Description: Type of housing unit Question: Which best describes your home? HousingUnitType n Unweighted Freq Mobile home 974 0.053 Single-family detached 12319 0.666 Single-family attached 1751 0.095 Apartment: 2-4 Units 1013 0.055 Apartment: 5 or more units 2439 0.132 Total 18496 1.000 YearMade Description: Range when housing unit was built Question: Derived from: In what year was your home built? AND Although you do not know the exact year your home was built, it is helpful to have an estimate. About when was your home built? YearMade n Unweighted Freq Before 1950 2721 0.147 1950-1959 1685 0.091 1960-1969 1867 0.101 1970-1979 2817 0.152 1980-1989 2435 0.132 1990-1999 2451 0.133 2000-2009 2748 0.149 2010-2015 989 0.053 2016-2020 783 0.042 Total 18496 1.000 TOTSQFT_EN Description: Total energy-consuming area (square footage) of the housing unit. Includes all main living areas; all basements; heated, cooled, or finished attics; and heating or cooled garages. For single-family housing units this is derived using the respondent-reported square footage (SQFTEST) and adjusted using the “include” variables (e.g., SQFTINCB), where applicable. For apartments and mobile homes this is the respondent-reported square footage. A derived variable rounded to the nearest 10 N Missing Minimum Median Maximum 0 200 1700 15000 TOTHSQFT Description: Square footage of the housing unit that is heated by space heating equipment. A derived variable rounded to the nearest 10 N Missing Minimum Median Maximum 0 0 1520 15000 TOTCSQFT Description: Square footage of the housing unit that is cooled by air-conditioning equipment or evaporative cooler, a derived variable rounded to the nearest 10 N Missing Minimum Median Maximum 0 0 1200 14600 ZTOTSQFT_EN Description: Imputation indicator for SQFTEST ZTOTSQFT_EN n Unweighted Freq Not imputed 11930 0.645 Imputed 6566 0.355 Total 18496 1.000 ZYearMade Description: Imputation indicator for YEARMADERANGE ZYearMade n Unweighted Freq Not imputed 18176 0.983 Imputed 320 0.017 Total 18496 1.000 ZHousingUnitType Description: Imputation indicator for TYPEHUQ ZHousingUnitType n Unweighted Freq Not imputed 18496 1 Total 18496 1 C.5 SPACE HEATING SpaceHeatingUsed Description: Space heating equipment used Question: Is your home heated during the winter? SpaceHeatingUsed n Unweighted Freq FALSE 751 0.041 TRUE 17745 0.959 Total 18496 1.000 ZSpaceHeatingUsed Description: Imputation indicator for HEATHOME ZSpaceHeatingUsed n Unweighted Freq Not imputed 18474 0.999 Imputed 22 0.001 Total 18496 1.000 C.6 AIR CONDITIONING ACUsed Description: Air conditioning equipment used Question: Is any air conditioning equipment used in your home? ACUsed n Unweighted Freq FALSE 2325 0.126 TRUE 16171 0.874 Total 18496 1.000 ZACUsed Description: Imputation indicator for AIRCOND ZACUsed n Unweighted Freq Not imputed 18448 0.997 Imputed 48 0.003 Total 18496 1.000 ZACBehavior Description: Imputation indicator for COOLCNTL ZACBehavior n Unweighted Freq Not imputed 15819 0.855 Imputed 352 0.019 Not applicable 2325 0.126 Total 18496 1.000 C.7 THERMOSTAT HeatingBehavior Description: Winter temperature control method Question: Which of the following best describes how your household controls the indoor temperature during the winter? HeatingBehavior n Unweighted Freq Set one temp and leave it 7806 0.422 Manually adjust at night/no one home 4654 0.252 Programmable or smart thermostat automatically adjusts the temperature 3310 0.179 Turn on or off as needed 1491 0.081 No control 438 0.024 Other 46 0.002 NA 751 0.041 Total 18496 1.000 WinterTempDay Description: Winter thermostat setting or temperature in home when someone is home during the day Question: During the winter, what is your home’s typical indoor temperature when someone is home during the day? N Missing Minimum Median Maximum 751 50 70 90 WinterTempAway Description: Winter thermostat setting or temperature in home when no one is home during the day Question: During the winter, what is your home’s typical indoor temperature when no one is inside your home during the day? N Missing Minimum Median Maximum 751 50 68 90 WinterTempNight Description: Winter thermostat setting or temperature in home at night Question: During the winter, what is your home’s typical indoor temperature inside your home at night? N Missing Minimum Median Maximum 751 50 68 90 ACBehavior Description: Summer temperature control method Question: Which of the following best describes how your household controls the indoor temperature during the summer? ACBehavior n Unweighted Freq Set one temp and leave it 6738 0.364 Manually adjust at night/no one home 3637 0.197 Programmable or smart thermostat automatically adjusts the temperature 2638 0.143 Turn on or off as needed 2746 0.148 No control 409 0.022 Other 3 0.000 NA 2325 0.126 Total 18496 1.000 SummerTempDay Description: Summer thermostat setting or temperature in home when someone is home during the day Question: During the summer, what is your home’s typical indoor temperature when someone is home during the day? N Missing Minimum Median Maximum 2325 50 72 90 SummerTempAway Description: Summer thermostat setting or temperature in home when no one is home during the day Question: During the summer, what is your home’s typical indoor temperature when no one is inside your home during the day? N Missing Minimum Median Maximum 2325 50 74 90 SummerTempNight Description: Summer thermostat setting or temperature in home at night Question: During the summer, what is your home’s typical indoor temperature inside your home at night? N Missing Minimum Median Maximum 2325 50 72 90 ZHeatingBehavior Description: Imputation indicator for HEATCNTL ZHeatingBehavior n Unweighted Freq Not imputed 17395 0.940 Imputed 350 0.019 Not applicable 751 0.041 Total 18496 1.000 ZWinterTempAway Description: Imputation indicator for TEMPGONE ZWinterTempAway n Unweighted Freq Not imputed 16840 0.910 Imputed 905 0.049 Not applicable 751 0.041 Total 18496 1.000 ZSummerTempAway Description: Imputation indicator for TEMPGONEAC ZSummerTempAway n Unweighted Freq Not imputed 15240 0.824 Imputed 931 0.050 Not applicable 2325 0.126 Total 18496 1.000 ZWinterTempDay Description: Imputation indicator for TEMPHOME ZWinterTempDay n Unweighted Freq Not imputed 17382 0.940 Imputed 363 0.020 Not applicable 751 0.041 Total 18496 1.000 ZSummerTempDay Description: Imputation indicator for TEMPHOMEAC ZSummerTempDay n Unweighted Freq Not imputed 15658 0.847 Imputed 513 0.028 Not applicable 2325 0.126 Total 18496 1.000 ZWinterTempNight Description: Imputation indicator for TEMPNITE ZWinterTempNight n Unweighted Freq Not imputed 17207 0.930 Imputed 538 0.029 Not applicable 751 0.041 Total 18496 1.000 ZSummerTempNight Description: Imputation indicator for TEMPNITEAC ZSummerTempNight n Unweighted Freq Not imputed 15497 0.838 Imputed 674 0.036 Not applicable 2325 0.126 Total 18496 1.000 C.8 WEIGHTS NWEIGHT Description: Final Analysis Weight N Missing Minimum Median Maximum 0 437.9 6119 29279 NWEIGHT1 Description: Final Analysis Weight for replicate 1 N Missing Minimum Median Maximum 0 0 6136 30015 NWEIGHT2 Description: Final Analysis Weight for replicate 2 N Missing Minimum Median Maximum 0 0 6151 29422 NWEIGHT3 Description: Final Analysis Weight for replicate 3 N Missing Minimum Median Maximum 0 0 6151 29431 NWEIGHT4 Description: Final Analysis Weight for replicate 4 N Missing Minimum Median Maximum 0 0 6153 29494 NWEIGHT5 Description: Final Analysis Weight for replicate 5 N Missing Minimum Median Maximum 0 0 6134 30039 NWEIGHT6 Description: Final Analysis Weight for replicate 6 N Missing Minimum Median Maximum 0 0 6147 29419 NWEIGHT7 Description: Final Analysis Weight for replicate 7 N Missing Minimum Median Maximum 0 0 6135 29586 NWEIGHT8 Description: Final Analysis Weight for replicate 8 N Missing Minimum Median Maximum 0 0 6151 29499 NWEIGHT9 Description: Final Analysis Weight for replicate 9 N Missing Minimum Median Maximum 0 0 6139 29845 NWEIGHT10 Description: Final Analysis Weight for replicate 10 N Missing Minimum Median Maximum 0 0 6163 29635 NWEIGHT11 Description: Final Analysis Weight for replicate 11 N Missing Minimum Median Maximum 0 0 6140 29681 NWEIGHT12 Description: Final Analysis Weight for replicate 12 N Missing Minimum Median Maximum 0 0 6160 29849 NWEIGHT13 Description: Final Analysis Weight for replicate 13 N Missing Minimum Median Maximum 0 0 6142 29843 NWEIGHT14 Description: Final Analysis Weight for replicate 14 N Missing Minimum Median Maximum 0 0 6154 30184 NWEIGHT15 Description: Final Analysis Weight for replicate 15 N Missing Minimum Median Maximum 0 0 6145 29970 NWEIGHT16 Description: Final Analysis Weight for replicate 16 N Missing Minimum Median Maximum 0 0 6133 29825 NWEIGHT17 Description: Final Analysis Weight for replicate 17 N Missing Minimum Median Maximum 0 0 6126 30606 NWEIGHT18 Description: Final Analysis Weight for replicate 18 N Missing Minimum Median Maximum 0 0 6155 29689 NWEIGHT19 Description: Final Analysis Weight for replicate 19 N Missing Minimum Median Maximum 0 0 6153 29336 NWEIGHT20 Description: Final Analysis Weight for replicate 20 N Missing Minimum Median Maximum 0 0 6139 30274 NWEIGHT21 Description: Final Analysis Weight for replicate 21 N Missing Minimum Median Maximum 0 0 6135 29766 NWEIGHT22 Description: Final Analysis Weight for replicate 22 N Missing Minimum Median Maximum 0 0 6149 29791 NWEIGHT23 Description: Final Analysis Weight for replicate 23 N Missing Minimum Median Maximum 0 0 6148 30126 NWEIGHT24 Description: Final Analysis Weight for replicate 24 N Missing Minimum Median Maximum 0 0 6136 29946 NWEIGHT25 Description: Final Analysis Weight for replicate 25 N Missing Minimum Median Maximum 0 0 6150 30445 NWEIGHT26 Description: Final Analysis Weight for replicate 26 N Missing Minimum Median Maximum 0 0 6136 29893 NWEIGHT27 Description: Final Analysis Weight for replicate 27 N Missing Minimum Median Maximum 0 0 6125 30030 NWEIGHT28 Description: Final Analysis Weight for replicate 28 N Missing Minimum Median Maximum 0 0 6149 29599 NWEIGHT29 Description: Final Analysis Weight for replicate 29 N Missing Minimum Median Maximum 0 0 6146 30136 NWEIGHT30 Description: Final Analysis Weight for replicate 30 N Missing Minimum Median Maximum 0 0 6149 29895 NWEIGHT31 Description: Final Analysis Weight for replicate 31 N Missing Minimum Median Maximum 0 0 6144 29604 NWEIGHT32 Description: Final Analysis Weight for replicate 32 N Missing Minimum Median Maximum 0 0 6159 29310 NWEIGHT33 Description: Final Analysis Weight for replicate 33 N Missing Minimum Median Maximum 0 0 6148 29408 NWEIGHT34 Description: Final Analysis Weight for replicate 34 N Missing Minimum Median Maximum 0 0 6139 29564 NWEIGHT35 Description: Final Analysis Weight for replicate 35 N Missing Minimum Median Maximum 0 0 6141 30437 NWEIGHT36 Description: Final Analysis Weight for replicate 36 N Missing Minimum Median Maximum 0 0 6149 27896 NWEIGHT37 Description: Final Analysis Weight for replicate 37 N Missing Minimum Median Maximum 0 0 6133 30596 NWEIGHT38 Description: Final Analysis Weight for replicate 38 N Missing Minimum Median Maximum 0 0 6139 30130 NWEIGHT39 Description: Final Analysis Weight for replicate 39 N Missing Minimum Median Maximum 0 0 6147 29262 NWEIGHT40 Description: Final Analysis Weight for replicate 40 N Missing Minimum Median Maximum 0 0 6144 30344 NWEIGHT41 Description: Final Analysis Weight for replicate 41 N Missing Minimum Median Maximum 0 0 6153 29594 NWEIGHT42 Description: Final Analysis Weight for replicate 42 N Missing Minimum Median Maximum 0 0 6137 29938 NWEIGHT43 Description: Final Analysis Weight for replicate 43 N Missing Minimum Median Maximum 0 0 6157 29878 NWEIGHT44 Description: Final Analysis Weight for replicate 44 N Missing Minimum Median Maximum 0 0 6148 29896 NWEIGHT45 Description: Final Analysis Weight for replicate 45 N Missing Minimum Median Maximum 0 0 6149 29729 NWEIGHT46 Description: Final Analysis Weight for replicate 46 N Missing Minimum Median Maximum 0 0 6152 29103 NWEIGHT47 Description: Final Analysis Weight for replicate 47 N Missing Minimum Median Maximum 0 0 6150 30070 NWEIGHT48 Description: Final Analysis Weight for replicate 48 N Missing Minimum Median Maximum 0 0 6139 29343 NWEIGHT49 Description: Final Analysis Weight for replicate 49 N Missing Minimum Median Maximum 0 0 6146 29590 NWEIGHT50 Description: Final Analysis Weight for replicate 50 N Missing Minimum Median Maximum 0 0 6159 30027 NWEIGHT51 Description: Final Analysis Weight for replicate 51 N Missing Minimum Median Maximum 0 0 6150 29247 NWEIGHT52 Description: Final Analysis Weight for replicate 52 N Missing Minimum Median Maximum 0 0 6154 29445 NWEIGHT53 Description: Final Analysis Weight for replicate 53 N Missing Minimum Median Maximum 0 0 6156 30131 NWEIGHT54 Description: Final Analysis Weight for replicate 54 N Missing Minimum Median Maximum 0 0 6151 29439 NWEIGHT55 Description: Final Analysis Weight for replicate 55 N Missing Minimum Median Maximum 0 0 6143 29216 NWEIGHT56 Description: Final Analysis Weight for replicate 56 N Missing Minimum Median Maximum 0 0 6153 29203 NWEIGHT57 Description: Final Analysis Weight for replicate 57 N Missing Minimum Median Maximum 0 0 6138 29819 NWEIGHT58 Description: Final Analysis Weight for replicate 58 N Missing Minimum Median Maximum 0 0 6137 29818 NWEIGHT59 Description: Final Analysis Weight for replicate 59 N Missing Minimum Median Maximum 0 0 6144 29606 NWEIGHT60 Description: Final Analysis Weight for replicate 60 N Missing Minimum Median Maximum 0 0 6140 29818 C.9 CONSUMPTION AND EXPENDITURE BTUEL Description: Total electricity use, in thousand Btu, 2020, including self-generation of solar power N Missing Minimum Median Maximum 0 143.3 31890 628155 DOLLAREL Description: Total electricity cost, in dollars, 2020 N Missing Minimum Median Maximum 0 -889.5 1258 15680 ZBTUEL Description: Imputation flag for total electricity use ZBTUEL n Unweighted Freq Not imputed 15965 0.863 Imputed amount and cost 2138 0.116 Imputed only amount for SOLAR=1 cases 393 0.021 Total 18496 1.000 BTUNG Description: Total natural gas use, in thousand Btu, 2020 N Missing Minimum Median Maximum 0 0 22012 1134709 DOLLARNG Description: Total natural gas cost, in dollars, 2020 N Missing Minimum Median Maximum 0 0 313.9 8155 ZBTUNG Description: Imputation flag for total natural gas use ZBTUNG n Unweighted Freq Not imputed 8823 0.477 Imputed 2331 0.126 Not applicable 7342 0.397 Total 18496 1.000 BTULP Description: Total propane use, in thousand Btu, 2020 N Missing Minimum Median Maximum 0 0 0 364215 DOLLARLP Description: Total propane cost, in dollars, 2020 N Missing Minimum Median Maximum 0 0 0 6621 ZBTULP Description: Imputation flag for total propane use ZBTULP n Unweighted Freq Not imputed 896 0.048 Imputed 1103 0.060 Not applicable 16497 0.892 Total 18496 1.000 BTUFO Description: Total fuel oil/kerosene use, in thousand Btu, 2020 N Missing Minimum Median Maximum 0 0 0 426268 DOLLARFO Description: Total fuel oil/kerosene cost, in dollars, 2020 N Missing Minimum Median Maximum 0 0 0 7004 ZBTUFO Description: Imputation flag for total fuel oil/kerosene use ZBTUFO n Unweighted Freq Not imputed 626 0.034 Imputed 607 0.033 Not applicable 17263 0.933 Total 18496 1.000 BTUWOOD Description: Total wood use, in thousand Btu, 2020 N Missing Minimum Median Maximum 0 0 0 5e+05 ZBTUWOOD Description: Imputation flag for total wood use ZBTUWOOD n Unweighted Freq Not imputed 1730 0.094 Imputed 244 0.013 Not applicable 16522 0.893 Total 18496 1.000 TOTALBTU Description: Total usage including electricity, natural gas, propane, and fuel oil, in thousand Btu, 2020 N Missing Minimum Median Maximum 0 1182 74180 1367548 TOTALDOL Description: Total cost including electricity, natural gas, propane, and fuel oil, in dollars, 2020 N Missing Minimum Median Maximum 0 -150.5 1793 20043 "],["exercise-solutions.html", "D Exercise solutions 5 - Descriptive analysis 6 - Statistical testing 7 - Modeling 10 - Specifying sample designs and replicate weights in {srvyr} 13 - National Crime Victimization Survey Vignette 14 - AmericasBarometer Vignette", " D Exercise solutions The chapter exercises use the survey design objects and packages provided in the Prerequisites box in the beginning of the chapter. Please ensure they are loaded in your environment before running the exercise solutions. Code chunks to load these are also included below. library(tidyverse) library(survey) library(srvyr) library(srvyrexploR) library(broom) library(prettyunits) library(gt) targetpop &lt;- 231592693 anes_adjwgt &lt;- anes_2020 %&gt;% mutate(Weight = Weight / sum(Weight) * targetpop) anes_des &lt;- anes_adjwgt %&gt;% as_survey_design( weights = Weight, strata = Stratum, ids = VarUnit, nest = TRUE ) recs_des &lt;- recs_2020 %&gt;% as_survey_rep( weights = NWEIGHT, repweights = NWEIGHT1:NWEIGHT60, type = &quot;JK1&quot;, scale = 59/60, mse = TRUE ) inc_series &lt;- ncvs_2021_incident %&gt;% mutate( series = case_when(V4017 %in% c(1, 8) ~ 1, V4018 %in% c(2, 8) ~ 1, V4019 %in% c(1, 8) ~ 1, TRUE ~ 2 ), n10v4016 = case_when(V4016 %in% c(997, 998) ~ NA_real_, V4016 &gt; 10 ~ 10, TRUE ~ V4016), serieswgt = case_when(series == 2 &amp; is.na(n10v4016) ~ 6, series == 2 ~ n10v4016, TRUE ~ 1), NEWWGT = WGTVICCY * serieswgt ) inc_ind &lt;- inc_series %&gt;% filter(V4022 != 1) %&gt;% mutate( WeapCat = case_when( is.na(V4049) ~ NA_character_, V4049 == 2 ~ &quot;NoWeap&quot;, V4049 == 3 ~ &quot;UnkWeapUse&quot;, V4050 == 3 ~ &quot;Other&quot;, V4051 == 1 | V4052 == 1 | V4050 == 7 ~ &quot;Firearm&quot;, V4053 == 1 | V4054 == 1 ~ &quot;Knife&quot;, TRUE ~ &quot;Other&quot; ), V4529_num = parse_number(as.character(V4529)), ReportPolice = V4399 == 1, Property = V4529_num &gt;= 31, Violent = V4529_num &lt;= 20, Property_ReportPolice = Property &amp; ReportPolice, Violent_ReportPolice = Violent &amp; ReportPolice, AAST = V4529_num %in% 11:13, AAST_NoWeap = AAST &amp; WeapCat == &quot;NoWeap&quot;, AAST_Firearm = AAST &amp; WeapCat == &quot;Firearm&quot;, AAST_Knife = AAST &amp; WeapCat == &quot;Knife&quot;, AAST_Other = AAST &amp; WeapCat == &quot;Other&quot; ) inc_hh_sums &lt;- inc_ind %&gt;% filter(V4529_num &gt; 23) %&gt;% # restrict to household crimes group_by(YEARQ, IDHH) %&gt;% summarize(WGTVICCY = WGTVICCY[1], across(starts_with(&quot;Property&quot;), ~ sum(. * serieswgt), .names = &quot;{.col}&quot;), .groups = &quot;drop&quot;) inc_pers_sums &lt;- inc_ind %&gt;% filter(V4529_num &lt;= 23) %&gt;% # restrict to person crimes group_by(YEARQ, IDHH, IDPER) %&gt;% summarize(WGTVICCY = WGTVICCY[1], across(c(starts_with(&quot;Violent&quot;), starts_with(&quot;AAST&quot;)), ~ sum(. * serieswgt), .names = &quot;{.col}&quot;), .groups = &quot;drop&quot;) hh_z_list &lt;- rep(0, ncol(inc_hh_sums) - 3) %&gt;% as.list() %&gt;% setNames(names(inc_hh_sums)[-(1:3)]) pers_z_list &lt;- rep(0, ncol(inc_pers_sums) - 4) %&gt;% as.list() %&gt;% setNames(names(inc_pers_sums)[-(1:4)]) hh_vsum &lt;- ncvs_2021_household %&gt;% full_join(inc_hh_sums, by = c(&quot;YEARQ&quot;, &quot;IDHH&quot;)) %&gt;% replace_na(hh_z_list) %&gt;% mutate(ADJINC_WT = if_else(is.na(WGTVICCY), 0, WGTVICCY / WGTHHCY)) pers_vsum &lt;- ncvs_2021_person %&gt;% full_join(inc_pers_sums, by = c(&quot;YEARQ&quot;, &quot;IDHH&quot;, &quot;IDPER&quot;)) %&gt;% replace_na(pers_z_list) %&gt;% mutate(ADJINC_WT = if_else(is.na(WGTVICCY), 0, WGTVICCY / WGTPERCY)) hh_vsum_der &lt;- hh_vsum %&gt;% mutate( Tenure = factor(case_when(V2015 == 1 ~ &quot;Owned&quot;, !is.na(V2015) ~ &quot;Rented&quot;), levels = c(&quot;Owned&quot;, &quot;Rented&quot;)), Urbanicity = factor(case_when(V2143 == 1 ~ &quot;Urban&quot;, V2143 == 2 ~ &quot;Suburban&quot;, V2143 == 3 ~ &quot;Rural&quot;), levels = c(&quot;Urban&quot;, &quot;Suburban&quot;, &quot;Rural&quot;)), SC214A_num = as.numeric(as.character(SC214A)), Income = case_when(SC214A_num &lt;= 8 ~ &quot;Less than $25,000&quot;, SC214A_num &lt;= 12 ~ &quot;$25,000-49,999&quot;, SC214A_num &lt;= 15 ~ &quot;$50,000-99,999&quot;, SC214A_num &lt;= 17 ~ &quot;$100,000-199,999&quot;, SC214A_num &lt;= 18 ~ &quot;$200,000 or more&quot;), Income = fct_reorder(Income, SC214A_num, .na_rm = FALSE), PlaceSize = case_match(as.numeric(as.character(V2126B)), 0 ~ &quot;Not in a place&quot;, 13 ~ &quot;Under 10,000&quot;, 16 ~ &quot;10,000-49,999&quot;, 17 ~ &quot;50,000-99,999&quot;, 18 ~ &quot;100,000-249,999&quot;, 19 ~ &quot;250,000-499,999&quot;, 20 ~ &quot;500,000-999,999&quot;, c(21, 22, 23) ~ &quot;1,000,000 or more&quot;), PlaceSize = fct_reorder(PlaceSize, as.numeric(V2126B)), Region = case_match(as.numeric(V2127B), 1 ~ &quot;Northeast&quot;, 2 ~ &quot;Midwest&quot;, 3 ~ &quot;South&quot;, 4 ~ &quot;West&quot;), Region = fct_reorder(Region, as.numeric(V2127B)) ) NHOPI &lt;- &quot;Native Hawaiian or Other Pacific Islander&quot; pers_vsum_der &lt;- pers_vsum %&gt;% mutate( Sex = factor(case_when(V3018 == 1 ~ &quot;Male&quot;, V3018 == 2 ~ &quot;Female&quot;)), RaceHispOrigin = factor(case_when(V3024 == 1 ~ &quot;Hispanic&quot;, V3023A == 1 ~ &quot;White&quot;, V3023A == 2 ~ &quot;Black&quot;, V3023A == 4 ~ &quot;Asian&quot;, V3023A == 5 ~ NHOPI, TRUE ~ &quot;Other&quot;), levels = c(&quot;White&quot;, &quot;Black&quot;, &quot;Hispanic&quot;, &quot;Asian&quot;, NHOPI, &quot;Other&quot;)), V3014_num = as.numeric(as.character(V3014)), AgeGroup = case_when(V3014_num &lt;= 17 ~ &quot;12-17&quot;, V3014_num &lt;= 24 ~ &quot;18-24&quot;, V3014_num &lt;= 34 ~ &quot;25-34&quot;, V3014_num &lt;= 49 ~ &quot;35-49&quot;, V3014_num &lt;= 64 ~ &quot;50-64&quot;, V3014_num &lt;= 90 ~ &quot;65 or older&quot;), AgeGroup = fct_reorder(AgeGroup, V3014_num), MaritalStatus = factor(case_when(V3015 == 1 ~ &quot;Married&quot;, V3015 == 2 ~ &quot;Widowed&quot;, V3015 == 3 ~ &quot;Divorced&quot;, V3015 == 4 ~ &quot;Separated&quot;, V3015 == 5 ~ &quot;Never married&quot;), levels = c(&quot;Never married&quot;, &quot;Married&quot;, &quot;Widowed&quot;,&quot;Divorced&quot;, &quot;Separated&quot;)) ) %&gt;% left_join(hh_vsum_der %&gt;% select(YEARQ, IDHH, V2117, V2118, Tenure:Region), by = c(&quot;YEARQ&quot;, &quot;IDHH&quot;)) hh_vsum_slim &lt;- hh_vsum_der %&gt;% select(YEARQ:V2118, WGTVICCY:ADJINC_WT, Tenure, Urbanicity, Income, PlaceSize, Region) pers_vsum_slim &lt;- pers_vsum_der %&gt;% select(YEARQ:WGTPERCY, WGTVICCY:ADJINC_WT, Sex:Region) dummy_records &lt;- hh_vsum_slim %&gt;% distinct(V2117, V2118) %&gt;% mutate(Dummy = 1, WGTVICCY = 1, NEWWGT = 1) inc_analysis &lt;- inc_ind %&gt;% mutate(Dummy = 0) %&gt;% left_join(select(pers_vsum_slim, YEARQ, IDHH, IDPER, Sex:Region), by = c(&quot;YEARQ&quot;, &quot;IDHH&quot;, &quot;IDPER&quot;)) %&gt;% bind_rows(dummy_records) %&gt;% select(YEARQ:IDPER, WGTVICCY, NEWWGT, V4529, WeapCat, ReportPolice, Property:Region) inc_des &lt;- inc_analysis %&gt;% as_survey( weight = NEWWGT, strata = V2117, ids = V2118, nest = TRUE ) hh_des &lt;- hh_vsum_slim %&gt;% as_survey( weight = WGTHHCY, strata = V2117, ids = V2118, nest = TRUE ) pers_des &lt;- pers_vsum_slim %&gt;% as_survey( weight = WGTPERCY, strata = V2117, ids = V2118, nest = TRUE ) 5 - Descriptive analysis How many females have a graduate degree? Hint: the variables Gender and Education will be useful. # Option 1: femgd_option1 &lt;- anes_des %&gt;% filter(Gender == &quot;Female&quot;, Education == &quot;Graduate&quot;) %&gt;% survey_count(name = &quot;n&quot;) femgd_option1 ## # A tibble: 1 × 2 ## n n_se ## &lt;dbl&gt; &lt;dbl&gt; ## 1 15072196. 837872. # Option 2: femgd_option2 &lt;- anes_des %&gt;% filter(Gender == &quot;Female&quot;, Education == &quot;Graduate&quot;) %&gt;% summarize(N = survey_total(), .groups = &quot;drop&quot;) femgd_option2 ## # A tibble: 1 × 2 ## N N_se ## &lt;dbl&gt; &lt;dbl&gt; ## 1 15072196. 837872. Answer: 15,072,196 What percentage of people identify as “Strong Democrat”? Hint: The variable PartyID indicates someone’s party affiliation. psd &lt;- anes_des %&gt;% group_by(PartyID) %&gt;% summarize(p = survey_mean()) %&gt;% filter(PartyID == &quot;Strong democrat&quot;) psd ## # A tibble: 1 × 3 ## PartyID p p_se ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Strong democrat 0.219 0.00646 Answer: 21.9% What percentage of people who voted in the 2020 election identify as “Strong Republican”? Hint: The variable VotedPres2020 indicates whether someone voted in 2020. psr &lt;- anes_des %&gt;% filter(VotedPres2020 == &quot;Yes&quot;) %&gt;% group_by(PartyID) %&gt;% summarize(p = survey_mean()) %&gt;% filter(PartyID == &quot;Strong republican&quot;) psr ## # A tibble: 1 × 3 ## PartyID p p_se ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Strong republican 0.228 0.00824 Answer: 22.8% What percentage of people voted in both the 2016 election and the 2020 election? Include the logit confidence interval. Hint: The variable VotedPres2016 indicates whether someone voted in 2016. pvb &lt;- anes_des %&gt;% filter(!is.na(VotedPres2016),!is.na(VotedPres2020)) %&gt;% group_by(interact(VotedPres2016, VotedPres2020)) %&gt;% summarize(p = survey_prop(var = &quot;ci&quot;, method = &quot;logit&quot;),) %&gt;% filter(VotedPres2016 == &quot;Yes&quot;, VotedPres2020 == &quot;Yes&quot;) pvb ## # A tibble: 1 × 5 ## VotedPres2016 VotedPres2020 p p_low p_upp ## &lt;fct&gt; &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Yes Yes 0.794 0.777 0.810 Answer: 79.4 with confidence interval: (77.7, 81) What is the design effect for the proportion of people who voted early? Hint: The variable EarlyVote2020 indicates whether someone voted early in 2020. pdeff &lt;- anes_des %&gt;% filter(!is.na(EarlyVote2020)) %&gt;% group_by(EarlyVote2020) %&gt;% summarize(p = survey_mean(deff = TRUE)) %&gt;% filter(EarlyVote2020 == &quot;Yes&quot;) pdeff ## # A tibble: 1 × 4 ## EarlyVote2020 p p_se p_deff ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Yes 0.726 0.0247 1.50 Answer: 1.5 What is the median temperature people set their thermostats to at night during the winter? Hint: The variable WinterTempNight indicates the temperature that people set their temperature in the winter at night. med_wintertempnight &lt;- recs_des %&gt;% summarize(wtn_med = survey_median(x = WinterTempNight, na.rm = TRUE)) med_wintertempnight ## # A tibble: 1 × 2 ## wtn_med wtn_med_se ## &lt;dbl&gt; &lt;dbl&gt; ## 1 68 0.250 Answer: 68 People sometimes set their temperature differently over different seasons and during the day. What median temperatures do people set their thermostat to in the summer and winter, both during the day and at night? Include confidence intervals. Hint: Use the variables WinterTempDay, WinterTempNight, SummerTempDay, and SummerTempNight. # Option 1 med_temps &lt;- recs_des %&gt;% summarize( across(c(WinterTempDay, WinterTempNight, SummerTempDay, SummerTempNight), ~survey_median(.x, na.rm=TRUE)) ) med_temps ## # A tibble: 1 × 8 ## WinterTempDay WinterTempDay_se WinterTempNight WinterTempNight_se ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 70 0.250 68 0.250 ## # ℹ 4 more variables: SummerTempDay &lt;dbl&gt;, SummerTempDay_se &lt;dbl&gt;, ## # SummerTempNight &lt;dbl&gt;, SummerTempNight_se &lt;dbl&gt; # Alternatively, could use `survey_quantile()` as shown below for WinterTempNight: quant_temps &lt;- recs_des %&gt;% summarize( across(c(WinterTempDay, WinterTempNight, SummerTempDay, SummerTempNight), ~survey_quantile(.x, quantiles=0.5, na.rm=TRUE)) ) quant_temps ## # A tibble: 1 × 8 ## WinterTempDay_q50 WinterTempDay_q50_se WinterTempNight_q50 ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 70 0.250 68 ## # ℹ 5 more variables: WinterTempNight_q50_se &lt;dbl&gt;, ## # SummerTempDay_q50 &lt;dbl&gt;, SummerTempDay_q50_se &lt;dbl&gt;, ## # SummerTempNight_q50 &lt;dbl&gt;, SummerTempNight_q50_se &lt;dbl&gt; Answer: - Winter during the day: 70 - Winter during the night: 68 - Summer during the day: 72 - Summer during the night: 72 What is the correlation between the temperature that people set their temperature at during the night and during the day in the summer? corr_summer_temp &lt;- recs_des %&gt;% summarize(summer_corr = survey_corr(SummerTempNight, SummerTempDay, na.rm = TRUE)) corr_summer_temp ## # A tibble: 1 × 2 ## summer_corr summer_corr_se ## &lt;dbl&gt; &lt;dbl&gt; ## 1 0.806 0.00806 Answer: 0.806 What is the 1st, 2nd, and 3rd quartile of the amount of money spent on energy by Building America (BA) climate zone? Hint: TOTALDOL indicates the total amount spent on all fuel, and ClimateRegion_BA indicates the BA climate zones. quant_baenergyexp &lt;- recs_des %&gt;% group_by(ClimateRegion_BA) %&gt;% summarize(dol_quant = survey_quantile( TOTALDOL, quantiles = c(0.25, 0.5, 0.75), vartype = &quot;se&quot;, na.rm = TRUE )) quant_baenergyexp ## # A tibble: 8 × 7 ## ClimateRegion_BA dol_quant_q25 dol_quant_q50 dol_quant_q75 ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Mixed-Dry 1091. 1541. 2139. ## 2 Mixed-Humid 1317. 1840. 2462. ## 3 Hot-Humid 1094. 1622. 2233. ## 4 Hot-Dry 926. 1513. 2223. ## 5 Very-Cold 1195. 1986. 2955. ## 6 Cold 1213. 1756. 2422. ## 7 Marine 938. 1380. 1987. ## 8 Subarctic 2404. 3535. 5219. ## # ℹ 3 more variables: dol_quant_q25_se &lt;dbl&gt;, dol_quant_q50_se &lt;dbl&gt;, ## # dol_quant_q75_se &lt;dbl&gt; Answer: #hrwokkyhya table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #hrwokkyhya thead, #hrwokkyhya tbody, #hrwokkyhya tfoot, #hrwokkyhya tr, #hrwokkyhya td, #hrwokkyhya th { border-style: none; } #hrwokkyhya p { margin: 0; padding: 0; } #hrwokkyhya .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #hrwokkyhya .gt_caption { padding-top: 4px; padding-bottom: 4px; } #hrwokkyhya .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #hrwokkyhya .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #hrwokkyhya .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #hrwokkyhya .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #hrwokkyhya .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #hrwokkyhya .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #hrwokkyhya .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #hrwokkyhya .gt_column_spanner_outer:first-child { padding-left: 0; } #hrwokkyhya .gt_column_spanner_outer:last-child { padding-right: 0; } #hrwokkyhya .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #hrwokkyhya .gt_spanner_row { border-bottom-style: hidden; } #hrwokkyhya .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #hrwokkyhya .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #hrwokkyhya .gt_from_md > :first-child { margin-top: 0; } #hrwokkyhya .gt_from_md > :last-child { margin-bottom: 0; } #hrwokkyhya .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #hrwokkyhya .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #hrwokkyhya .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #hrwokkyhya .gt_row_group_first td { border-top-width: 2px; } #hrwokkyhya .gt_row_group_first th { border-top-width: 2px; } #hrwokkyhya .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #hrwokkyhya .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #hrwokkyhya .gt_first_summary_row.thick { border-top-width: 2px; } #hrwokkyhya .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #hrwokkyhya .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #hrwokkyhya .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #hrwokkyhya .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #hrwokkyhya .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #hrwokkyhya .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #hrwokkyhya .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #hrwokkyhya .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #hrwokkyhya .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #hrwokkyhya .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #hrwokkyhya .gt_left { text-align: left; } #hrwokkyhya .gt_center { text-align: center; } #hrwokkyhya .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #hrwokkyhya .gt_font_normal { font-weight: normal; } #hrwokkyhya .gt_font_bold { font-weight: bold; } #hrwokkyhya .gt_font_italic { font-style: italic; } #hrwokkyhya .gt_super { font-size: 65%; } #hrwokkyhya .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #hrwokkyhya .gt_asterisk { font-size: 100%; vertical-align: 0; } #hrwokkyhya .gt_indent_1 { text-indent: 5px; } #hrwokkyhya .gt_indent_2 { text-indent: 10px; } #hrwokkyhya .gt_indent_3 { text-indent: 15px; } #hrwokkyhya .gt_indent_4 { text-indent: 20px; } #hrwokkyhya .gt_indent_5 { text-indent: 25px; } Quartile summary of energy expenditure by BA Climate Zone Q1 Q2 Q3 Mixed-Dry $1,091 $1,541 $2,139 Mixed-Humid $1,317 $1,840 $2,462 Hot-Humid $1,094 $1,622 $2,233 Hot-Dry $926 $1,513 $2,223 Very-Cold $1,195 $1,986 $2,955 Cold $1,213 $1,756 $2,422 Marine $938 $1,380 $1,987 Subarctic $2,404 $3,535 $5,219 6 - Statistical testing Using the RECS data, do more than 50% of U.S. households use AC (ACUsed)? ttest_solution1 &lt;- recs_des %&gt;% svyttest(design = ., formula = ((ACUsed == TRUE) - 0.5) ~ 0, na.rm = TRUE, alternative=&quot;greater&quot;) %&gt;% tidy() ttest_solution1 ## # A tibble: 1 × 8 ## estimate statistic p.value parameter conf.low conf.high method ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;chr&gt; ## 1 0.387 126. 1.73e-72 58 0.380 0.393 Design-based… ## # ℹ 1 more variable: alternative &lt;chr&gt; Answer: 88.7% of households use air-conditioning which is significantly different from 50% (p&lt;0.0001) so there is strong evidence that more than 50% of households use air-conditioning. Using the RECS data, does the average temperature that U.S. households set their thermostats to differ between the day and night in the winter (WinterTempDay and WinterTempNight)? ttest_solution2 &lt;- recs_des %&gt;% svyttest( design = ., formula = WinterTempDay - WinterTempNight ~ 0, na.rm = TRUE ) %&gt;% tidy() ttest_solution2 ## # A tibble: 1 × 8 ## estimate statistic p.value parameter conf.low conf.high method ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;chr&gt; ## 1 1.67 45.9 2.82e-47 58 1.59 1.74 Design-based… ## # ℹ 1 more variable: alternative &lt;chr&gt; Answer: The average temperature difference between night and day during the winter for thermostat settings is 1.67 which is significantly different from 0 (p&lt;0.0001) so there is strong evidence that the temperature setting is different between night and daytime during the winter. Using the ANES data, does the average age (Age) of those who voted for Joseph Biden in 2020 (VotedPres2020_selection) differ from those who voted for another candidate? ttest_solution3 &lt;- anes_des %&gt;% filter(!is.na(VotedPres2020_selection)) %&gt;% svyttest( design = ., formula = Age ~ VotedPres2020_selection == &quot;Biden&quot;, na.rm = TRUE ) %&gt;% tidy() ttest_solution3 ## # A tibble: 1 × 8 ## estimate statistic p.value parameter conf.low conf.high method ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;chr&gt; ## 1 -3.60 -5.97 0.000000244 50 -4.81 -2.39 Design-ba… ## # ℹ 1 more variable: alternative &lt;chr&gt; On average, those who voted for Joseph Biden in 2020 were -3.6 years younger than voters for other candidates and this is significantly different (p &lt;0.0001). If you wanted to determine if the political party affiliation differed for males and females, what test would you use? Goodness of fit test (svygofchisq()) Test of independence (svychisq()) Test of homogeneity (svychisq()) Answer: c. Test of homogeneity (svychisq()) In the RECS data, is there a relationship between the type of housing unit (HousingUnitType) and the year the house was built (YearMade)? chisq_solution2 &lt;- recs_des %&gt;% svychisq( formula = ~ HousingUnitType + YearMade, design = ., statistic = &quot;Wald&quot;, na.rm = TRUE ) chisq_solution2 %&gt;% tidy() ## Multiple parameters; naming those columns ndf, ddf ## # A tibble: 1 × 5 ## ndf ddf statistic p.value method ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;chr&gt; ## 1 32 59 67.9 5.54e-36 Design-based Wald test of association Answer: There is strong evidence (p&lt;0.0001) that there is a relationship between type of housing unit and the year the house was built. In the ANES data, is there a difference in the distribution of gender (Gender) across early voting status in 2020 (EarlyVote2020)? chisq_solution3 &lt;- anes_des %&gt;% svychisq( formula = ~ Gender + EarlyVote2020, design = ., statistic = &quot;F&quot;, na.rm = TRUE ) %&gt;% tidy() ## Multiple parameters; naming those columns ndf, ddf chisq_solution3 ## # A tibble: 1 × 5 ## ndf ddf statistic p.value method ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;chr&gt; ## 1 1 51 4.53 0.0381 Pearson&#39;s X^2: Rao &amp; Scott adjustment Answer: There is strong evidence that there is a difference in the gender distribution of gender by early voting status (p=0.0381). 7 - Modeling The type of housing unit may have an impact on energy expenses. Is there any relationship between housing unit type (HousingUnitType) and total energy expenditure (TOTALDOL)? First, find the average energy expenditure by housing unit type as a descriptive analysis and then do the test. The reference level in the comparison should be the housing unit type that is most common. expense_by_hut &lt;- recs_des %&gt;% group_by(HousingUnitType) %&gt;% summarize(Expense = survey_mean(TOTALDOL, na.rm = TRUE), HUs = survey_total()) %&gt;% arrange(desc(HUs)) expense_by_hut ## # A tibble: 5 × 5 ## HousingUnitType Expense Expense_se HUs HUs_se ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Single-family detached 2205. 9.36 77067692. 0.00000277 ## 2 Apartment: 5 or more units 1108. 13.7 22835862. 0.000000226 ## 3 Apartment: 2-4 Units 1407. 24.2 9341795. 0.119 ## 4 Single-family attached 1653. 22.3 7451177. 0.114 ## 5 Mobile home 1773. 26.2 6832499. 0.0000000927 exp_unit_out &lt;- recs_des %&gt;% mutate(HousingUnitType = fct_infreq(HousingUnitType, NWEIGHT)) %&gt;% svyglm( design = ., formula = TOTALDOL ~ HousingUnitType, na.action = na.omit ) tidy(exp_unit_out) ## # A tibble: 5 × 5 ## term estimate std.error statistic p.value ## &lt;chr&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 (Intercept) 2205. 9.36 236. 2.53e-84 ## 2 HousingUnitTypeApartment: 5 or … -1097. 16.5 -66.3 3.52e-54 ## 3 HousingUnitTypeApartment: 2-4 U… -798. 28.0 -28.5 1.37e-34 ## 4 HousingUnitTypeSingle-family at… -551. 25.0 -22.1 5.28e-29 ## 5 HousingUnitTypeMobile home -431. 27.4 -15.7 5.36e-22 Answer: The reference level should be Single-family detached. All p-values are very small indicating there is a significant relationship between housing unit type and total energy expenditure. Does temperature play a role in electricity expenditure (DOLLAREL)? Cooling degree days are a measure of how hot a place is. CDD65 for a given day indicates the number of degrees Fahrenheit warmer than 65°F (18.3°C) it is in a location. On a day that averages 65°F and below, CDD65=0. While a day that averages 85°F (29.4°C) would have CDD65=20 because it is 20 degrees Fahrenheit warmer. For each day in the year, this is summed to give an indicator of how hot the place is throughout the year. Similarly, HDD65 indicates the days colder than 65°F29. Can energy expenditure be predicted using these temperature indicators along with square footage? Is there a significant relationship? Include main effects and two-way interactions. temps_sqft_exp &lt;- recs_des %&gt;% svyglm( design = ., formula = DOLLAREL ~ (TOTSQFT_EN + CDD65 + HDD65) ^ 2, na.action = na.omit ) tidy(temps_sqft_exp) %&gt;% mutate(p.value=pretty_p_value(p.value) %&gt;% str_pad(7)) ## # A tibble: 7 × 5 ## term estimate std.error statistic p.value ## &lt;chr&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;chr&gt; ## 1 (Intercept) 741. 70.5 10.5 &quot;&lt;0.0001&quot; ## 2 TOTSQFT_EN 0.272 0.0471 5.77 &quot;&lt;0.0001&quot; ## 3 CDD65 0.0293 0.0227 1.29 &quot; 0.2024&quot; ## 4 HDD65 -0.00111 0.0104 -0.107 &quot; 0.9149&quot; ## 5 TOTSQFT_EN:CDD65 0.0000459 0.0000154 2.97 &quot; 0.0044&quot; ## 6 TOTSQFT_EN:HDD65 -0.00000840 0.00000633 -1.33 &quot; 0.1902&quot; ## 7 CDD65:HDD65 0.00000533 0.00000355 1.50 &quot; 0.1390&quot; Answer: There is a significant interaction between square footage and cooling degree days in the model and the square footage is a significant predictor of eletricity expenditure. Continuing with our results from question 2, create a plot between the actual and predicted expenditures and a residual plot for the predicted expenditures. Answer: temps_sqft_exp_fit &lt;- temps_sqft_exp %&gt;% augment() %&gt;% mutate(.se.fit = sqrt(attr(.fitted, &quot;var&quot;)), # extract the variance of the fitted value .fitted = as.numeric(.fitted)) temps_sqft_exp_fit %&gt;% ggplot(aes(x = DOLLAREL, y = .fitted)) + geom_point() + geom_abline(intercept = 0, slope = 1, color = &quot;red&quot;) + xlab(&quot;Actual expenditures&quot;) + ylab(&quot;Predicted expenditures&quot;) + theme_minimal() FIGURE D.1: Actual and predicted electricity expenditures temps_sqft_exp_fit %&gt;% ggplot(aes(x = .fitted, y = .resid)) + geom_point() + geom_hline(yintercept = 0, color = &quot;red&quot;) + xlab(&quot;Predicted expenditure&quot;) + ylab(&quot;Residual value of expenditure&quot;) + theme_minimal() FIGURE D.2: Residual plot of electric cost model with covariates TOTSQFT_EN, CDD65, and HDD65 Early voting expanded in 202030. Build a logistic model predicting early voting in 2020 (EarlyVote2020) using age (Age), education (Education), and party identification (PartyID). Include two-way interactions. Answer: earlyvote_mod &lt;- anes_des %&gt;% filter(!is.na(EarlyVote2020)) %&gt;% svyglm( design = ., formula = EarlyVote2020 ~ (Age + Education + PartyID) ^ 2 , family = quasibinomial ) tidy(earlyvote_mod) %&gt;% print(n=50) ## # A tibble: 46 × 5 ## term estimate std.error statistic p.value ## &lt;chr&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 (Intercept) 3.28e-1 3.86 0.0848 0.940 ## 2 Age -2.20e-2 0.0579 -0.379 0.741 ## 3 EducationHigh school -2.56e+0 3.89 -0.658 0.578 ## 4 EducationPost HS -3.27e+0 3.97 -0.823 0.497 ## 5 EducationBachelor&#39;s -3.29e+0 3.91 -0.842 0.489 ## 6 EducationGraduate -1.36e+0 3.91 -0.349 0.761 ## 7 PartyIDNot very strong democrat 2.00e+0 3.30 0.605 0.607 ## 8 PartyIDIndependent-democrat 3.38e+0 2.60 1.30 0.323 ## 9 PartyIDIndependent 5.22e+0 2.25 2.32 0.146 ## 10 PartyIDIndependent-republican -1.95e+1 2.42 -8.09 0.0149 ## 11 PartyIDNot very strong republic… -1.33e+1 3.24 -4.10 0.0546 ## 12 PartyIDStrong republican 3.13e+0 2.18 1.44 0.287 ## 13 Age:EducationHigh school 4.72e-2 0.0592 0.796 0.509 ## 14 Age:EducationPost HS 5.25e-2 0.0588 0.892 0.467 ## 15 Age:EducationBachelor&#39;s 4.76e-2 0.0600 0.793 0.511 ## 16 Age:EducationGraduate 8.65e-3 0.0578 0.150 0.895 ## 17 Age:PartyIDNot very strong demo… -2.28e-2 0.0497 -0.459 0.691 ## 18 Age:PartyIDIndependent-democrat -7.03e-2 0.0285 -2.46 0.133 ## 19 Age:PartyIDIndependent -8.00e-2 0.0302 -2.65 0.118 ## 20 Age:PartyIDIndependent-republic… 6.72e-2 0.0378 1.78 0.217 ## 21 Age:PartyIDNot very strong repu… -3.07e-2 0.0420 -0.732 0.540 ## 22 Age:PartyIDStrong republican -3.84e-2 0.0180 -2.14 0.166 ## 23 EducationHigh school:PartyIDNot… -1.24e+0 2.22 -0.557 0.633 ## 24 EducationPost HS:PartyIDNot ver… -8.95e-1 2.16 -0.413 0.719 ## 25 EducationBachelor&#39;s:PartyIDNot … -1.21e+0 2.29 -0.528 0.650 ## 26 EducationGraduate:PartyIDNot ve… -1.90e+0 2.25 -0.844 0.487 ## 27 EducationHigh school:PartyIDInd… 7.84e-1 2.50 0.314 0.783 ## 28 EducationPost HS:PartyIDIndepen… 4.04e-1 2.31 0.175 0.877 ## 29 EducationBachelor&#39;s:PartyIDInde… 5.00e-1 2.60 0.193 0.865 ## 30 EducationGraduate:PartyIDIndepe… -1.48e+1 2.47 -5.99 0.0268 ## 31 EducationHigh school:PartyIDInd… -6.32e-1 1.72 -0.368 0.748 ## 32 EducationPost HS:PartyIDIndepen… -9.27e-2 1.63 -0.0568 0.960 ## 33 EducationBachelor&#39;s:PartyIDInde… -2.62e-1 2.13 -0.123 0.913 ## 34 EducationGraduate:PartyIDIndepe… -1.42e+1 1.75 -8.12 0.0148 ## 35 EducationHigh school:PartyIDInd… 1.55e+1 2.56 6.05 0.0262 ## 36 EducationPost HS:PartyIDIndepen… 1.48e+1 2.77 5.34 0.0333 ## 37 EducationBachelor&#39;s:PartyIDInde… 1.77e+1 2.32 7.64 0.0167 ## 38 EducationGraduate:PartyIDIndepe… 1.65e+1 2.33 7.10 0.0193 ## 39 EducationHigh school:PartyIDNot… 1.59e+1 2.02 7.88 0.0157 ## 40 EducationPost HS:PartyIDNot ver… 1.62e+1 1.69 9.54 0.0108 ## 41 EducationBachelor&#39;s:PartyIDNot … 1.58e+1 1.93 8.18 0.0146 ## 42 EducationGraduate:PartyIDNot ve… 1.54e+1 1.72 8.95 0.0123 ## 43 EducationHigh school:PartyIDStr… -2.06e+0 1.88 -1.10 0.387 ## 44 EducationPost HS:PartyIDStrong … 9.17e-2 2.01 0.0456 0.968 ## 45 EducationBachelor&#39;s:PartyIDStro… 6.87e-2 2.06 0.0333 0.976 ## 46 EducationGraduate:PartyIDStrong… -8.53e-1 1.81 -0.471 0.684 Continuing from Exercise 4, predict the probability of early voting for two people. Both are 28 years old and have a graduate degree, but one person is a strong Democrat, and the other is a strong Republican. add_vote_dat &lt;- anes_2020 %&gt;% select(EarlyVote2020, Age, Education, PartyID) %&gt;% rbind(tibble( EarlyVote2020 = NA, Age = 28, Education = &quot;Graduate&quot;, PartyID = c(&quot;Strong democrat&quot;, &quot;Strong republican&quot;) )) %&gt;% tail(2) log_ex_2_out &lt;- earlyvote_mod %&gt;% augment(newdata = add_vote_dat, type.predict = &quot;response&quot;) %&gt;% mutate(.se.fit = sqrt(attr(.fitted, &quot;var&quot;)), # extract the variance of the fitted value .fitted = as.numeric(.fitted)) log_ex_2_out ## # A tibble: 2 × 6 ## EarlyVote2020 Age Education PartyID .fitted .se.fit ## &lt;fct&gt; &lt;dbl&gt; &lt;fct&gt; &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 &lt;NA&gt; 28 Graduate Strong democrat 0.197 0.150 ## 2 &lt;NA&gt; 28 Graduate Strong republican 0.450 0.244 Answer: We predict that the 28 year old with a graduate degree who identifies as a strong democrat will vote early 19.7% of the time while a person who is otherwise similar but is a strong replican will vote early 45% of the time 10 - Specifying sample designs and replicate weights in {srvyr} The National Health Interview Survey (NHIS) is an annual household survey conducted by the National Center for Health Statistics (NCHS). The NHIS includes a wide variety of health topics for adults including health status and conditions, functioning and disability, health care access and health service utilization, health-related behaviors, health promotion, mental health, barriers to care, and community engagement. Like many national in-person surveys, the sampling design is a stratified clustered design with details included in the Survey Description (National Center for Health Statistics 2023). The Survey Description provides information on setting up syntax in SUDAAN, Stata, SPSS, SAS, and R ({survey} package implementation). You have imported the data and the variable containing the data is: nhis_adult_data. How would you specify the design using {srvyr} using either as_survey_design() or as_survey_rep()? Answer: nhis_adult_des &lt;- nhis_adult_data %&gt;% as_survey_design( ids = PPSU, strata = PSTRAT, nest = TRUE, weights = WTFA_A ) The General Social Survey is a survey that has been administered since 1972 on social, behavioral, and attitudinal topics. The 2016-2020 GSS Panel codebook provides examples of setting up syntax in SAS and Stata but not R (Davern et al. 2021). You have imported the data and the variable containing the data is: gss_data. How would you specify the design in R using either as_survey_design() or as_survey_rep()? Answer: gss_des &lt;- gss_data %&gt;% as_survey_design(ids = VPSU_2, strata = VSTRAT_2, weights = WTSSNR_2) 13 - National Crime Victimization Survey Vignette What proportion of completed motor vehicle thefts are not reported to the police? Hint: Use the codebook to look at the definition of Type of Crime (V4529). ans1 &lt;- inc_des %&gt;% filter(str_detect(V4529, &quot;40|41&quot;)) %&gt;% summarize(Pct = survey_mean(!ReportPolice, na.rm = TRUE) * 100) Answer: It is estimated that 23.1% of motor vehicle thefts are not reported to the police. How many violent crimes occur in each region? Answer: inc_des %&gt;% filter(Violent) %&gt;% survey_count(Region) %&gt;% select(-n_se) %&gt;% gt(rowname_col=&quot;Region&quot;) %&gt;% fmt_integer() %&gt;% cols_label( n =&quot;Violent victimizations&quot;, ) %&gt;% tab_header(&quot;Estimated number of violent crimes by region&quot;) #uqkclffrjq table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #uqkclffrjq thead, #uqkclffrjq tbody, #uqkclffrjq tfoot, #uqkclffrjq tr, #uqkclffrjq td, #uqkclffrjq th { border-style: none; } #uqkclffrjq p { margin: 0; padding: 0; } #uqkclffrjq .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #uqkclffrjq .gt_caption { padding-top: 4px; padding-bottom: 4px; } #uqkclffrjq .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #uqkclffrjq .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #uqkclffrjq .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #uqkclffrjq .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #uqkclffrjq .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #uqkclffrjq .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #uqkclffrjq .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #uqkclffrjq .gt_column_spanner_outer:first-child { padding-left: 0; } #uqkclffrjq .gt_column_spanner_outer:last-child { padding-right: 0; } #uqkclffrjq .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #uqkclffrjq .gt_spanner_row { border-bottom-style: hidden; } #uqkclffrjq .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #uqkclffrjq .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #uqkclffrjq .gt_from_md > :first-child { margin-top: 0; } #uqkclffrjq .gt_from_md > :last-child { margin-bottom: 0; } #uqkclffrjq .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #uqkclffrjq .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #uqkclffrjq .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #uqkclffrjq .gt_row_group_first td { border-top-width: 2px; } #uqkclffrjq .gt_row_group_first th { border-top-width: 2px; } #uqkclffrjq .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #uqkclffrjq .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #uqkclffrjq .gt_first_summary_row.thick { border-top-width: 2px; } #uqkclffrjq .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #uqkclffrjq .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #uqkclffrjq .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #uqkclffrjq .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #uqkclffrjq .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #uqkclffrjq .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #uqkclffrjq .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #uqkclffrjq .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #uqkclffrjq .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #uqkclffrjq .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #uqkclffrjq .gt_left { text-align: left; } #uqkclffrjq .gt_center { text-align: center; } #uqkclffrjq .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #uqkclffrjq .gt_font_normal { font-weight: normal; } #uqkclffrjq .gt_font_bold { font-weight: bold; } #uqkclffrjq .gt_font_italic { font-style: italic; } #uqkclffrjq .gt_super { font-size: 65%; } #uqkclffrjq .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #uqkclffrjq .gt_asterisk { font-size: 100%; vertical-align: 0; } #uqkclffrjq .gt_indent_1 { text-indent: 5px; } #uqkclffrjq .gt_indent_2 { text-indent: 10px; } #uqkclffrjq .gt_indent_3 { text-indent: 15px; } #uqkclffrjq .gt_indent_4 { text-indent: 20px; } #uqkclffrjq .gt_indent_5 { text-indent: 25px; } Estimated number of violent crimes by region Violent victimizations Northeast 698,406 Midwest 1,144,407 South 1,394,214 West 1,361,278 What is the property victimization rate among each income level? Answer: hh_des %&gt;% filter(!is.na(Income)) %&gt;% group_by(Income) %&gt;% summarize(Property_Rate = survey_mean(Property * ADJINC_WT * 1000, na.rm = TRUE)) %&gt;% gt(rowname_col=&quot;Income&quot;) %&gt;% cols_label( Property_Rate=&quot;Rate&quot;, Property_Rate_se=&quot;Standard Error&quot; ) %&gt;% fmt_number(decimals=1) %&gt;% tab_header(&quot;Estimated property victimization rate by income level&quot;) #xqossclppl table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #xqossclppl thead, #xqossclppl tbody, #xqossclppl tfoot, #xqossclppl tr, #xqossclppl td, #xqossclppl th { border-style: none; } #xqossclppl p { margin: 0; padding: 0; } #xqossclppl .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #xqossclppl .gt_caption { padding-top: 4px; padding-bottom: 4px; } #xqossclppl .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #xqossclppl .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #xqossclppl .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #xqossclppl .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #xqossclppl .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #xqossclppl .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #xqossclppl .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #xqossclppl .gt_column_spanner_outer:first-child { padding-left: 0; } #xqossclppl .gt_column_spanner_outer:last-child { padding-right: 0; } #xqossclppl .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #xqossclppl .gt_spanner_row { border-bottom-style: hidden; } #xqossclppl .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #xqossclppl .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #xqossclppl .gt_from_md > :first-child { margin-top: 0; } #xqossclppl .gt_from_md > :last-child { margin-bottom: 0; } #xqossclppl .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #xqossclppl .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #xqossclppl .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #xqossclppl .gt_row_group_first td { border-top-width: 2px; } #xqossclppl .gt_row_group_first th { border-top-width: 2px; } #xqossclppl .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #xqossclppl .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #xqossclppl .gt_first_summary_row.thick { border-top-width: 2px; } #xqossclppl .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #xqossclppl .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #xqossclppl .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #xqossclppl .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #xqossclppl .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #xqossclppl .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #xqossclppl .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #xqossclppl .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #xqossclppl .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #xqossclppl .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #xqossclppl .gt_left { text-align: left; } #xqossclppl .gt_center { text-align: center; } #xqossclppl .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #xqossclppl .gt_font_normal { font-weight: normal; } #xqossclppl .gt_font_bold { font-weight: bold; } #xqossclppl .gt_font_italic { font-style: italic; } #xqossclppl .gt_super { font-size: 65%; } #xqossclppl .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #xqossclppl .gt_asterisk { font-size: 100%; vertical-align: 0; } #xqossclppl .gt_indent_1 { text-indent: 5px; } #xqossclppl .gt_indent_2 { text-indent: 10px; } #xqossclppl .gt_indent_3 { text-indent: 15px; } #xqossclppl .gt_indent_4 { text-indent: 20px; } #xqossclppl .gt_indent_5 { text-indent: 25px; } Estimated property victimization rate by income level Rate Standard Error Less than $25,000 110.6 5.0 $25,000-49,999 89.5 3.4 $50,000-99,999 87.8 3.3 $100,000-199,999 76.5 3.5 $200,000 or more 91.8 5.7 What is the difference between the violent victimization rate between males and females? Is it statistically different? vr_gender &lt;- pers_des %&gt;% group_by(Sex) %&gt;% summarize( Violent_rate=survey_mean(Violent * ADJINC_WT * 1000, na.rm=TRUE) ) vr_gender_test &lt;- pers_des %&gt;% mutate( Violent_Adj=Violent * ADJINC_WT * 1000 ) %&gt;% svyttest( formula = Violent_Adj ~ Sex, design = ., na.rm = TRUE ) %&gt;% broom::tidy() ## Warning in summary.glm(g): observations with zero weight not used for ## calculating dispersion ## Warning in summary.glm(glm.object): observations with zero weight not ## used for calculating dispersion Answer: The difference between male and female victimization rate is estimated as 1.9 victimizations/1,000 people and is not significantly different (p-value=0.1560) 14 - AmericasBarometer Vignette Calculate the percentage of households with broadband internet in and those with any internet at home, including from a phone or tablet in Latin America and the Caribbean. Hint: if you come across countries with 0% internet usage, you may want to filter by something first. Answer: int_ests &lt;- ambarom_des %&gt;% filter(!is.na(Internet) | !is.na(BroadbandInternet)) %&gt;% group_by(Country) %&gt;% summarize( p_broadband = survey_mean(BroadbandInternet, na.rm = TRUE) * 100, p_internet = survey_mean(Internet, na.rm = TRUE) * 100 ) int_ests %&gt;% gt(rowname_col = &quot;Country&quot;) %&gt;% fmt_number(decimals=1) %&gt;% tab_spanner( label=&quot;Broadband at home&quot;, columns=c(p_broadband, p_broadband_se) ) %&gt;% tab_spanner( label=&quot;Internet at home&quot;, columns=c(p_internet, p_internet_se) ) %&gt;% cols_label( p_broadband=&quot;Percent&quot;, p_internet=&quot;Percent&quot;, p_broadband_se=&quot;S.E.&quot;, p_internet_se=&quot;S.E.&quot;, ) #rwvnyhqfyu table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #rwvnyhqfyu thead, #rwvnyhqfyu tbody, #rwvnyhqfyu tfoot, #rwvnyhqfyu tr, #rwvnyhqfyu td, #rwvnyhqfyu th { border-style: none; } #rwvnyhqfyu p { margin: 0; padding: 0; } #rwvnyhqfyu .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #rwvnyhqfyu .gt_caption { padding-top: 4px; padding-bottom: 4px; } #rwvnyhqfyu .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #rwvnyhqfyu .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #rwvnyhqfyu .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #rwvnyhqfyu .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #rwvnyhqfyu .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #rwvnyhqfyu .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #rwvnyhqfyu .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #rwvnyhqfyu .gt_column_spanner_outer:first-child { padding-left: 0; } #rwvnyhqfyu .gt_column_spanner_outer:last-child { padding-right: 0; } #rwvnyhqfyu .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #rwvnyhqfyu .gt_spanner_row { border-bottom-style: hidden; } #rwvnyhqfyu .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #rwvnyhqfyu .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #rwvnyhqfyu .gt_from_md > :first-child { margin-top: 0; } #rwvnyhqfyu .gt_from_md > :last-child { margin-bottom: 0; } #rwvnyhqfyu .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #rwvnyhqfyu .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #rwvnyhqfyu .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #rwvnyhqfyu .gt_row_group_first td { border-top-width: 2px; } #rwvnyhqfyu .gt_row_group_first th { border-top-width: 2px; } #rwvnyhqfyu .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #rwvnyhqfyu .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #rwvnyhqfyu .gt_first_summary_row.thick { border-top-width: 2px; } #rwvnyhqfyu .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #rwvnyhqfyu .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #rwvnyhqfyu .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #rwvnyhqfyu .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #rwvnyhqfyu .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #rwvnyhqfyu .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #rwvnyhqfyu .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #rwvnyhqfyu .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #rwvnyhqfyu .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #rwvnyhqfyu .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #rwvnyhqfyu .gt_left { text-align: left; } #rwvnyhqfyu .gt_center { text-align: center; } #rwvnyhqfyu .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #rwvnyhqfyu .gt_font_normal { font-weight: normal; } #rwvnyhqfyu .gt_font_bold { font-weight: bold; } #rwvnyhqfyu .gt_font_italic { font-style: italic; } #rwvnyhqfyu .gt_super { font-size: 65%; } #rwvnyhqfyu .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #rwvnyhqfyu .gt_asterisk { font-size: 100%; vertical-align: 0; } #rwvnyhqfyu .gt_indent_1 { text-indent: 5px; } #rwvnyhqfyu .gt_indent_2 { text-indent: 10px; } #rwvnyhqfyu .gt_indent_3 { text-indent: 15px; } #rwvnyhqfyu .gt_indent_4 { text-indent: 20px; } #rwvnyhqfyu .gt_indent_5 { text-indent: 25px; } Broadband at home Internet at home Percent S.E. Percent S.E. Argentina 62.3 1.1 86.2 0.9 Bolivia 41.4 1.0 77.2 1.0 Brazil 68.3 1.2 88.9 0.9 Chile 63.1 1.1 93.5 0.5 Colombia 45.7 1.2 68.7 1.1 Costa Rica 49.6 1.1 84.4 0.8 Dominican Republic 37.1 1.0 73.7 1.0 Ecuador 59.7 1.1 79.9 0.9 El Salvador 30.2 0.9 63.9 1.0 Guatemala 33.4 1.0 61.5 1.1 Guyana 63.7 1.1 86.8 0.8 Haiti 11.8 0.8 58.5 1.2 Honduras 28.2 1.0 60.7 1.1 Jamaica 64.2 1.0 91.5 0.6 Mexico 44.9 1.1 70.9 1.0 Nicaragua 39.1 1.1 76.3 1.1 Panama 43.4 1.0 73.1 1.0 Paraguay 33.3 1.0 72.9 1.0 Peru 42.4 1.1 71.1 1.1 Uruguay 62.7 1.1 90.6 0.7 Create a faceted map showing both broadband internet and any internet usage. Answer: library(sf) library(rnaturalearth) library(ggpattern) internet_sf &lt;- country_shape_upd %&gt;% full_join(select(int_ests, p = p_internet, geounit = Country), by = &quot;geounit&quot;) %&gt;% mutate(Type = &quot;Internet&quot;) broadband_sf &lt;- country_shape_upd %&gt;% full_join(select(int_ests, p = p_broadband, geounit = Country), by = &quot;geounit&quot;) %&gt;% mutate(Type = &quot;Broadband&quot;) b_int_sf &lt;- internet_sf %&gt;% bind_rows(broadband_sf) %&gt;% filter(region_wb == &quot;Latin America &amp; Caribbean&quot;) b_int_sf %&gt;% ggplot(aes(fill = p), color=&quot;darkgray&quot;) + geom_sf() + facet_wrap( ~ Type) + scale_fill_gradientn( guide = &quot;colorbar&quot;, name = &quot;Percent&quot;, labels = scales::comma, colors = c(&quot;#BFD7EA&quot;, &quot;#087E8B&quot;, &quot;#0B3954&quot;), na.value = NA ) + geom_sf_pattern( data = filter(b_int_sf, is.na(p)), pattern = &quot;crosshatch&quot;, pattern_fill = &quot;lightgray&quot;, pattern_color = &quot;lightgray&quot;, fill = NA, color = &quot;darkgray&quot; ) + theme_minimal() FIGURE D.3: Percent of broadband internet and any internet usage, Central and South America References Davern, Michael, Rene Bautista, Jeremy Freese, Stephen L. Morgan, and Tom W. Smith. 2021. “General Social Survey 2016-2020 Panel Codebook.” Edited by Chicago NORC. https://gss.norc.org/Documents/codebook/2016-2020%20GSS%20Panel%20Codebook%20-%20R1a.pdf. National Center for Health Statistics. 2023. “National Health Interview Survey, 2022 survey description.” https://ftp.cdc.gov/pub/Health_Statistics/NCHS/Dataset_Documentation/NHIS/2022/srvydesc-508.pdf. https://www.eia.gov/energyexplained/units-and-calculators/degree-days.php↩︎ https://www.npr.org/2020/10/26/927803214/62-million-and-counting-americans-are-breaking-early-voting-records↩︎ "],["references.html", "References", " References American National Election Studies. 2021. “ANES 2020 Time Series Study: Pre-Election and Post-Election Survey Questionnaires.” https://electionstudies.org/wp-content/uploads/2021/07/anes_timeseries_2020_questionnaire_20210719.pdf. ———. 2022. “ANES 2020 Time Series Study Full Release: User Guide and Codebook.” https://electionstudies.org/wp-content/uploads/2022/02/anes_timeseries_2020_userguidecodebook_20220210.pdf. Bache, Stefan Milton, and Hadley Wickham. 2022. magrittr: A Forward-Pipe Operator for R. Biemer, Paul P. 2010. “Total Survey Error: Design, Implementation, and Evaluation.” Public Opinion Quarterly 74 (5): 817–48. https://doi.org/10.1093/poq/nfq058. Biemer, Paul P., and Lars E. Lyberg. 2003. Introduction to Survey Quality. John Wiley &amp; Sons. Biemer, Paul P., Joe Murphy, Stephanie Zimmer, Chip Berry, Grace Deng, and Katie Lewis. 2017. “Using Bonus Monetary Incentives to Encourage Web Response in Mixed-Mode Household Surveys.” Journal of Survey Statistics and Methodology 6 (2): 240–61. https://doi.org/10.1093/jssam/smx015. Bollen, Kenneth A., Paul P. Biemer, Alan F. Karr, Stephen Tueller, and Marcus E. Berzofsky. 2016. “Are Survey Weights Needed? A Review of Diagnostic Tests in Regression Analysis.” Annual Review of Statistics and Its Application 3 (1): 375–92. https://doi.org/10.1146/annurev-statistics-011516-012958. Bradburn, Norman M., Seymour Sudman, and Brian Wansink. 2004. Asking Questions: The Definitive Guide to Questionnaire Design. 2nd Edition. Jossey-Bass. Bryan, Jenny. 2023. Happy Git and GitHub for the useR. https://happygitwithr.com/. Bureau of Justice Statistics. 2017. “National Crime Victimization Survey, 2016: Technical Documentation.” https://bjs.ojp.gov/sites/g/files/xyckuh236/files/media/document/ncvstd16.pdf. Centers for Disease Control and Prevention (CDC). 2021. “Behavioral Risk Factor Surveillance System Survey Questionnaire.” U.S. Department of Health; Human Services, Centers for Disease Control; Prevention; https://www.cdc.gov/brfss/questionnaires/pdf-ques/2021-BRFSS-Questionnaire-1-19-2022-508.pdf. Cochran, William G. 1977. Sampling Techniques. John Wiley &amp; Sons. Cox, Brenda G, David A Binder, B Nanjamma Chinnappa, Anders Christianson, Michael J Colledge, and Phillip S Kott. 2011. Business Survey Methods. John Wiley &amp; Sons. Csardi, Gabor. 2023. prettyunits: Pretty, Human Readable Formatting of Quantities. https://github.com/r-lib/prettyunits. Davern, Michael, Rene Bautista, Jeremy Freese, Stephen L. Morgan, and Tom W. Smith. 2021. “General Social Survey 2016-2020 Panel Codebook.” Edited by Chicago NORC. https://gss.norc.org/Documents/codebook/2016-2020%20GSS%20Panel%20Codebook%20-%20R1a.pdf. DeBell, Matthew. 2010. “How to Analyze ANES Survey Data.” ANES Technical Report Series nes012492. Palo Alto, CA: Stanford University; Ann Arbor, MI: the University of Michigan; https://electionstudies.org/wp-content/uploads/2018/05/HowToAnalyzeANESData.pdf. DeBell, Matthew and Amsbary, Michelle and Brader, Ted and Brock, Shelley and Good, Cindy and Kamens, Justin and Maisel, Natalya and Pinto, Sarah. 2022. “Methodology Report for the ANES 2020 Time Series Study.” https://electionstudies.org/wp-content/uploads/2022/08/anes_timeseries_2020_methodology_report.pdf. DeLeeuw, Edith D. 2005. “To Mix or Not to Mix Data Collection Modes in Surveys.” Journal of Official Statistics 21: 233–55. ———. 2018. “Mixed-Mode: Past, Present, and Future.” Survey Research Methods 12 (2): 75–89. https://doi.org/10.18148/srm/2018.v12i2.7402. Deming, W Edwards. 1991. Sample Design in Business Research. Vol. 23. John Wiley &amp; Sons. Dillman, Don A, Jolene D Smyth, and Leah Melani Christian. 2014. Internet, Phone, Mail, and Mixed-Mode Surveys: The Tailored Design Method. John Wiley &amp; Sons. FC, Mike, Trevor L Davis, and ggplot2 authors. 2022. ggpattern: Ggplot2 Pattern Geoms. Fowler, Floyd J, and Thomas W. Mangione. 1989. Standardized Survey Interviewing. SAGE. Freedman Ellis, Greg, and Ben Schneider. 2023. srvyr: Dplyr-Like Syntax for Summary Statistics of Survey Data. Fuller, Wayne A. 2011. Sampling Statistics. John Wiley &amp; Sons. Gard, Arianna M., Luke W. Hyde, Steven G. Heeringa, Brady T. West, and Colter Mitchell. 2023. “Why Weight? Analytic Approaches for Large-Scale Population Neuroscience Data.” Dev Cogn Neurosci. https://doi.org/10.1016/j.dcn.2023.101196. Gelman, Andrew. 2007. “Struggles with Survey Weighting and Regression Modeling.” Statistical Science 22 (2): 153–64. https://doi.org/10.1214/088342306000000691. Groves, Robert M, Floyd J Fowler Jr, Mick P Couper, James M Lepkowski, Eleanor Singer, and Roger Tourangeau. 2009. Survey Methodology. John Wiley &amp; Sons. Harter, Rachel, Michael P Battaglia, Trent D Buskirk, Don A Dillman, Ned English, Mansour Fahimi, Martin R Frankel, et al. 2016. “Address-Based Sampling.” Task force report. American Association for Public Opinion Research; https://aapor.org/wp-content/uploads/2022/11/AAPOR_Report_1_7_16_CLEAN-COPY-FINAL-2.pdf. Henry, Lionel, and Hadley Wickham. 2022. tidyselect: Select from a Set of Strings. Iannone, Richard, Joe Cheng, Barret Schloerke, Ellis Hughes, Alexandra Lauer, and JooYoung Seo. 2023. gt: Easily Create Presentation-Ready Display Tables. Kim, Jae Kwang, and Jun Shao. 2021. Statistical Methods for Handling Incomplete Data. Chapman &amp; Hall/CRC Press. Landau, William Michael. 2021. “The Targets r Package: A Dynamic Make-Like Function-Oriented Pipeline Toolkit for Reproducibility and High-Performance Computing.” Journal of Open Source Software 6 (57): 2959. https://doi.org/10.21105/joss.02959. LAPOP. 2021a. “AmericasBarometer 2021 - Canada: Technical Information.” Vanderbilt University; http://datasets.americasbarometer.org/database/files/ABCAN2021-Technical-Report-v1.0-FINAL-eng-110921.pdf. ———. 2021b. “AmericasBarometer 2021 - U.S.: Technical Information.” Vanderbilt University; http://datasets.americasbarometer.org/database/files/ABUSA2021-Technical-Report-v1.0-FINAL-eng-110921.pdf. ———. 2021c. “AmericasBarometer 2021: Technical Information.” Vanderbilt University; https://www.vanderbilt.edu/lapop/ab2021/AB2021-Technical-Report-v1.0-FINAL-eng-030722.pdf. ———. 2021d. “Core Questionnaire.” https://www.vanderbilt.edu/lapop/ab2021/AB2021-Core-Questionnaire-v17.5-Eng-210514-W-v2.pdf. ———. 2023a. “About the AmericasBarometer.” https://www.vanderbilt.edu/lapop/about-americasbarometer.php. ———. 2023b. “The AmericasBarometer by the LAPOP Lab.” www.vanderbilt.edu/lapop. Larmarange, Joseph. 2023. labelled: Manipulating Labelled Data. https://larmarange.github.io/labelled/. Levy, Paul S, and Stanley Lemeshow. 2013. Sampling of Populations: Methods and Applications. John Wiley &amp; Sons. Lumley, Thomas. 2010. Complex Surveys: A Guide to Analysis Using R. John Wiley; Sons. Mack, Christina, Zhaohui Su, and Daniel Westreich. 2018. “Types of Missing Data.” In Managing Missing Data in Patient Registries: Addendum to Registries for Evaluating Patient Outcomes: A User’s Guide, Third Edition [Internet]. Rockville (MD): Agency for Healthcare Research; Quality (US); https://www.ncbi.nlm.nih.gov/books/NBK493614/. Massicotte, Philippe, and Andy South. 2023. rnaturalearth: World Map Data from Natural Earth. https://docs.ropensci.org/rnaturalearth/ https://github.com/ropensci/rnaturalearth. McCullagh, Peter, and John Ashworth Nelder. 1989. “Binary Data.” In Generalized Linear Models, 98–148. Springer. Müller, Kirill. 2020. here: A Simpler Way to Find Your Files. National Center for Health Statistics. 2023. “National Health Interview Survey, 2022 survey description.” https://ftp.cdc.gov/pub/Health_Statistics/NCHS/Dataset_Documentation/NHIS/2022/srvydesc-508.pdf. Ooms, Jeroen. 2014. “The Jsonlite Package: A Practical and Consistent Mapping Between JSON Data and r Objects.” arXiv:1403.2805 [Stat.CO]. https://arxiv.org/abs/1403.2805. Pebesma, Edzer, and Roger Bivand. 2023. Spatial Data Science: With applications in R. Chapman and Hall/CRC. https://doi.org/10.1201/9780429459016. Penn State. 2019. “STAT 506: Sampling Theory and Methods [Online Course].” https://online.stat.psu.edu/stat506/. R Core Team. 2023. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/. Recht, Hannah. 2024. censusapi: Retrieve Data from the Census APIs. Robinson, David, Alex Hayes, and Simon Couch. 2023. broom: Convert Statistical Objects into Tidy Tibbles. Särndal, Carl-Erik, Bengt Swensson, and Jan Wretman. 2003. Model Assisted Survey Sampling. Springer Science &amp; Business Media. Schafer, Joseph L, and John W Graham. 2002. “Missing Data: Our View of the State of the Art.” Psychological Methods 7: 147–77. https://doi.org/10.1037//1082-989X.7.2.147. Schouten, Barry, Andy Peytchev, and James Wagner. 2018. Adaptive Survey Design. Chapman &amp; Hall/CRC Press. Scott, Alastair. 2007. “Rao-Scott Corrections and Their Impact.” In Section on Survey Research Methods, 3514–18. http://www.asasrms.org/Proceedings/y2007/Files/JSM2007-000874.pdf. Shah, Babubhai V, and Akhil K Vaish. 2006. “Confidence Intervals for Quantile Estimation from Complex Survey Data.” In Proceedings of the Section on Survey Research Methods. http://www.asasrms.org/Proceedings/y2006/Files/JSM2006-000749.pdf. Shook-Sa, Bonnie, Couzens, G. Lance, and Berzofsky, Marcus. 2015. “Users’ Guide to the National Crime Victimization Survey (NCVS) Direct Variance Estimation.” https://bjs.ojp.gov/sites/g/files/xyckuh236/files/media/document/ncvs_variance_user_guide_11.06.14.pdf; Bureau of Justice Statistics. Sjoberg, Daniel D., Karissa Whiting, Michael Curry, Jessica A. Lavery, and Joseph Larmarange. 2021. “Reproducible Summary Tables with the Gtsummary Package.” The R Journal 13: 570–80. https://doi.org/10.32614/RJ-2021-053. Skinner, Chris. 2009. “Chapter 15: Statistical Disclosure Control for Survey Data.” In Handbook of Statistics: Sample Surveys: Design, Methods and Applications, edited by C. R. Rao, 381–96. Elsevier B.V. Sprunt, Barbara. 2020. “93 Million and Counting: Americans Are Shattering Early Voting Records.” National Public Radio. Stephanie, Zimmer, Powell Rebecca, and Velásquez Isabella. 2024. srvyrexploR: Data Supplement for Exploring Complex Survey Data Analysis in R. Tierney, Nicholas. 2017. “Visdat: Visualising Whole Data Frames.” JOSS 2 (16): 355. https://doi.org/10.21105/joss.00355. Tierney, Nicholas, and Dianne Cook. 2023. “Expanding Tidy Data Principles to Facilitate Missing Data Exploration, Visualization and Assessment of Imputations.” Journal of Statistical Software 105 (7): 1–31. https://doi.org/10.18637/jss.v105.i07. Tourangeau, Roger, Mick P. Couper, and Frederick Conrad. 2004. “Spacing, Position, and Order: Interpretive Heuristics for Visual Features of Survey Questions.” Public Opinion Quarterly 68: 368–93. Tourangeau, Roger, Lance J. Rips, and Kenneth Rasinski. 2000. Psychology of Survey Response. Cambridge University Press. United States. Bureau of Justice Statistics. 2022. “National Crime Victimization Survey, [United States], 2021.” https://www.icpsr.umich.edu/web/NACJD/studies/38429; Inter-university Consortium for Political; Social Research [distributor]. https://doi.org/10.3886/ICPSR38429.v1. U.S. Census Bureau. 2021. “Understanding and Using the American Community Survey Public Use Microdata Sample Files What Data Users Need to Know.” U.S. Government Printing Office; https://www.census.gov/content/dam/Census/library/publications/2021/acs/acs_pums_handbook_2021.pdf. U.S. Energy Information Administration. 2017. “Residential Energy Consumption Survey (RECS): Using the 2015 microdata file to compute estimates and standard errors (RSEs).” https://www.eia.gov/consumption/residential/data/2015/pdf/microdata_v3.pdf. ———. 2023a. “2020 Residential Energy Consumption Survey: Consumption and Expenditures Technical Documentation Summary.” https://www.eia.gov/consumption/residential/data/2020/pdf/2020%20RECS%20CE%20Methodology_Final.pdf. ———. 2023b. “2020 Residential Energy Consumption Survey: Household Characteristics Technical Documentation Summary.” https://www.eia.gov/consumption/residential/data/2020/pdf/2020%20RECS_Methodology%20Report.pdf. ———. 2023c. “2020 Residential Energy Consumption Survey: Using the microdata file to compute estimates and relative standard errors (RSEs).” https://www.eia.gov/consumption/residential/data/2020/pdf/microdata-guide.pdf. ———. 2023d. “Units and Calculators Explained: Degree Days.” https://www.eia.gov/energyexplained/units-and-calculators/degree-days.php. Ushey, Kevin, and Hadley Wickham. 2023. renv: Project Environments. Valliant, Richard, and Jill A. Dever. 2018. Survey Weights: A Step-by-Step Guide to Calculation. Stata Press. Valliant, Richard, Jill A Dever, and Frauke Kreuter. 2013. Practical Tools for Designing and Weighting Survey Samples. Vol. 1. Springer. Walker, Kyle, and Matt Herman. 2024. tidycensus: Load US Census Boundary and Attribute Data as Tidyverse and Sf-Ready Data Frames. https://walker-data.com/tidycensus/. Wickham, Hadley. 2016. Ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. https://ggplot2.tidyverse.org. ———. 2019. Advanced R. https://adv-r.hadley.nz/; CRC press. ———. 2023a. forcats: Tools for Working with Categorical Variables (Factors). ———. 2023b. httr2: Perform HTTP Requests and Process the Responses. Wickham, Hadley, Mara Averick, Jennifer Bryan, Winston Chang, Lucy D’Agostino McGowan, Romain François, Garrett Grolemund, et al. 2019. “Welcome to the tidyverse.” Journal of Open Source Software 4 (43): 1686. https://doi.org/10.21105/joss.01686. Wickham, Hadley, Mine Çetinkaya-Rundel, and Garrett Grolemund. 2023. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. 2rd Edition. https://r4ds.hadley.nz/; O’Reilly Media. Wickham, Hadley, Romain François, Lionel Henry, Kirill Müller, and Davis Vaughan. 2023. dplyr: A Grammar of Data Manipulation. Wickham, Hadley, and Lionel Henry. 2023. purrr: Functional Programming Tools. Wickham, Hadley, Jim Hester, and Jennifer Bryan. 2023. readr: Read Rectangular Text Data. Wickham, Hadley, Evan Miller, and Danny Smith. 2023. haven: Import and Export SPSS, Stata and SAS Files. Wolter, Kirk M. 2007. Introduction to Variance Estimation. Vol. 53. Springer. Xie, Yihui, Christophe Dervieux, and Emily Riederer. 2020. R Markdown Cookbook. Boca Raton, Florida: Chapman; Hall/CRC. https://bookdown.org/yihui/rmarkdown-cookbook. "],["404.html", "Page not found", " Page not found The page you requested cannot be found (perhaps it was moved or renamed). You may want to try searching to find the page's new location, or use the table of contents to find the page you are looking for. "]]
+[["index.html", "Exploring Complex Survey Data Analysis in R A Tidy Introduction with srvyr Dedication", " Exploring Complex Survey Data Analysis in R A Tidy Introduction with srvyr Stephanie Zimmer, Rebecca J. Powell, and Isabella Velásquez 2024-04-24 Dedication To Will, Tom, and Drew, thanks for all the help with additional chores and plenty of Git consulting! "],["c01-intro.html", "Chapter 1 Introduction 1.1 Survey analysis in R 1.2 What to expect 1.3 Prerequisites 1.4 Datasets used in this book 1.5 Conventions 1.6 Getting help 1.7 Acknowledgements 1.8 Colophon", " Chapter 1 Introduction Surveys are valuable tools for gathering information about a population. Researchers, governments, and businesses use surveys to better understand public opinion and behaviors. For example, a non-profit group may analyze societal trends to measure their impact, government agencies may study behaviors to inform policy, or companies may seek to learn customer product preferences to refine business strategy. With survey data, we can explore the world around us. Surveys are often conducted with a sample of the population. Therefore, to use the survey data to understand the population, we use weights to adjust the survey results for unequal probabilities of selection, non-response, and post-stratification. These adjustments ensure the sample accurately represents the population of interest (Gard et al. 2023). To account for the intricate nature of the survey design, analysts rely on statistical software such as SAS, Stata, SUDAAN, and R. In this book, we focus on R to introduce survey analysis. Our goal is to provide a comprehensive guide for individuals new to survey analysis but with some familiarity with statistics and R programming. We use a combination of the {survey} and {srvyr} packages and present the code following best practices from the tidyverse (Freedman Ellis and Schneider 2023; Lumley 2010; Wickham et al. 2019). 1.1 Survey analysis in R The {survey} package was released on the Comprehensive R Archive Network (CRAN) in 2003 and has been continuously developed over time. This package, primarily authored by Thomas Lumley, offers an extensive array of features, including: Calculation of point estimates and their associated variances, including means, totals, ratios, quantiles, and proportions Estimation of regression models, including generalized linear models, log-linear models, and survival curves Variances by Taylor linearization or by replicate weights, including balance repeated replication, jackknife, bootstrap, multistage bootstrap, or user-supplied methods Hypothesis testing for means, proportions, and other parameters The {srvyr} package builds on the {survey} package by providing wrappers for functions that align with the tidyverse philosophy. This is our motivation for using and recommending the {srvyr} package. We find that it is user-friendly for those familiar with the tidyverse packages in R. For example, while many functions in the {survey} package access variables through formulas, the {srvyr} package uses tidy selection to pass variable names, a common feature in the tidyverse (Henry and Wickham 2022). Users of the tidyverse are also likely familiar with the magrittr pipe operator (%&gt;%), which seamlessly works with functions from the {srvyr} package. Moreover, several common functions from {dplyr}, such as filter(), mutate(), and summarize(), can be applied to survey objects (Wickham et al. 2023). This enables users to streamline their analysis workflow and leverage the benefits of both the {srvyr} and {tidyverse} packages. While the {srvyr} package offers many advantages, there is one notable limitation: it doesn’t fully incorporate the modeling capabilities of the {survey} package into tidy wrappers. When discussing modeling and hypothesis testing, we primarily rely on the {survey} package. However, we provide information on how to apply the pipe operator to these functions to maintain clarity and consistency in analyses. 1.2 What to expect This book covers many aspects of survey design and analysis, from understanding how to create design objects to conducting descriptive analysis, statistical tests, and models. We emphasize coding best practices and effective presentation techniques while using real-world data and practical examples to help readers gain proficiency in survey analysis. Below is a summary of each chapter: Chapter 2 - Overview of Surveys: Overview of survey design processes References for more in-depth knowledge Chapter 3 - Survey data documentation: Guide to survey documentation types How to read survey documentation Chapter 4 - Getting started: Installation of packages Introduction to the {srvyrexploR} package and its analytic datasets Outline of the survey analysis process Comparison between the {dplyr} and {srvyr} packages Chapter 5 - Descriptive analyses: Calculation of point estimates Estimation of standard errors and confidence intervals Calculation of design effects Chapter 6 - Statistical testing: Statistical testing methods Comparison of means and proportions Goodness of fit tests, tests of independence, and tests of homogeneity Chapter 7 - Modeling: Overview of model formula specifications Linear regression, ANOVA, and logistic regression modeling Chapter 8 - Communication of results: Strategies for communicating survey results Tools and guidance for creating publishable tables and graphs Chapter 9 - Reproducible research: Tools and methods for achieving reproducibility Resources for reproducible research Chapter 10 - Sample designs and replicate weights: Overview of common sampling designs Replicate weight methods How to specify survey designs in R Chapter 11 - Missing data: Overview of missing data in surveys Approaches to dealing with missing data Chapter 12 - Successful survey analysis recommendations: Tips for successful analysis Recommendations for debugging Chapter 13 - National Crime Victimization Survey Vignette: Vignette on analyzing National Crime Victimization Survey (NCVS) data Illustration of analysis requiring multiple files for victimization rates Chapter 14 - AmericasBarometer Vignette: Vignette on analyzing AmericasBarometer survey data Creation of choropleth maps with survey estimates The majority of chapters contain code that readers can follow. Each of these chapters starts with a “Prerequisites” section, which includes the code needed to load the packages and datasets used in the chapter. We then provide the main idea of the chapter and examples of how to use the functions. Most chapters conclude with exercises to work through. We provide the solutions to the exercises in the online version of the book. While we provide a brief overview of survey methodology and statistical theory, this book is not intended to be the sole resource for these topics. We reference other materials and encourage readers to seek them out for more information. 1.3 Prerequisites To get the most out of our this book, we assume a survey has already been conducted and readers have obtained a microdata file. Microdata, also known as respondent-level or row-level data, differs from summarized data typically found in tables. They contain individual survey responses, along with analysis weights and design variables such as strata or clusters. Additionally, the survey data should already include weights and design variables. These are required to accurately calculate unbiased estimates. The concepts and techniques discussed in this book help readers to extract meaningful insights from survey data, but do not cover how to create weights as this is a separate complex topic. If weights are not already created for the survey data, we recommend reviewing other resources focused on weight creation such as Valliant and Dever (2018). This book is tailored for analysts already familiar with R and the tidyverse, but who may be new to complex survey analysis in R. We anticipate that readers of this book can: Install R and their Integrated Development Environment (IDE) of choice, such as RStudio Install and load packages from CRAN and GitHub repositories Run R code Read data from a folder or their working directory Understand fundamental tidyverse concepts such as tidy/long/wide data, tibbles, the magrittr pipe (%&gt;%), and tidy selection Use the tidyverse packages to wrangle, tidy, and visualize data If these concepts or skills are new, we recommend starting with introductory resources to cover these topics before reading this book. R for Data Science (Wickham, Çetinkaya-Rundel, and Grolemund 2023) is a beginner-friendly guide for getting started in data science using R. It offers guidance on preliminary installation steps, basic R syntax, and tidyverse concepts and packages. 1.4 Datasets used in this book We work with two key datasets throughout the book: the Residential Energy Consumption Survey (RECS – U.S. Energy Information Administration 2023b) and the American National Election Studies (ANES – DeBell 2010). We introduce the loading and preparation of these datasets in Chapter 4. 1.5 Conventions Throughout the book, we use the following typographical conventions: Package names are surrounded by curly brackets: {srvyr} Function names are in constant-width text format and include parentheses: survey_mean() Object and variable names are in constant-width text format: anes_des 1.6 Getting help We recommend first trying to resolve errors and issues independently using the tips provided in Chapter 12. There are several community forums for asking questions, including: Posit Community R for Data Science Slack Community Stack Overflow Please report any bugs and issues to the book’s GitHub repository. 1.7 Acknowledgements We would like to thank Holly Cast, Greg Freedman Ellis, Joe Murphy, and Sheila Saia for their reviews of the initial draft. Their detailed and honest feedback helped improve this book, and we are grateful for their input. Additionally, this book started with two short courses. The first was at the Annual Conference for the American Association for Public Opinion Research (AAPOR) and the second was a series of webinars for the Midwest Association of Public Opinion Research (MAPOR.) We would like to also thank those who assisted us by moderating breakout rooms and answering questions from attendees: Greg Freedman Ellis, Raphael Nishimura, and Benjamin Schneider. 1.8 Colophon This book was written in bookdown using RStudio. The complete source is available on GitHub. This version of the book was built with R version 4.3.1 (2023-06-16) and with the packages listed in Table 1.1. #iuwtudrxst table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #iuwtudrxst thead, #iuwtudrxst tbody, #iuwtudrxst tfoot, #iuwtudrxst tr, #iuwtudrxst td, #iuwtudrxst th { border-style: none; } #iuwtudrxst p { margin: 0; padding: 0; } #iuwtudrxst .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #iuwtudrxst .gt_caption { padding-top: 4px; padding-bottom: 4px; } #iuwtudrxst .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #iuwtudrxst .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #iuwtudrxst .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #iuwtudrxst .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #iuwtudrxst .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #iuwtudrxst .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #iuwtudrxst .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #iuwtudrxst .gt_column_spanner_outer:first-child { padding-left: 0; } #iuwtudrxst .gt_column_spanner_outer:last-child { padding-right: 0; } #iuwtudrxst .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #iuwtudrxst .gt_spanner_row { border-bottom-style: hidden; } #iuwtudrxst .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #iuwtudrxst .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #iuwtudrxst .gt_from_md > :first-child { margin-top: 0; } #iuwtudrxst .gt_from_md > :last-child { margin-bottom: 0; } #iuwtudrxst .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #iuwtudrxst .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #iuwtudrxst .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #iuwtudrxst .gt_row_group_first td { border-top-width: 2px; } #iuwtudrxst .gt_row_group_first th { border-top-width: 2px; } #iuwtudrxst .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #iuwtudrxst .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #iuwtudrxst .gt_first_summary_row.thick { border-top-width: 2px; } #iuwtudrxst .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #iuwtudrxst .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #iuwtudrxst .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #iuwtudrxst .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #iuwtudrxst .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #iuwtudrxst .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #iuwtudrxst .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #iuwtudrxst .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #iuwtudrxst .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #iuwtudrxst .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #iuwtudrxst .gt_left { text-align: left; } #iuwtudrxst .gt_center { text-align: center; } #iuwtudrxst .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #iuwtudrxst .gt_font_normal { font-weight: normal; } #iuwtudrxst .gt_font_bold { font-weight: bold; } #iuwtudrxst .gt_font_italic { font-style: italic; } #iuwtudrxst .gt_super { font-size: 65%; } #iuwtudrxst .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #iuwtudrxst .gt_asterisk { font-size: 100%; vertical-align: 0; } #iuwtudrxst .gt_indent_1 { text-indent: 5px; } #iuwtudrxst .gt_indent_2 { text-indent: 10px; } #iuwtudrxst .gt_indent_3 { text-indent: 15px; } #iuwtudrxst .gt_indent_4 { text-indent: 20px; } #iuwtudrxst .gt_indent_5 { text-indent: 25px; } TABLE 1.1: Package versions and source used in building this book Package Version Source DiagrammeR 1.0.10 CRAN Matrix 1.6-1 CRAN bookdown 0.34 CRAN broom 1.0.5 CRAN censusapi 0.8.0 GitHub (hrecht/censusapi@15b2b02) dplyr 1.1.4 CRAN forcats 1.0.0 CRAN ggpattern 1.0.1 CRAN ggplot2 3.4.2 CRAN gt 0.9.0 CRAN gtsummary 1.7.1 CRAN haven 2.5.2 CRAN janitor 2.2.0 CRAN kableExtra 1.3.4 CRAN knitr 1.43 CRAN labelled 2.12.0 CRAN lubridate 1.9.2 CRAN naniar 1.0.0 CRAN osfr 0.2.9 CRAN prettyunits 1.2.0 CRAN purrr 1.0.2 CRAN readr 2.1.4 CRAN renv 1.0.0 CRAN rmarkdown 2.23 CRAN rnaturalearth 0.3.3 CRAN rnaturalearthdata 0.1.0 CRAN sf 1.0-14 CRAN srvyr 1.2.0 GitHub (gergness/srvyr@1917f75) srvyrexploR 1.0.0 GitHub (tidy-survey-r/srvyrexploR@e03f36c) stringr 1.5.1 CRAN survey 4.2-1 CRAN survival 3.5-7 CRAN tibble 3.2.1 CRAN tidycensus 1.6.2 CRAN tidyr 1.3.0 CRAN tidyselect 1.2.0 CRAN tidyverse 2.0.0 CRAN References DeBell, Matthew. 2010. “How to Analyze ANES Survey Data.” ANES Technical Report Series nes012492. Palo Alto, CA: Stanford University; Ann Arbor, MI: the University of Michigan; https://electionstudies.org/wp-content/uploads/2018/05/HowToAnalyzeANESData.pdf. Freedman Ellis, Greg, and Ben Schneider. 2023. srvyr: ’dplyr’-Like Syntax for Summary Statistics of Survey Data. Gard, Arianna M., Luke W. Hyde, Steven G. Heeringa, Brady T. West, and Colter Mitchell. 2023. “Why Weight? Analytic Approaches for Large-Scale Population Neuroscience Data.” Dev Cogn Neurosci. https://doi.org/10.1016/j.dcn.2023.101196. Henry, Lionel, and Hadley Wickham. 2022. tidyselect: Select from a Set of Strings. Lumley, Thomas. 2010. Complex Surveys: A Guide to Analysis Using R. John Wiley; Sons. ———. 2023b. “2020 Residential Energy Consumption Survey: Household Characteristics Technical Documentation Summary.” https://www.eia.gov/consumption/residential/data/2020/pdf/2020%20RECS_Methodology%20Report.pdf. Valliant, Richard, and Jill A. Dever. 2018. Survey Weights: A Step-by-Step Guide to Calculation. Stata Press. Wickham, Hadley, Mara Averick, Jennifer Bryan, Winston Chang, Lucy D’Agostino McGowan, Romain François, Garrett Grolemund, et al. 2019. “Welcome to the tidyverse.” Journal of Open Source Software 4 (43): 1686. https://doi.org/10.21105/joss.01686. Wickham, Hadley, Mine Çetinkaya-Rundel, and Garrett Grolemund. 2023. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. 2rd Edition. https://r4ds.hadley.nz/; O’Reilly Media. Wickham, Hadley, Romain François, Lionel Henry, Kirill Müller, and Davis Vaughan. 2023. dplyr: A Grammar of Data Manipulation. "],["c02-overview-surveys.html", "Chapter 2 Overview of Surveys 2.1 Introduction 2.2 Searching for public-use survey data 2.3 Pre-survey planning 2.4 Study design 2.5 Data collection 2.6 Post-survey processing 2.7 Post-survey data analysis and reporting", " Chapter 2 Overview of Surveys 2.1 Introduction Developing surveys to gather accurate information about populations involves an intricate and time-intensive process. Researchers can spend months, or even years, developing the study design, questions, and other methods for a single survey to ensure high-quality data is collected. Before analyzing survey data, we recommend understanding the entire survey life cycle. This understanding can provide better insight into what types of analyses should be conducted on the data. The survey life cycle consists of the necessary stages to execute a survey project successfully. Each stage influences the survey’s timing, costs, and feasibility, consequently impacting the data collected and how we should analyze it. Figure 2.1 shows a high-level overview of the survey process. FIGURE 2.1: Overview of the survey process The survey life cycle starts with a research topic or question of interest (e.g., what impact does childhood trauma have on health outcomes later in life.) Drawing from available resources can result in a reduced burden on respondents, cheaper research costs, and faster research outcomes. Therefore, we recommend reviewing existing data sources to determine if data that can address this question are already available. However, if existing data cannot answer the nuances of the research question, we can capture the exact data we need through a questionnaire, or a set of questions. To gain a deeper understanding of survey design and implementation, we recommend reviewing several pieces of existing literature in detail (e.g., Biemer and Lyberg 2003; Bradburn, Sudman, and Wansink 2004; Dillman, Smyth, and Christian 2014; Groves et al. 2009; Tourangeau, Rips, and Rasinski 2000; Valliant, Dever, and Kreuter 2013). 2.2 Searching for public-use survey data Throughout this book, we use public-use datasets from different surveys, including the American National Election Survey (ANES), the Residential Energy Consumption Survey (RECS), the National Crime Victimization Survey (NCVS), and the AmericasBarometer surveys. As mentioned above, we should look for existing data that can provide insights into our research questions before embarking on a new survey. One of the greatest sources of data is the government. For example, in the U.S., we can get data directly from the various statistical agencies such as the U.S. Energy Information Administration or Bureau of Justice Statistics. Other countries often have data available through official statistics offices, such as the Office for National Statistics in the United Kingdom. In addition to government data, many researchers make their data publicly available through repositories such as the Inter-university Consortium for Political and Social Research (ICPSR) or the Odum Institute Data Archive. Searching these repositories or other compiled lists (e.g., Analyze Survey Data for Free) can be an efficient way to identify surveys with questions related to our research topic. 2.3 Pre-survey planning There are multiple things to consider when starting a survey. Errors are the differences between the true values of the variables being studied and the values obtained through the survey. Each step and decision made before the launch of the survey impact the types of errors that are introduced into the data, which in turn impact how to interpret the results. Generally, survey researchers consider there to be seven main sources of error that fall under either Representation and Measurement (Groves et al. 2009): Representation Coverage Error: A mismatch between the population of interest and the sampling frame, the list from which the sample is drawn. Sampling Error: Error produced when selecting a sample, the subset of the population, from the sampling frame. This error is due to randomization, and we discuss how to quantify this error in Chapter 10. There is no sampling error in a census as there is no randomization. The sampling error measures the difference between all potential samples under the same sampling method. Nonresponse Error: Differences between those who responded and did not respond to the survey (unit nonresponse) or a given question (item nonresponse.) Adjustment Error: Error introduced during post-survey statistical adjustments. Measurement Validity: A mismatch between the research topic and the question(s) used to collect that information. Measurement Error: A mismatch between what the researcher asked and how the respondent answered. Processing Error: Edits by the researcher to responses provided by the respondent (e.g., adjustments to data based on illogical responses.) Almost every survey has errors. Researchers attempt to conduct a survey that reduces the total survey error, or the accumulation of all errors that may arise throughout the survey life cycle. By assessing these different types of errors together, researchers can seek strategies to maximize the overall survey quality and improve the reliability and validity of results (Biemer 2010). However, attempts to reduce individual sources errors (and therefore total survey error) come at the price of time and money. For example: Coverage Error Tradeoff: Researchers can search for or create more accurate and updated sampling frames, but they can be difficult to construct or obtain. Sampling Error Tradeoff: Researchers can increase the sample size to reduce sampling error; however, larger samples can be expensive and time-consuming to field. Nonresponse Error Tradeoff: Researchers can increase or diversify efforts to improve survey participation but this may be resource-intensive while not entirely removing nonresponse bias. Adjustment Error Tradeoff: Weighting is a statistical technique used to adjust the contribution of individual survey responses to the final survey estimates. It is typically done to make the sample more representative of the population of interest. However, if researchers do not carefully execute the adjustments or base them on inaccurate information, they can introduce new biases, leading to less accurate estimates. Validity Error Tradeoff: Researchers can increase validity through a variety of ways, such as using established scales or collaborating with a psychometrician during survey design to pilot and evaluate questions. However, doing so lengthens the amount of time and resources needed to complete survey design. Measurement Error Tradeoff: Researchers can use techniques such as questionnaire testing and cognitive interviewing to ensure respondents are answering questions as expected. However, these activities require time and resources to complete. Processing Error Tradeoff: Researchers can impose rigorous data cleaning and validation processes. However, this requires supervision, training, and time. The challenge for survey researchers is to find the optimal tradeoffs among these errors. They must carefully consider ways to reduce each error source and total survey error while balancing their study’s objectives and resources. For survey analysts, understanding the decisions that researchers took to minimize these error sources can impact how results are interpreted. The remainder of this chapter explores critical considerations for survey development. We explore how to consider each of these sources of error and how these error sources can inform the interpretations of the data. 2.4 Study design From formulating methodologies to choosing an appropriate sampling frame, the study design phase is where the blueprint for a successful survey takes shape. Study design encompasses multiple parts of the survey life cycle, including decisions on the population of interest, survey mode (the format through which a survey is administered to respondents), timeline, and questionnaire design. Knowing who and how to survey individuals depends on the study’s goals and the feasibility of implementation. This section explores the strategic planning that lays the foundation for a survey. 2.4.1 Sampling design The set or group we want to survey is known as the population of interest or the target population. The population of interest could be broad, such as “all adults age 18+ living in the U.S.” or a specific population based on a particular characteristic or location. For example, we may want to know about “adults aged 18-24 who live in North Carolina” or “eligible voters living in Illinois.” However, a sampling frame with contact information is needed to survey individuals in these populations of interest. If we are looking at eligible voters, the sampling frame could be the voting registry for a given state or area. If we are looking at more board populations of interest, like all adults in the United States, the sampling frame is likely imperfect. In these cases, a full list of individuals in the United States is not available for a sampling frame. Instead, we may choose to use a sampling frame of mailing addresses and send the survey to households, or we may choose to use random digit dialing (RDD) and call random phone numbers (that may or may not be assigned, connected, and working.) These imperfect sampling frames can result in coverage error where there is a mismatch between the population of interest and the list of individuals we can select. For example, if we are looking to obtain estimates for “all adults aged 18+ living in the U.S.”, a sampling frame of mailing addresses will miss specific types of individuals, such as the homeless, transient populations, and incarcerated individuals. Additionally, many households have more than one adult resident, so we would need to consider how to get a specific individual to fill out the survey (called within household selection) or adjust the population of interest to report on “U.S. households” instead of “individuals.” Once we have selected the sampling frame, the next step is determining how to select individuals for the survey. In rare cases, we may conduct a census and survey everyone on the sampling frame. However, the ability to implement a questionnaire at that scale is something only few can do (e.g., government censuses.) Instead, we typically choose to sample individuals and use weights to estimate numbers in the population of interest. They can use a variety of different sampling methods, and more information on these can be found in Chapter 10. This decision of which sampling method to use impacts sampling error and can be accounted for in weighting. Example: Number of pets in a household Let’s use a simple example where we are interested in the average number of pets in a household. We need to consider the population of interest for this study. Specifically, are we interested in all households in a given country or households in a more local area (e.g., city or state)? Let’s assume we are interested in the number of pets in a U.S. household with at least one adult (18 years old or older.) In this case, a sampling frame of mailing addresses would introduce only a small amount of coverage error as the frame would closely match our population of interest. Specifically, we would likely want to use the Computerized Delivery Sequence File (CDSF), which is a file of mailing addresses that the United States Postal Service (USPS) creates and covers nearly 100% of U.S. households (Harter et al. 2016). To sample these households, for simplicity, we use a stratified simple random sample design (see Chapter 10 for more information on sample designs), where we randomly sample households within each state (i.e., we stratify by state.) Throughout this chapter, we build on this example research question to plan a survey. 2.4.2 Data collection planning With the sampling design decided, researchers can then decide how to survey these individuals. Specifically, the modes used for contacting and surveying the sample, how frequently to send reminders and follow-ups, and the overall timeline of the study are four of the major data collection determinations. Traditionally, survey researchers have considered there to be four main modes1: Computer Assisted Personal Interview (CAPI; also known as face-to-face or in-person interviewing) Computer Assisted Telephone Interview (CATI; also known as phone or telephone interviewing) Computer Assisted Web Interview (CAWI; also known as web or online interviewing) Paper and Pencil Interview (PAPI) We can use a single mode to collect data or multiple modes (also called mixed-modes.) Using mixed-modes can allow for broader reach and increase response rates depending on the population of interest (Biemer et al. 2017; DeLeeuw 2005, 2018). For example, we could both call households to conduct a CATI survey and send mail with a PAPI survey to the household. By using both modes, we could gain participation through the mail from individuals who do not pick up the phone to unknown numbers or through the phone from individuals who do not open all of their mail. However, mode effects (where responses differ based on the mode of response) can be present in the data and may need to be considered during analysis. When selecting which mode, or modes, to use, understanding the unique aspects of the chosen population of interest and sampling frame provides insight into how they can best be reached and engaged. For example, if we plan to survey adults aged 18-24 who live in North Carolina, asking them to complete a survey using CATI (i.e., over the phone) would likely not be as successful as other modes like the web. This age group does not talk on the phone as much as other generations and often does not answer their phones for unknown numbers. Additionally, the mode for contacting respondents relies on what information is available in the sampling frame. For example, if our sampling frame includes an email address, we could email our selected sample members to convince them to complete a survey. Alternatively, if the sampling frame is a list of mailing addresses, we could contact sample members with a letter. It is important to note that there can be a difference between the contact and survey modes. For example, if we have a sampling frame with addresses, we can send a letter to our sample members and provide information on completing a web survey. Another option is using mixed-mode surveys by mailing sample members a paper and pencil survey but also including instructions to complete the survey online. Combining different contact modes and different survey modes can be helpful in reducing unit nonresponse error–where the entire unit (e.g., a household) does not respond to the survey at all–as different sample members may respond better to different contact and survey modes. However, when considering which modes to use, it is important to make access to the survey as easy as possible for sample members to reduce burden and unit nonresponse. Another way to reduce unit nonresponse error is by varying the language of the contact materials (Dillman, Smyth, and Christian 2014). People are motivated by different things, so constantly repeating the same message may not be helpful. Instead, mixing up the messaging and the type of contact material the sample member receives can increase response rates and reduce the unit nonresponse error. For example, instead of only sending standard letters, we could consider sending mailings that invoke “urgent” or “important” thoughts by sending priority letters or using other delivery services like FedEx, UPS, or DHL. A study timeline may also determine the number and types of contacts. If the timeline is long, there is plentiful time for follow-ups and diversified messages in contact materials. If the timeline is short, then fewer follow-ups can be implemented. Many studies start with the tailored design method put forth by Dillman, Smyth, and Christian (2014) and implement five contacts: Prenotification (Prenotice) letting sample members know the survey is coming Invitation to complete the survey Reminder that also thanks the respondents that may have already completed the survey Reminder (with a replacement paper survey if needed) Final reminder This method is easily adaptable based on the study timeline and needs but provides a starting point for most studies. Example: Number of pets in a household Let’s return to our example of the average number of pets in a household. We are using a sampling frame of mailing addresses, so we recommend starting our data collection with letters mailed to households, but later in data collection, we want to send interviewers to the house to conduct an in-person (or CAPI) interview to decrease unit nonresponse error. This means we have two contact modes (paper and in-person.) As mentioned above, the survey mode does not have to be the same as the contact mode, so we recommend a mixed-mode study with both Web and CAPI modes. Let’s assume we have six months for data collection, so we could recommend the following protocol: Protocol Example for 6-month Web and CAPI Data Collection Week Contact Mode Contact Message Survey Mode Offered 1 Mail: Letter Prenotice — 2 Mail: Letter Invitation Web 3 Mail: Postcard Thank You/Reminder Web 6 Mail: Letter in large envelope Animal Welfare Discussion Web 10 Mail: Postcard Inform Upcoming In-Person Visit Web 14 In-Person Visit — CAPI 16 Mail: Letter Reminder of In-Person Visit Web, but includes a number to call to schedule CAPI 20 In-Person Visit — CAPI 25 Mail: Letter in large envelope Survey Closing Notice Web, but includes a number to call to schedule CAPI This is just one possible protocol that we can use that starts respondents with the web (typically done to reduce costs.) However, we could begin in-person data collection earlier during the data collection period or ask their interviewers to attempt more than two visits with a household. 2.4.3 Questionnaire design When developing the questionnaire, it can be helpful to first outline the topics to be asked and include the “why” each question or topic is important to the research question(s). This can help us better tailor the questionnaire and reduce the number of questions (and thus the burden on the respondent) if topics are deemed irrelevant to the research question. When making these decisions, we should also consider questions needed for weighting. While we would love to have everyone in our population of interest answer our survey, this rarely happens. Thus, including questions about demographics in the survey can assist with weighting for nonresponse errors (both unit and item nonresponse.) Knowing the details of the sampling plan and what may impact coverage error and sampling error can help us determine what types of demographics to include. Thus questionnaire design is typically done in conjunction with sampling design. We can benefit from the work of others by using questions from other surveys. Demographic sections in surveys, such as race, ethnicity, or education, often are borrowed questions from a government census or other official surveys. Question banks such as the Inter-university Consortium for Political and Social Research (ICPSR) variable search can provide additional potential questions. If a question does not exist in a question bank, we can craft our own. When developing survey questions, we should start with the research topic and attempt to write questions that match the concept. The closer the question asked is to the overall concept, the better the validity. For example, if we want to know how people consume T.V. series and movies but only ask a question about how many T.V.s are in the house, then they would be missing other ways that people watch T.V. series and movies, such as on other devices or at places outside of the home. As mentioned above, we can employ techniques to increase the validity of their questionnaires. For example, questionnaire testing involves piloting the survey instrument to identify and fix potential issues before conducting the main survey. Additionally, we could conduct cognitive interviews – a technique where we walk through the survey with participants, encouraging them to speak their thoughts out loud to uncover how they interpret and understand survey questions. Additionally, when designing questions, we should consider the mode for the survey and adjust the language appropriately. In self-administered surveys (e.g., web or mail), respondents can see all the questions and response options, but that is not the case in interviewer-administered surveys (e.g., CATI or CAPI.) With interviewer-administered surveys, the response options must be read aloud to the respondents, so the question may need to be adjusted to create a better flow to the interview. Additionally, with self-administered surveys, because the respondents are viewing the questionnaire, the formatting of the questions is even more critical to ensure accurate measurement. Incorrect formatting or wording can result in measurement error, so following best practices or using existing validated questions can reduce error. There are multiple resources to help researchers draft questions for different modes (e.g., Bradburn, Sudman, and Wansink 2004; Dillman, Smyth, and Christian 2014; Fowler and Mangione 1989; Tourangeau, Couper, and Conrad 2004). Example: Number of pets in a household As part of our survey on the average number of pets in a household, we may want to know what animal most people prefer to have as a pet. Let’s say we have a question in our survey displayed in Figure 2.2. FIGURE 2.2: Example Question Asking Pet Preference Type This question may have validity issues as it only provides the options of “dogs” and “cats” to respondents, and the interpretation of the data could be incorrect. For example, if we had 100 respondents who answered the question and 50 selected dogs, then the results of this question cannot be “50% of the population prefers to have a dog as a pet,” as only two response options were provided. If a respondent taking our survey prefers turtles, they could either be forced to choose a response between these two (i.e., interpret the question as “between dogs and cats, which do you prefer?” and result in measurement error), or they may not answer the question (which results in item nonresponse error.) Based on this, the interpretation of this question should be, “When given a choice between dogs and cats, 50% of respondents preferred to have a dog as a pet.” To avoid this issue, we should consider these possibilities and adjust the question accordingly. One simple way could be to add an “other” response option to give respondents a chance to provide a different response. The “other” response option could then include a way for respondents to write their other preference. For example, we could rewrite this question as displayed in Figure 2.3. FIGURE 2.3: Example Question Asking Pet Preference Type with Other Specify Option We can then code the responses from the open-ended box and get a better understanding of the respondent’s choice of preferred pet. Interpreting this question becomes easier as researchers no longer need to qualify the results with the choices provided. This is a simple example of how the presentation of the question and options can impact the findings. For more complex topics and questions, we must thoroughly consider how to mitigate any impacts from the presentation, formatting, wording, and other aspects. For survey analysts, reviewing not only the data but also the wording of the questions is crucial to ensure the results are presented in a manner consistent with the question asked. Chapter 3 provides further details on how to review existing survey documentation to inform our analyses and Chapter 8 goes into more details on communicating results. 2.5 Data collection Once the data collection starts, we try to stick to the data collection protocol designed during pre-survey planning. However, effective researchers also prepare to adjust their plans and adapt as needed to the current progress of data collection (Schouten, Peytchev, and Wagner 2018). Some extreme examples could be natural disasters that could prevent mailings or interviewers from getting to the sample members. This could cause an in-person survey needing to quickly pivot to a self-administered survey, or the field period could be delayed, for example. Others could be smaller in that something newsworthy occurs connected to the survey, so we could choose to play this up in communication materials. In addition to these external factors, there could be factors unique to the survey, such as lower response rates for a specific sub-group, so the data collection protocol may need to find ways to improve response rates for that specific group. 2.6 Post-survey processing After data collection, various activities need to be completed before we can analyze the survey. Multiple decisions made during this post-survey phase can assist us in reducing different error sources, such as weighting to account for the sample selection. Knowing the decisions made in creating the final analytic data can impact how we use the data and interpret the results. 2.6.1 Data cleaning and imputation Post-survey cleaning is one of the first steps to get the survey responses into an analytic dataset. Data cleaning can consist of correcting inconsistent data (e.g., with skip pattern errors or multiple questions throughout the survey being consistent with each other), editing numeric entries or open-ended responses for grammar and consistency, or recoding open-ended questions into categories for analysis. There is no universal set of fixed rules that every survey must adhere to. Instead, each survey or research study should establish its own guidelines and procedures for handling various cleaning scenarios based on its specific objectives. We should use our best judgment to ensure data integrity, and all decisions should be documented and available to those using the data in the analysis. Each decision we make impacts processing error, so often, multiple people review these rules or recode open-ended data and adjudicate any differences in an attempt to reduce this error. Another crucial step in post-survey processing is imputation. Often, there is item nonresponse where respondents do not answer specific questions. If the questions are crucial to analysis efforts or the research question, we may implement imputation to reduce item nonresponse error. Imputation is a technique for replacing missing or incomplete data values with estimated values. However, as imputation is a way of assigning values to missing data based on an algorithm or model, it can also introduce processing error, so we should consider the overall implications of imputing data compared to having item nonresponse. There are multiple ways to impute data. We recommend reviewing other resources like Kim and Shao (2021) for more information. Example: Number of pets in a household Let’s return to the question we created to ask about animal preference. The “other specify” invites respondents to specify the type of animal they prefer to have as a pet. If respondents entered answers such as “puppy,” “turtle,” “rabit,” “rabbit,” “bunny,” “ant farm,” “snake,” “Mr. Purr,” then we may wish to categorize these write-in responses to help with analysis. In this example, “puppy” could be assumed to be a reference to a “Dog”, and could be recoded there. The misspelling of “rabit” could be coded along with “rabbit” and “bunny” into a single category of “Bunny or Rabbit”. These are relatively standard decisions that we can make. The remaining write-in responses could be categorized in a few different ways. “Mr. Purr,” which may be someone’s reference to their own cat, could be recoded as “Cat”, or it could remain as “Other” or some category that is “Unknown”. Depending on the number of responses related to each of the others, they could all be combined into a single “Other” category, or maybe categories such as “Reptiles” or “Insects” could be created. Each of these decisions may impact the interpretation of the data, so we should document the types of responses that fall into each of the new categories and any decisions made. 2.6.2 Weighting We can address some error sources identified in the previous sections using weighting. During the weighting process, weights are created for each respondent record. These weights allow the survey responses to generalize to the population. A weight, generally, reflects how many units in the population each respondent represents. Often, the weight is constructed such that the sum of the weights is the size of the population. Weights can address coverage, sampling, and nonresponse errors. Many published surveys include an “analysis weight” variable that combines these adjustments. However, weighting itself can also introduce adjustment error, so we need to balance which types of errors should be corrected with weighting. The construction of weights is outside the scope of this book, so we recommend referencing other materials if interested in weight construction (Valliant and Dever 2018). Instead, this book assumes the survey has been completed, weights are constructed, and data are available to users. Example: Number of pets in a household In the simple example of our survey, we decided to obtain a random sample from each state to select our sample members. Knowing this sampling design, we can include selection weights for analysis that account for how the sample members were selected for the survey. Additionally, the sampling frame may have the type of building associated with each address, so we could include the building type as a potential nonresponse weighting variable, along with some interviewer observations that may be related to our research topic of the average number of pets in a household. Combining these weights, we can create an analytic weight that analysts need to use when analyzing the data. 2.6.3 Disclosure Before data is released publicly, we need to ensure that individual respondents can not be identified by the data when confidentiality is required. There are a variety of different methods that can be used. Here we describe a few of the most commonly used: Data swapping: We may swap specific data values across different respondents so that it does not impact insights from the data but ensures that specific individuals cannot be identified. Top/bottom coding: We may choose top or bottom coding to mask extreme values. For example, we may top-code income values such that households with income greater than $500,000 are coded as “$500,000 or more,” with other incomes being presented as integers between $0 and $499,999. This can impact analyses at the tails of the distribution. Coarsening: We may use coarsening to mask unique values. For example, a survey question may ask for a precise income, but the public data may include income as a categorical variable. Another example commonly used in survey practice is to coarsen geographic variables. Data collectors likely know the precise address of sample members, but the public data may only include the state or even region of respondents. Perturbation: We may add random noise to outcomes. As with swapping, this is done so that it does not impact insights from the data but ensures that specific individuals cannot be identified. There is as much art as there is science to the methods used for disclosure. Only high-level comments about the disclosure are provided in the survey documentation, not specific details. This ensures nobody can reverse the disclosure and thus identify individuals. For more information on different disclosure methods, please see Skinner (2009) and the AAPOR Standards. 2.6.4 Documentation Documentation is a critical step of the survey life cycle. We should systematically record all the details, decisions, procedures, and methodologies to ensure transparency, reproducibility, and the overall quality of survey research. Proper documentation allows analysts to understand, reproduce, and evaluate the study’s methods and findings. Chapter 3 dives into how analysts should use survey data documentation. 2.7 Post-survey data analysis and reporting After completing the survey life cycle, the data are ready for analysts to use. Chapter 4 continues from this point. For more information on the survey life cycle, please explore the references cited throughout this chapter. References Biemer, Paul P. 2010. “Total Survey Error: Design, Implementation, and Evaluation.” Public Opinion Quarterly 74 (5): 817–48. https://doi.org/10.1093/poq/nfq058. Biemer, Paul P., and Lars E. Lyberg. 2003. Introduction to Survey Quality. John Wiley &amp; Sons. Biemer, Paul P., Joe Murphy, Stephanie Zimmer, Chip Berry, Grace Deng, and Katie Lewis. 2017. “Using Bonus Monetary Incentives to Encourage Web Response in Mixed-Mode Household Surveys.” Journal of Survey Statistics and Methodology 6 (2): 240–61. https://doi.org/10.1093/jssam/smx015. Bradburn, Norman M., Seymour Sudman, and Brian Wansink. 2004. Asking Questions: The Definitive Guide to Questionnaire Design. 2nd Edition. Jossey-Bass. DeLeeuw, Edith D. 2005. “To Mix or Not to Mix Data Collection Modes in Surveys.” Journal of Official Statistics 21: 233–55. ———. 2018. “Mixed-Mode: Past, Present, and Future.” Survey Research Methods 12 (2): 75–89. https://doi.org/10.18148/srm/2018.v12i2.7402. Dillman, Don A, Jolene D Smyth, and Leah Melani Christian. 2014. Internet, Phone, Mail, and Mixed-Mode Surveys: The Tailored Design Method. John Wiley &amp; Sons. Fowler, Floyd J, and Thomas W. Mangione. 1989. Standardized Survey Interviewing. SAGE. Groves, Robert M, Floyd J Fowler Jr, Mick P Couper, James M Lepkowski, Eleanor Singer, and Roger Tourangeau. 2009. Survey Methodology. John Wiley &amp; Sons. Harter, Rachel, Michael P Battaglia, Trent D Buskirk, Don A Dillman, Ned English, Mansour Fahimi, Martin R Frankel, et al. 2016. “Address-Based Sampling.” Task force report. American Association for Public Opinion Research; https://aapor.org/wp-content/uploads/2022/11/AAPOR_Report_1_7_16_CLEAN-COPY-FINAL-2.pdf. Kim, Jae Kwang, and Jun Shao. 2021. Statistical Methods for Handling Incomplete Data. Chapman &amp; Hall/CRC Press. Schouten, Barry, Andy Peytchev, and James Wagner. 2018. Adaptive Survey Design. Chapman &amp; Hall/CRC Press. Skinner, Chris. 2009. “Chapter 15: Statistical Disclosure Control for Survey Data.” In Handbook of Statistics: Sample Surveys: Design, Methods and Applications, edited by C. R. Rao, 381–96. Elsevier B.V. Tourangeau, Roger, Mick P. Couper, and Frederick Conrad. 2004. “Spacing, Position, and Order: Interpretive Heuristics for Visual Features of Survey Questions.” Public Opinion Quarterly 68: 368–93. Tourangeau, Roger, Lance J. Rips, and Kenneth Rasinski. 2000. Psychology of Survey Response. Cambridge University Press. Valliant, Richard, and Jill A. Dever. 2018. Survey Weights: A Step-by-Step Guide to Calculation. Stata Press. Valliant, Richard, Jill A Dever, and Frauke Kreuter. 2013. Practical Tools for Designing and Weighting Survey Samples. Vol. 1. Springer. Other modes such as using mobile apps or text messaging can also be considered, but at the time of publication, have smaller reach or are better for longitudinal studies (i.e., surveying the same individuals over many time periods of a single study.)↩︎ "],["c03-survey-data-documentation.html", "Chapter 3 Survey data documentation 3.1 Introduction 3.2 Types of survey documentation 3.3 Missing data coding 3.4 Example: American National Election Studies (ANES) 2020 survey documentation", " Chapter 3 Survey data documentation 3.1 Introduction Survey documentation helps us prepare before we look at the actual survey data. The documentation includes technical guides, questionnaires, codebooks, errata, and other useful resources. By taking the time to review these materials, we can gain a comprehensive understanding of the survey data (including research and design decisions discussed in Chapters 2 and 10) and conduct our analysis more effectively. Survey documentation can vary in organization, type, and ease of use. The information may be stored in any format - PDFs, Excel spreadsheets, Word documents, and so on. Some surveys bundle documentation together, such as providing the codebook and questionnaire in a single document. Others keep them in separate files. Despite these variations, we can gain a general understanding of the documentation types and what aspects to focus on in each. 3.2 Types of survey documentation 3.2.1 Technical documentation The technical documentation, also known as user guides or methodology/analysis guides, highlights the variables necessary to specify the survey design. We recommend concentrating on these key sections: Introduction: The introduction orients us to the survey. This section provides the project’s background, the study’s purpose, and the main research questions. Study design: The study design section describes how researchers prepared and administered the survey. Sample: The sample section describes the sample frame, any known sampling errors, and the sample’s limitations. This section can contain recommendations on how to use sampling weights. Look for weight information, whether the survey design contains strata, clusters/PSUs, or replicate weights. Also, look for population sizes, finite population correction, or replicate weight scaling information. Additional detail on sample designs is available in Chapter 10. Notes on fielding: Any additional notes on fielding, such as response rates, may be found in the technical documentation. The technical documentation may include other helpful resources. For example, some technical documentation includes syntax for SAS, SUDAAN, Stata, and/or R, so we do not have to create this code from scratch. 3.2.2 Questionnaires A questionnaire is a series of questions used to collect information from people in a survey. It can ask about opinions, behaviors, demographics, or even just numbers like the count of lightbulbs, square footage, or farm size. Questionnaires can employ different types of questions, such as closed-ended (e.g., select one or check all that apply), open-ended (e.g., numeric or text), Likert scales (e.g., a 5- or 7-point scale specifying a respondent’s level of agreement to a statement), or ranking questions (e.g., a list of options that a respondent ranks by preference.) It may randomize the display order of responses or include instructions that help respondents understand the questions. A survey may have one questionnaire or multiple, depending on its scale and scope. The questionnaire is another important resource for understanding and interpreting the survey data (see Section 2.4.3), and we should use it alongside any analysis. It provides details about each of the questions asked in the survey, such as question name, question wording, response options, skip logic, randomizations, display specifications, mode differences, and the universe (the subset of respondents who were asked a question.) In Figure 3.1, we show an example from the ANES 2020 questionnaire (American National Election Studies 2021). The figure shows the question name (POSTVOTE_RVOTE), description (Did R Vote?), full wording of the question and responses, response order, universe, question logic (this question was only asked if vote_pre = 0), and other specifications. The section also includes the variable name, which we can link to the codebook. FIGURE 3.1: ANES 2020 Questionnaire Example The content and structure of questionnaires vary depending on the specific survey. For instance, question names may be informative (like the ANES example above), sequential, or denoted by a code. In some cases, surveys may not use separate names for questions and variables. Figure 3.2 shows an example from the Behavioral Risk Factor Surveillance System (BRFSS) questionnaire that shows a sequential question number and a coded variable name (as opposed to a question name) (Centers for Disease Control and Prevention (CDC) 2021). FIGURE 3.2: BRFSS 2021 Questionnaire Example We should factor in the details of a survey when conducting our analyses. For example, surveys that use various modes (e.g., web and mail) may have differences in question wording or skip logic, as web surveys can include fills or automate skip logic. If large enough, these variations could warrant separate analyses for each mode. 3.2.3 Codebooks While a questionnaire provides information about the questions posed to respondents, the codebook explains how the survey data were coded and recorded. It lists details such as variable names, variable labels, variable meanings, codes for missing data, value labels, and value types (whether categorical, continuous, etc.) The codebook helps us understand and use the variables appropriately in our analysis. In particular, the codebook (as opposed to the questionnaire) often includes information on missing data. Note that the term data dictionary is sometimes used interchangeably with codebook, but a data dictionary may include more details on the structure and elements of the data. Figure 3.3 is a question from the ANES 2020 codebook (American National Election Studies 2022). This section indicates a variable’s name (V202066), question wording, value labels, universe, and associated survey question (POSTVOTE_RVOTE.) FIGURE 3.3: ANES 2020 Codebook Example Reviewing the questionnaires and codebooks in parallel can clarify how to interpret the variables (Figures 3.1 and 3.3), as questions and variables do not always correspond directly to each other in a one-to-one mapping. A single question may have multiple associated variables, or a single variable may summarize multiple questions. 3.2.4 Errata An erratum (singular) or errata (plural) is a document that lists errors found in a publication or dataset. The purpose of an erratum is to correct or update inaccuracies in the original document. Examples of errata include: Issuing a corrected data table after realizing a typo or mistake in a table cell Reporting incorrectly programmed skips in an electronic survey where questions are skipped by the respondent when they should not have been For example, the 2004 ANES dataset released an erratum, notifying analysts to remove a specific row from the data file due to the inclusion of a respondent who should not have been part of the sample. Adhering to an issued erratum helps us increase the accuracy and reliability of analysis. 3.2.5 Additional resources Survey documentation may include additional material, such as interviewer instructions or “show cards” provided to respondents during interviewer-administered surveys to help respondents answer questions. Explore the survey website to find out what resources were used and in what contexts. 3.3 Missing data coding Some observations in a dataset may have missing data. This can be due to design or nonresponse, and these concepts are detailed in Chapter 11. In that chapter, we also discuss how to analyze data with missing values. This chapter walks through how to understand documentation related to missing data. The survey documentation, often the codebook, represents the missing data with a code. The codebook may list different codes depending on why certain data points are missing. In the example of variable V202066 from the ANES (Figure 3.3), -9 represents “Refused,” -7 means that the response was deleted due to an incomplete interview, -6 means that there is no response because there was no follow-up interview, and -1 means “Inapplicable” (due to a designed skip pattern.) As another example, there may be a summary variable that describes the missingness of a set of variables - particularly with “select all that apply” or “multiple response” questions. In the National Crime Victimization Survey (NCVS), respondents who are victims of a crime and saw the offender are asked if the offender had a weapon and then asked what the type of weapon was. This part of the questionnaire from 2021 is shown in Figure 3.4. FIGURE 3.4: Excerpt from the NCVS 2020-2021 Crime Incident Report - Weapon Type The NCVS codebook includes coding for all multiple response variables of a “lead in” variable that summarizes the individual options. For question 23a on the weapon type, the lead-in variable is V4050, which is shown in 3.5. This variable is then followed by a set of variables for each weapon type. An example of one of the individual variables from the codebook, the handgun, is shown in 3.6. We will dive into how to analyze this variable in Chapter 11. FIGURE 3.5: Excerpt from the NCVS 2021 Codebook for V4050 - LI WHAT WAS WEAPON FIGURE 3.6: Excerpt from the NCVS 2021 Codebook for V4051 - C WEAPON: HAND GUN When data are read into R, some values may be system missing, that is they are coded as NA even if that is not evident in a codebook. We discuss in Chapter 11 how to analyze data with NA values and review how R handles missing data in calculations. 3.4 Example: American National Election Studies (ANES) 2020 survey documentation Let’s look at the survey documentation for the American National Election Studies (ANES) 2020 and the documentation from their website. Navigating to “User Guide and Codebook” (American National Election Studies 2022), we can download the PDF that contains the survey documentation, titled “ANES 2020 Time Series Study Full Release: User Guide and Codebook”. Do not be daunted by the 796-page PDF. Below, we focus on the most critical information. Introduction The first section in the User Guide explains that the ANES 2020 Times Series Study continues a series of election surveys conducted since 1948. These surveys contain data on public opinion and voting behavior in the U.S. presidential elections. The introduction also includes information about the modes used for data collection (web, live video interviewing, or CATI.) Additionally, there is a summary of the number of pre-election interviews (8,280) and post-election re-interviews (7,449.) Sample design and respondent recruitment The section “Sample Design and Respondent Recruitment” provides more detail about the survey’s sequential mixed-mode design. All three modes were conducted one after another and not at the same time. Additionally, it indicates that for the 2020 survey, they resampled all respondents who participated in the 2016 ANES, along with a newly drawn cross-section: The target population for the fresh cross-section was the 231 million non-institutional U.S. citizens aged 18 or older living in the 50 U.S. states or the District of Columbia. The document continues with more details on the sample groups. Data analysis, weights, and variance estimation The section “Data Analysis, Weights, and Variance Estimation” includes information on weights and strata/cluster variables. Reading through, we can find the full sample weight variables: For analysis of the complete set of cases using pre-election data only, including all cases and representative of the 2020 electorate, use the full sample pre-election weight, V200010a. For analysis including post-election data for the complete set of participants (i.e., analysis of post-election data only or a combination of pre- and post-election data), use the full sample post-election weight, V200010b. Additional weights are provided for analysis of subsets of the data… The document provides more information about the design variables, summarized in Table 3.1. TABLE 3.1: Weight and variance information for ANES For weight Use variance unit/PSU/cluster and use variance stratum V200010a V200010c V200010d V200010b V200010c V200010d Methodology The user guide mentions a supplemental document called “How to Analyze ANES Survey Data” (DeBell 2010) as a ‘how-to guide’ for analyzing the data. In this document, we learn more about the weights, where we learn that they sum to the sample size and not the population. If our goal is to calculate estimates for the entire U.S. population instead of just the sample, we must adjust the weights to the U.S. population. To create accurate weights for the population, we need to determine the total population size at the time of the survey. Let’s review the “Sample Design and Respondent Recruitment” section for more details: The target population for the fresh cross-section was the 231 million non-institutional U.S. citizens aged 18 or older living in the 50 U.S. states or the District of Columbia. The documentation suggests that the population should equal around 231 million, but this is a very imprecise count. Upon further investigation of the available resources, we can find the methodology file titled “Methodology Report for the ANES 2020 Time Series Study” (DeBell et al. 2022). This file states that we can use the population total from the Current Population Survey (CPS), a monthly survey sponsored by the U.S. Census Bureau and the U.S. Bureau of Labor Statistics. The CPS provides a more accurate population estimate for a specific month. Therefore, we can use the CPS to get the total population number for March 2020, when the ANES was conducted. Chapter 4 goes into detailed instructions on how to calculate and adjust this value in the data. References American National Election Studies. 2021. “ANES 2020 Time Series Study: Pre-Election and Post-Election Survey Questionnaires.” https://electionstudies.org/wp-content/uploads/2021/07/anes_timeseries_2020_questionnaire_20210719.pdf. ———. 2022. “ANES 2020 Time Series Study Full Release: User Guide and Codebook.” https://electionstudies.org/wp-content/uploads/2022/02/anes_timeseries_2020_userguidecodebook_20220210.pdf. Centers for Disease Control and Prevention (CDC). 2021. “Behavioral Risk Factor Surveillance System Survey Questionnaire.” U.S. Department of Health; Human Services, Centers for Disease Control; Prevention; https://www.cdc.gov/brfss/questionnaires/pdf-ques/2021-BRFSS-Questionnaire-1-19-2022-508.pdf. DeBell, Matthew. 2010. “How to Analyze ANES Survey Data.” ANES Technical Report Series nes012492. Palo Alto, CA: Stanford University; Ann Arbor, MI: the University of Michigan; https://electionstudies.org/wp-content/uploads/2018/05/HowToAnalyzeANESData.pdf. DeBell, Matthew, Michelle Amsbary, Ted Brader, Shelley Brock, Cindy Good, Justin Kamens, Natalya Maisel, and Sarah Pinto. 2022. “Methodology Report for the ANES 2020 Time Series Study.” https://electionstudies.org/wp-content/uploads/2022/08/anes_timeseries_2020_methodology_report.pdf. "],["c04-getting-started.html", "Chapter 4 Getting started 4.1 Introduction 4.2 Setup 4.3 Survey analysis process 4.4 Similarities between {dplyr} and {srvyr} functions", " Chapter 4 Getting started 4.1 Introduction This chapter provides an overview of the packages, data, and design objects we use frequently throughout this book. As mentioned in Chapter 2, understanding how a survey was conducted helps us make sense of the results and interpret findings. Therefore, we provide background on the datasets used in examples and exercises. Next, we walk through how to create the survey design objects necessary to begin an analysis. Finally, we provide an overview of the {srvyr} package and the steps needed for analysis. Please report any bugs and issues while going through the book to the book’s GitHub repository. 4.2 Setup This section provides details on the required packages and data, as well as the steps for preparing survey design objects. For a streamlined learning experience, we recommend taking the time to walk through the code provided here and making sure everything is properly set up. 4.2.1 Packages We use several packages throughout the book, but let’s install and load specific ones for this chapter. Many functions in the examples and exercises are from three packages: {tidyverse}, {survey}, and {srvyr}. If they are not already installed, use the code below. The {tidyverse} and {survey} packages can both be installed from the Comprehensive R Archive Network (CRAN) (Lumley 2010; Wickham et al. 2019). We use the GitHub development version of {srvyr} because of its additional functionality compared to the one on CRAN (Freedman Ellis and Schneider 2023). Install the package directly from GitHub using the {remotes} package: install.packages(c(&quot;tidyverse&quot;, &quot;survey&quot;, &quot;remotes&quot;)) remotes::install_github(&quot;gergness/srvyr&quot;) We bundled the datasets used in the book in an R package, {srvyrexploR}. Install it directly from GitHub using the {remotes} package: remotes::install_github(&quot;tidy-survey-r/srvyrexploR&quot;) After installing these packages, load them using the library() function: library(tidyverse) library(survey) library(srvyr) library(srvyrexploR) The packages {broom}, {gt}, and {gtsummary} play a role in displaying output and creating formatted tables Robinson, Hayes, and Couch (2023). Install them with the provided code2: install.packages(c(&quot;gt&quot;, &quot;gtsummary&quot;)) After installing these packages, load them using the library() function: library(broom) library(gt) library(gtsummary) Install and load the {censusapi} package to access the Current Population Survey (CPS), which we use to ensure accurate weighting of a key dataset in the book (Recht 2024). Run the code below to install {censusapi}: install.packages(&quot;censusapi&quot;) After installing this package, load it using the library() function: library(censusapi) Note that the {censusapi} package requires a Census API key, available for free from the U.S. Census Bureau website (refer to the package documentation for more information). We recommend storing the Census API key in the R environment instead of directly in the code. To do this, run Sys.setenv() after obtaining the API key. Sys.setenv(CENSUS_KEY=&quot;YOUR_API_KEY_HERE&quot;) Then, restart the R session. Once the Census API key is stored, we can retrieve it in our R code with Sys.getenv(\"CENSUS_KEY\"). There are a few other packages used in the book in limited frequency. We list them in the Prerequisite boxes at the beginning of each chapter. As we work through the book, make sure to check the Prerequisite box and install any missing packages before proceeding. 4.2.2 Data The {srvyrexploR} package contains the datasets used in the book. Once installed and loaded, explore the documentation using the help() function. Read the descriptions of the datasets to understand what they contain: help(package = &quot;srvyrexploR&quot;) This book uses two main datasets: the American National Election Studies (ANES – DeBell 2010) and the Residential Energy Consumption Survey (RECS – U.S. Energy Information Administration 2023b), which are included as anes_2020 and recs_2020 in the {srvyrexploR} package, respectively. American National Election Studies (ANES) Data ANES is a study that collects data from election surveys dating back to 1948. These surveys contain information on public opinion and voting behavior in U.S. presidential elections and some midterm elections3. They cover topics such as party affiliation, voting choice, and level of trust in the government. The 2020 survey (data used in this book) was fielded online, through live video interviews, or via computer-assisted telephone interviews (CATI). When working with new survey data, we should review the survey documentation (see Chapter 3) to understand the data collection methods. The original ANES data contains variables starting with V20 (DeBell 2010), so to assist with our analysis throughout the book, we created descriptive variable names. For example, the respondent’s age is now in a variable called Age, and gender is in a variable called Gender. These descriptive variables are included in the {srvyrexploR} package, and Table 4.1 displays the list of these renamed variables. A complete overview of all variables can be found in Appendix B. #dfizchpgkm table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #dfizchpgkm thead, #dfizchpgkm tbody, #dfizchpgkm tfoot, #dfizchpgkm tr, #dfizchpgkm td, #dfizchpgkm th { border-style: none; } #dfizchpgkm p { margin: 0; padding: 0; } #dfizchpgkm .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #dfizchpgkm .gt_caption { padding-top: 4px; padding-bottom: 4px; } #dfizchpgkm .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #dfizchpgkm .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #dfizchpgkm .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #dfizchpgkm .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #dfizchpgkm .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #dfizchpgkm .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #dfizchpgkm .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #dfizchpgkm .gt_column_spanner_outer:first-child { padding-left: 0; } #dfizchpgkm .gt_column_spanner_outer:last-child { padding-right: 0; } #dfizchpgkm .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #dfizchpgkm .gt_spanner_row { border-bottom-style: hidden; } #dfizchpgkm .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #dfizchpgkm .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #dfizchpgkm .gt_from_md > :first-child { margin-top: 0; } #dfizchpgkm .gt_from_md > :last-child { margin-bottom: 0; } #dfizchpgkm .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #dfizchpgkm .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #dfizchpgkm .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #dfizchpgkm .gt_row_group_first td { border-top-width: 2px; } #dfizchpgkm .gt_row_group_first th { border-top-width: 2px; } #dfizchpgkm .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #dfizchpgkm .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #dfizchpgkm .gt_first_summary_row.thick { border-top-width: 2px; } #dfizchpgkm .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #dfizchpgkm .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #dfizchpgkm .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #dfizchpgkm .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #dfizchpgkm .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #dfizchpgkm .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #dfizchpgkm .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #dfizchpgkm .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #dfizchpgkm .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #dfizchpgkm .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #dfizchpgkm .gt_left { text-align: left; } #dfizchpgkm .gt_center { text-align: center; } #dfizchpgkm .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #dfizchpgkm .gt_font_normal { font-weight: normal; } #dfizchpgkm .gt_font_bold { font-weight: bold; } #dfizchpgkm .gt_font_italic { font-style: italic; } #dfizchpgkm .gt_super { font-size: 65%; } #dfizchpgkm .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #dfizchpgkm .gt_asterisk { font-size: 100%; vertical-align: 0; } #dfizchpgkm .gt_indent_1 { text-indent: 5px; } #dfizchpgkm .gt_indent_2 { text-indent: 10px; } #dfizchpgkm .gt_indent_3 { text-indent: 15px; } #dfizchpgkm .gt_indent_4 { text-indent: 20px; } #dfizchpgkm .gt_indent_5 { text-indent: 25px; } TABLE 4.1: List of created variables in the ANES Data Variable Name CaseID InterviewMode Weight VarUnit Stratum CampaignInterest EarlyVote2020 VotedPres2016 VotedPres2016_selection PartyID TrustGovernment TrustPeople Age AgeGroup Education RaceEth Gender Income Income7 VotedPres2020 VotedPres2020_selection Before beginning an analysis, it is useful to view the data to understand the available variables. The dplyr::glimpse() function produces a list of all variables, their types (e.g., function, double), and a few example values. Below, we remove variables containing a “V” followed by numbers with select(-matches(\"^V\\\\d\")) before using glimpse() to get a quick overview of the data with descriptive variable names: anes_2020 %&gt;% select(-matches(&quot;^V\\\\d&quot;)) %&gt;% glimpse() ## Rows: 7,453 ## Columns: 21 ## $ CaseID &lt;dbl&gt; 200015, 200022, 200039, 200046, 200053… ## $ InterviewMode &lt;fct&gt; Web, Web, Web, Web, Web, Web, Web, Web… ## $ Weight &lt;dbl&gt; 1.0057, 1.1635, 0.7687, 0.5210, 0.9658… ## $ VarUnit &lt;fct&gt; 2, 2, 1, 2, 1, 2, 1, 2, 2, 2, 1, 1, 2,… ## $ Stratum &lt;fct&gt; 9, 26, 41, 29, 23, 37, 7, 37, 32, 41, … ## $ CampaignInterest &lt;fct&gt; Somewhat interested, Not much interest… ## $ EarlyVote2020 &lt;fct&gt; NA, NA, NA, NA, NA, NA, NA, NA, Yes, N… ## $ VotedPres2016 &lt;fct&gt; Yes, Yes, Yes, Yes, Yes, No, Yes, No, … ## $ VotedPres2016_selection &lt;fct&gt; Trump, Other, Clinton, Clinton, Trump,… ## $ PartyID &lt;fct&gt; Strong republican, Independent, Indepe… ## $ TrustGovernment &lt;fct&gt; Never, Never, Some of the time, About … ## $ TrustPeople &lt;fct&gt; About half the time, Some of the time,… ## $ Age &lt;dbl&gt; 46, 37, 40, 41, 72, 71, 37, 45, 70, 43… ## $ AgeGroup &lt;fct&gt; 40-49, 30-39, 40-49, 40-49, 70 or olde… ## $ Education &lt;fct&gt; Bachelor&#39;s, Post HS, High school, Post… ## $ RaceEth &lt;fct&gt; &quot;Hispanic&quot;, &quot;Asian, NH/PI&quot;, &quot;White&quot;, &quot;… ## $ Gender &lt;fct&gt; Male, Female, Female, Male, Male, Fema… ## $ Income &lt;fct&gt; &quot;$175,000-249,999&quot;, &quot;$70,000-74,999&quot;, … ## $ Income7 &lt;fct&gt; $125k or more, $60k to &lt; 80k, $100k to… ## $ VotedPres2020 &lt;fct&gt; NA, Yes, Yes, Yes, Yes, Yes, Yes, NA, … ## $ VotedPres2020_selection &lt;fct&gt; NA, Other, Biden, Biden, Trump, Biden,… From the output, we can see there are 7,453 rows and 21 variables in the ANES data. This output also indicates that most of the variables are factors (e.g., InterviewMode), while a few variables are in double (numeric) format (e.g., Age). Residential Energy Consumption Survey (RECS) Data RECS is a study that measures energy consumption and expenditure in American households. Funded by the Energy Information Administration, RECS data are collected through interviews with household members and energy suppliers. These interviews take place in person, over the phone, via mail, and on the web, with modes changing over time. The survey has been fielded 14 times between 1950 and 2020. It includes questions about appliances, electronics, heating, air conditioning (A/C), temperatures, water heating, lighting, energy bills, respondent demographics, and energy assistance. We should read the survey documentation (see Chapter 3) to understand how the data were collected and implemented. Table 4.2 displays the list of variables in the RECS data (not including the weights, which start with NWEIGHT and are described in more detail in Chapter 10). An overview of all variables can be found in Appendix C. #zgtugjzmca table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #zgtugjzmca thead, #zgtugjzmca tbody, #zgtugjzmca tfoot, #zgtugjzmca tr, #zgtugjzmca td, #zgtugjzmca th { border-style: none; } #zgtugjzmca p { margin: 0; padding: 0; } #zgtugjzmca .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #zgtugjzmca .gt_caption { padding-top: 4px; padding-bottom: 4px; } #zgtugjzmca .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #zgtugjzmca .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #zgtugjzmca .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #zgtugjzmca .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #zgtugjzmca .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #zgtugjzmca .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #zgtugjzmca .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #zgtugjzmca .gt_column_spanner_outer:first-child { padding-left: 0; } #zgtugjzmca .gt_column_spanner_outer:last-child { padding-right: 0; } #zgtugjzmca .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #zgtugjzmca .gt_spanner_row { border-bottom-style: hidden; } #zgtugjzmca .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #zgtugjzmca .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #zgtugjzmca .gt_from_md > :first-child { margin-top: 0; } #zgtugjzmca .gt_from_md > :last-child { margin-bottom: 0; } #zgtugjzmca .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #zgtugjzmca .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #zgtugjzmca .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #zgtugjzmca .gt_row_group_first td { border-top-width: 2px; } #zgtugjzmca .gt_row_group_first th { border-top-width: 2px; } #zgtugjzmca .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #zgtugjzmca .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #zgtugjzmca .gt_first_summary_row.thick { border-top-width: 2px; } #zgtugjzmca .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #zgtugjzmca .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #zgtugjzmca .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #zgtugjzmca .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #zgtugjzmca .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #zgtugjzmca .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #zgtugjzmca .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #zgtugjzmca .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #zgtugjzmca .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #zgtugjzmca .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #zgtugjzmca .gt_left { text-align: left; } #zgtugjzmca .gt_center { text-align: center; } #zgtugjzmca .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #zgtugjzmca .gt_font_normal { font-weight: normal; } #zgtugjzmca .gt_font_bold { font-weight: bold; } #zgtugjzmca .gt_font_italic { font-style: italic; } #zgtugjzmca .gt_super { font-size: 65%; } #zgtugjzmca .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #zgtugjzmca .gt_asterisk { font-size: 100%; vertical-align: 0; } #zgtugjzmca .gt_indent_1 { text-indent: 5px; } #zgtugjzmca .gt_indent_2 { text-indent: 10px; } #zgtugjzmca .gt_indent_3 { text-indent: 15px; } #zgtugjzmca .gt_indent_4 { text-indent: 20px; } #zgtugjzmca .gt_indent_5 { text-indent: 25px; } TABLE 4.2: List of Variables in the RECS Data Variable Name DOEID ClimateRegion_BA Urbanicity Region REGIONC Division STATE_FIPS state_postal state_name HDD65 CDD65 HDD30YR CDD30YR HousingUnitType YearMade TOTSQFT_EN TOTHSQFT TOTCSQFT ZTOTSQFT_EN ZYearMade ZHousingUnitType SpaceHeatingUsed ZSpaceHeatingUsed ACUsed ZACUsed ZACBehavior HeatingBehavior WinterTempDay WinterTempAway WinterTempNight ACBehavior SummerTempDay SummerTempAway SummerTempNight ZHeatingBehavior ZWinterTempAway ZSummerTempAway ZWinterTempDay ZSummerTempDay ZWinterTempNight ZSummerTempNight BTUEL DOLLAREL ZBTUEL BTUNG DOLLARNG ZBTUNG BTULP DOLLARLP ZBTULP BTUFO DOLLARFO ZBTUFO BTUWOOD ZBTUWOOD TOTALBTU TOTALDOL Before starting an analysis, we recommend viewing the data to understand the types of data and variables that are included. The dplyr::glimpse() function produces a list of all variables, the type of the variable (e.g., function, double), and a few example values. Below, we remove the weight variables with select(-matches(\"^NWEIGHT\")) before using glimpse() to get a quick overview of the data: recs_2020 %&gt;% select(-matches(&quot;^NWEIGHT&quot;)) %&gt;% glimpse() ## Rows: 18,496 ## Columns: 57 ## $ DOEID &lt;dbl&gt; 1e+05, 1e+05, 1e+05, 1e+05, 1e+05, 1e+05, 1e… ## $ ClimateRegion_BA &lt;fct&gt; Mixed-Dry, Mixed-Humid, Mixed-Dry, Mixed-Hum… ## $ Urbanicity &lt;fct&gt; Urban Area, Urban Area, Urban Area, Urban Ar… ## $ Region &lt;fct&gt; West, South, West, South, Northeast, South, … ## $ REGIONC &lt;chr&gt; &quot;WEST&quot;, &quot;SOUTH&quot;, &quot;WEST&quot;, &quot;SOUTH&quot;, &quot;NORTHEAST… ## $ Division &lt;fct&gt; Mountain South, West South Central, Mountain… ## $ STATE_FIPS &lt;chr&gt; &quot;35&quot;, &quot;05&quot;, &quot;35&quot;, &quot;45&quot;, &quot;34&quot;, &quot;48&quot;, &quot;40&quot;, &quot;2… ## $ state_postal &lt;fct&gt; NM, AR, NM, SC, NJ, TX, OK, MS, DC, AZ, CA, … ## $ state_name &lt;fct&gt; New Mexico, Arkansas, New Mexico, South Caro… ## $ HDD65 &lt;dbl&gt; 3844, 3766, 3819, 2614, 4219, 901, 3148, 182… ## $ CDD65 &lt;dbl&gt; 1679, 1458, 1696, 1718, 1363, 3558, 2128, 23… ## $ HDD30YR &lt;dbl&gt; 4451, 4429, 4500, 3229, 4896, 1150, 3564, 26… ## $ CDD30YR &lt;dbl&gt; 1027, 1305, 1010, 1653, 1059, 3588, 2043, 21… ## $ HousingUnitType &lt;fct&gt; Single-family detached, Apartment: 5 or more… ## $ YearMade &lt;ord&gt; 1970-1979, 1980-1989, 1960-1969, 1980-1989, … ## $ TOTSQFT_EN &lt;dbl&gt; 2100, 590, 900, 2100, 800, 4520, 2100, 900, … ## $ TOTHSQFT &lt;dbl&gt; 2100, 590, 900, 2100, 800, 3010, 1200, 900, … ## $ TOTCSQFT &lt;dbl&gt; 2100, 590, 900, 2100, 800, 3010, 1200, 0, 50… ## $ ZTOTSQFT_EN &lt;fct&gt; Not imputed, Not imputed, Not imputed, Not i… ## $ ZYearMade &lt;fct&gt; Not imputed, Not imputed, Not imputed, Not i… ## $ ZHousingUnitType &lt;fct&gt; Not imputed, Not imputed, Not imputed, Not i… ## $ SpaceHeatingUsed &lt;lgl&gt; TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TR… ## $ ZSpaceHeatingUsed &lt;fct&gt; Not imputed, Not imputed, Not imputed, Not i… ## $ ACUsed &lt;lgl&gt; TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, FA… ## $ ZACUsed &lt;fct&gt; Not imputed, Not imputed, Not imputed, Not i… ## $ ZACBehavior &lt;fct&gt; Not imputed, Imputed, Not imputed, Not imput… ## $ HeatingBehavior &lt;fct&gt; Set one temp and leave it, Turn on or off as… ## $ WinterTempDay &lt;dbl&gt; 70, 70, 69, 68, 68, 76, 74, 70, 68, 70, 72, … ## $ WinterTempAway &lt;dbl&gt; 70, 65, 68, 68, 68, 76, 65, 70, 60, 70, 70, … ## $ WinterTempNight &lt;dbl&gt; 68, 65, 67, 68, 68, 68, 74, 68, 62, 68, 72, … ## $ ACBehavior &lt;fct&gt; Set one temp and leave it, Turn on or off as… ## $ SummerTempDay &lt;dbl&gt; 71, 68, 70, 72, 72, 69, 68, NA, 72, 74, 77, … ## $ SummerTempAway &lt;dbl&gt; 71, 68, 68, 72, 72, 74, 70, NA, 76, 74, 77, … ## $ SummerTempNight &lt;dbl&gt; 71, 68, 68, 72, 72, 68, 70, NA, 68, 72, 77, … ## $ ZHeatingBehavior &lt;fct&gt; Not imputed, Not imputed, Not imputed, Not i… ## $ ZWinterTempAway &lt;fct&gt; Not imputed, Not imputed, Not imputed, Not i… ## $ ZSummerTempAway &lt;fct&gt; Not imputed, Not imputed, Not imputed, Not i… ## $ ZWinterTempDay &lt;fct&gt; Not imputed, Not imputed, Not imputed, Not i… ## $ ZSummerTempDay &lt;fct&gt; Not imputed, Not imputed, Not imputed, Not i… ## $ ZWinterTempNight &lt;fct&gt; Not imputed, Not imputed, Not imputed, Not i… ## $ ZSummerTempNight &lt;fct&gt; Not imputed, Not imputed, Not imputed, Not i… ## $ BTUEL &lt;dbl&gt; 42723, 17889, 8147, 31647, 20027, 48968, 494… ## $ DOLLAREL &lt;dbl&gt; 1955.06, 713.27, 334.51, 1424.86, 1087.00, 1… ## $ ZBTUEL &lt;fct&gt; Not imputed, Not imputed, Imputed amount and… ## $ BTUNG &lt;dbl&gt; 101924.4, 10145.3, 22603.1, 55118.7, 39099.5… ## $ DOLLARNG &lt;dbl&gt; 701.83, 261.73, 188.14, 636.91, 376.04, 439.… ## $ ZBTUNG &lt;fct&gt; Not imputed, Not imputed, Imputed, Not imput… ## $ BTULP &lt;dbl&gt; 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 17… ## $ DOLLARLP &lt;dbl&gt; 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,… ## $ ZBTULP &lt;fct&gt; Not applicable, Not applicable, Not applicab… ## $ BTUFO &lt;dbl&gt; 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 68… ## $ DOLLARFO &lt;dbl&gt; 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 18… ## $ ZBTUFO &lt;fct&gt; Not applicable, Not applicable, Not applicab… ## $ BTUWOOD &lt;dbl&gt; 0, 0, 0, 0, 0, 3000, 0, 0, 0, 0, 0, 0, 0, 0,… ## $ ZBTUWOOD &lt;fct&gt; Not applicable, Not applicable, Not applicab… ## $ TOTALBTU &lt;dbl&gt; 144648, 28035, 30750, 86765, 59127, 85401, 1… ## $ TOTALDOL &lt;dbl&gt; 2656.9, 975.0, 522.6, 2061.8, 1463.0, 2335.1… From the output, we can see that the RECS data has 18,496 rows and 57 non-weight variables. This output also indicates that most of the variables are in double (numeric) format (e.g., TOTSQFT_EN), with some factor (e.g., Region), Boolean (e.g., ACUsed), character (e.g., REGIONC), and ordinal (e.g., YearMade) variables. 4.2.3 Design objects The design object is the backbone for survey analysis. It is where we specify the sampling design, weights, and other necessary information to ensure we account for errors in the data. Before creating the design object, we should carefully review the survey documentation to understand how to create the design object for accurate analysis. In this section, we provide details on how to code the design object for the ANES and RECS data used in the book. However, we only provide a high-level overview to get readers started. For a deeper understanding of creating design objects for a variety of sampling designs, see Chapter 10. While we recommend conducting exploratory data analysis on the original data before diving into complex survey analysis (see Chapter 12), the actual survey analysis and inference should be performed with the survey design objects instead of the original survey data. For example, the ANES data is called anes_2020. If we create a survey design object called anes_des, our survey analyses should begin with anes_des and not anes_2020. Using the survey design object ensures that our calculations appropriately account for the details of the survey design. American National Election Studies (ANES) Design Object The ANES documentation (DeBell 2010) details the sampling and weighting implications for analyzing the survey data. From this documentation and as noted in Chapter 3, the 2020 ANES data are weighted to the sample, not the population. To make generalizations about the population, we need to weigh the data against the full population count. The ANES methodology recommends using the Current Population Survey (CPS) to determine the number of non-institutional U.S. citizens aged 18 or older living in the 50 U.S. states or D.C. in March 2020. We can use the {censusapi} package to obtain the information needed for the survey design object. The getCensus() function allows us to retrieve the CPS data for March (cps/basic/mar) in 2020 (vintage = 2020). Additionally, we extract several variables from the CPS: month (HRMONTH) and year (HRYEAR4) of the interview: to confirm the correct time period age (PRTAGE) of the respondent: to narrow the population to 18 and older (eligible age to vote) citizenship status (PRCITSHP) of the respondent: to narrow the population to only those eligible to vote final person-level weight (PWSSWGT) Detailed information for these variables can be found in the CPS data dictionary. cps_state_in &lt;- getCensus(name = &quot;cps/basic/mar&quot;, vintage = 2020, region = &quot;state&quot;, vars = c(&quot;HRMONTH&quot;, &quot;HRYEAR4&quot;, &quot;PRTAGE&quot;, &quot;PRCITSHP&quot;, &quot;PWSSWGT&quot;), key = Sys.getenv(&quot;CENSUS_KEY&quot;)) cps_state &lt;- cps_state_in %&gt;% as_tibble() %&gt;% mutate(across(.cols = everything(), .fns = as.numeric)) In the code above, we include region = \"state\". The default region type for the CPS data is at the state level. While not required, including the region can be helpful for understanding the geographical context of the data. In getCensus(), we filtered the dataset by specifying the month (HRMONTH == 3) and year (HRYEAR4 == 2020) of our request. Therefore, we expect that all interviews within our output were conducted during that particular month and year. We can confirm that the data are from March 2020 by running the code below: cps_state %&gt;% distinct(HRMONTH, HRYEAR4) ## # A tibble: 1 × 2 ## HRMONTH HRYEAR4 ## &lt;dbl&gt; &lt;dbl&gt; ## 1 3 2020 We can narrow down the dataset using the age and citizenship variables to include only individuals who are 18 years or older (PRTAGE &gt;= 18) and have U.S. citizenship (PRCITSHIP %in% c(1:4)): cps_narrow_resp &lt;- cps_state %&gt;% filter(PRTAGE &gt;= 18, PRCITSHP %in% c(1:4)) To calculate the U.S. population from the filtered data, we sum the person weights (PWSSWGT): targetpop &lt;- cps_narrow_resp %&gt;% pull(PWSSWGT) %&gt;% sum() scales::comma(targetpop) ## [1] &quot;231,034,125&quot; The population of interest in 2020 is 231,034,125. This result gives us what we need to create the survey design object for estimating population statistics. Using the anes_2020 data, we adjust the weighting variable (V200010b) using the population of interest we just calculated (targetpop). We determine the proportion of the total weight for each individual weight (V200010b / sum(V200010b)) and then multiply that proportion by the calculated population of interest. anes_adjwgt &lt;- anes_2020 %&gt;% mutate(Weight = V200010b / sum(V200010b) * targetpop) Once we have the adjusted weights, we can refer to the rest of the documentation to create the survey design. The documentation indicates that the study uses a stratified cluster sampling design. Therefore, we need to specify variables for strata and ids (cluster) and fill in the nest argument. The documentation provides guidance on which strata and cluster variables to use depending on whether we are analyzing pre- or post-election data. In this book, we analyze post-election data, so we need to use the post-election weight V200010b, strata variable V200010d, and PSU/cluster variable V200010c. Additionally, we set nest=TRUE to ensure the clusters are nested within the strata. anes_des &lt;- anes_adjwgt %&gt;% as_survey_design(weights = Weight, strata = V200010d, ids = V200010c, nest = TRUE) anes_des ## Stratified 1 - level Cluster Sampling design (with replacement) ## With (101) clusters. ## Called via srvyr ## Sampling variables: ## - ids: V200010c ## - strata: V200010d ## - weights: Weight ## Data variables: ## - V200001 (dbl), CaseID (dbl), V200002 (dbl+lbl), InterviewMode ## (fct), V200010b (dbl), Weight (dbl), V200010c (dbl), VarUnit (fct), ## V200010d (dbl), Stratum (fct), V201006 (dbl+lbl), CampaignInterest ## (fct), V201023 (dbl+lbl), EarlyVote2020 (fct), V201024 (dbl+lbl), ## V201025x (dbl+lbl), V201028 (dbl+lbl), V201029 (dbl+lbl), V201101 ## (dbl+lbl), V201102 (dbl+lbl), VotedPres2016 (fct), V201103 ## (dbl+lbl), VotedPres2016_selection (fct), V201228 (dbl+lbl), ## V201229 (dbl+lbl), V201230 (dbl+lbl), V201231x (dbl+lbl), PartyID ## (fct), V201233 (dbl+lbl), TrustGovernment (fct), V201237 (dbl+lbl), ## TrustPeople (fct), V201507x (dbl+lbl), Age (dbl), AgeGroup (fct), ## V201510 (dbl+lbl), Education (fct), V201546 (dbl+lbl), V201547a ## (dbl+lbl), V201547b (dbl+lbl), V201547c (dbl+lbl), V201547d ## (dbl+lbl), V201547e (dbl+lbl), V201547z (dbl+lbl), V201549x ## (dbl+lbl), RaceEth (fct), V201600 (dbl+lbl), Gender (fct), V201607 ## (dbl+lbl), V201610 (dbl+lbl), V201611 (dbl+lbl), V201613 (dbl+lbl), ## V201615 (dbl+lbl), V201616 (dbl+lbl), V201617x (dbl+lbl), Income ## (fct), Income7 (fct), V202051 (dbl+lbl), V202066 (dbl+lbl), V202072 ## (dbl+lbl), VotedPres2020 (fct), V202073 (dbl+lbl), V202109x ## (dbl+lbl), V202110x (dbl+lbl), VotedPres2020_selection (fct) We can examine this new object to learn more about the survey design, such that the ANES is a “Stratified 1 - level Cluster Sampling design (with replacement) With (101) clusters”. Additionally, the output displays the sampling variables and then lists the remaining variables in the dataset. This design object is used throughout this book to conduct survey analysis. Residential Energy Consumption Survey (RECS) Design Object The RECS documentation (U.S. Energy Information Administration 2023b) provides information on the survey’s sampling and weighting implications for analysis. The documentation shows the 2020 RECS uses Jackknife weights, where the main analytic weight is NWEIGHT, and the Jackknife weights are NWEIGHT1-NWEIGHT60. We can specify these in the weights and repweights arguments in the survey design object code, respectively. With Jackknife weights, additional information is required: type, scale, and mse. Chapter 10 goes into depth about each of these arguments, but to quickly get started, the RECS documentation lets us know that type=JK1, scale=59/60, and mse = TRUE. We can use the following code to create the survey design object: recs_des &lt;- recs_2020 %&gt;% as_survey_rep( weights = NWEIGHT, repweights = NWEIGHT1:NWEIGHT60, type = &quot;JK1&quot;, scale = 59 / 60, mse = TRUE ) recs_des ## Call: Called via srvyr ## Unstratified cluster jacknife (JK1) with 60 replicates and MSE variances. ## Sampling variables: ## - repweights: `NWEIGHT1 + NWEIGHT2 + NWEIGHT3 + NWEIGHT4 + NWEIGHT5 + ## NWEIGHT6 + NWEIGHT7 + NWEIGHT8 + NWEIGHT9 + NWEIGHT10 + NWEIGHT11 + ## NWEIGHT12 + NWEIGHT13 + NWEIGHT14 + NWEIGHT15 + NWEIGHT16 + ## NWEIGHT17 + NWEIGHT18 + NWEIGHT19 + NWEIGHT20 + NWEIGHT21 + ## NWEIGHT22 + NWEIGHT23 + NWEIGHT24 + NWEIGHT25 + NWEIGHT26 + ## NWEIGHT27 + NWEIGHT28 + NWEIGHT29 + NWEIGHT30 + NWEIGHT31 + ## NWEIGHT32 + NWEIGHT33 + NWEIGHT34 + NWEIGHT35 + NWEIGHT36 + ## NWEIGHT37 + NWEIGHT38 + NWEIGHT39 + NWEIGHT40 + NWEIGHT41 + ## NWEIGHT42 + NWEIGHT43 + NWEIGHT44 + NWEIGHT45 + NWEIGHT46 + ## NWEIGHT47 + NWEIGHT48 + NWEIGHT49 + NWEIGHT50 + NWEIGHT51 + ## NWEIGHT52 + NWEIGHT53 + NWEIGHT54 + NWEIGHT55 + NWEIGHT56 + ## NWEIGHT57 + NWEIGHT58 + NWEIGHT59 + NWEIGHT60` ## - weights: NWEIGHT ## Data variables: ## - DOEID (dbl), ClimateRegion_BA (fct), Urbanicity (fct), Region ## (fct), REGIONC (chr), Division (fct), STATE_FIPS (chr), ## state_postal (fct), state_name (fct), HDD65 (dbl), CDD65 (dbl), ## HDD30YR (dbl), CDD30YR (dbl), HousingUnitType (fct), YearMade ## (ord), TOTSQFT_EN (dbl), TOTHSQFT (dbl), TOTCSQFT (dbl), ## ZTOTSQFT_EN (fct), ZYearMade (fct), ZHousingUnitType (fct), ## SpaceHeatingUsed (lgl), ZSpaceHeatingUsed (fct), ACUsed (lgl), ## ZACUsed (fct), ZACBehavior (fct), HeatingBehavior (fct), ## WinterTempDay (dbl), WinterTempAway (dbl), WinterTempNight (dbl), ## ACBehavior (fct), SummerTempDay (dbl), SummerTempAway (dbl), ## SummerTempNight (dbl), ZHeatingBehavior (fct), ZWinterTempAway ## (fct), ZSummerTempAway (fct), ZWinterTempDay (fct), ZSummerTempDay ## (fct), ZWinterTempNight (fct), ZSummerTempNight (fct), NWEIGHT ## (dbl), NWEIGHT1 (dbl), NWEIGHT2 (dbl), NWEIGHT3 (dbl), NWEIGHT4 ## (dbl), NWEIGHT5 (dbl), NWEIGHT6 (dbl), NWEIGHT7 (dbl), NWEIGHT8 ## (dbl), NWEIGHT9 (dbl), NWEIGHT10 (dbl), NWEIGHT11 (dbl), NWEIGHT12 ## (dbl), NWEIGHT13 (dbl), NWEIGHT14 (dbl), NWEIGHT15 (dbl), NWEIGHT16 ## (dbl), NWEIGHT17 (dbl), NWEIGHT18 (dbl), NWEIGHT19 (dbl), NWEIGHT20 ## (dbl), NWEIGHT21 (dbl), NWEIGHT22 (dbl), NWEIGHT23 (dbl), NWEIGHT24 ## (dbl), NWEIGHT25 (dbl), NWEIGHT26 (dbl), NWEIGHT27 (dbl), NWEIGHT28 ## (dbl), NWEIGHT29 (dbl), NWEIGHT30 (dbl), NWEIGHT31 (dbl), NWEIGHT32 ## (dbl), NWEIGHT33 (dbl), NWEIGHT34 (dbl), NWEIGHT35 (dbl), NWEIGHT36 ## (dbl), NWEIGHT37 (dbl), NWEIGHT38 (dbl), NWEIGHT39 (dbl), NWEIGHT40 ## (dbl), NWEIGHT41 (dbl), NWEIGHT42 (dbl), NWEIGHT43 (dbl), NWEIGHT44 ## (dbl), NWEIGHT45 (dbl), NWEIGHT46 (dbl), NWEIGHT47 (dbl), NWEIGHT48 ## (dbl), NWEIGHT49 (dbl), NWEIGHT50 (dbl), NWEIGHT51 (dbl), NWEIGHT52 ## (dbl), NWEIGHT53 (dbl), NWEIGHT54 (dbl), NWEIGHT55 (dbl), NWEIGHT56 ## (dbl), NWEIGHT57 (dbl), NWEIGHT58 (dbl), NWEIGHT59 (dbl), NWEIGHT60 ## (dbl), BTUEL (dbl), DOLLAREL (dbl), ZBTUEL (fct), BTUNG (dbl), ## DOLLARNG (dbl), ZBTUNG (fct), BTULP (dbl), DOLLARLP (dbl), ZBTULP ## (fct), BTUFO (dbl), DOLLARFO (dbl), ZBTUFO (fct), BTUWOOD (dbl), ## ZBTUWOOD (fct), TOTALBTU (dbl), TOTALDOL (dbl) Viewing this new object provides information about the survey design, such that RECS is an “Unstratified cluster jacknife (JK1) with 60 replicates and MSE variances”. Additionally, the output shows the sampling variables (NWEIGHT1-NWEIGHT60) and then lists the remaining variables in the dataset. This design object is used throughout this book to conduct survey analysis. 4.3 Survey analysis process There is a general process for analyzing data to create estimates with {srvyr} package: Create a tbl_svy object (a survey object) using: as_survey_design() or as_survey_rep() Subset data (if needed) using filter() (to create subpopulations) Specify domains of analysis using group_by() Within summarize(), specify variables to calculate, including means, totals, proportions, quantiles, and more In Section 4.2.3, we follow Step #1 to create the survey design objects for the ANES and RECS data featured in this book. Additional details on how to create design objects can be found in Chapter 10. Then, once we have the design object, we can filter the data to any subpopulation of interest (if needed). It is important to filter the data after creating the design object. This ensures that we are accurately accounting for the survey design in our calculations. Finally, we can use group_by(), summarize(), and other functions from the {survey} and {srvyr} packages to analyze the survey data by estimating means, totals, and so on. 4.4 Similarities between {dplyr} and {srvyr} functions The {dplyr} package from the tidyverse offers flexible and intuitive functions for data wrangling (Wickham et al. 2023). One of the major advantages of using {srvyr} is that it applies {dplyr}-like syntax to the {survey} package (Freedman Ellis and Schneider 2023). We can use pipes, such as %&gt;% from the {magrittr} package, to specify a survey design object, apply a function, and then feed that output into the next function’s first argument (Bache and Wickham 2022). Functions follow the ‘tidy’ convention of snake_case function names. To help explain the similarities between {dplyr} functions and {srvyr} functions, we use the towny dataset from the {gt} package and apistrat data that comes in the {survey} package. The towny dataset provides population data for municipalities in Ontario, Canada on Census years between 1996 and 2021. Taking a look at towny with dplyr::glimpse(), we can see the dataset has 25 columns with a mix of character and numeric data. towny %&gt;% glimpse() ## Rows: 414 ## Columns: 25 ## $ name &lt;chr&gt; &quot;Addington Highlands&quot;, &quot;Adelaide Metc… ## $ website &lt;chr&gt; &quot;https://addingtonhighlands.ca&quot;, &quot;htt… ## $ status &lt;chr&gt; &quot;lower-tier&quot;, &quot;lower-tier&quot;, &quot;lower-ti… ## $ csd_type &lt;chr&gt; &quot;township&quot;, &quot;township&quot;, &quot;township&quot;, &quot;… ## $ census_div &lt;chr&gt; &quot;Lennox and Addington&quot;, &quot;Middlesex&quot;, … ## $ latitude &lt;dbl&gt; 45.00, 42.95, 44.13, 45.53, 43.86, 48… ## $ longitude &lt;dbl&gt; -77.25, -81.70, -79.93, -76.90, -79.0… ## $ land_area_km2 &lt;dbl&gt; 1293.99, 331.11, 371.53, 519.59, 66.6… ## $ population_1996 &lt;int&gt; 2429, 3128, 9359, 2837, 64430, 1027, … ## $ population_2001 &lt;int&gt; 2402, 3149, 10082, 2824, 73753, 956, … ## $ population_2006 &lt;int&gt; 2512, 3135, 10695, 2716, 90167, 958, … ## $ population_2011 &lt;int&gt; 2517, 3028, 10603, 2844, 109600, 864,… ## $ population_2016 &lt;int&gt; 2318, 2990, 10975, 2935, 119677, 969,… ## $ population_2021 &lt;int&gt; 2534, 3011, 10989, 2995, 126666, 954,… ## $ density_1996 &lt;dbl&gt; 1.88, 9.45, 25.19, 5.46, 966.84, 8.81… ## $ density_2001 &lt;dbl&gt; 1.86, 9.51, 27.14, 5.44, 1106.74, 8.2… ## $ density_2006 &lt;dbl&gt; 1.94, 9.47, 28.79, 5.23, 1353.05, 8.2… ## $ density_2011 &lt;dbl&gt; 1.95, 9.14, 28.54, 5.47, 1644.66, 7.4… ## $ density_2016 &lt;dbl&gt; 1.79, 9.03, 29.54, 5.65, 1795.87, 8.3… ## $ density_2021 &lt;dbl&gt; 1.96, 9.09, 29.58, 5.76, 1900.75, 8.1… ## $ pop_change_1996_2001_pct &lt;dbl&gt; -0.0111, 0.0067, 0.0773, -0.0046, 0.1… ## $ pop_change_2001_2006_pct &lt;dbl&gt; 0.0458, -0.0044, 0.0608, -0.0382, 0.2… ## $ pop_change_2006_2011_pct &lt;dbl&gt; 0.0020, -0.0341, -0.0086, 0.0471, 0.2… ## $ pop_change_2011_2016_pct &lt;dbl&gt; -0.0791, -0.0125, 0.0351, 0.0320, 0.0… ## $ pop_change_2016_2021_pct &lt;dbl&gt; 0.0932, 0.0070, 0.0013, 0.0204, 0.058… Let’s examine the towny object’s class. We verify that it is a tibble, as indicated by \"tbl_df\", by running the code below: class(towny) ## [1] &quot;tbl_df&quot; &quot;tbl&quot; &quot;data.frame&quot; All tibbles are data.frames, but not all data.frames are tibbles. Compared to data.frames, tibbles have some advantages, with the printing behavior being a noticeable advantage. When working with tidyverse style code, we recommend making all your datasets tibbles for ease of analysis. The {survey} package contains datasets related to the California Academic Performance Index, which measures student performance in schools with at least 100 students in California. We can access these datasets by loading the {survey} package and running data(api). Let’s work with the apistrat dataset, which is a stratified random sample, stratified by school type (stype) with three levels: E for elementary school, M for middle school, and H for high school. We first create the survey design object (see Chapter 10 for more information). The sample is stratified by the stype variable, and the sampling weights are found in the pw variable. We can use this information to construct the design object, apistrat_des. data(api) apistrat_des &lt;- apistrat %&gt;% as_survey_design(strata = stype, weights = pw) When we check the class of apistrat_des, it is not a typical data.frame. Applying the as_survey_design() function transforms the data into a tbl_svy, a special class specifically for survey design objects. The {srvyr} package is designed to work with the tbl_svy class of objects. class(apistrat_des) ## [1] &quot;tbl_svy&quot; &quot;survey.design2&quot; &quot;survey.design&quot; Let’s look at how {dplyr} works with regular data frames. The example below calculates the mean and median for the land_area_km2 variable in the towny dataset. towny %&gt;% summarize(area_mean = mean(land_area_km2), area_median = median(land_area_km2)) ## # A tibble: 1 × 2 ## area_mean area_median ## &lt;dbl&gt; &lt;dbl&gt; ## 1 373. 273. In the code below, we calculate the mean and median of the variable api00 using apistrat_des. Note the similarity in the syntax. However, the standard error of the statistic is also calculated in addition to the statistic itself. apistrat_des %&gt;% summarize(api00_mean = survey_mean(api00), api00_med = survey_median(api00)) ## # A tibble: 1 × 4 ## api00_mean api00_mean_se api00_med api00_med_se ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 662. 9.54 668 13.7 The functions in {srvyr} also play nicely with other tidyverse functions. For example, if we wanted to select columns with shared characteristics, we can use {tidyselect} functions such as starts_with(), num_range(), etc. (Henry and Wickham 2022). In the examples below, we use a combination of across() and starts_with() to calculate the mean of variables starting with “population” in the towny data frame and those beginning with api in the apistrat_des survey object. towny %&gt;% summarize(across(starts_with(&quot;population&quot;), ~mean(.x, na.rm=TRUE))) ## # A tibble: 1 × 6 ## population_1996 population_2001 population_2006 population_2011 ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 25866. 27538. 29173. 30838. ## # ℹ 2 more variables: population_2016 &lt;dbl&gt;, population_2021 &lt;dbl&gt; apistrat_des %&gt;% summarize(across(starts_with(&quot;api&quot;), survey_mean)) ## # A tibble: 1 × 6 ## api00 api00_se api99 api99_se api.stu api.stu_se ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 662. 9.54 629. 10.1 498. 16.4 We have the flexibility to use {dplyr} verbs such as mutate(), filter(), and select() on our survey design object. As mentioned in Section 4.3, these steps should be performed on the survey design object. This ensures our survey design is properly considered in all our calculations. apistrat_des_mod &lt;- apistrat_des %&gt;% mutate(api_diff = api00 - api99) %&gt;% filter(stype == &quot;E&quot;) %&gt;% select(stype, api99, api00, api_diff, api_students = api.stu) apistrat_des_mod ## Stratified Independent Sampling design (with replacement) ## Called via srvyr ## Sampling variables: ## - ids: `1` ## - strata: stype ## - weights: pw ## Data variables: ## - stype (fct), api99 (int), api00 (int), api_diff (int), api_students ## (int) apistrat_des ## Stratified Independent Sampling design (with replacement) ## Called via srvyr ## Sampling variables: ## - ids: `1` ## - strata: stype ## - weights: pw ## Data variables: ## - cds (chr), stype (fct), name (chr), sname (chr), snum (dbl), dname ## (chr), dnum (int), cname (chr), cnum (int), flag (int), pcttest ## (int), api00 (int), api99 (int), target (int), growth (int), ## sch.wide (fct), comp.imp (fct), both (fct), awards (fct), meals ## (int), ell (int), yr.rnd (fct), mobility (int), acs.k3 (int), ## acs.46 (int), acs.core (int), pct.resp (int), not.hsg (int), hsg ## (int), some.col (int), col.grad (int), grad.sch (int), avg.ed ## (dbl), full (int), emer (int), enroll (int), api.stu (int), pw ## (dbl), fpc (dbl) Several functions in {srvyr} must be called within srvyr::summarize(), with the exception of srvyr::survey_count() and srvyr::survey_tally(). This is similar to how dplyr::count() and dplyr::tally() are not called within dplyr::summarize(). The summarize() function can be used in conjunction with the group_by() function or by/.by arguments, which applies the functions on a group-by-group basis to create grouped summaries. towny %&gt;% group_by(csd_type) %&gt;% dplyr::summarize(area_mean = mean(land_area_km2), area_median = median(land_area_km2)) ## # A tibble: 5 × 3 ## csd_type area_mean area_median ## &lt;chr&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 city 498. 198. ## 2 municipality 607. 488. ## 3 town 183. 129. ## 4 township 363. 301. ## 5 village 23.0 3.3 We use a similar setup to summarize data in {srvyr}: apistrat_des %&gt;% group_by(stype) %&gt;% summarize(api00_mean = survey_mean(api00), api00_median = survey_median(api00)) ## # A tibble: 3 × 5 ## stype api00_mean api00_mean_se api00_median api00_median_se ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 E 674. 12.5 671 20.7 ## 2 H 626. 15.5 635 21.6 ## 3 M 637. 16.6 648 24.1 At this time, the .by argument is srvyr::summarize() does not exist as it does in {dplyr}. An alternative way to do the grouped analysis on the towny data would be: towny %&gt;% dplyr::summarize(area_mean = mean(land_area_km2), area_median = median(land_area_km2), .by=csd_type) ## # A tibble: 5 × 3 ## csd_type area_mean area_median ## &lt;chr&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 township 363. 301. ## 2 town 183. 129. ## 3 municipality 607. 488. ## 4 city 498. 198. ## 5 village 23.0 3.3 However, the .by syntax is not yet available in {srvyr}: apistrat_des %&gt;% summarize(api00_mean = survey_mean(api00), api00_median = survey_median(api00), .by=stype) ## Error in `dplyr::summarise()` at gergness-srvyr-1917f75/R/summarise.r:10:3: ## ℹ In argument: `api00_mean = survey_mean(api00)`. ## ℹ In group 1: `stype = E`. ## Caused by error in `[[&lt;-` at gergness-srvyr-1917f75/R/survey_statistics_helpers.R:48:5: ## ! Assigned data `x` must be compatible with existing data. ## ✖ Existing data has 200 rows. ## ✖ Assigned data has 100 rows. ## ℹ Only vectors of size 1 are recycled. ## Caused by error in `vectbl_recycle_rhs_rows()`: ## ! Can&#39;t recycle input of size 100 to size 200. As mentioned above, {srvyr} functions are meant for tbl_svy objects. Attempting to manipulate data on non-tbl_svy objects, like the towny example shown below, results in an error. Running the code lets us know what the issue is: Survey context not set. towny %&gt;% summarize(area_mean = survey_mean(land_area_km2)) ## Error in `summarize()`: ## ℹ In argument: `area_mean = survey_mean(land_area_km2)`. ## Caused by error in `cur_svy()` at gergness-srvyr-1917f75/R/survey_statistics.r:114:3: ## ! Survey context not set A few functions in {srvyr} have counterparts in {dplyr}, such as srvyr::summarize() and srvyr::group_by(). Unlike {srvyr}-specific verbs, {srvyr} recognizes these parallel functions if applied to a non-survey object. Instead of causing an error, the package provides the equivalent output from {dplyr}: towny %&gt;% srvyr::summarize(area_mean = mean(land_area_km2)) ## # A tibble: 1 × 1 ## area_mean ## &lt;dbl&gt; ## 1 373. Because this book focuses on survey analysis, most of our pipes stem from a survey object. When we load the {dplyr} and {srvyr} packages, the functions automatically figure out the class of data and use the appropriate one from {dplyr} or {srvyr}. Therefore, we do not need to include the namespace for each function (e.g., srvyr::summarize()). References Bache, Stefan Milton, and Hadley Wickham. 2022. magrittr: A Forward-Pipe Operator for R. DeBell, Matthew. 2010. “How to Analyze ANES Survey Data.” ANES Technical Report Series nes012492. Palo Alto, CA: Stanford University; Ann Arbor, MI: the University of Michigan; https://electionstudies.org/wp-content/uploads/2018/05/HowToAnalyzeANESData.pdf. Freedman Ellis, Greg, and Ben Schneider. 2023. srvyr: ’dplyr’-Like Syntax for Summary Statistics of Survey Data. Henry, Lionel, and Hadley Wickham. 2022. tidyselect: Select from a Set of Strings. Lumley, Thomas. 2010. Complex Surveys: A Guide to Analysis Using R. John Wiley; Sons. Recht, Hannah. 2024. censusapi: Retrieve Data from the Census APIs. Robinson, David, Alex Hayes, and Simon Couch. 2023. broom: Convert Statistical Objects into Tidy Tibbles. ———. 2023b. “2020 Residential Energy Consumption Survey: Household Characteristics Technical Documentation Summary.” https://www.eia.gov/consumption/residential/data/2020/pdf/2020%20RECS_Methodology%20Report.pdf. Wickham, Hadley, Mara Averick, Jennifer Bryan, Winston Chang, Lucy D’Agostino McGowan, Romain François, Garrett Grolemund, et al. 2019. “Welcome to the tidyverse.” Journal of Open Source Software 4 (43): 1686. https://doi.org/10.21105/joss.01686. Wickham, Hadley, Romain François, Lionel Henry, Kirill Müller, and Davis Vaughan. 2023. dplyr: A Grammar of Data Manipulation. Note: {broom} is already included in the tidyverse, so no separate installation is required.↩︎ In the United States, presidential elections are held in years divisible by four. In other even years, there are elections at the federal level for congress which are referred to as midterm elections as they occur at the middle of the term of a president.↩︎ "],["c05-descriptive-analysis.html", "Chapter 5 Descriptive analyses 5.1 Introduction 5.2 Counts and cross-tabulations 5.3 Totals and sums 5.4 Means and proportions 5.5 Quantiles and medians 5.6 Ratios 5.7 Correlations 5.8 Standard deviation and variance 5.9 Additional topics 5.10 Exercises", " Chapter 5 Descriptive analyses Prerequisites For this chapter, load the following packages: library(tidyverse) library(srvyr) library(srvyrexploR) library(broom) We are using data from ANES and RECS described in Chapter 4. As a reminder, here is the code to create the design objects for each to use throughout this chapter. For ANES, we need to adjust the weight so it sums to the population instead of the sample (see the ANES documentation and Chapter 4 for more information.) targetpop &lt;- 231592693 anes_adjwgt &lt;- anes_2020 %&gt;% mutate(Weight = Weight / sum(Weight) * targetpop) anes_des &lt;- anes_adjwgt %&gt;% as_survey_design( weights = Weight, strata = Stratum, ids = VarUnit, nest = TRUE ) For RECS, details are included in the RECS documentation and Chapters 4 and 10. recs_des &lt;- recs_2020 %&gt;% as_survey_rep( weights = NWEIGHT, repweights = NWEIGHT1:NWEIGHT60, type = &quot;JK1&quot;, scale = 59/60, mse = TRUE ) 5.1 Introduction Descriptive analyses, such as basic counts, cross-tabulations, or means, are among the first steps in making sense of our survey results. By reviewing the findings, we can glean insight into the data, the underlying population, and any unique aspects of the data or population. For example, if only 10% of the survey respondents are male, it could indicate a unique population, a potential error or bias, an intentional survey sampling method, or other factors. Additionally, descriptive analyses allow us to provide summaries like means, proportions, or other measures to make estimates about the population. These analyses lay the groundwork for the next steps of running statistical tests or developing models. We discuss many different types of descriptive analyses in this chapter. However, it is important to know what type of data we are working with and which statistics are appropriate. In survey data, we typically consider data as one of four main types: Categorical/nominal data: variables with levels or descriptions that cannot be ordered, such as the region of the country (North, South, East, and West) Ordinal data: variables that can be ordered, such as those from a Likert scale (strongly disagree, disagree, agree, and strongly agree) Discrete data: variables that are counted or measured, such as number of children Continuous data, variables that are measured and whose values can lie anywhere on an interval, such as income This chapter discusses how to analyze measures of distribution (e.g., cross-tabulations), central tendency (e.g., means), relationship (e.g., ratios), and dispersion (e.g., standard deviation) using functions from the {srvyr} package (Freedman Ellis and Schneider 2023). Measures of distribution describe how often an event or response occurs. These measures include counts and totals. We cover the following functions: Count of observations (survey_count() and survey_tally()) Summation of variables (survey_total()) Measures of central tendency find the central (or average) responses. These measures include means and medians. We cover the following functions: Means and proportions (survey_mean() and survey_prop()) Quantiles and medians (survey_quantile() and survey_median()) Measures of relationship describe how variables relate to each other. These measures include correlations and ratios. We cover the following functions: Correlations (survey_corr()) Ratios (survey_ratio()) Measures of dispersion describe how data spread around the central tendency for continuous variables. These measures include standard deviations and variances. We cover the following functions: Variances and standard deviations (survey_var() and survey_sd()) To incorporate each of these survey functions, recall the general process for survey estimation from Chapter 4: Create a tbl_svy object using srvyr::as_survey_design() or srvyr::as_survey_rep(). Subset the data for subpopulations using srvyr::filter(), if needed. Specify domains of analysis using srvyr::group_by(), if needed. Analyze the data with survey-specific functions. This chapter walks through how to apply the survey functions in Step 4. Note that unless otherwise specified, our estimates are weighted as a result of setting up the survey design object. To look at the data by different subgroups, we can choose to filter and/or group the data. It is very important that we filter and group the data only after creating the design object. This ensures that the results accurately reflect the survey design. If we filter or group data before creating the survey design object, the data for those cases are not included in the survey design information and estimations of the variance, leading to inaccurate results. For the sake of simplicity, we’ve removed cases with missing values in the examples below. For a more detailed explanation of how to handle missing data, please refer to Chapter 11. 5.2 Counts and cross-tabulations Using survey_count() and survey_tally(), we can calculate the estimated population counts for a given variable or combination of variables. These summaries, often referred to as cross-tabulations or cross-tabs, are applied to categorical data. They help in estimating counts of the population size for different groups based on the survey data. 5.2.1 Syntax The syntax for survey_count() is similar to the dplyr::count() syntax, as mentioned in Chapter 4. However, as noted above, this function can only be called on tbl_svy objects. Let’s explore the syntax: survey_count( x, ..., wt = NULL, sort = FALSE, name = &quot;n&quot;, .drop = dplyr::group_by_drop_default(x), vartype = c(&quot;se&quot;, &quot;ci&quot;, &quot;var&quot;, &quot;cv&quot;) ) The arguments are: x: a tbl_svy object created by as_survey ...: variables to group by, passed to group_by wt: a variable to weight on in addition to the survey weights, defaults to NULL sort: how to sort the variables, defaults to FALSE name: the name of the count variable, defaults to n .drop: whether to drop empty groups vartype: type(s) of variation estimate to calculate including any of c(\"se\", \"ci\", \"var\", \"cv\"), defaults to se (standard error) (see 5.2.1 for more information) To generate a count or cross-tabs by different variables, we include them in the (...) argument. This argument can take any number of variables and breaks down the counts by all combinations of the provided variables. This is similar to dplyr::count(). To obtain an estimate of the overall population, we can exclude any variables from the (...) argument or use the survey_tally() function. While the survey_tally() function has a similar syntax to the survey_count() function, it does not include the (...) or the .drop arguments: survey_tally( x, wt, sort = FALSE, name = &quot;n&quot;, vartype = c(&quot;se&quot;, &quot;ci&quot;, &quot;var&quot;, &quot;cv&quot;) ) Both functions include the vartype argument with four different values: se: standard error The estimated standard deviation of the estimate Output has a column with the variable name specified in the name argument with a suffix of “_se” ci: confidence interval The lower and upper limits of a confidence interval Output has two columns with the variable name specified in the name argument with a suffix of “_low” and “_upp” By default, this is a 95% confidence interval but can be changed by using the argument level and specifying a number between 0 and 1. For example, level=0.8 would produce a 80% confidence interval. var: variance The estimated variance of the estimate Output has a column with the variable name specified in the name argument with a suffix of “_var” cv: coefficient of variation A ratio of the standard error and the estimate Output has a column with the variable name specified in the name argument with a suffix of “_cv” The confidence intervals are always calculated using a symmetric t-distribution based method, given by the formula: \\[ \\text{estimate} \\pm t^*_{df}\\times SE\\] where \\(t^*_{df}\\) is the critical value from a t-distribution based on the confidence level and the degrees of freedom. By default, the degrees of freedom are based on the design or number of replicates, but they can be specified using the df argument. For survey design objects, the degrees of freedom are calculated as the number of primary sampling units (PSUs or clusters) minus the number of strata (see Chapter 10 for more information on PSUs, strata, and sample designs.) For replicate-based objects, the degrees of freedom are calculated as one less than the rank of the matrix of replicate weight, where the number of replicates is typically the rank. Note that specifying df = Inf is equivalent to using a normal (z-based) confidence interval – this is the default in {survey}. These variability types are the same for most of the survey functions, and we provide examples using different variability types throughout this chapter. 5.2.2 Examples Example 1: Estimated population count If we want to obtain the estimated number of households in the U.S. (the population of interest) using the Residential Energy Consumption Survey (RECS) data, we can use survey_count(). If we do not specify any variables in the survey_count() function, it outputs the estimated population count (n) and its corresponding standard error (n_se.) recs_des %&gt;% survey_count() ## # A tibble: 1 × 2 ## n n_se ## &lt;dbl&gt; &lt;dbl&gt; ## 1 123529025. 0.148 Based on this calculation, the estimated number of households in the U.S. is 123,529,025. Alternatively, we could also use the survey_tally() function. The example below yields the same results as survey_count(). recs_des %&gt;% survey_tally() ## # A tibble: 1 × 2 ## n n_se ## &lt;dbl&gt; &lt;dbl&gt; ## 1 123529025. 0.148 Example 2: Estimated counts by subgroups (cross-tabs) To calculate the estimated number of observations for specific subgroups, such as Region and Division, we can include the variables of interest in the survey_count() function. In the example below, we calculate the estimated number of housing units by region and division. The argument name = in survey_count() allows us to change the name of the count variable in the output from the default n to N. recs_des %&gt;% survey_count(Region, Division, name = &quot;N&quot;) ## # A tibble: 10 × 4 ## Region Division N N_se ## &lt;fct&gt; &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Northeast New England 5876166 0.0000000137 ## 2 Northeast Middle Atlantic 16043503 0.0000000487 ## 3 Midwest East North Central 18546912 0.000000437 ## 4 Midwest West North Central 8495815 0.0000000177 ## 5 South South Atlantic 24843261 0.0000000418 ## 6 South East South Central 7380717. 0.114 ## 7 South West South Central 14619094 0.000488 ## 8 West Mountain North 4615844 0.119 ## 9 West Mountain South 4602070 0.0000000492 ## 10 West Pacific 18505643. 0.00000295 When we run the cross-tab, we see there are an estimated 5,876,166 housing units in the New England Division. The code results in an error if we try to use the survey_count() syntax with survey_tally(): recs_des %&gt;% survey_tally(Region, Division, name = &quot;N&quot;) ## Error in `dplyr::summarise()` at gergness-srvyr-1917f75/R/summarise.r:10:3: ## ℹ In argument: `N = survey_total(Region, vartype = vartype, ## na.rm = TRUE)`. ## Caused by error: ## ! Factor not allowed in survey functions, should be used as a grouping variable. Use a group_by() function prior to using survey_tally() to successfully run the cross-tab: recs_des %&gt;% group_by(Region, Division) %&gt;% survey_tally(name = &quot;N&quot;) ## # A tibble: 10 × 4 ## # Groups: Region [4] ## Region Division N N_se ## &lt;fct&gt; &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Northeast New England 5876166 0.0000000137 ## 2 Northeast Middle Atlantic 16043503 0.0000000487 ## 3 Midwest East North Central 18546912 0.000000437 ## 4 Midwest West North Central 8495815 0.0000000177 ## 5 South South Atlantic 24843261 0.0000000418 ## 6 South East South Central 7380717. 0.114 ## 7 South West South Central 14619094 0.000488 ## 8 West Mountain North 4615844 0.119 ## 9 West Mountain South 4602070 0.0000000492 ## 10 West Pacific 18505643. 0.00000295 5.3 Totals and sums The survey_total() function is analogous to sum. It can be applied to continuous variables to obtain the estimated total quantity in a population. Starting from this point in the chapter, all the introduced functions must be called within summarize(). 5.3.1 Syntax Here is the syntax: survey_total( x, na.rm = FALSE, vartype = c(&quot;se&quot;, &quot;ci&quot;, &quot;var&quot;, &quot;cv&quot;), level = 0.95, deff = FALSE, df = NULL ) The arguments are: x: a variable, expression, or empty na.rm: an indicator of whether missing values should be dropped, defaults to FALSE vartype: type(s) of variation estimate to calculate including any of c(\"se\", \"ci\", \"var\", \"cv\"), defaults to se (standard error) (see 5.2.1 for more information) level: a number or a vector indicating the confidence level, defaults to 0.95 deff: a logical value stating whether the design effect should be returned, defaults to FALSE (this is described in more detail in Section 5.9.3) df: (for vartype = 'ci'), a numeric value indicating degrees of freedom for the t-distribution 5.3.2 Examples Example 1: Estimated population count To calculate a population count estimate with survey_total(), we leave the argument x empty, as shown in the example below: recs_des %&gt;% summarize(Tot = survey_total()) ## # A tibble: 1 × 2 ## Tot Tot_se ## &lt;dbl&gt; &lt;dbl&gt; ## 1 123529025. 0.148 The estimated number of households in the U.S. is 123,529,025. Note that this result obtained from survey_total() is equivalent to the ones from the survey_count() and survey_tally() functions. However, the survey_total() function is called within summarize(), whereas survey_count() and survey_tally() are not. Example 2: Overall summation of continuous variables The distinction between survey_total() and survey_count() becomes more evident when working with continuous variables. Let’s compute the total cost of electricity in whole dollars from variable DOLLAREL4. recs_des %&gt;% summarize(elec_bill = survey_total(DOLLAREL)) ## # A tibble: 1 × 2 ## elec_bill elec_bill_se ## &lt;dbl&gt; &lt;dbl&gt; ## 1 170473527909. 664893504. It is estimated that American residential households spent a total of $170,473,527,909 on electricity in 2020, and the estimate has a standard error of $664,893,504. Example 3: Summation by groups Since we are using the {srvyr} package, we can use group_by() to calculate the cost of electricity for different groups. Let’s examine the variations in the cost of electricity in whole dollars across regions and display the confidence interval instead of the default standard error. recs_des %&gt;% group_by(Region) %&gt;% summarize(elec_bill = survey_total(DOLLAREL, vartype = &quot;ci&quot;)) ## # A tibble: 4 × 4 ## Region elec_bill elec_bill_low elec_bill_upp ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Northeast 29430369947. 28788987554. 30071752341. ## 2 Midwest 34972544751. 34339576041. 35605513460. ## 3 South 72496840204. 71534780902. 73458899506. ## 4 West 33573773008. 32909111702. 34238434313. The survey results estimate that households in the Northeast spent $29,430,369,947 with a confidence interval of ($28,788,987,554, $30,071,752,341) on electricity in 2020, while households in the South spent an estimated $72,496,840,204 with a confidence interval of ($71,534,780,902, $73,458,899,506.) 5.4 Means and proportions Means and proportions form the foundation of many research studies. These estimates are often the first things we look for when reviewing research on a given topic. The survey_mean() and survey_prop() functions calculate means and proportions while taking into account the survey design elements. The survey_mean() function should be used on continuous variables of survey data, while the survey_prop() function should be used on categorical variables. 5.4.1 Syntax The syntax for both means and proportions are very similar: survey_mean( x, na.rm = FALSE, vartype = c(&quot;se&quot;, &quot;ci&quot;, &quot;var&quot;, &quot;cv&quot;), level = 0.95, proportion = FALSE, prop_method = c(&quot;logit&quot;, &quot;likelihood&quot;, &quot;asin&quot;, &quot;beta&quot;, &quot;mean&quot;), deff = FALSE, df = NULL ) survey_prop( na.rm = FALSE, vartype = c(&quot;se&quot;, &quot;ci&quot;, &quot;var&quot;, &quot;cv&quot;), level = 0.95, proportion = TRUE, prop_method = c(&quot;logit&quot;, &quot;likelihood&quot;, &quot;asin&quot;, &quot;beta&quot;, &quot;mean&quot;, &quot;xlogit&quot;), deff = FALSE, df = NULL ) Both functions have the following arguments and defaults: na.rm: an indicator of whether missing values should be dropped, defaults to FALSE vartype: type(s) of variation estimate to calculate including any of c(\"se\", \"ci\", \"var\", \"cv\"), defaults to se (standard error) (see 5.2.1 for more information) level: a number or a vector indicating the confidence level, defaults to 0.95 prop_method: Method to calculate the confidence interval for confidence intervals deff: a logical value stating whether the design effect should be returned, defaults to FALSE (this is described in more detail in Section 5.9.3) df: (for vartype = 'ci'), a numeric value indicating degrees of freedom for the t-distribution There are two main differences in the syntax. The survey_mean() function includes the first argument x, representing the variable or expression on which the mean should be calculated. The survey_prop() does not have an argument to include the variables directly. Instead, prior to summarize(), we must use the group_by() function to specify the variables of interest for survey_prop(). For survey_mean(), including a group_by() function allows us to obtain the means by different groups. The other main difference is with the proportion argument. The survey_mean() function can be used to calculate both means and proportions. Its proportion argument defaults to FALSE, indicating it is used for calculating means. If we wish to calculate a proportion using survey_mean(), we need to set the proportion argument to TRUE. In the survey_prop() function, the proportion argument defaults to TRUE because the function is specifically designed for calculating proportions. In Section 5.2.1, we provide an overview of different variability types. The confidence interval used for most measures, such as means and counts, is referred to as a Wald-type interval. However, for proportions, a Wald-type interval with a symmetric t-based confidence interval may not provide accurate coverage, especially when dealing with small sample sizes or proportions “near” 0 or 1. We can use other methods to calculate confidence intervals, which we specify using the prop_method option in survey_prop(). The options include: logit: fits a logistic regression model and computes a Wald-type interval on the log-odds scale, which is then transformed to the probability scale. This is the default method. likelihood: uses the (Rao-Scott) scaled chi-squared distribution for the log-likelihood from a binomial distribution. asin: uses the variance-stabilizing transformation for the binomial distribution, the arcsine square root, and then back-transforms the interval to the probability scale beta: uses the incomplete beta function with an effective sample size based on the estimated variance of the proportion. mean: the Wald-type interval (\\(\\pm t_{df}^*\\times SE\\)) xlogit: uses a logit transformation of the proportion, calculates a Wald-type interval, and then back-transforms to the probability scale. This method is the same as those used by default in SUDAAN and SPSS. Each option yields slightly different confidence interval bounds when dealing with proportions. Please note that when working with survey_mean(), we do not need to specify a method unless the proportion argument is TRUE. If proportion is FALSE, it calculates a symmetric mean type of confidence interval. 5.4.2 Examples Example 1: One variable proportion If we are interested in obtaining the proportion of people in each region in the RECS data, we can use group_by() and survey_prop() as shown below: recs_des %&gt;% group_by(Region) %&gt;% summarize(p = survey_prop()) ## # A tibble: 4 × 3 ## Region p p_se ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Northeast 0.177 0.000000000212 ## 2 Midwest 0.219 0.000000000262 ## 3 South 0.379 0.000000000740 ## 4 West 0.224 0.000000000816 17.7% of the households are in the Northeast, 21.9% in the Midwest, and so on. Note that the proportions in column p add up to one. The survey_prop() function is essentially the same as using survey_mean() with a categorical variable and without specifying a numeric variable in the x argument. The following code gives us the same results as above: recs_des %&gt;% group_by(Region) %&gt;% summarize(p = survey_mean()) ## # A tibble: 4 × 3 ## Region p p_se ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Northeast 0.177 0.000000000212 ## 2 Midwest 0.219 0.000000000262 ## 3 South 0.379 0.000000000740 ## 4 West 0.224 0.000000000816 Example 2: Conditional proportions We can also obtain proportions by more than one variable. In the following example, we look at the proportion of housing units by Region and whether air conditioning (A/C) is used (ACUsed.)5 recs_des %&gt;% group_by(Region, ACUsed) %&gt;% summarize(p = survey_prop()) ## # A tibble: 8 × 4 ## # Groups: Region [4] ## Region ACUsed p p_se ## &lt;fct&gt; &lt;lgl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Northeast FALSE 0.110 0.00590 ## 2 Northeast TRUE 0.890 0.00590 ## 3 Midwest FALSE 0.0666 0.00508 ## 4 Midwest TRUE 0.933 0.00508 ## 5 South FALSE 0.0581 0.00278 ## 6 South TRUE 0.942 0.00278 ## 7 West FALSE 0.255 0.00759 ## 8 West TRUE 0.745 0.00759 When specifying multiple variables, the proportions are conditional. In the results above, notice that the proportions sum to 1 within each region. This can be interpreted as the proportion of housing units with A/C within each region. For example, in the Northeast region, approximately 11.0% of housing units don’t have A/C, while around 89.0% have A/C. Example 3: Joint proportions If we’re interested in a joint proportion, we use the interact() function. In the example below, we apply the interact() function to Region and ACUsed: recs_des %&gt;% group_by(interact(Region, ACUsed)) %&gt;% summarize(p = survey_prop()) ## # A tibble: 8 × 4 ## Region ACUsed p p_se ## &lt;fct&gt; &lt;lgl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Northeast FALSE 0.0196 0.00105 ## 2 Northeast TRUE 0.158 0.00105 ## 3 Midwest FALSE 0.0146 0.00111 ## 4 Midwest TRUE 0.204 0.00111 ## 5 South FALSE 0.0220 0.00106 ## 6 South TRUE 0.357 0.00106 ## 7 West FALSE 0.0573 0.00170 ## 8 West TRUE 0.167 0.00170 In this case, all proportions sum to 1, not just within regions. This means that 15.8% of the population lives in the Northeast and has A/C. As noted earlier, we can use both the survey_prop() and survey_mean() functions, and they produce the same results. Example 4: Overall mean Below, we calculate the estimated average cost of electricity in the U.S. using survey_mean(). To include both the standard error and the confidence interval, we can include them in the vartype argument: recs_des %&gt;% summarize(elec_bill = survey_mean(DOLLAREL, vartype = c(&quot;se&quot;, &quot;ci&quot;))) ## # A tibble: 1 × 4 ## elec_bill elec_bill_se elec_bill_low elec_bill_upp ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 1380. 5.38 1369. 1391. Nationally, the average household spent $1,380 in 2020. Example 5: Means by subgroup We can also calculate the estimated average cost of electricity in the U.S. by each region. To do this, we include a group_by() function with the variable of interest before the summarize() function: recs_des %&gt;% group_by(Region) %&gt;% summarize(elec_bill = survey_mean(DOLLAREL)) ## # A tibble: 4 × 3 ## Region elec_bill elec_bill_se ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Northeast 1343. 14.6 ## 2 Midwest 1293. 11.7 ## 3 South 1548. 10.3 ## 4 West 1211. 12.0 Households from the West spent approximately $1,211, while in the South, the average spending was $1,548. 5.5 Quantiles and medians To better understand the distribution of a continuous variable like income, we can calculate quantiles at specific points. For example, computing estimates of the quartiles (25%, 50%, 75%) helps us understand how income is spread across the population. We use the survey_quantile() function to calculate quantiles in survey data. Medians are useful for finding the midpoint of a continuous distribution when the data are skewed, as medians are less affected by outliers compared to means. The median is the same as the 50th percentile, meaning the value where 50% of the data are higher and 50% are lower. Because medians are a special, common case of quantiles, we have a dedicated function called survey_median() for calculating the median in survey data. Alternatively, we can use the survey_quantile() function with the quantiles argument set to 0.5 to achieve the same result. 5.5.1 Syntax The syntax for survey_quantile() and survey_median() are nearly identical: survey_quantile( x, quantiles, na.rm = FALSE, vartype = c(&quot;se&quot;, &quot;ci&quot;, &quot;var&quot;, &quot;cv&quot;), level = 0.95, interval_type = c(&quot;mean&quot;, &quot;beta&quot;, &quot;xlogit&quot;, &quot;asin&quot;, &quot;score&quot;, &quot;quantile&quot;), qrule = c(&quot;math&quot;, &quot;school&quot;, &quot;shahvaish&quot;, &quot;hf1&quot;, &quot;hf2&quot;, &quot;hf3&quot;, &quot;hf4&quot;, &quot;hf5&quot;, &quot;hf6&quot;, &quot;hf7&quot;, &quot;hf8&quot;, &quot;hf9&quot;), df = NULL ) survey_median( x, na.rm = FALSE, vartype = c(&quot;se&quot;, &quot;ci&quot;, &quot;var&quot;, &quot;cv&quot;), level = 0.95, interval_type = c(&quot;mean&quot;, &quot;beta&quot;, &quot;xlogit&quot;, &quot;asin&quot;, &quot;score&quot;, &quot;quantile&quot;), qrule = c(&quot;math&quot;, &quot;school&quot;, &quot;shahvaish&quot;, &quot;hf1&quot;, &quot;hf2&quot;, &quot;hf3&quot;, &quot;hf4&quot;, &quot;hf5&quot;, &quot;hf6&quot;, &quot;hf7&quot;, &quot;hf8&quot;, &quot;hf9&quot;), df = NULL ) The arguments available in both functions are: x: a variable, expression, or empty na.rm: an indicator of whether missing values should be dropped, defaults to FALSE vartype: type(s) of variation estimate to calculate, defaults to se (standard error) level: a number or a vector indicating the confidence level, defaults to 0.95 interval_type: method for calculating a confidence interval qrule: rule for defining quantiles. The default is the lower end of the quantile interval (“math”.) The midpoint of the quantile interval is the “school” rule. “hf1” to “hf9” are weighted analogs to type=1 to 9 in quantile(). “shahvaish” corresponds to a rule proposed by Shah and Vaish (2006). See vignette(\"qrule\", package=\"survey\") for more information. df: (for vartype = 'ci'), a numeric value indicating degrees of freedom for the t-distribution The only difference between survey_quantile() and survey_median() is the inclusion of the quantiles argument in the survey_quantile() function. This argument takes a vector with values between 0 and 1 to indicate which quantiles to calculate. For example, if we wanted the quartiles of a variable, we would provide quantiles = c(0.25, 0.5, 0.75). While we can specify quantiles of 0 and 1, which represent the minimum and maximum, this is not recommended. It only returns the minimum and maximum of the respondents and cannot be extrapolated to the population as there is no valid definition of standard error. In Section 5.2.1, we provide an overview of the different variability types. The interval used in confidence intervals for most measures, such as means and counts, is referred to as a Wald-type interval. However, this is not always the most accurate interval for quantiles. Similar to confidence intervals for proportions, quantiles have various interval types, including asin, beta, mean, and xlogit (see Section 5.4.1.) Quantiles also have two more methods available: score: the Francisco and Fuller confidence interval based on inverting a score test (only available for design-based survey objects and not replicate-based objects) quantile: based on the replicates of the quantile. This is not valid for jackknife-type replicates but is available for bootstrap and BRR replicates. One note with the score method is that when there are numerous ties in the data, this method may produce confidence intervals that do not contain the estimate. When dealing with a high propensity for ties (e.g., many respondents are the same age), it is recommended to use another method. SUDAAN, for example, uses the score method but adds noise to the values to prevent issues. The documentation in the {survey} package indicates, in general, that the score method may have poorer performance compared to the beta and logit intervals (Lumley 2010). 5.5.2 Examples Example 1: Overall quartiles Quantiles provide insights into the distribution of a variable. Let’s look into the quartiles, specifically, the first quartile (p=0.25), the median (p=0.5), and the third quartile (p=0.75) of electric bills. recs_des %&gt;% summarize(elec_bill = survey_quantile(DOLLAREL, quantiles = c(0.25, .5, 0.75))) ## # A tibble: 1 × 6 ## elec_bill_q25 elec_bill_q50 elec_bill_q75 elec_bill_q25_se ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 795. 1215. 1770. 5.69 ## elec_bill_q50_se elec_bill_q75_se ## &lt;dbl&gt; &lt;dbl&gt; ## 1 6.33 9.99 The output above shows the values for the three quartiles of electric bill costs and their respective standard errors: the 25th percentile is $795 with a standard error of $5.69, the 50th percentile (median) is $1,215 with a standard error of $6.33, and the 75th percentile is $1,770 with a standard error of $9.99. Example 2: Quartiles by subgroup We can estimate the quantiles of electric bills by region by using the group_by() function: recs_des %&gt;% group_by(Region) %&gt;% summarize(elec_bill = survey_quantile(DOLLAREL, quantiles = c(0.25, .5, 0.75))) ## # A tibble: 4 × 7 ## Region elec_bill_q25 elec_bill_q50 elec_bill_q75 elec_bill_q25_se ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Northeast 740. 1148. 1712. 13.7 ## 2 Midwest 769. 1149. 1632. 8.88 ## 3 South 968. 1402. 1945. 10.6 ## 4 West 623. 1028. 1568. 10.8 ## elec_bill_q50_se elec_bill_q75_se ## &lt;dbl&gt; &lt;dbl&gt; ## 1 16.6 25.8 ## 2 11.6 18.6 ## 3 9.17 13.9 ## 4 14.3 20.5 The 25th percentile for the Northeast region is $740, while it is $968 for the South. Example 3: Minimum and maximum As mentioned in the syntax section, we can specify quantiles of 0 (minimum) and 1 (maximum), and R calculates these values. However, these are only the minimum and maximum values in the data, and there is not enough information to determine their standard errors: recs_des %&gt;% summarize(elec_bill = survey_quantile(DOLLAREL, quantiles = c(0, 1))) ## # A tibble: 1 × 4 ## elec_bill_q00 elec_bill_q100 elec_bill_q00_se elec_bill_q100_se ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 -151. 15680. NaN 0 The minimum cost of electricity in the dataset is -$151 while the maximum is $15,680, but the standard error is shown as NaN and 0, respectively. Notice that the minimum cost is a negative number. This may be surprising, but some housing units with solar power sell their energy back to the grid and earn money, which is recorded as a negative expenditure. Example 4: Overall median We can calculate the estimated median cost of electricity in the U.S. using the survey_median() function: recs_des %&gt;% summarize(elec_bill = survey_median(DOLLAREL)) ## # A tibble: 1 × 2 ## elec_bill elec_bill_se ## &lt;dbl&gt; &lt;dbl&gt; ## 1 1215. 6.33 Nationally, the median household spent $1,215 in 2020. This is the same result as we obtained using the survey_quantile() function. Interestingly, the average electric bill for households that we calculated in Section 5.4 is $1,380, but the estimated median electric bill is $1,215, indicating the distribution is likely right-skewed. Example 5: Medians by subgroup We can calculate the estimated median cost of electricity in the U.S. by region using the group_by() function with the variable(s) of interest before the summarize() function, similar to when we found the mean by region. recs_des %&gt;% group_by(Region) %&gt;% summarize(elec_bill = survey_median(DOLLAREL)) ## # A tibble: 4 × 3 ## Region elec_bill elec_bill_se ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Northeast 1148. 16.6 ## 2 Midwest 1149. 11.6 ## 3 South 1402. 9.17 ## 4 West 1028. 14.3 We estimate that households in the Northeast spent a median of $1,148 on electricity, and in the South, they spent a median of $1,402. 5.6 Ratios A ratio is a measure of the ratio of the sum of two variables, specifically in the form of: \\[ \\frac{\\sum x_i}{\\sum y_i}.\\] Note that the ratio is not the same as calculating the following: \\[ \\frac{1}{N} \\sum \\frac{x_i}{y_i} \\] which can be calculated with survey_mean() by creating a derived variable \\(z=x/y\\) and then calculating the mean of \\(z\\). Say we wanted to assess the energy efficiency of homes in a standardized way, where we can compare homes of different sizes. We can calculate the ratio of energy consumption to the square footage of a home. This helps us meaningfully compare homes of different sizes by identifying how much energy is being used per unit of space. To calculate this ratio, we would run survey_ratio(Energy Consumption in BTUs, Square Footage of Home). If, instead, we used survey_mean(Energy Consumption in BTUs/Square Footage of Home), we would estimate the average energy consumption per square foot of all surveyed homes. While helpful in understanding general energy use, this statistic does not account for differences in home sizes. 5.6.1 Syntax The syntax for survey_ratio() is as follows: survey_ratio( numerator, denominator, na.rm = FALSE, vartype = c(&quot;se&quot;, &quot;ci&quot;, &quot;var&quot;, &quot;cv&quot;), level = 0.95, deff = FALSE, df = NULL ) The arguments are: numerator: The numerator of the ratio denominator: The denominator of the ratio na.rm: A logical value to indicate whether missing values should be dropped vartype: type(s) of variation estimate to calculate including any of c(\"se\", \"ci\", \"var\", \"cv\"), defaults to se (standard error) (see 5.2.1 for more information) level: A single number or vector of numbers indicating the confidence level deff: A logical value to indicate whether the design effect should be returned (this is described in more detail in Section 5.9.3) df: (For vartype = “ci” only) A numeric value indicating the degrees of freedom for t-distribution 5.6.2 Examples Example 1: Overall ratios Suppose we wanted to find the ratio of dollars spent on liquid propane per unit (in British thermal unit [Btu]) nationally6. To find the average cost to a household, we can use survey_mean(). However, to find the national unit rate, we can use survey_ratio(). In the following example, we show both methods and discuss the interpretation of each: recs_des %&gt;% summarize( DOLLARLP_Tot = survey_total(DOLLARLP, vartype = NULL), BTULP_Tot = survey_total(BTULP, vartype = NULL), DOL_BTU_Rat = survey_ratio(DOLLARLP, BTULP), DOL_BTU_Avg = survey_mean(DOLLARLP / BTULP, na.rm = TRUE) ) ## # A tibble: 1 × 6 ## DOLLARLP_Tot BTULP_Tot DOL_BTU_Rat DOL_BTU_Rat_se DOL_BTU_Avg ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 8122911173. 391425311586. 0.0208 0.000240 0.0240 ## DOL_BTU_Avg_se ## &lt;dbl&gt; ## 1 0.000223 The ratio of the total spent on liquid propane to the total consumption was 0.0208, but the average rate was 0.024. With a bit of calculation, we can show that the ratio is the ratio of the totals DOLLARLP_Tot/BTULP_Tot=8,122,911,173/391,425,311,586=0.0208. Although the estimated ratio can be calculated manually in this manner, the standard error requires the use of the survey_ratio() function. The average can be interpreted as the average rate paid by a household. Example 2: Ratios by subgroup As previously done with other estimates, we can use group_by() to examine whether this ratio varies by region. recs_des %&gt;% group_by(Region) %&gt;% summarize(DOL_BTU_Rat = survey_ratio(DOLLARLP, BTULP)) %&gt;% arrange(DOL_BTU_Rat) ## # A tibble: 4 × 3 ## Region DOL_BTU_Rat DOL_BTU_Rat_se ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Midwest 0.0158 0.000240 ## 2 South 0.0245 0.000388 ## 3 West 0.0246 0.000875 ## 4 Northeast 0.0247 0.000488 Although not a formal statistical test, it appears that the cost ratios for liquid propane are the lowest in the Midwest (0.0158.) 5.7 Correlations The correlation is a measure of the linear relationship between two continuous variables, which ranges between -1 and 1. The most commonly used method is Pearson’s correlation (referred to as correlation henceforth.) A sample correlation for a simple random sample is calculated as follows: \\[\\frac{\\sum (x_i-\\bar{x})(y_i-\\bar{y})}{\\sqrt{\\sum (x_i-\\bar{x})^2} \\sqrt{\\sum(y_i-\\bar{y})^2}} \\] When using survey_corr() for designs other than a simple random sample, the weights are applied when estimating the correlation. 5.7.1 Syntax The syntax for survey_corr() is as follows: survey_corr( x, y, na.rm = FALSE, vartype = c(&quot;se&quot;, &quot;ci&quot;, &quot;var&quot;, &quot;cv&quot;), level = 0.95, df = NULL ) The arguments are: x: A variable or expression y: A variable or expression na.rm: A logical value to indicate whether missing values should be dropped vartype: type(s) of variation estimate to calculate including any of c(\"se\", \"ci\", \"var\", \"cv\"), defaults to se (standard error) (see 5.2.1 for more information) level: (For vartype = “ci” only) A single number or vector of numbers indicating the confidence level df: (For vartype = “ci” only) A numeric value indicating the degrees of freedom for t-distribution 5.7.2 Examples Example 1: Overall correlation We can calculate the correlation between the total square footage of homes (TOTSQFT_EN)7 and electricity consumption (BTUEL.)8 recs_des %&gt;% summarize(SQFT_Elec_Corr = survey_corr(TOTSQFT_EN, BTUEL)) ## # A tibble: 1 × 2 ## SQFT_Elec_Corr SQFT_Elec_Corr_se ## &lt;dbl&gt; &lt;dbl&gt; ## 1 0.417 0.00689 The correlation between the total square footage of homes and electricity consumption is 0.417, indicating a moderate positive relationship. Example 2: Correlations by subgroup We can explore the correlation between total square footage and electricity consumption based on subgroups, such as whether air conditioning (A/C) is used (ACUsed.) recs_des %&gt;% group_by(ACUsed) %&gt;% summarize(SQFT_Elec_Corr = survey_corr(TOTSQFT_EN, DOLLAREL)) ## # A tibble: 2 × 3 ## ACUsed SQFT_Elec_Corr SQFT_Elec_Corr_se ## &lt;lgl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 FALSE 0.290 0.0240 ## 2 TRUE 0.401 0.00808 For homes without A/C, there is a small positive correlation between total square footage with electricity consumption (0.29.) For homes with A/C, the correlation of 0.401 indicates a stronger positive correlation between total square footage and electricity consumption. 5.8 Standard deviation and variance All survey functions produce an estimate of the variability of a given estimate. No additional function is needed when dealing with variable estimates. However, if we are specifically interested in population variance and standard deviation, we can use the survey_var() and survey_sd() functions. In our experience, it is not common practice to use these functions. They can be used when designing a future study to gauge population variability and inform sampling precision. 5.8.1 Syntax As with non-survey data, the standard deviation estimate is the square root of the variance estimate. Therefore, the survey_var() and survey_sd() functions share the same arguments, except the standard deviation does not allow the usage of vartype. survey_var( x, na.rm = FALSE, vartype = c(&quot;se&quot;, &quot;ci&quot;, &quot;var&quot;), level = 0.95, df = NULL ) survey_sd( x, na.rm = FALSE ) The arguments are: x: A variable or expression, or empty na.rm: A logical value to indicate whether missing values should be dropped vartype: type(s) of variation estimate to calculate including any of c(\"se\", \"ci\", \"var\"), defaults to se (standard error) (see 5.2.1 for more information) level: (For vartype = “ci” only) A single number or vector of numbers indicating the confidence level. df: (For vartype = “ci” only) A numeric value indicating the degrees of freedom for t-distribution 5.8.2 Examples Example 1: Overall variability Let’s return to electricity bills and explore the variability in electricity expenditure. recs_des %&gt;% summarize(var_elbill = survey_var(DOLLAREL), sd_elbill = survey_sd(DOLLAREL)) ## # A tibble: 1 × 3 ## var_elbill var_elbill_se sd_elbill ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 704906. 13926. 840. We may encounter a warning related to deprecated underlying calculations performed by the survey_var() function. This warning is a result of changes in the way R handles recycling in vectorized operations. The results are still valid. They give an estimate of the population variance of electricity bills (var_elbill), the standard error of that variance (var_elbill_se), and the estimated population standard deviation of electricity bills (sd_elbill.) Note that no standard error is associated with the standard deviation - this is the only estimate that does not include a standard error. Example 2: Variability by subgroup To find out if the variability in electricity expenditure is similar across regions, we can calculate the variance by region using group_by(): recs_des %&gt;% group_by(Region) %&gt;% summarize(var_elbill = survey_var(DOLLAREL), sd_elbill = survey_sd(DOLLAREL)) ## # A tibble: 4 × 4 ## Region var_elbill var_elbill_se sd_elbill ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Northeast 775450. 38843. 881. ## 2 Midwest 552423. 25252. 743. ## 3 South 702521. 30641. 838. ## 4 West 717886. 30597. 847. 5.9 Additional topics 5.9.1 Unweighted analysis Sometimes, it is helpful to calculate an unweighted estimate of a given variable. For this, we use the unweighted() function in the summarize() function. The unweighted() function calculates unweighted summaries from a tbl_svy object, providing the summary among the respondents without extrapolating to a population estimate. The unweighted() function can be used in conjunction with any {dplyr} functions. Here is an example looking at the average household electricity cost: recs_des %&gt;% summarize(elec_bill = survey_mean(DOLLAREL), elec_unweight = unweighted(mean(DOLLAREL))) ## # A tibble: 1 × 3 ## elec_bill elec_bill_se elec_unweight ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 1380. 5.38 1425. It is estimated that American residential households spent an average of $1,380 on electricity in 2020, and the estimate has a standard error of $5.38. The unweighted() function calculates the unweighted average and represents the average amount of money spent on electricity in 2020 by the respondents, which was $1,425. 5.9.2 Subpopulation analysis We mentioned using filter() to subset a survey object for analysis. This operation should be done after creating the survey design object. Subsetting data before creating the object can lead to incorrect variability estimates if subsetting removes an entire Primary Sampling Unit (PSU; see Chapter 10 for more information on PSUs and sample designs.) Suppose we want estimates of the average amount spent on natural gas among housing units using natural gas (based on the variable BTUNG.)9 We first filter records to only include records where BTUNG &gt; 0 and then find the average amount spent. recs_des %&gt;% filter(BTUNG &gt; 0) %&gt;% summarize(NG_mean = survey_mean(DOLLARNG, vartype = c(&quot;se&quot;, &quot;ci&quot;))) ## # A tibble: 1 × 4 ## NG_mean NG_mean_se NG_mean_low NG_mean_upp ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 631. 4.64 621. 640. The estimated average amount spent on natural gas among households that use natural gas is $631. Let’s compare this to the mean when we do not filter. recs_des %&gt;% summarize(NG_mean = survey_mean(DOLLARNG, vartype = c(&quot;se&quot;, &quot;ci&quot;))) ## # A tibble: 1 × 4 ## NG_mean NG_mean_se NG_mean_low NG_mean_upp ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 382. 3.41 375. 389. Based on this calculation, the estimated average amount spent on natural gas is $382. Note that applying the filter to include only housing units that use natural gas yields a higher mean than when not applying the filter. This is because including housing units that do not use natural gas introduces many $0 amounts, impacting the mean calculation. 5.9.3 Design effects The design effect measures how the precision of an estimate is influenced by the sampling design. In other words, it measures how much more or less statistically efficient the survey design is compared to a simple random sample (SRS.) It is computed by taking the ratio of the estimate’s variance under the design at hand to the estimate’s variance under a simple random sample without replacement. A design effect less than 1 indicates that the design is more statistically efficient than an SRS design, which is rare but possible in a stratified sampling design where the outcome correlates with the stratification variable(s). A design effect greater than 1 indicates that the design is less statistically efficient than a SRS design. From a design effect, we can calculate the effective sample size as follows: \\[n_{eff}=\\frac{n}{D_{eff}} \\] where \\(n\\) is the nominal sample size (the number of survey responses) and \\(D_{eff}\\) is the estimated design effect. We can interpret the effective sample size \\(n_{eff}\\) as the hypothetical sample size that a survey using an SRS design would need to achieve the same precision as the design at hand. Design effects specific to each outcome — outcomes that are less clustered in the population have smaller design effects than outcomes that are clustered. In the {srvyr} package, design effects can be calculated for totals, proportions, means, and ratio estimates by setting the deff argument to TRUE in the corresponding functions. In the example below, we calculate the design effects for the average consumption of electricity (BTUEL), natural gas (BTUNG), liquid propane (BTULP), fuel oil (BTUFO), and wood (BTUWOOD) by setting deff = TRUE: recs_des %&gt;% summarize(across( c(BTUEL, BTUNG, BTULP, BTUFO, BTUWOOD), ~ survey_mean(.x, deff = TRUE, vartype = NULL) )) %&gt;% select(ends_with(&quot;deff&quot;)) ## # A tibble: 1 × 5 ## BTUEL_deff BTUNG_deff BTULP_deff BTUFO_deff BTUWOOD_deff ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 0.597 0.938 1.21 0.720 1.10 For the values less than 1 (BTUEL_deff and BTUFO_deff), the results suggest that the survey design is more efficient than a simple random sample. For the values greater than 1 (BTUNG_deff, BTULP_deff, and BTUWOOD_deff), the results indicate that the survey design is less efficient than a simple random sample. 5.9.4 Creating summary rows When using group_by() in analysis, the results are returned with a row for each group or combination of groups. Often, we want both the breakdowns by group and a summary row for the estimate representing the entire population. For example, we may want the average electricity consumption by region and nationally. The {srvyr} package has the convenient cascade() function, which adds summary rows for the total of a group. It is used instead of summarize() and has similar functionalities along with some additional features. Syntax The syntax is as follows: cascade( .data, ..., .fill = NA, .fill_level_top = FALSE, .groupings = NULL ) where the arguments are: .data: A tbl_svy object ...: Name-value pairs of summary functions (same as the summarize() function) .fill: Value to fill in for group summaries (defaults to NA) .fill_level_top: When filling factor variables, whether to put the value ‘.fill’ in the first position (defaults to FALSE, placing it in the bottom.) Example First, let’s look at an example where we calculate the average household electricity cost. Then, we build on it to examine the features of the cascade() function. In the first example below, we calculate the average household energy cost DOLLAREL_mn using survey_mean() without modifying any of the argument defaults in the function: recs_des %&gt;% cascade(DOLLAREL_mn = survey_mean(DOLLAREL)) ## # A tibble: 1 × 2 ## DOLLAREL_mn DOLLAREL_mn_se ## &lt;dbl&gt; &lt;dbl&gt; ## 1 1380. 5.38 Next, let’s group the results by region by adding group_by() before the cascade() function: recs_des %&gt;% group_by(Region) %&gt;% cascade(DOLLAREL_mn = survey_mean(DOLLAREL)) ## # A tibble: 5 × 3 ## Region DOLLAREL_mn DOLLAREL_mn_se ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Northeast 1343. 14.6 ## 2 Midwest 1293. 11.7 ## 3 South 1548. 10.3 ## 4 West 1211. 12.0 ## 5 &lt;NA&gt; 1380. 5.38 We can see the estimated average electricity bills by region: $1,343 for the Northeast, $1,548 for the South, and so on. The last row, where Region = NA, is the national average electricity bill, $1,380. However, naming the national “region” as NA is not very informative. We can give it a better name using the .fill argument. recs_des %&gt;% group_by(Region) %&gt;% cascade(DOLLAREL_mn = survey_mean(DOLLAREL), .fill = &quot;National&quot;) ## # A tibble: 5 × 3 ## Region DOLLAREL_mn DOLLAREL_mn_se ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Northeast 1343. 14.6 ## 2 Midwest 1293. 11.7 ## 3 South 1548. 10.3 ## 4 West 1211. 12.0 ## 5 National 1380. 5.38 We can move the summary row to the first row by adding .fill_level_top = TRUE to cascade(): recs_des %&gt;% group_by(Region) %&gt;% cascade( DOLLAREL_mn = survey_mean(DOLLAREL), .fill = &quot;National&quot;, .fill_level_top = TRUE ) ## # A tibble: 5 × 3 ## Region DOLLAREL_mn DOLLAREL_mn_se ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 National 1380. 5.38 ## 2 Northeast 1343. 14.6 ## 3 Midwest 1293. 11.7 ## 4 South 1548. 10.3 ## 5 West 1211. 12.0 While the results remain the same, the table is now easier to interpret. 5.9.5 Calculating estimates for many outcomes Often, we are interested in a summary statistic across many variables. Useful tools include the across() function in {dplyr}, shown a few times above, and the map() function in {purrr}. The across() function applies the same function to multiple columns within summarize(). This works well with all functions shown above, except for survey_prop(). In a later example, we tackle summarizing multiple proportions. Example 1: across() Suppose we want to calculate the total and average consumption, along with coefficients of variation (CV), for each fuel type. These include the reported consumption of electricity (BTUEL), natural gas (BTUNG), liquid propane (BTULP), fuel oil (BTUFO), and wood (BTUWOOD), as mentioned in the section on design effects. We can take advantage of the fact that these are the only variables that start with “BTU” by selecting them with starts_with(\"BTU\") in the across() function. For each selected column (.x), across() creates a list of two functions to be applied: survey_total() to calculate the total and survey_mean() to calculate the mean, along with their CV (vartype = \"cv\".) Finally, .unpack = \"{outer}.{inner}\" specifies that the resulting column names are a concatenation of the variable name, followed by Total or Mean, and then “coef” or “cv”. consumption_ests &lt;- recs_des %&gt;% summarize(across( starts_with(&quot;BTU&quot;), list( Total = ~ survey_total(.x, vartype = &quot;cv&quot;), Mean = ~ survey_mean(.x, vartype = &quot;cv&quot;) ), .unpack = &quot;{outer}.{inner}&quot; )) consumption_ests ## # A tibble: 1 × 20 ## BTUEL_Total.coef BTUEL_Total._cv BTUEL_Mean.coef BTUEL_Mean._cv ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 4453284510065 0.00377 36051. 0.00377 ## # ℹ 16 more variables: BTUNG_Total.coef &lt;dbl&gt;, BTUNG_Total._cv &lt;dbl&gt;, ## # BTUNG_Mean.coef &lt;dbl&gt;, BTUNG_Mean._cv &lt;dbl&gt;, ## # BTULP_Total.coef &lt;dbl&gt;, BTULP_Total._cv &lt;dbl&gt;, ## # BTULP_Mean.coef &lt;dbl&gt;, BTULP_Mean._cv &lt;dbl&gt;, ## # BTUFO_Total.coef &lt;dbl&gt;, BTUFO_Total._cv &lt;dbl&gt;, ## # BTUFO_Mean.coef &lt;dbl&gt;, BTUFO_Mean._cv &lt;dbl&gt;, ## # BTUWOOD_Total.coef &lt;dbl&gt;, BTUWOOD_Total._cv &lt;dbl&gt;, … The estimated total consumption of electricity (BTUEL) is 4,453,284,510,065 (BTUEL_Total.coef), the estimated average consumption is 36,051 (BTUEL_Mean.coef), and the CV is 0.0038. In the example above, the table was quite wide. We may prefer a row for each fuel type. Using the pivot_longer() and pivot_wider() functions from {tidyr} can help us achieve this. First, we use pivot_longer() to make each variable a column, changing the data to a “long” format. We use the names_to argument to specify new column names: FuelType, Stat, and Type. Then, the names_pattern argument extracts the names in the original column names based on the regular expression pattern BTU(.*)_(.*)\\\\.(.*). They are saved in the column names defined in names_to. consumption_ests_long &lt;- consumption_ests %&gt;% pivot_longer( cols = everything(), names_to = c(&quot;FuelType&quot;, &quot;Stat&quot;, &quot;Type&quot;), names_pattern = &quot;BTU(.*)_(.*)\\\\.(.*)&quot; ) consumption_ests_long ## # A tibble: 20 × 4 ## FuelType Stat Type value ## &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;dbl&gt; ## 1 EL Total coef 4453284510065 ## 2 EL Total _cv 0.00377 ## 3 EL Mean coef 36051. ## 4 EL Mean _cv 0.00377 ## 5 NG Total coef 4240769382106. ## 6 NG Total _cv 0.00908 ## 7 NG Mean coef 34330. ## 8 NG Mean _cv 0.00908 ## 9 LP Total coef 391425311586. ## 10 LP Total _cv 0.0380 ## 11 LP Mean coef 3169. ## 12 LP Mean _cv 0.0380 ## 13 FO Total coef 395699976655. ## 14 FO Total _cv 0.0343 ## 15 FO Mean coef 3203. ## 16 FO Mean _cv 0.0343 ## 17 WOOD Total coef 345091088404. ## 18 WOOD Total _cv 0.0454 ## 19 WOOD Mean coef 2794. ## 20 WOOD Mean _cv 0.0454 Then, we use pivot_wider() to create a table that is nearly ready for publication. Within the function, we can make the names for each element more descriptive and informative by gluing the Stat and Type together with names_glue. Further details on creating publication-ready tables are covered in Chapter 8. consumption_ests_long %&gt;% mutate(Type = case_when(Type == &quot;coef&quot; ~ &quot;&quot;, Type == &quot;_cv&quot; ~ &quot; (CV)&quot;)) %&gt;% pivot_wider( id_cols = FuelType, names_from = c(Stat, Type), names_glue = &quot;{Stat}{Type}&quot;, values_from = value ) ## # A tibble: 5 × 5 ## FuelType Total `Total (CV)` Mean `Mean (CV)` ## &lt;chr&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 EL 4453284510065 0.00377 36051. 0.00377 ## 2 NG 4240769382106. 0.00908 34330. 0.00908 ## 3 LP 391425311586. 0.0380 3169. 0.0380 ## 4 FO 395699976655. 0.0343 3203. 0.0343 ## 5 WOOD 345091088404. 0.0454 2794. 0.0454 Example 2: Proportions with across() As mentioned earlier, proportions do not work as well directly with the across() method. If we want the proportion of houses with air conditioning (A/C) and the proportion of houses with heating, we require two separate group_by() statements as shown below: recs_des %&gt;% group_by(ACUsed) %&gt;% summarize(p = survey_prop()) ## # A tibble: 2 × 3 ## ACUsed p p_se ## &lt;lgl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 FALSE 0.113 0.00306 ## 2 TRUE 0.887 0.00306 recs_des %&gt;% group_by(SpaceHeatingUsed) %&gt;% summarize(p = survey_prop()) ## # A tibble: 2 × 3 ## SpaceHeatingUsed p p_se ## &lt;lgl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 FALSE 0.0469 0.00207 ## 2 TRUE 0.953 0.00207 We estimate 88.7% of households have A/C and 95.3% have heating. If we are only interested in the TRUE outcomes, that is, the proportion of households that have A/C and the proportion that have heating, we can simplify the code. Applying survey_mean() to a logical variable is the same as using survey_prop(), as shown below: cool_heat_tab &lt;- recs_des %&gt;% summarize(across(c(ACUsed, SpaceHeatingUsed), ~ survey_mean(.x), .unpack = &quot;{outer}.{inner}&quot;)) cool_heat_tab ## # A tibble: 1 × 4 ## ACUsed.coef ACUsed._se SpaceHeatingUsed.coef SpaceHeatingUsed._se ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 0.887 0.00306 0.953 0.00207 Note that the estimates are the same as those obtained using the separate group_by() statements. As before, we can use pivot_longer() to structure the table in a more suitable format for distribution. cool_heat_tab %&gt;% pivot_longer(everything(), names_to = c(&quot;Comfort&quot;, &quot;.value&quot;), names_pattern = &quot;(.*)\\\\.(.*)&quot;) %&gt;% rename(p = coef, se = `_se`) ## # A tibble: 2 × 3 ## Comfort p se ## &lt;chr&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 ACUsed 0.887 0.00306 ## 2 SpaceHeatingUsed 0.953 0.00207 Example 3: purrr::map() Loops are a common tool when dealing with repetitive calculations. The {purrr} package provides the map() functions, which, like a loop, allow us to perform the same task across different elements (Wickham and Henry 2023). In our case, we may want to calculate proportions from the same design multiple times. A straightforward approach is to design the calculation for one variable, build a function based on that, and then apply it iteratively for the rest of the variables. Suppose we want to create a table that shows the proportion of people who express trust in their government (TrustGovernment)10 as well as those that trust in people (TrustPeople)11. First, we create a table for a single variable. The table includes the variable name as a column, the response, and the corresponding percentage with its standard error. anes_des %&gt;% drop_na(TrustGovernment) %&gt;% group_by(TrustGovernment) %&gt;% summarize(p = survey_prop() * 100) %&gt;% mutate(Variable = &quot;TrustGovernment&quot;) %&gt;% rename(Answer = TrustGovernment) %&gt;% select(Variable, everything()) ## # A tibble: 5 × 4 ## Variable Answer p p_se ## &lt;chr&gt; &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 TrustGovernment Always 1.55 0.204 ## 2 TrustGovernment Most of the time 13.2 0.553 ## 3 TrustGovernment About half the time 30.9 0.829 ## 4 TrustGovernment Some of the time 43.4 0.855 ## 5 TrustGovernment Never 11.0 0.566 We estimate that 1.55% of people always trust the government, 13.16% trust the government most of the time, and so on. Now, we want to use the original series of steps as a template to create a general function calcps() that can apply the same steps to other variables. We replace TrustGovernment with an argument for a generic variable, var. Referring to var involves a bit of tidy evaluation, an advanced skill. To learn more, we recommend Wickham (2019). calcps &lt;- function(var) { anes_des %&gt;% drop_na(!!sym(var)) %&gt;% group_by(!!sym(var)) %&gt;% summarize(p = survey_prop() * 100) %&gt;% mutate(Variable = var) %&gt;% rename(Answer := !!sym(var)) %&gt;% select(Variable, everything()) } We then apply this function to the two variables of interest, TrustGovernment and TrustPeople: calcps(&quot;TrustGovernment&quot;) ## # A tibble: 5 × 4 ## Variable Answer p p_se ## &lt;chr&gt; &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 TrustGovernment Always 1.55 0.204 ## 2 TrustGovernment Most of the time 13.2 0.553 ## 3 TrustGovernment About half the time 30.9 0.829 ## 4 TrustGovernment Some of the time 43.4 0.855 ## 5 TrustGovernment Never 11.0 0.566 calcps(&quot;TrustPeople&quot;) ## # A tibble: 5 × 4 ## Variable Answer p p_se ## &lt;chr&gt; &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 TrustPeople Always 0.809 0.164 ## 2 TrustPeople Most of the time 41.4 0.857 ## 3 TrustPeople About half the time 28.2 0.776 ## 4 TrustPeople Some of the time 24.5 0.670 ## 5 TrustPeople Never 5.05 0.422 Finally, we use map() to iterate over as many variables as needed. We feed our desired variables into map() along with our custom function, calcps. The output is a tibble with the variable names in the “Variable” column, the responses in the “Answer” column, along with the percentage and standard error. The list_rbind() function combines the rows into a single tibble. This example extends nicely when dealing with numerous variables for which we want percentage estimates. c(&quot;TrustGovernment&quot;, &quot;TrustPeople&quot;) %&gt;% map(calcps) %&gt;% list_rbind() ## # A tibble: 10 × 4 ## Variable Answer p p_se ## &lt;chr&gt; &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 TrustGovernment Always 1.55 0.204 ## 2 TrustGovernment Most of the time 13.2 0.553 ## 3 TrustGovernment About half the time 30.9 0.829 ## 4 TrustGovernment Some of the time 43.4 0.855 ## 5 TrustGovernment Never 11.0 0.566 ## 6 TrustPeople Always 0.809 0.164 ## 7 TrustPeople Most of the time 41.4 0.857 ## 8 TrustPeople About half the time 28.2 0.776 ## 9 TrustPeople Some of the time 24.5 0.670 ## 10 TrustPeople Never 5.05 0.422 In addition to our results above, we can also see the output for TrustPeople. While we estimate that 1.55% of people always trust the government, 0.81% always trust people. 5.10 Exercises The exercises use the design objects anes_des and recs_des provided in the Prerequisites box at the beginning of the chapter. How many females have a graduate degree? Hint: the variables Gender and Education will be useful. What percentage of people identify as “Strong Democrat”? Hint: The variable PartyID indicates someone’s party affiliation. What percentage of people who voted in the 2020 election identify as “Strong Republican”? Hint: The variable VotedPres2020 indicates whether someone voted in 2020. What percentage of people voted in both the 2016 election and the 2020 election? Include the logit confidence interval. Hint: The variable VotedPres2016 indicates whether someone voted in 2016. What is the design effect for the proportion of people who voted early? Hint: The variable EarlyVote2020 indicates whether someone voted early in 2020. What is the median temperature people set their thermostats to at night during the winter? Hint: The variable WinterTempNight indicates the temperature that people set their temperature in the winter at night. People sometimes set their temperature differently over different seasons and during the day. What median temperatures do people set their thermostats to in the summer and winter, both during the day and at night? Include confidence intervals. Hint: Use the variables WinterTempDay, WinterTempNight, SummerTempDay, and SummerTempNight. What is the correlation between the temperature that people set their temperature at during the night and during the day in the summer? What is the 1st, 2nd, and 3rd quartile of money spent on energy by Building America (BA) climate zone? Hint: TOTALDOL indicates the total amount spent on all fuel, and ClimateRegion_BA indicates the BA climate zones. References Freedman Ellis, Greg, and Ben Schneider. 2023. srvyr: ’dplyr’-Like Syntax for Summary Statistics of Survey Data. Lumley, Thomas. 2010. Complex Surveys: A Guide to Analysis Using R. John Wiley; Sons. Shah, Babubhai V, and Akhil K Vaish. 2006. “Confidence Intervals for Quantile Estimation from Complex Survey Data.” In Proceedings of the Section on Survey Research Methods. http://www.asasrms.org/Proceedings/y2006/Files/JSM2006-000749.pdf. ———. 2023a. “2020 Residential Energy Consumption Survey: Consumption and Expenditures Technical Documentation Summary.” https://www.eia.gov/consumption/residential/data/2020/pdf/2020%20RECS%20CE%20Methodology_Final.pdf. ———. 2019. Advanced R. https://adv-r.hadley.nz/; CRC press. Wickham, Hadley, and Lionel Henry. 2023. purrr: Functional Programming Tools. RECS has two components: a household survey and an energy supplier survey. For each household that responds, their energy provider(s) are contacted to obtain their energy consumption and expenditure. This value reflects the dollars spent on electricity in 2020, according to the energy supplier. See U.S. Energy Information Administration (2023a) for more details.↩︎ Question text: Is any air conditioning equipment used in your home?↩︎ The value of DOLLARLP reflects the annualized amount spent on liquid propane and BTULP reflects the annualized consumption in Btu of liquid propane.↩︎ Question text: What is the square footage of your home?↩︎ BTUEL is derived from the supplier side component of the survey where BTUEL represents the electricity consumption in British thermal units (Btus) converted from kilowatt hours (kWh) in a year.↩︎ BTUNG is derived from the supplier side component of the survey where BTUNG represents the natural gas consumption in British thermal units (Btus) in a year.↩︎ Question: How often can you trust the federal government in Washington to do what is right? (Always, most of the time, about half the time, some of the time, or never / Never, some of the time, about half the time, most of the time, or always)?↩︎ Question: Generally speaking, how often can you trust other people? (Always, most of the time, about half the time, some of the time, or never / Never, some of the time, about half the time, most of the time, or always)? ↩︎ "],["c06-statistical-testing.html", "Chapter 6 Statistical testing 6.1 Introduction 6.2 Dot notation 6.3 Comparison of proportions and means 6.4 Chi-square tests 6.5 Exercises", " Chapter 6 Statistical testing Prerequisites For this chapter, load the following packages: library(tidyverse) library(survey) library(srvyr) library(srvyrexploR) library(broom) library(gt) library(prettyunits) We are using data from ANES and RECS described in Chapter 4. As a reminder, here is the code to create the design objects for each to use throughout this chapter. For ANES, we need to adjust the weight so it sums to the population instead of the sample (see the ANES documentation and Chapter 4 for more information.) targetpop &lt;- 231592693 anes_adjwgt &lt;- anes_2020 %&gt;% mutate(Weight = Weight / sum(Weight) * targetpop) anes_des &lt;- anes_adjwgt %&gt;% as_survey_design( weights = Weight, strata = Stratum, ids = VarUnit, nest = TRUE ) For RECS, details are included in the RECS documentation and Chapters 4 and 10. recs_des &lt;- recs_2020 %&gt;% as_survey_rep( weights = NWEIGHT, repweights = NWEIGHT1:NWEIGHT60, type = &quot;JK1&quot;, scale = 59/60, mse = TRUE ) 6.1 Introduction When analyzing survey results, the point estimates described in Chapter 5 help us understand the data at a high level. Still, we often want to make comparisons between different groups. These comparisons are calculated through statistical testing. The general idea of statistical testing is the same for data obtained through surveys and data obtained through other methods, where we compare the point estimates and variance estimates of each statistic to see if statistically significant differences exist. However, statistical testing for complex surveys involves additional considerations due to the need to account for the sampling design in order to obtain accurate variance estimates. Statistical testing, also called hypothesis testing, involves declaring a null and alternative hypothesis. A null hypothesis is denoted as \\(H_0\\) and the alternative hypothesis is denoted as \\(H_A\\). The null hypothesis is the default assumption in that there are no differences in the data, or that the data are operating under “standard” behaviors. On the other hand, the alternative hypothesis is the break from the “standard” and what we are trying to determine if the data support this alternative hypothesis. Let’s review an example outside of survey data. If we are flipping a coin, a null hypothesis would be that the coin is fair and that each side has an equal chance of being flipped. In other words, the probability of the coin landing on each side is 1/2. Whereas an alternative hypothesis could be that the coin is unfair and that one side has a higher probability of being flipped (e.g., a probability of 1/4 to get heads but a probability of 3/4 to get tails.) We write this set of hypotheses as: \\(H_0: \\rho_{heads} = \\rho_{tails}\\), where \\(\\rho_{x}\\) is the probability of flipping the coin and having it land on heads (\\(\\rho_{heads}\\)) or tails (\\(\\rho_{tails}\\)) \\(H_A: \\rho_{heads} \\neq \\rho_{tails}\\) When we conduct hypothesis testing, the statistical models calculate a p-value, which shows how likely we are to observe the data if the null hypothesis is true. If the p-value (a probability between 0 and 1) is small, we have strong evidence to reject the null hypothesis as it is unlikely to see the data we observe if the null hypothesis is true. However, if the p-value is large, we say we do not have evidence to reject the null hypothesis. The size of the p-value for this cut-off is determined by Type 1 error known as \\(\\alpha\\). A common Type 1 error value for statistical testing is to use \\(\\alpha = 0.05\\).12 It is common for explanations of statistical testing to refer to confidence level. The confidence level is the inverse of the Type 1 error. Thus, if \\(\\alpha = 0.05\\), the confidence level would be 95%. The functions in the {survey} package allow for the correct estimation of the variances. This chapter covers the following statistical tests with survey data and the following functions from the {survey} package(Lumley 2010): Comparison of proportions (svyttest()) Comparison of means (svyttest()) Goodness of fit tests (svygofchisq()) Tests of independence (svychisq()) Tests of homogeneity (svychisq()) 6.2 Dot notation Up to this point, we have shown functions that use wrappers from the {srvyr} package. This means that the functions work with tidyverse syntax. However, the functions in this chapter do not have wrappers in the {srvyr} package and are instead used directly from the {survey} package. Therefore, the design object is not the first argument, and to use these functions with the magrittr pipe (%&gt;%) and tidyverse syntax, we need to use dot (.) notation.13 Functions that work with the magrittr pipe (%&gt;%) have the dataset as the first argument. When we run a function with the pipe, it automatically places anything to the left of the pipe into the first argument of the function to the right of the pipe. For example, if we wanted to take the towny data from the {gt} package and filter to municipalities with the Census Subdivision Type of “city”, we can write the code in at least four different ways: filter(towny, csd_type == \"city\") towny %&gt;% filter(csd_type == \"city\") towny %&gt;% filter(., csd_type == \"city\") towny %&gt;% filter(.data = ., csd_type == \"city\") Each of these lines of code produces the same output since the argument that takes the dataset is in the first spot in filter(). The first two are probably familiar to those who have worked with the tidyverse. The third option functions the same way as the second one but is explicit that towny goes into the first argument, and the fourth option indicates that towny is going into the named argument of .data. Here, we are telling R to take what is on the left side of the pipe (towny) and pipe it into the spot with the dot (.)—the first argument. In functions that are not part of the tidyverse, the data argument may not be in the first spot. For example, in svyttest(), the data argument is in the second spot, which means we need to place the dot (.) in the second spot and not the first. For example: svydata_des %&gt;% svyttest(x ~ y, .) By default, the pipe places the left-hand object in the first argument spot. Placing the dot (.) in the second argument spot indicates that the survey design object svydata_des should be used in the second argument and not the first. Alternatively, named arguments could be used to place the dot first as named arguments can appear at any location, as in the following: svydata_des %&gt;% svyttest(design = ., x ~ y) However, the following code does not work as the svyttest() function expects the formula as the first argument when arguments are not named: svydata_des %&gt;% svyttest(., x ~ y) 6.3 Comparison of proportions and means We use t-tests to compare two proportions or means. T-tests allow us to determine if one proportion or mean is statistically different from another. They are commonly used to determine if a single estimate differs from a known value (e.g., 0 or 50%) or to compare two group means (e.g., North versus South.) Comparing a single estimate to a known value is called a one sample t-test, and we can set up the hypothesis test as follows: \\(H_0: \\mu = 0\\) where \\(\\mu\\) is the mean outcome and \\(0\\) is the value we are comparing it to \\(H_A: \\mu \\neq 0\\) For comparing two estimates, this is called a two-sample t-test. We can set up the hypothesis test as follows: \\(H_0: \\mu_1 = \\mu_2\\) where \\(\\mu_i\\) is the mean outcome for group \\(i\\) \\(H_A: \\mu_1 \\neq \\mu_2\\) Two sample t-tests can also be paired or unpaired. If the data come from two different populations (e.g., North versus South), the t-test run is an unpaired or independent samples t-test. Paired t-tests occur when the data come from the same population. This is commonly seen with data from the same population in two different time periods (e.g., before and after an intervention.) The difference between t-tests with non-survey data and survey data is based on the underlying variance estimation difference. Chapter 10 provides a detailed overview of the math behind the mean and sampling error calculations for various sample designs. The functions in the {survey} package account for these nuances, provided the design object is correctly defined. 6.3.1 Syntax When we do not have survey data, we can use the t.test() function from the {stats} package. This function does not allow for weights or the variance structure that need to be accounted for with survey data. Therefore, we need to use the svyttest() function from {survey} when using survey data. Many of the arguments are the same between the two functions, but there are a few key differences: We need to use the survey design object instead of the original data frame We can only use a formula and not separate x and y data The confidence level cannot be specified and is always be set to 95%. However, we show examples of how the confidence level can be changed after running the svyttest() function by using the confint() function. Here is the syntax for the svyttest() function: svyttest(formula, design, ...) The arguments are: formula: Formula, outcome~group for two-sample, outcome~0 or outcome~1 for one-sample. The group variable must be a factor or character with two levels, or be coded 0/1 or 1/2. We give more details on formula set-up below for different types of tests. design: survey design object ...: This passes options on for one-sided tests only, and thus, we can specify na.rm=TRUE Notice that the first argument here is the formula and not the design. This means we must use the dot (.) if we pipe in the survey design object (as described in Section 6.2). The formula argument can take several different forms depending on what we are measuring. Here are a few common scenarios: One-sample t-test: Comparison to 0: var ~ 0, where var is the measure of interest, and we compare it to the value 0. For example, we could test if the population mean of household debt is different from 0 given the sample data collected. Comparison to a different value: var - value ~ 0, where var is the measure of interest and value is what we are comparing to. For example, we could test if the proportion of the population that has blue eyes is different from 25% by using var - 0.25 ~ 0. Note that specifying the formula as var ~ 0.25 is not equivalent and results in a syntax error. Two-sample t-test: Unpaired: 2 level grouping variable: var ~ groupVar, where var is the measure of interest and groupVar is a variable with two categories. For example, we could test if the average age of the population who voted for president in 2020 differed from the age of people who did not vote. In this case, age would be used for var, and a binary variable indicating voting activity would be the groupVar. 3+ level grouping variable: var ~ groupVar == level, where var is the measure of interest, groupVar is the categorical variable, and level is the category level to isolate. For example, we could test if the test scores in one classroom differed from all other classrooms where groupVar would be the variable holding the values for classroom IDs and level is the classroom ID we want to compare to the others. Paired: var_1 - var_2 ~ 0, where var_1 is the first variable of interest and var_2 is the second variable of interest. For example, we could test if test scores on a subject differed between the start and the end of a course, so var_1 would be the test score at the beginning of the course, and var_2 would be the score at the end of the course. The na.rm argument defaults to FALSE, which means if any data values are missing, the t-test does not compute. Throughout this chapter, we always set na.rm = TRUE, but before analyzing the survey data, review the notes provided in Chapter 11 to better understand how to handle missing data. Let’s walk through a few examples using the ANES and RECS data. 6.3.2 Examples Example 1: One-sample t-test for mean RECS asks respondents to indicate what temperature they set their house to during the summer at night.14 In our data, we have called this variable SummerTempNight. If we want to see if the average U.S. household sets its temperature at a value different from 68\\(^\\circ\\)F15, we could set up the hypothesis as follows: \\(H_0: \\mu = 68\\) where \\(\\mu\\) is the average temperature U.S. households set their thermostat to in the summer at night \\(H_A: \\mu \\neq 68\\) To conduct this in R, we use svyttest() and subtract the temperature on the left-hand side of the formula: ttest_ex1 &lt;- recs_des %&gt;% svyttest( formula = SummerTempNight - 68 ~ 0, design = ., na.rm = TRUE ) ttest_ex1 ## ## Design-based one-sample t-test ## ## data: SummerTempNight - 68 ~ 0 ## t = 85, df = 58, p-value &lt;2e-16 ## alternative hypothesis: true mean is not equal to 0 ## 95 percent confidence interval: ## 3.288 3.447 ## sample estimates: ## mean ## 3.367 To pull out specific output, we can use R’s built-in $ operator. For instance, to obtain the estimate \\(\\mu - 68\\), we run ttest_ex1$estimate. If we want the average, we take our t-test estimate and add it to 68: ttest_ex1$estimate + 68 ## mean ## 71.37 Or, we can use the survey_mean() function described in Chapter 5: recs_des %&gt;% summarize(mu = survey_mean(SummerTempNight, na.rm = TRUE)) ## # A tibble: 1 × 2 ## mu mu_se ## &lt;dbl&gt; &lt;dbl&gt; ## 1 71.4 0.0397 The result is the same in both methods, so we see that the average temperature U.S. households set their thermostat to in the summer at night is 71.4\\(^\\circ\\)F. Looking at the output from svyttest(), the t-statistic is 84.8, and the p-value is \\(&lt;0.0001\\), indicating that the average is statistically different from 68\\(^\\circ\\)F at an \\(\\alpha\\) level of \\(0.05\\). If we want an 80% confidence interval for the test statistic, we can use the function confint() to change the confidence level. Below, we print the default confidence interval (95%), the confidence interval explicitly specifying the level as 95%, and the 80% confidence interval. The default confidence level is 95%, and when we specify this level, R returns a vector with both row and column names. However, when we specify any other confidence level, an unnamed vector is returned, with the first element being the lower bound and the second element being the upper bound of the confidence interval. confint(ttest_ex1) ## 2.5 % 97.5 % ## as.numeric(SummerTempNight - 68) 3.288 3.447 ## attr(,&quot;conf.level&quot;) ## [1] 0.95 confint(ttest_ex1, level = 0.95) ## 2.5 % 97.5 % ## as.numeric(SummerTempNight - 68) 3.288 3.447 ## attr(,&quot;conf.level&quot;) ## [1] 0.95 confint(ttest_ex1, level = 0.8) ## [1] 3.316 3.419 ## attr(,&quot;conf.level&quot;) ## [1] 0.8 In this case, neither confidence interval contains 0, and we draw the same conclusion from either that the average temperature households set their thermostat in the summer at night is significantly higher than 68\\(^\\circ\\)F. Example 2: One-sample t-test for proportion RECS asked respondents if they use air conditioning (A/C) in their home.16 In our data, we call this variable ACUsed. Let’s look at the proportion of U.S. households that use A/C in their homes using the survey_prop() function we learned in Chapter 5. acprop &lt;- recs_des %&gt;% group_by(ACUsed) %&gt;% summarize(p = survey_prop()) acprop ## # A tibble: 2 × 3 ## ACUsed p p_se ## &lt;lgl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 FALSE 0.113 0.00306 ## 2 TRUE 0.887 0.00306 Based on this, 88.7% of U.S. households use A/C in their homes. If we wanted to know if this differs from 90%, we could set up our hypothesis as follows: \\(H_0: p = 0.90\\) where \\(p\\) is the proportion of U.S. households that use A/C in their homes \\(H_A: p \\neq 0.90\\) To conduct this in R, we use the svyttest() function as follows: ttest_ex2 &lt;- recs_des %&gt;% svyttest( formula = (ACUsed == TRUE) - 0.90 ~ 0, design = ., na.rm = TRUE ) ttest_ex2 ## ## Design-based one-sample t-test ## ## data: (ACUsed == TRUE) - 0.9 ~ 0 ## t = -4.4, df = 58, p-value = 5e-05 ## alternative hypothesis: true mean is not equal to 0 ## 95 percent confidence interval: ## -0.019603 -0.007348 ## sample estimates: ## mean ## -0.01348 The output from the svyttest() function can be a bit hard to read. Using the tidy() function from the {broom} package, we can clean up the output into a tibble to more easily understand what the test tells us (Robinson, Hayes, and Couch 2023). tidy(ttest_ex2) ## # A tibble: 1 × 8 ## estimate statistic p.value parameter conf.low conf.high method ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;chr&gt; ## 1 -0.0135 -4.40 0.0000466 58 -0.0196 -0.00735 Design-base… ## # ℹ 1 more variable: alternative &lt;chr&gt; The ‘tidied’ output can also be piped into the {gt} package to create a table ready for publication. We go over the {gt} package in Chapter 8. The function pretty_p_value() comes from the {prettyunits} package and converts numeric p-values to characters and, by default, prints four decimal places and displays any p-value less than 0.0001 as \"&lt;0.0001\" though another minimum display p-value can be specified (Csardi 2023). tidy(ttest_ex2) %&gt;% mutate(p.value = pretty_p_value(p.value)) %&gt;% gt() %&gt;% fmt_number() #lapnrnbvkd table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #lapnrnbvkd thead, #lapnrnbvkd tbody, #lapnrnbvkd tfoot, #lapnrnbvkd tr, #lapnrnbvkd td, #lapnrnbvkd th { border-style: none; } #lapnrnbvkd p { margin: 0; padding: 0; } #lapnrnbvkd .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #lapnrnbvkd .gt_caption { padding-top: 4px; padding-bottom: 4px; } #lapnrnbvkd .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #lapnrnbvkd .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #lapnrnbvkd .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #lapnrnbvkd .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #lapnrnbvkd .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #lapnrnbvkd .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #lapnrnbvkd .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #lapnrnbvkd .gt_column_spanner_outer:first-child { padding-left: 0; } #lapnrnbvkd .gt_column_spanner_outer:last-child { padding-right: 0; } #lapnrnbvkd .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #lapnrnbvkd .gt_spanner_row { border-bottom-style: hidden; } #lapnrnbvkd .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #lapnrnbvkd .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #lapnrnbvkd .gt_from_md > :first-child { margin-top: 0; } #lapnrnbvkd .gt_from_md > :last-child { margin-bottom: 0; } #lapnrnbvkd .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #lapnrnbvkd .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #lapnrnbvkd .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #lapnrnbvkd .gt_row_group_first td { border-top-width: 2px; } #lapnrnbvkd .gt_row_group_first th { border-top-width: 2px; } #lapnrnbvkd .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #lapnrnbvkd .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #lapnrnbvkd .gt_first_summary_row.thick { border-top-width: 2px; } #lapnrnbvkd .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #lapnrnbvkd .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #lapnrnbvkd .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #lapnrnbvkd .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #lapnrnbvkd .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #lapnrnbvkd .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #lapnrnbvkd .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #lapnrnbvkd .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #lapnrnbvkd .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #lapnrnbvkd .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #lapnrnbvkd .gt_left { text-align: left; } #lapnrnbvkd .gt_center { text-align: center; } #lapnrnbvkd .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #lapnrnbvkd .gt_font_normal { font-weight: normal; } #lapnrnbvkd .gt_font_bold { font-weight: bold; } #lapnrnbvkd .gt_font_italic { font-style: italic; } #lapnrnbvkd .gt_super { font-size: 65%; } #lapnrnbvkd .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #lapnrnbvkd .gt_asterisk { font-size: 100%; vertical-align: 0; } #lapnrnbvkd .gt_indent_1 { text-indent: 5px; } #lapnrnbvkd .gt_indent_2 { text-indent: 10px; } #lapnrnbvkd .gt_indent_3 { text-indent: 15px; } #lapnrnbvkd .gt_indent_4 { text-indent: 20px; } #lapnrnbvkd .gt_indent_5 { text-indent: 25px; } TABLE 6.1: One-sample t-test output for estimates of U.S. households use A/C in their homes differing from 90%, RECS 2020 estimate statistic p.value parameter conf.low conf.high method alternative −0.01 −4.40 &lt;0.0001 58.00 −0.02 −0.01 Design-based one-sample t-test two.sided The estimate differs from Example 1 in that the estimate does not display \\(p - 0.90\\) but rather \\(p\\), or the difference between the U.S. households that use A/C and the proportion we are comparing to. We can see that there is a difference of -1.35 percentage points. Additionally, the t-statistic value in the statistic column is -4.4, and the p-value is &lt;0.0001. These results indicate that fewer than 90% of U.S. households use A/C in their homes. Example 3: Unpaired two-sample t-test Two additional variables in the RECS data are the electric bill cost (DOLLAREL) and whether the house used A/C or not (ACUsed.)17 If we want to know if the U.S. households that used A/C had higher electrical bills compared to those that did not, we could set up the hypothesis as follows: \\(H_0: \\mu_{AC} = \\mu_{noAC}\\) where \\(\\mu_{AC}\\) is the electrical bill cost for U.S. households that used A/C and \\(\\mu_{noAC}\\) is the electrical bill cost for U.S. households that did not use A/C \\(H_A: \\mu_{AC} \\neq \\mu_{noAC}\\) Let’s take a quick look at the data to see the format the data are in: recs_des %&gt;% group_by(ACUsed) %&gt;% summarize(mean = survey_mean(DOLLAREL, na.rm = TRUE)) ## # A tibble: 2 × 3 ## ACUsed mean mean_se ## &lt;lgl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 FALSE 1056. 16.0 ## 2 TRUE 1422. 5.69 To conduct this in R, we use svyttest(): ttest_ex3 &lt;- recs_des %&gt;% svyttest(formula = DOLLAREL ~ ACUsed, design = ., na.rm = TRUE) tidy(ttest_ex3) %&gt;% mutate(p.value = pretty_p_value(p.value)) %&gt;% gt() %&gt;% fmt_number() #tbdsbanjdr table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #tbdsbanjdr thead, #tbdsbanjdr tbody, #tbdsbanjdr tfoot, #tbdsbanjdr tr, #tbdsbanjdr td, #tbdsbanjdr th { border-style: none; } #tbdsbanjdr p { margin: 0; padding: 0; } #tbdsbanjdr .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #tbdsbanjdr .gt_caption { padding-top: 4px; padding-bottom: 4px; } #tbdsbanjdr .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #tbdsbanjdr .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #tbdsbanjdr .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #tbdsbanjdr .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #tbdsbanjdr .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #tbdsbanjdr .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #tbdsbanjdr .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #tbdsbanjdr .gt_column_spanner_outer:first-child { padding-left: 0; } #tbdsbanjdr .gt_column_spanner_outer:last-child { padding-right: 0; } #tbdsbanjdr .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #tbdsbanjdr .gt_spanner_row { border-bottom-style: hidden; } #tbdsbanjdr .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #tbdsbanjdr .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #tbdsbanjdr .gt_from_md > :first-child { margin-top: 0; } #tbdsbanjdr .gt_from_md > :last-child { margin-bottom: 0; } #tbdsbanjdr .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #tbdsbanjdr .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #tbdsbanjdr .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #tbdsbanjdr .gt_row_group_first td { border-top-width: 2px; } #tbdsbanjdr .gt_row_group_first th { border-top-width: 2px; } #tbdsbanjdr .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #tbdsbanjdr .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #tbdsbanjdr .gt_first_summary_row.thick { border-top-width: 2px; } #tbdsbanjdr .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #tbdsbanjdr .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #tbdsbanjdr .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #tbdsbanjdr .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #tbdsbanjdr .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #tbdsbanjdr .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #tbdsbanjdr .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #tbdsbanjdr .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #tbdsbanjdr .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #tbdsbanjdr .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #tbdsbanjdr .gt_left { text-align: left; } #tbdsbanjdr .gt_center { text-align: center; } #tbdsbanjdr .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #tbdsbanjdr .gt_font_normal { font-weight: normal; } #tbdsbanjdr .gt_font_bold { font-weight: bold; } #tbdsbanjdr .gt_font_italic { font-style: italic; } #tbdsbanjdr .gt_super { font-size: 65%; } #tbdsbanjdr .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #tbdsbanjdr .gt_asterisk { font-size: 100%; vertical-align: 0; } #tbdsbanjdr .gt_indent_1 { text-indent: 5px; } #tbdsbanjdr .gt_indent_2 { text-indent: 10px; } #tbdsbanjdr .gt_indent_3 { text-indent: 15px; } #tbdsbanjdr .gt_indent_4 { text-indent: 20px; } #tbdsbanjdr .gt_indent_5 { text-indent: 25px; } TABLE 6.2: Unpaired two-sample t-test output for estimates of U.S. households electrical bills by A/C use, RECS 2020 estimate statistic p.value parameter conf.low conf.high method alternative 365.72 21.29 &lt;0.0001 58.00 331.33 400.11 Design-based t-test two.sided The results indicate that the difference in electrical bills for those who used A/C and those who did not is, on average, $365.72. The difference appears to be statistically significant as the t-statistic is 21.3 and the p-value is \\(&lt;0.0001\\). Households that used A/C spent, on average, $365.72 more in 2020 on electricity than households without A/C. Example 4: Paired two-sample t-test Let’s say we want to test whether the temperature at which U.S. households set their thermostat at night differs depending on the season (comparing summer18 and winter19 temperatures.) We could set up the hypothesis as follows: \\(H_0: \\mu_{summer} = \\mu_{winter}\\) where \\(\\mu_{summer}\\) is the temperature that U.S. households set their thermostat to during summer nights, and \\(\\mu_{winter}\\) is the temperature that U.S. households set their thermostat to during winter nights \\(H_A: \\mu_{summer} \\neq \\mu_{winter}\\) To conduct this in R, we use svyttest() by calculating the temperature difference on the left-hand side as follows: ttest_ex4 &lt;- recs_des %&gt;% svyttest( design = ., formula = SummerTempNight - WinterTempNight ~ 0, na.rm = TRUE ) tidy(ttest_ex4) %&gt;% mutate(p.value = pretty_p_value(p.value)) %&gt;% gt() %&gt;% fmt_number() #ckowdxwsja table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #ckowdxwsja thead, #ckowdxwsja tbody, #ckowdxwsja tfoot, #ckowdxwsja tr, #ckowdxwsja td, #ckowdxwsja th { border-style: none; } #ckowdxwsja p { margin: 0; padding: 0; } #ckowdxwsja .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #ckowdxwsja .gt_caption { padding-top: 4px; padding-bottom: 4px; } #ckowdxwsja .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #ckowdxwsja .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #ckowdxwsja .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #ckowdxwsja .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #ckowdxwsja .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #ckowdxwsja .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #ckowdxwsja .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #ckowdxwsja .gt_column_spanner_outer:first-child { padding-left: 0; } #ckowdxwsja .gt_column_spanner_outer:last-child { padding-right: 0; } #ckowdxwsja .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #ckowdxwsja .gt_spanner_row { border-bottom-style: hidden; } #ckowdxwsja .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #ckowdxwsja .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #ckowdxwsja .gt_from_md > :first-child { margin-top: 0; } #ckowdxwsja .gt_from_md > :last-child { margin-bottom: 0; } #ckowdxwsja .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #ckowdxwsja .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #ckowdxwsja .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #ckowdxwsja .gt_row_group_first td { border-top-width: 2px; } #ckowdxwsja .gt_row_group_first th { border-top-width: 2px; } #ckowdxwsja .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #ckowdxwsja .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #ckowdxwsja .gt_first_summary_row.thick { border-top-width: 2px; } #ckowdxwsja .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #ckowdxwsja .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #ckowdxwsja .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #ckowdxwsja .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #ckowdxwsja .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #ckowdxwsja .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #ckowdxwsja .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #ckowdxwsja .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #ckowdxwsja .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #ckowdxwsja .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #ckowdxwsja .gt_left { text-align: left; } #ckowdxwsja .gt_center { text-align: center; } #ckowdxwsja .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #ckowdxwsja .gt_font_normal { font-weight: normal; } #ckowdxwsja .gt_font_bold { font-weight: bold; } #ckowdxwsja .gt_font_italic { font-style: italic; } #ckowdxwsja .gt_super { font-size: 65%; } #ckowdxwsja .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #ckowdxwsja .gt_asterisk { font-size: 100%; vertical-align: 0; } #ckowdxwsja .gt_indent_1 { text-indent: 5px; } #ckowdxwsja .gt_indent_2 { text-indent: 10px; } #ckowdxwsja .gt_indent_3 { text-indent: 15px; } #ckowdxwsja .gt_indent_4 { text-indent: 20px; } #ckowdxwsja .gt_indent_5 { text-indent: 25px; } TABLE 6.3: Paired two-sample t-test output for estimates of U.S. households thermostat temperature by season, RECS 2020 estimate statistic p.value parameter conf.low conf.high method alternative 2.85 50.83 &lt;0.0001 58.00 2.74 2.96 Design-based one-sample t-test two.sided U.S. households set their thermostat on average 2.9\\(^\\circ\\)F warmer in summer nights than winter nights, which is statistically significant (t = 50.8, p-value = \\(&lt;0.0001\\).) 6.4 Chi-square tests Chi-square tests (\\(\\chi^2\\)) allow us to examine multiple proportions using a goodness-of-fit test, a test of independence, or a test of homogeneity. These three tests have the same \\(\\chi^2\\) distributions but with slightly different underlying assumptions. First, goodness-of-fit tests are used when comparing observed data to expected data. For example, this could be used to determine if respondent demographics (the observed data in the sample) match known population information (the expected data.) In this case, we can set up the hypothesis test as follows: \\(H_0: p_1 = \\pi_1, ~ p_2 = \\pi_2, ~ ..., ~ p_k = \\pi_k\\) where \\(p_i\\) is the observed proportion for category \\(i\\), \\(\\pi_i\\) is expected proportion for category \\(i\\), and \\(k\\) is the number of categories \\(H_A:\\) at least one level of \\(p_i\\) does not match \\(\\pi_i\\) Second, tests of independence are used when comparing two types of observed data to see if there is a relationship. For example, this could be used to determine if the proportion of respondents who voted for each political party in the presidential election matches the proportion of respondents who voted for each political party in a local election. In this case, we can set up the hypothesis test as follows: \\(H_0:\\) The two variables/factors are independent \\(H_A:\\) The two variables/factors are not independent Third, tests of homogeneity are used to compare two distributions to see if they match. For example, this could be used to determine if the highest education achieved is the same for both men and women. In this case, we can set up the hypothesis test as follows: \\(H_0: p_{1a} = p_{1b}, ~ p_{2a} = p_{2b}, ~ ..., ~ p_{ka} = p_{kb}\\) where \\(p_{ia}\\) is the observed proportion of category \\(i\\) for subgroup \\(a\\), \\(p_{ib}\\) is the observed proportion of category \\(i\\) for subgroup \\(a\\) and \\(k\\) is the number of categories \\(H_A:\\) at least one category of \\(p_{ia}\\) does not match \\(p_{ib}\\) As with t-tests, the difference between using \\(\\chi^2\\) tests with non-survey data and survey data is based on the underlying variance estimation. The functions in the {survey} package account for these nuances, provided the design object is correctly defined. For basic variance estimation formulas for different survey design types, refer to Chapter 10. 6.4.1 Syntax When we do not have survey data, we may be able to use the chisq.test() function from the {stats} package in base R (R Core Team 2023). However, this function does not allow for weights or the variance structure to be accounted for with survey data. Therefore, when using survey data, we need to use one of two functions: svygofchisq(): For goodness of fit tests svychisq(): For tests of independence and homogeneity The non-survey data function of chisq.test() requires either a single set of counts and given proportions (for goodness of fit tests) or two sets of counts for tests of independence and homogeneity. The functions we use with survey data require respondent-level data and formulas instead of counts. This ensures that the variances are correctly calculated. First, the function for the goodness of fit tests is svygofchisq(): svygofchisq(formula, p, design, na.rm = TRUE, ...) The arguments are: formula: Formula specifying a single factor variable p: Vector of probabilities for the categories of the factor in the correct order. If the probabilities do not sum to 1, they are rescaled to sum to 1. design: Survey design object …: Other arguments to pass on, such as na.rm Based on the order of the arguments, we again must use the dot (.) notation if we pipe in the survey design object or explicitly name the arguments as described in Section 6.2. For the goodness of fit tests, the formula is a single variable formula = ~var as we compare the observed data from this variable to the expected data. The expected probabilities are then entered in the p argument and need to be a vector of the same length as the number of categories in the variable. For example, if we want to know if the proportion of males and females matches a distribution of 30/70, then the sex variable (with two categories) would be used formula = ~SEX, and the proportions would be included as p = c(.3, .7). It is important to note that the variable entered into the formula should be formatted as either a factor or a character. The examples below provide more detail and tips on how to make sure the levels match up correctly. For tests of homogeneity and independence, the svychisq() function should be used. The syntax is as follows: svychisq( formula, design, statistic = c(&quot;F&quot;, &quot;Chisq&quot;, &quot;Wald&quot;, &quot;adjWald&quot;, &quot;lincom&quot;, &quot;saddlepoint&quot;), na.rm = TRUE ) The arguments are: formula: Model formula specifying the table (shown in examples) design: Survey design object statistic: Type of test statistic to use in test (details below) na.rm: Remove missing values There are six statistics that are accepted in this formula. For tests of homogeneity (when comparing cross-tabulations), the F or Chisq statistics should be used.20 The F statistic is the default and uses the Rao-Scott second-order correction. This correction is designed to assist with complicated sampling designs (i.e., those other than a simple random sample) (Scott 2007). The Chisq statistic is an adjusted version of the Pearson \\(\\chi^2\\) statistic. The version of this statistic in the svychisq() function compares the design effect estimate from the provided survey data to what the \\(\\chi^2\\) distribution would have been if the data came from a simple random sampling. For tests of independence, the Wald and adjWald are recommended as they provide a better adjustment for variable comparisons (Lumley 2010). If the data have a small number of primary sampling units (PSUs) compared to the degrees of freedom, then the adjWald statistic should be used to account for this. The lincom and saddlepoint statistics are available for more complicated data structures. The formula argument is always one-sided, unlike the svyttest() function. The two variables of interest should be included with a plus sign: formula = ~ var_1 + var_2. As with the svygofchisq() function, the variables entered into the formula should be formatted as either a factor or a character. Additionally, as with the t-test function, both svygofchisq() and svychisq() have the na.rm argument. If any data values are missing, the \\(\\chi^2\\) tests assume that NA is a category and include it in the calculation. Throughout this chapter, we always set na.rm = TRUE, but before analyzing the survey data, review the notes provided in Chapter 11 to better understand how to handle missing data. 6.4.2 Examples Let’s walk through a few examples using the ANES data. Example 1: Goodness of fit test ANES asked respondents about their highest education level.21 Based on the data from the 2020 American Community Survey (ACS) 5-year estimates22, the education distribution of those aged 18+ in the United States (among the 50 states and District of Columbia) is as follows: 11% had less than a High School degree 27% had a High School degree 29% had some college or associate’s degree 33% had a bachelor’s degree or higher If we want to see if the weighted distribution from the ANES 2020 data matches this distribution, we could set up the hypothesis as follows: \\(H_0: p_1 = 0.11, ~ p_2 = 0.27, ~ p_3 = 0.29, ~ p_4 = 0.33\\) \\(H_A:\\) at least one of the education levels does not match between the ANES and the ACS To conduct this in R, let’s first look at the education variable (Education) we have on the ANES data. Using the survey_mean() function discussed in Chapter 5, we can see the education levels and estimated proportions. anes_des %&gt;% drop_na(Education) %&gt;% group_by(Education) %&gt;% summarize(p = survey_mean()) ## # A tibble: 5 × 3 ## Education p p_se ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Less than HS 0.0805 0.00568 ## 2 High school 0.277 0.0102 ## 3 Post HS 0.290 0.00713 ## 4 Bachelor&#39;s 0.226 0.00633 ## 5 Graduate 0.126 0.00499 Based on this output, we can see that we have different levels from the ACS data. Specifically, the education data from ANES include two levels for Bachelor’s Degree or Higher (Bachelor’s and Graduate), so these two categories need to be collapsed into a single category to match the ACS data. For this, among other methods, we can use the {forcats} package from the tidyverse (Wickham 2023a). The package’s fct_collapse() function helps us create a new variable by collapsing categories into a single one. Then, we use the svygofchisq() function to compare the ANES data to the ACS data, where we specify the updated design object, the formula using the collapsed education variable, the ACS estimates for education levels as p, and removing NA values. anes_des_educ &lt;- anes_des %&gt;% mutate(Education2 = fct_collapse(Education, &quot;Bachelor or Higher&quot; = c(&quot;Bachelor&#39;s&quot;, &quot;Graduate&quot;))) anes_des_educ %&gt;% drop_na(Education2) %&gt;% group_by(Education2) %&gt;% summarize(p = survey_mean()) ## # A tibble: 4 × 3 ## Education2 p p_se ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Less than HS 0.0805 0.00568 ## 2 High school 0.277 0.0102 ## 3 Post HS 0.290 0.00713 ## 4 Bachelor or Higher 0.352 0.00732 chi_ex1 &lt;- anes_des_educ %&gt;% svygofchisq( formula = ~ Education2, p = c(0.11, 0.27, 0.29, 0.33), design = ., na.rm = TRUE ) chi_ex1 ## ## Design-based chi-squared test for given probabilities ## ## data: ~Education2 ## X-squared = 2172220, scale = 1.1e+05, df = 2.3e+00, p-value = ## 9e-05 The output from the svygofchisq() indicates that at least one proportion from ANES does not match the ACS data (\\(\\chi^2 =\\) 2,172,220; p-value &lt;0.0001.) To get a better idea of the differences, we can use the expected output along with survey_mean() to create a comparison table: ex1_table &lt;- anes_des_educ %&gt;% drop_na(Education2) %&gt;% group_by(Education2) %&gt;% summarize(Observed = survey_mean(vartype = &quot;ci&quot;)) %&gt;% rename(Education = Education2) %&gt;% mutate(Expected=c(0.11, 0.27, 0.29, 0.33)) %&gt;% select(Education, Expected, everything()) ex1_table ## # A tibble: 4 × 5 ## Education Expected Observed Observed_low Observed_upp ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Less than HS 0.11 0.0805 0.0691 0.0919 ## 2 High school 0.27 0.277 0.257 0.298 ## 3 Post HS 0.29 0.290 0.276 0.305 ## 4 Bachelor or Higher 0.33 0.352 0.337 0.367 This output includes our expected proportions from the ACS that we provided the svygofchisq() function along with the output of the observed proportions and their confidence intervals. This table shows that the “High school” and “Post HS” categories have nearly identical proportions but that the other two categories are slightly different. Looking at the confidence intervals, we can see that the ANES data skew to include fewer people in the “Less than HS” category and more people in the “Bachelor or Higher” category. This may be easier to see if we plot this. The code below uses the tabular output to create Figure 6.1. ex1_table %&gt;% pivot_longer( cols = c(&quot;Expected&quot;, &quot;Observed&quot;), names_to = &quot;Names&quot;, values_to = &quot;Proportion&quot; ) %&gt;% mutate( Observed_low = if_else(Names == &quot;Observed&quot;, Observed_low, NA_real_), Observed_upp = if_else(Names == &quot;Observed&quot;, Observed_upp, NA_real_), Names = if_else(Names == &quot;Observed&quot;, &quot;ANES (observed)&quot;, &quot;ACS (expected)&quot;) ) %&gt;% ggplot(aes(x = Education, y = Proportion, color = Names)) + geom_point(alpha = 0.75, size = 2) + geom_errorbar(aes(ymin = Observed_low, ymax = Observed_upp), width = 0.25) + theme_bw() + scale_color_manual(name = &quot;Type&quot;, values = book_colors[c(4, 1)]) + theme(legend.position = &quot;bottom&quot;, legend.title = element_blank()) FIGURE 6.1: Expected and observed proportions of education with confidence intervals Example 2: Test of independence ANES asked respondents two questions about trust: How often can you trust the federal government to do what is right? How often can you trust other people? If we want to see if the distributions of these two questions are similar or not, we can conduct a test of independence. Here is how the hypothesis could be set up: \\(H_0:\\) People’s trust in the federal government and their trust in other people are independent (i.e., not related) \\(H_A:\\) People’s trust in the federal government and their trust in other people are not independent (i.e., they are related) To conduct this in R, we use the svychisq() function to compare the two variables: chi_ex2 &lt;- anes_des %&gt;% svychisq( formula = ~ TrustGovernment + TrustPeople, design = ., statistic = &quot;Wald&quot;, na.rm = TRUE ) chi_ex2 ## ## Design-based Wald test of association ## ## data: NextMethod() ## F = 21, ndf = 16, ddf = 51, p-value &lt;2e-16 The output from svychisq() indicates that the distribution of people’s trust in the federal government and their trust in other people are not independent, meaning that they are related. Let’s output the distributions in a table to see the relationship. The observed output from the test provides a cross-tabulation of the counts for each category: chi_ex2$observed ## TrustPeople ## TrustGovernment Always Most of the time About half the time ## Always 16.470 25.009 31.848 ## Most of the time 11.020 539.377 196.258 ## About half the time 11.772 934.858 861.971 ## Some of the time 17.007 1353.779 839.863 ## Never 3.174 236.785 174.272 ## TrustPeople ## TrustGovernment Some of the time Never ## Always 36.854 5.523 ## Most of the time 206.556 27.184 ## About half the time 428.871 65.024 ## Some of the time 932.628 89.596 ## Never 217.994 189.307 However, we often want to know about the proportions, not just the respondent counts from the survey. There are a couple of different ways that we can do this. The first is using the counts from chi_ex2$observed to calculate the proportion. We can then pivot the table to create a cross-tabulation similar to the counts table above. Adding group_by() to the code means that we obtain the proportions within each variable level. In this case, we are looking at the distribution of TrustGovernment for each level of TrustPeople. The resulting table is shown in Table 6.4. chi_ex2_table&lt;-chi_ex2$observed %&gt;% as_tibble() %&gt;% group_by(TrustPeople) %&gt;% mutate(prop = round(n / sum(n), 3)) %&gt;% select(-n) %&gt;% pivot_wider(names_from = TrustPeople, values_from = prop) %&gt;% gt(rowname_col = &quot;TrustGovernment&quot;) %&gt;% tab_stubhead(label = &quot;Trust in Government&quot;) %&gt;% tab_spanner(label = &quot;Trust in People&quot;, columns = everything()) %&gt;% cols_label(`Most of the time` = md(&quot;Most of&lt;br /&gt;the time&quot;), `About half the time` = md(&quot;About half&lt;br /&gt;the time&quot;), `Some of the time` = md(&quot;Some of&lt;br /&gt;the time&quot;)) chi_ex2_table #vxngdmvipb table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #vxngdmvipb thead, #vxngdmvipb tbody, #vxngdmvipb tfoot, #vxngdmvipb tr, #vxngdmvipb td, #vxngdmvipb th { border-style: none; } #vxngdmvipb p { margin: 0; padding: 0; } #vxngdmvipb .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #vxngdmvipb .gt_caption { padding-top: 4px; padding-bottom: 4px; } #vxngdmvipb .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #vxngdmvipb .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #vxngdmvipb .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #vxngdmvipb .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #vxngdmvipb .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #vxngdmvipb .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #vxngdmvipb .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #vxngdmvipb .gt_column_spanner_outer:first-child { padding-left: 0; } #vxngdmvipb .gt_column_spanner_outer:last-child { padding-right: 0; } #vxngdmvipb .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #vxngdmvipb .gt_spanner_row { border-bottom-style: hidden; } #vxngdmvipb .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #vxngdmvipb .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #vxngdmvipb .gt_from_md > :first-child { margin-top: 0; } #vxngdmvipb .gt_from_md > :last-child { margin-bottom: 0; } #vxngdmvipb .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #vxngdmvipb .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #vxngdmvipb .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #vxngdmvipb .gt_row_group_first td { border-top-width: 2px; } #vxngdmvipb .gt_row_group_first th { border-top-width: 2px; } #vxngdmvipb .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #vxngdmvipb .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #vxngdmvipb .gt_first_summary_row.thick { border-top-width: 2px; } #vxngdmvipb .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #vxngdmvipb .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #vxngdmvipb .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #vxngdmvipb .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #vxngdmvipb .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #vxngdmvipb .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #vxngdmvipb .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #vxngdmvipb .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #vxngdmvipb .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #vxngdmvipb .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #vxngdmvipb .gt_left { text-align: left; } #vxngdmvipb .gt_center { text-align: center; } #vxngdmvipb .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #vxngdmvipb .gt_font_normal { font-weight: normal; } #vxngdmvipb .gt_font_bold { font-weight: bold; } #vxngdmvipb .gt_font_italic { font-style: italic; } #vxngdmvipb .gt_super { font-size: 65%; } #vxngdmvipb .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #vxngdmvipb .gt_asterisk { font-size: 100%; vertical-align: 0; } #vxngdmvipb .gt_indent_1 { text-indent: 5px; } #vxngdmvipb .gt_indent_2 { text-indent: 10px; } #vxngdmvipb .gt_indent_3 { text-indent: 15px; } #vxngdmvipb .gt_indent_4 { text-indent: 20px; } #vxngdmvipb .gt_indent_5 { text-indent: 25px; } TABLE 6.4: Proportion of adults in the U.S. by levels of trust in people and government, ANES 2020 Trust in Government Trust in People Always Most ofthe time About halfthe time Some ofthe time Never Always 0.277 0.008 0.015 0.020 0.015 Most of the time 0.185 0.175 0.093 0.113 0.072 About half the time 0.198 0.303 0.410 0.235 0.173 Some of the time 0.286 0.438 0.399 0.512 0.238 Never 0.053 0.077 0.083 0.120 0.503 In Table 6.4, each column sums to 1. For example, we can say that it is estimated that of people who always trust in people, 27.7% also always trust in the government based on the top-left cell, but 5.3% never trust in the government. The second option is to use the group_by() and survey_mean() functions to calculate the proportions from the ANES design object. Remember that with more than one variable listed in the group_by() statement, the proportions are within the first variable listed. As mentioned above, we are looking at the distribution of TrustGovernment for each level of TrustPeople. chi_ex2_obs &lt;- anes_des %&gt;% drop_na(TrustPeople, TrustGovernment) %&gt;% group_by(TrustPeople, TrustGovernment) %&gt;% summarize(Observed = round(survey_mean(vartype = &quot;ci&quot;), 3), .groups=&quot;drop&quot;) chi_ex2_obs_table&lt;-chi_ex2_obs %&gt;% mutate(prop = paste0(Observed, &quot; (&quot;, Observed_low, &quot;, &quot;, Observed_upp, &quot;)&quot;)) %&gt;% select(TrustGovernment, TrustPeople, prop) %&gt;% pivot_wider(names_from = TrustPeople, values_from = prop) %&gt;% gt(rowname_col = &quot;TrustGovernment&quot;) %&gt;% tab_stubhead(label = &quot;Trust in Government&quot;) %&gt;% tab_spanner(label = &quot;Trust in People&quot;, columns = everything()) %&gt;% tab_options(page.orientation = &quot;landscape&quot;) chi_ex2_obs_table #ddwtqmsbxc table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #ddwtqmsbxc thead, #ddwtqmsbxc tbody, #ddwtqmsbxc tfoot, #ddwtqmsbxc tr, #ddwtqmsbxc td, #ddwtqmsbxc th { border-style: none; } #ddwtqmsbxc p { margin: 0; padding: 0; } #ddwtqmsbxc .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #ddwtqmsbxc .gt_caption { padding-top: 4px; padding-bottom: 4px; } #ddwtqmsbxc .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #ddwtqmsbxc .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #ddwtqmsbxc .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #ddwtqmsbxc .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #ddwtqmsbxc .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #ddwtqmsbxc .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #ddwtqmsbxc .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #ddwtqmsbxc .gt_column_spanner_outer:first-child { padding-left: 0; } #ddwtqmsbxc .gt_column_spanner_outer:last-child { padding-right: 0; } #ddwtqmsbxc .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #ddwtqmsbxc .gt_spanner_row { border-bottom-style: hidden; } #ddwtqmsbxc .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #ddwtqmsbxc .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #ddwtqmsbxc .gt_from_md > :first-child { margin-top: 0; } #ddwtqmsbxc .gt_from_md > :last-child { margin-bottom: 0; } #ddwtqmsbxc .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #ddwtqmsbxc .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #ddwtqmsbxc .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #ddwtqmsbxc .gt_row_group_first td { border-top-width: 2px; } #ddwtqmsbxc .gt_row_group_first th { border-top-width: 2px; } #ddwtqmsbxc .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #ddwtqmsbxc .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #ddwtqmsbxc .gt_first_summary_row.thick { border-top-width: 2px; } #ddwtqmsbxc .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #ddwtqmsbxc .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #ddwtqmsbxc .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #ddwtqmsbxc .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #ddwtqmsbxc .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #ddwtqmsbxc .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #ddwtqmsbxc .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #ddwtqmsbxc .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #ddwtqmsbxc .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #ddwtqmsbxc .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #ddwtqmsbxc .gt_left { text-align: left; } #ddwtqmsbxc .gt_center { text-align: center; } #ddwtqmsbxc .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #ddwtqmsbxc .gt_font_normal { font-weight: normal; } #ddwtqmsbxc .gt_font_bold { font-weight: bold; } #ddwtqmsbxc .gt_font_italic { font-style: italic; } #ddwtqmsbxc .gt_super { font-size: 65%; } #ddwtqmsbxc .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #ddwtqmsbxc .gt_asterisk { font-size: 100%; vertical-align: 0; } #ddwtqmsbxc .gt_indent_1 { text-indent: 5px; } #ddwtqmsbxc .gt_indent_2 { text-indent: 10px; } #ddwtqmsbxc .gt_indent_3 { text-indent: 15px; } #ddwtqmsbxc .gt_indent_4 { text-indent: 20px; } #ddwtqmsbxc .gt_indent_5 { text-indent: 25px; } TABLE 6.5: Proportion of adults in the U.S. by levels of trust in people and government with confidence intervals, ANES 2020 Trust in Government Trust in People Always Most of the time About half the time Some of the time Never Always 0.277 (0.11, 0.444) 0.008 (0.004, 0.012) 0.015 (0.006, 0.024) 0.02 (0.008, 0.033) 0.015 (0, 0.029) Most of the time 0.185 (-0.009, 0.38) 0.175 (0.157, 0.192) 0.093 (0.078, 0.109) 0.113 (0.085, 0.141) 0.072 (0.021, 0.123) About half the time 0.198 (0.046, 0.35) 0.303 (0.281, 0.324) 0.41 (0.378, 0.441) 0.235 (0.2, 0.271) 0.173 (0.099, 0.246) Some of the time 0.286 (0.069, 0.503) 0.438 (0.415, 0.462) 0.399 (0.365, 0.433) 0.512 (0.481, 0.543) 0.238 (0.178, 0.298) Never 0.053 (-0.01, 0.117) 0.077 (0.064, 0.089) 0.083 (0.063, 0.103) 0.12 (0.097, 0.142) 0.503 (0.422, 0.583) Both methods produce the same output as the svychisq() function. However, calculating the proportions directly from the design object allows us to obtain the variance information. In this case, the table output displays the survey estimate followed by the confidence intervals. Based on the output, we can see that of those who never trust people, 50.3% also never trust the government, while the proportions of never trusting the government are much lower for each of the other levels of trusting people. We may find it easier to look at these proportions graphically. We can use ggplot() and facets to provide an overview to create Figure 6.2 below: chi_ex2_obs %&gt;% mutate(TrustPeople= fct_reorder(str_c(&quot;Trust in People:\\n&quot;, TrustPeople), order(TrustPeople))) %&gt;% ggplot( aes(x = TrustGovernment, y = Observed, color = TrustGovernment)) + facet_wrap( ~ TrustPeople, ncol = 5) + geom_point() + geom_errorbar(aes(ymin = Observed_low, ymax = Observed_upp)) + ylab(&quot;Proportion&quot;) + xlab(&quot;&quot;) + theme_bw() + scale_color_manual(name=&quot;Trust in Government&quot;, values=book_colors) + theme(axis.text.x = element_blank(), axis.ticks.x = element_blank(), legend.position = &quot;bottom&quot;) + guides(col = guide_legend(nrow=2)) FIGURE 6.2: Proportion of adults in the U.S. by levels of trust in people and government with confidence intervals, ANES 2020 Example 3: Test of homogeneity Researchers and politicians often look at specific demographics each election cycle to understand how each group is leaning or voting toward candidates. The ANES data are collected post-election, but we can still see if there are differences in how specific demographic groups voted. If we want to see if there is a difference in how each age group voted for the 2020 candidates, this would be a test of homogeneity, and we can set up the hypothesis as follows: \\[\\begin{align*} H_0: p_{1_{Biden}} &amp;= p_{1_{Trump}} = p_{1_{Other}},\\\\ p_{2_{Biden}} &amp;= p_{2_{Trump}} = p_{2_{Other}},\\\\ p_{3_{Biden}} &amp;= p_{3_{Trump}} = p_{3_{Other}},\\\\ p_{4_{Biden}} &amp;= p_{4_{Trump}} = p_{4_{Other}},\\\\ p_{5_{Biden}} &amp;= p_{5_{Trump}} = p_{5_{Other}},\\\\ p_{6_{Biden}} &amp;= p_{6_{Trump}} = p_{6_{Other}} \\end{align*}\\] where \\(p_{i_{Biden}}\\) is the observed proportion of each age group (\\(i\\)) that voted for Joseph Biden, \\(p_{i_{Trump}}\\) is the observed proportion of each age group (\\(i\\)) that voted for Donald Trump, and \\(p_{i_{Other}}\\) is the observed proportion of each age group (\\(i\\)) that voted for another candidate \\(H_A:\\) at least one category of \\(p_{i_{Biden}}\\) does not match \\(p_{i_{Trump}}\\) or \\(p_{i_{Other}}\\) To conduct this in R, we use the svychisq() function to compare the two variables: chi_ex3 &lt;- anes_des %&gt;% drop_na(VotedPres2020_selection, AgeGroup) %&gt;% svychisq( formula = ~ AgeGroup + VotedPres2020_selection, design = ., statistic = &quot;Chisq&quot;, na.rm = TRUE ) chi_ex3 ## ## Pearson&#39;s X^2: Rao &amp; Scott adjustment ## ## data: NextMethod() ## X-squared = 171, df = 10, p-value &lt;2e-16 The output from svychisq() indicates a difference in how each age group voted in the 2020 election. To get a better idea of the different distributions, let’s output proportions to see the relationship. As we learned in Example 2 above, we can use chi_ex3$observed, or if we want to get the variance information (which is crucial with survey data), we can use survey_mean(). Remember, when we have two variables in group_by(), we obtain the proportions within each level of the variable listed. In this case, we are looking at the distribution of AgeGroup for each level of VotedPres2020_selection. chi_ex3_obs &lt;- anes_des %&gt;% filter(VotedPres2020 == &quot;Yes&quot;) %&gt;% drop_na(VotedPres2020_selection, AgeGroup) %&gt;% group_by(VotedPres2020_selection, AgeGroup) %&gt;% summarize(Observed = round(survey_mean(vartype = &quot;ci&quot;), 3)) chi_ex3_obs_table&lt;-chi_ex3_obs %&gt;% mutate(prop = paste0(Observed, &quot; (&quot;, Observed_low, &quot;, &quot;, Observed_upp, &quot;)&quot;)) %&gt;% select(AgeGroup, VotedPres2020_selection, prop) %&gt;% pivot_wider(names_from = VotedPres2020_selection, values_from = prop) %&gt;% gt(rowname_col = &quot;AgeGroup&quot;) %&gt;% tab_stubhead(label = &quot;Age Group&quot;) chi_ex3_obs_table #zbxwhiitju table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #zbxwhiitju thead, #zbxwhiitju tbody, #zbxwhiitju tfoot, #zbxwhiitju tr, #zbxwhiitju td, #zbxwhiitju th { border-style: none; } #zbxwhiitju p { margin: 0; padding: 0; } #zbxwhiitju .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #zbxwhiitju .gt_caption { padding-top: 4px; padding-bottom: 4px; } #zbxwhiitju .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #zbxwhiitju .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #zbxwhiitju .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #zbxwhiitju .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #zbxwhiitju .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #zbxwhiitju .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #zbxwhiitju .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #zbxwhiitju .gt_column_spanner_outer:first-child { padding-left: 0; } #zbxwhiitju .gt_column_spanner_outer:last-child { padding-right: 0; } #zbxwhiitju .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #zbxwhiitju .gt_spanner_row { border-bottom-style: hidden; } #zbxwhiitju .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #zbxwhiitju .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #zbxwhiitju .gt_from_md > :first-child { margin-top: 0; } #zbxwhiitju .gt_from_md > :last-child { margin-bottom: 0; } #zbxwhiitju .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #zbxwhiitju .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #zbxwhiitju .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #zbxwhiitju .gt_row_group_first td { border-top-width: 2px; } #zbxwhiitju .gt_row_group_first th { border-top-width: 2px; } #zbxwhiitju .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #zbxwhiitju .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #zbxwhiitju .gt_first_summary_row.thick { border-top-width: 2px; } #zbxwhiitju .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #zbxwhiitju .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #zbxwhiitju .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #zbxwhiitju .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #zbxwhiitju .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #zbxwhiitju .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #zbxwhiitju .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #zbxwhiitju .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #zbxwhiitju .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #zbxwhiitju .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #zbxwhiitju .gt_left { text-align: left; } #zbxwhiitju .gt_center { text-align: center; } #zbxwhiitju .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #zbxwhiitju .gt_font_normal { font-weight: normal; } #zbxwhiitju .gt_font_bold { font-weight: bold; } #zbxwhiitju .gt_font_italic { font-style: italic; } #zbxwhiitju .gt_super { font-size: 65%; } #zbxwhiitju .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #zbxwhiitju .gt_asterisk { font-size: 100%; vertical-align: 0; } #zbxwhiitju .gt_indent_1 { text-indent: 5px; } #zbxwhiitju .gt_indent_2 { text-indent: 10px; } #zbxwhiitju .gt_indent_3 { text-indent: 15px; } #zbxwhiitju .gt_indent_4 { text-indent: 20px; } #zbxwhiitju .gt_indent_5 { text-indent: 25px; } TABLE 6.6: Distribution of age group by presidential candidate selection with confidence intervals Age Group Biden Trump Other 18-29 0.203 (0.177, 0.229) 0.113 (0.095, 0.132) 0.221 (0.144, 0.298) 30-39 0.168 (0.152, 0.184) 0.146 (0.125, 0.168) 0.302 (0.21, 0.394) 40-49 0.163 (0.146, 0.18) 0.157 (0.137, 0.177) 0.21 (0.13, 0.29) 50-59 0.152 (0.135, 0.17) 0.229 (0.202, 0.256) 0.104 (0.04, 0.168) 60-69 0.177 (0.159, 0.196) 0.193 (0.173, 0.213) 0.103 (0.025, 0.182) 70 or older 0.136 (0.123, 0.149) 0.161 (0.143, 0.179) 0.06 (0.01, 0.109) We can see that the age group distribution that voted for Biden and other candidates was younger than those that voted for Trump. For example, of those who voted for Biden, 20.4% were in the 18-29 age group, compared to only 11.4% of those who voted for Trump were in that age group. Conversely, 23.4% of those who voted for Trump were in the 50-59 age group compared to only 15.4% of those who voted for Biden. 6.5 Exercises The exercises use the design objects anes_des and recs_des as provided in the Prerequisites box at the beginning of the chapter. Here are some exercises for practicing conducting t-tests using svyttest(): Using the RECS data, do more than 50% of U.S. households use A/C (ACUsed)? Using the RECS data, does the average temperature at which U.S. households set their thermostats differ between the day and night in the winter (WinterTempDay and WinterTempNight)? Using the ANES data, does the average age (Age) of those who voted for Joseph Biden in 2020 (VotedPres2020_selection) differ from those who voted for another candidate? If we wanted to determine if the political party affiliation differed for males and females, what test would we use? Goodness of fit test (svygofchisq()) Test of independence (svychisq()) Test of homogeneity (svychisq()) In the RECS data, is there a relationship between the type of housing unit (HousingUnitType) and the year the house was built (YearMade)? In the ANES data, is there a difference in the distribution of gender (Gender) across early voting status in 2020 (EarlyVote2020)? References Csardi, Gabor. 2023. prettyunits: Pretty, Human Readable Formatting of Quantities. https://github.com/r-lib/prettyunits. Lumley, Thomas. 2010. Complex Surveys: A Guide to Analysis Using R. John Wiley; Sons. R Core Team. 2023. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/. Robinson, David, Alex Hayes, and Simon Couch. 2023. broom: Convert Statistical Objects into Tidy Tibbles. Scott, Alastair. 2007. “Rao-Scott Corrections and Their Impact.” In Section on Survey Research Methods, 3514–18. http://www.asasrms.org/Proceedings/y2007/Files/JSM2007-000874.pdf. ———. 2023a. forcats: Tools for Working with Categorical Variables (Factors). For more information on statistical testing, we recommend reviewing introduction to statistics textbooks.↩︎ This could change in the future if another package is built or {srvyr} is expanded to work with {tidymodels} packages but no such plans are known at this time.↩︎ During the summer, what is your home’s typical indoor temperature inside your home at night?↩︎ This is the temperature that Stephanie prefers at night during the summer, and she wanted to see if she was different from the population.↩︎ Is any air conditioning equipment used in your home?↩︎ Is any air conditioning equipment used in your home?↩︎ During the summer, what is your home’s typical indoor temperature inside your home at night?↩︎ During the winter, what is your home’s typical indoor temperature inside your home at night?↩︎ These two statistics can also be used for goodness of fit tests if the svygofchisq() function is not used.↩︎ What is the highest level of school you have completed or the highest degree you have received?↩︎ Data was pulled from data.census.gov using the S1501 Education Attainment 2020: ACS 5-Year Estimates Subject Tables.↩︎ "],["c07-modeling.html", "Chapter 7 Modeling 7.1 Introduction 7.2 Analysis of variance (ANOVA) 7.3 Normal linear regression 7.4 Logistic regression 7.5 Exercises", " Chapter 7 Modeling Prerequisites For this chapter, load the following packages: library(tidyverse) library(survey) library(srvyr) library(srvyrexploR) library(broom) library(gt) library(prettyunits) We are using data from ANES and RECS described in Chapter 4. As a reminder, here is the code to create the design objects for each to use throughout this chapter. For ANES, we need to adjust the weight so it sums to the population instead of the sample (see the ANES documentation and Chapter 4 for more information.) targetpop &lt;- 231592693 anes_adjwgt &lt;- anes_2020 %&gt;% mutate(Weight = Weight / sum(Weight) * targetpop) anes_des &lt;- anes_adjwgt %&gt;% as_survey_design( weights = Weight, strata = Stratum, ids = VarUnit, nest = TRUE ) For RECS, details are included in the RECS documentation and Chapters 4 and 10. recs_des &lt;- recs_2020 %&gt;% as_survey_rep( weights = NWEIGHT, repweights = NWEIGHT1:NWEIGHT60, type = &quot;JK1&quot;, scale = 59 / 60, mse = TRUE ) 7.1 Introduction Modeling data is a way for researchers to investigate the relationship between a single dependent variable and one or more independent variables. This builds upon the analyses conducted in Chapter 6, which looked at the relationships between just two variables. For example, in Example 3 in Section 6.3.2, we investigated if there is a relationship between the electrical bill cost and whether or not the household used air-conditioning. However, there are potentially other elements that could go into what the cost of electrical bills are in a household (e.g., outside temperature, desired internal temperature, types and number of appliances, etc.) T-tests only allow us to investigate the relationship of one independent variable at a time, but using models, we can look into multiple variables and even explore interactions between these variables. There are several types of models, but in this chapter, we cover Analysis of Variance (ANOVA) and linear regression models following common normal (Gaussian) and logit models. Jonas Kristoffer Lindeløv has an interesting discussion of many statistical tests and models being equivalent to a linear model. For example, a one-way ANOVA is a linear model with one categorical independent variable, and a two-sample t-test is an ANOVA where the independent variable has exactly two levels. When modeling data, it is helpful to first create an equation that provides an overview of what we are modeling. The main structure of these models is as follows: \\[y_i=\\beta_0 +\\sum_{i=1}^p \\beta_i x_i + \\epsilon_i\\] where \\(y_i\\) is the outcome, \\(\\beta_0\\) is an intercept, \\(x_1, \\cdots, x_p\\) are the predictors with \\(\\beta_1, \\cdots, \\beta_p\\) as the associated coefficients, and \\(\\epsilon_i\\) is the error. Not all models have all components. For example, some models may not include an intercept (\\(\\beta_0\\)), may have interactions between different independent variables (\\(x_i\\)), or may have different underlying structures for the dependent variable (\\(y_i\\).) However, all linear models have the independent variables related to the dependent variable in a linear form. To specify these models in R, the formulas are the same with both survey data and other data. The left side of the formula is the response/dependent variable, and the right side has the predictor/independent variable(s). There are many symbols used in R to specify the formula. For example, a linear formula mathematically notated as \\[y_i=\\beta_0+\\beta_1 x_i+\\epsilon_i\\] would be specified in R as y~x where the intercept is not explicitly included. To fit a model with no intercept, that is, \\[y_i=\\beta_1 x_i+\\epsilon_i\\] it can be specified in R as y~x-1. Formula notation details in R can be found in the help file for formula23. A quick overview of the common formula notation is in Table 7.1: TABLE 7.1: Common symbols in formula notation Symbol Example Meaning + +x include this variable - -x delete this variable : x:z include the interaction between these variables * x*z include these variables and the interactions between them ^n (x+y+z)^3 include these variables and all interactions up to n-way I I(x-z) as-is: include a new variable that is calculated inside the parentheses (e.g., x-z, x*z, x/z are possible calculations that could be done) There are often multiple ways to specify the same formula. For example, consider the following equation using the mtcars dataset that is built into R: \\[mpg_i=\\beta_0+\\beta_1cyl_{i}+\\beta_2disp_{i}+\\beta_3hp_{i}+\\beta_4cyl_{i}disp_{i}+\\beta_5cyl_{i}hp_{i}+\\beta_6disp_{i}hp_{i}+\\epsilon_i\\] This could be specified in R code as any of the following: mpg ~ (cyl + disp + hp)^2 mpg ~ cyl + disp + hp + cyl:disp + cyl:hp + disp:hp mpg ~ cyl*disp + cyl*hp + disp*hp In the above options, the way the : and * notation are implemented are different. Using : only includes the interactions and not the main effects, while using * includes the main effects and all possible interactions. Table 7.2 provides an overview of the syntax and differences between the two notations. TABLE 7.2: Differences in formulas for : and * code syntax Symbol Syntax Formula : mpg ~ cyl:disp:hp \\[ \\begin{aligned} mpg_i = &amp;\\beta_0+\\beta_4cyl_{i}disp_{i}+\\beta_5cyl_{i}hp_{i}+ \\\\&amp; \\beta_6disp_{i}hp_{i}+\\epsilon_i\\end{aligned}\\] * mpg ~ cyl*disp*hp \\[ \\begin{aligned} mpg_i= &amp;\\beta_0+\\beta_1cyl_{i}+\\beta_2disp_{i}+\\beta_3hp_{i}+\\\\&amp; \\beta_4cyl_{i}disp_{i}+\\beta_5cyl_{i}hp_{i}+\\beta_6disp_{i}hp_{i}+\\\\&amp;\\beta_7cyl_{i}disp_{i}hp_{i}+\\epsilon_i\\end{aligned}\\] When using non-survey data, such as experimental or observational data, researchers use the glm() function for linear models. With survey data, however, we use svyglm() from the {survey} package to ensure that we account for the survey design and weights in modeling24. This allows us to generalize a model to the population of interest and accounts for the fact that the observations in the survey data may not be independent. As discussed in Chapter 6, modeling survey data cannot be directly done in {srvyr} but can be done in the {survey} package (Lumley 2010). In this chapter, we provide syntax and examples for linear models, including ANOVA, normal linear regression, and logistic regression. For details on other types of regression, including ordinal regression, log-linear models, and survival analysis, refer to Lumley (2010). Lumley (2010) also discusses custom models such as a negative binomial or Poisson model in Appendix E of his book. 7.2 Analysis of variance (ANOVA) In ANOVA, we are testing whether the mean of an outcome is the same across two or more groups. Statistically, we set up this as follows: \\(H_0: \\mu_1 = \\mu_2= \\dots = \\mu_k\\) where \\(\\mu_i\\) is the mean outcome for group \\(i\\) \\(H_A: \\text{At least one mean is different}\\) An ANOVA test is also a linear model, we can re-frame the problem using the framework as: \\[ y_i=\\sum_{i=1}^k \\mu_i x_i + \\epsilon_i\\] where \\(x_i\\) is a group indicator for groups \\(1, \\cdots, k\\). Some assumptions when using ANOVA on survey data include: The outcome variable is normally distributed within each group The variances of the outcome variable between each group are approximately equal We do NOT assume independence between the groups as with ANOVA on non-survey data. The covariance is accounted for in the survey design 7.2.1 Syntax To perform this type of analysis in R, the general syntax is as follows: des_obj %&gt;% svyglm( formula = outcome ~ group, design = ., na.action = na.omit, df.resid = NULL ) The arguments are: formula: Formula in the form of outcome~group. The group variable must be a factor or character. design: a tbl_svy object created by as_survey na.action: handling of missing data df.resid: degrees of freedom for Wald tests (optional) - defaults to using degf(design)-(g-1) where \\(g\\) is the number of groups The function svyglm() does not have the design as the first argument so the dot (.) notation is used to pass it with a pipe (see Chapter 6 for more details.) The default for missing data is na.omit. This means that we are removing all records with any missing data in either predictors or outcomes from analyses. There are other options for handling missing data, and we recommend looking at the help documentation for na.omit (run help(na.omit) or ?na.omit) for more information on options to use for na.action. For a discussion on how to handle missing data, see Chapter 11. 7.2.2 Example Looking at an example helps us discuss the output and how to interpret the results. In RECS, respondents are asked what temperature they set their thermostat to during the day and evening when using the air-conditioning (A/C) during the summer. To analyze these data, we filter the respondents to only those using A/C (ACUsed.) Then, if we want to see if there are regional differences, we can use group_by(). A descriptive analysis of the temperature at night (SummerTempNight) set by region and the sample sizes is displayed below. recs_des %&gt;% filter(ACUsed) %&gt;% group_by(Region) %&gt;% summarize( SMN = survey_mean(SummerTempNight, na.rm = TRUE), n = unweighted(n()), n_na = unweighted(sum(is.na(SummerTempNight))) ) ## # A tibble: 4 × 5 ## Region SMN SMN_se n n_na ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;int&gt; &lt;int&gt; ## 1 Northeast 69.7 0.103 3204 0 ## 2 Midwest 71.0 0.0897 3619 0 ## 3 South 71.8 0.0536 6065 0 ## 4 West 72.5 0.129 3283 0 In the following code, we test whether this temperature varies by region by first using svyglm() to run the test and then using broom::tidy() to display the output. Note that the temperature setting is set to NA when the household does not use A/C, and since the default handling of NAs is na.action=na.omit, records that do not use A/C are not included in this regression. anova_out &lt;- recs_des %&gt;% svyglm(design = ., formula = SummerTempNight ~ Region) tidy(anova_out) ## # A tibble: 4 × 5 ## term estimate std.error statistic p.value ## &lt;chr&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 (Intercept) 69.7 0.103 674. 3.69e-111 ## 2 RegionMidwest 1.34 0.138 9.68 1.46e- 13 ## 3 RegionSouth 2.05 0.128 16.0 1.36e- 22 ## 4 RegionWest 2.80 0.177 15.9 2.27e- 22 In the output above, we can see the estimated coefficients (estimate), estimated standard errors of the coefficients (std.error), the t-statistic (statistic), and the p-value for each coefficient. In this output, the intercept represents the reference value of the Northeast region. The other coefficients indicate the difference in temperature relative to the Northeast region. For example, in the Midwest, temperatures are set, on average, 1.34 (p-value&lt;0.0001) degrees higher than in the Northeast during summer nights, and each region sets their thermostats at significantly higher temperatures than the Northeast. If we wanted to change the reference value, we would reorder the factor before modeling using the function relevel() from {stats} or using one of many factor ordering functions in {forcats} such as fct_relevel() or fct_infreq(). For example, if we wanted the reference level to be the Midwest region, we could use the following code. Note the usage of the gt() function on top of tidy() to print a nice-looking output table (Iannone et al. 2023; Robinson, Hayes, and Couch 2023) (see Chapter 8 for more information on the {gt} package.) anova_out_relevel &lt;- recs_des %&gt;% mutate(Region=fct_relevel(Region, &quot;Midwest&quot;, after = 0)) %&gt;% svyglm(design = ., formula = SummerTempNight ~ Region) tidy(anova_out_relevel) %&gt;% mutate(p.value=pretty_p_value(p.value)) %&gt;% gt() %&gt;% fmt_number() #pyfzfbszwz table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #pyfzfbszwz thead, #pyfzfbszwz tbody, #pyfzfbszwz tfoot, #pyfzfbszwz tr, #pyfzfbszwz td, #pyfzfbszwz th { border-style: none; } #pyfzfbszwz p { margin: 0; padding: 0; } #pyfzfbszwz .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #pyfzfbszwz .gt_caption { padding-top: 4px; padding-bottom: 4px; } #pyfzfbszwz .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #pyfzfbszwz .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #pyfzfbszwz .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #pyfzfbszwz .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #pyfzfbszwz .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #pyfzfbszwz .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #pyfzfbszwz .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #pyfzfbszwz .gt_column_spanner_outer:first-child { padding-left: 0; } #pyfzfbszwz .gt_column_spanner_outer:last-child { padding-right: 0; } #pyfzfbszwz .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #pyfzfbszwz .gt_spanner_row { border-bottom-style: hidden; } #pyfzfbszwz .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #pyfzfbszwz .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #pyfzfbszwz .gt_from_md > :first-child { margin-top: 0; } #pyfzfbszwz .gt_from_md > :last-child { margin-bottom: 0; } #pyfzfbszwz .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #pyfzfbszwz .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #pyfzfbszwz .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #pyfzfbszwz .gt_row_group_first td { border-top-width: 2px; } #pyfzfbszwz .gt_row_group_first th { border-top-width: 2px; } #pyfzfbszwz .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #pyfzfbszwz .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #pyfzfbszwz .gt_first_summary_row.thick { border-top-width: 2px; } #pyfzfbszwz .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #pyfzfbszwz .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #pyfzfbszwz .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #pyfzfbszwz .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #pyfzfbszwz .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #pyfzfbszwz .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #pyfzfbszwz .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #pyfzfbszwz .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #pyfzfbszwz .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #pyfzfbszwz .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #pyfzfbszwz .gt_left { text-align: left; } #pyfzfbszwz .gt_center { text-align: center; } #pyfzfbszwz .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #pyfzfbszwz .gt_font_normal { font-weight: normal; } #pyfzfbszwz .gt_font_bold { font-weight: bold; } #pyfzfbszwz .gt_font_italic { font-style: italic; } #pyfzfbszwz .gt_super { font-size: 65%; } #pyfzfbszwz .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #pyfzfbszwz .gt_asterisk { font-size: 100%; vertical-align: 0; } #pyfzfbszwz .gt_indent_1 { text-indent: 5px; } #pyfzfbszwz .gt_indent_2 { text-indent: 10px; } #pyfzfbszwz .gt_indent_3 { text-indent: 15px; } #pyfzfbszwz .gt_indent_4 { text-indent: 20px; } #pyfzfbszwz .gt_indent_5 { text-indent: 25px; } TABLE 7.3: ANOVA output for estimates of thermostat temperature setting at night by region with Midwest as the reference region, RECS 2020 term estimate std.error statistic p.value (Intercept) 71.04 0.09 791.83 &lt;0.0001 RegionNortheast −1.34 0.14 −9.68 &lt;0.0001 RegionSouth 0.71 0.10 6.91 &lt;0.0001 RegionWest 1.47 0.16 9.17 &lt;0.0001 This output now has the coefficients indicating the difference in temperature relative to the Midwest region. For example, in the Northeast, temperatures are set, on average, -1.34 (p-value&lt;0.0001) degrees lower than in the Midwest during summer nights, and each region sets their thermostats at significantly lower temperatures than the Midwest. This is the reverse of what we saw in the prior model, as we are still comparing the same two regions, just from different reference points. 7.3 Normal linear regression Normal linear regression is a more generalized method than ANOVA, where we fit a model of a continuous outcome with any number of categorical or continuous predictors (whereas ANOVA only has categorical predictors) and is similarly specified as: \\[\\begin{equation} y_i=\\beta_0 +\\sum_{i=1}^p \\beta_i x_i + \\epsilon_i \\end{equation}\\] where \\(y_i\\) is the outcome, \\(\\beta_0\\) is an intercept, \\(x_1, \\cdots, x_p\\) are the predictors with \\(\\beta_1, \\cdots, \\beta_p\\) as the associated coefficients, and \\(\\epsilon_i\\) is the error. Assumptions in normal linear regression using survey data include: The residuals (\\(\\epsilon_i\\)) are normally distributed, but there is not an assumption of independence, and the correlation structure is captured in the survey design object There is a linear relationship between the outcome variable and the independent variables The residuals are homoscedastic; that is, the error term is the same across all values of independent variables 7.3.1 Syntax The syntax for this regression uses the same function as ANOVA but can have more than one variable listed on the right-hand side of the formula: des_obj %&gt;% svyglm( formula = outcomevar ~ x1 + x2 + x3, design = ., na.action = na.omit, df.resid = NULL ) The arguments are: formula: Formula in the form of y~x design: a tbl_svy object created by as_survey na.action: handling of missing data df.resid: degrees of freedom for Wald tests (optional) - defaults to using degf(design)-p where \\(p\\) is the rank of the design matrix As discussed in Section 7.1, the formula on the right-hand side can be specified in many ways, whether interactions are desired or not, for example. 7.3.2 Examples Example 1: Linear regression with a single variable On RECS, we can obtain information on the square footage of homes and the electric bills. We assume that square footage is related to the amount of money spent on electricity and examine a model for this. Before any modeling, we first plot the data to determine whether it is reasonable to assume a linear relationship. In Figure 7.1, each hexagon represents the weighted count of households in the bin, and we can see a general positive linear trend (as the square footage increases, so does the amount of money spent on electricity.) recs_2020 %&gt;% ggplot(aes( x = TOTSQFT_EN, y = DOLLAREL, weight = NWEIGHT / 1000000 )) + geom_hex() + scale_fill_gradientn( guide = &quot;colorbar&quot;, name = &quot;Housing Units\\n(Millions)&quot;, labels = scales::comma, colors = book_colors[c(3, 2, 1)] ) + xlab(&quot;Total square footage&quot;) + ylab(&quot;Amount spent on electricity&quot;) + scale_y_continuous(labels = scales::dollar_format()) + scale_x_continuous(labels = scales::comma_format()) + theme_minimal() FIGURE 7.1: Relationship between square footage and dollars spent on electricity, RECS 2020 Given that the plot shows a potentially increasing relationship between square footage and electricity expenditure, fitting a model allows us to determine if the relationship is statistically significant. The model is fit below with electricity expenditure as the outcome. m_electric_sqft &lt;- recs_des %&gt;% svyglm(design = ., formula = DOLLAREL ~ TOTSQFT_EN, na.action = na.omit) tidy(m_electric_sqft) %&gt;% mutate(p.value=pretty_p_value(p.value)) %&gt;% gt() %&gt;% fmt_number() #zulyvxvvtg table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #zulyvxvvtg thead, #zulyvxvvtg tbody, #zulyvxvvtg tfoot, #zulyvxvvtg tr, #zulyvxvvtg td, #zulyvxvvtg th { border-style: none; } #zulyvxvvtg p { margin: 0; padding: 0; } #zulyvxvvtg .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #zulyvxvvtg .gt_caption { padding-top: 4px; padding-bottom: 4px; } #zulyvxvvtg .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #zulyvxvvtg .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #zulyvxvvtg .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #zulyvxvvtg .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #zulyvxvvtg .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #zulyvxvvtg .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #zulyvxvvtg .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #zulyvxvvtg .gt_column_spanner_outer:first-child { padding-left: 0; } #zulyvxvvtg .gt_column_spanner_outer:last-child { padding-right: 0; } #zulyvxvvtg .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #zulyvxvvtg .gt_spanner_row { border-bottom-style: hidden; } #zulyvxvvtg .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #zulyvxvvtg .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #zulyvxvvtg .gt_from_md > :first-child { margin-top: 0; } #zulyvxvvtg .gt_from_md > :last-child { margin-bottom: 0; } #zulyvxvvtg .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #zulyvxvvtg .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #zulyvxvvtg .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #zulyvxvvtg .gt_row_group_first td { border-top-width: 2px; } #zulyvxvvtg .gt_row_group_first th { border-top-width: 2px; } #zulyvxvvtg .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #zulyvxvvtg .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #zulyvxvvtg .gt_first_summary_row.thick { border-top-width: 2px; } #zulyvxvvtg .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #zulyvxvvtg .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #zulyvxvvtg .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #zulyvxvvtg .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #zulyvxvvtg .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #zulyvxvvtg .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #zulyvxvvtg .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #zulyvxvvtg .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #zulyvxvvtg .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #zulyvxvvtg .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #zulyvxvvtg .gt_left { text-align: left; } #zulyvxvvtg .gt_center { text-align: center; } #zulyvxvvtg .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #zulyvxvvtg .gt_font_normal { font-weight: normal; } #zulyvxvvtg .gt_font_bold { font-weight: bold; } #zulyvxvvtg .gt_font_italic { font-style: italic; } #zulyvxvvtg .gt_super { font-size: 65%; } #zulyvxvvtg .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #zulyvxvvtg .gt_asterisk { font-size: 100%; vertical-align: 0; } #zulyvxvvtg .gt_indent_1 { text-indent: 5px; } #zulyvxvvtg .gt_indent_2 { text-indent: 10px; } #zulyvxvvtg .gt_indent_3 { text-indent: 15px; } #zulyvxvvtg .gt_indent_4 { text-indent: 20px; } #zulyvxvvtg .gt_indent_5 { text-indent: 25px; } TABLE 7.4: Linear regression output predicting electricity expenditure given square footage, RECS 2020 term estimate std.error statistic p.value (Intercept) 836.72 12.77 65.51 &lt;0.0001 TOTSQFT_EN 0.30 0.01 41.67 &lt;0.0001 In the output above, we can see the estimated coefficients (estimate), estimated standard errors of the coefficients (std.error), the t-statistic (statistic), and the p-value for each coefficient. In these results, we can say that, on average, for every additional square foot of house size, the electricity bill increases by 30 cents, and that square footage is significantly associated with electricity expenditure (p-value&lt;0.0001.) This is a straightforward model, and there are likely many more factors related to electricity expenditure, including the type of cooling, number of appliances, location, and more. However, starting with one-variable models can help researchers understand what potential relationships there are between variables before fitting more complex models. Often, researchers start with known relationships before building models to determine what impact additional variables have on the model. Example 2: Linear regression with multiple variables and interactions In the following example, a model is fit to predict electricity expenditure, including Census region (factor/categorical), urbanicity (factor/categorical), square footage (double/numeric), and whether air-conditioning (A/C) is used (logical/categorical) with all two-way interactions also included. In this example, we are choosing to fit this model without an intercept (using -1 in the formula.) This results in an intercept estimate for each region instead of a single intercept for all data. m_electric_multi &lt;- recs_des %&gt;% svyglm( design = ., formula = DOLLAREL ~ (Region + Urbanicity + TOTSQFT_EN + ACUsed)^2 - 1, na.action = na.omit ) tidy(m_electric_multi) %&gt;% mutate(p.value=pretty_p_value(p.value)) %&gt;% gt() %&gt;% fmt_number() #wqomfjloar table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #wqomfjloar thead, #wqomfjloar tbody, #wqomfjloar tfoot, #wqomfjloar tr, #wqomfjloar td, #wqomfjloar th { border-style: none; } #wqomfjloar p { margin: 0; padding: 0; } #wqomfjloar .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #wqomfjloar .gt_caption { padding-top: 4px; padding-bottom: 4px; } #wqomfjloar .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #wqomfjloar .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #wqomfjloar .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #wqomfjloar .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #wqomfjloar .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #wqomfjloar .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #wqomfjloar .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #wqomfjloar .gt_column_spanner_outer:first-child { padding-left: 0; } #wqomfjloar .gt_column_spanner_outer:last-child { padding-right: 0; } #wqomfjloar .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #wqomfjloar .gt_spanner_row { border-bottom-style: hidden; } #wqomfjloar .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #wqomfjloar .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #wqomfjloar .gt_from_md > :first-child { margin-top: 0; } #wqomfjloar .gt_from_md > :last-child { margin-bottom: 0; } #wqomfjloar .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #wqomfjloar .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #wqomfjloar .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #wqomfjloar .gt_row_group_first td { border-top-width: 2px; } #wqomfjloar .gt_row_group_first th { border-top-width: 2px; } #wqomfjloar .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #wqomfjloar .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #wqomfjloar .gt_first_summary_row.thick { border-top-width: 2px; } #wqomfjloar .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #wqomfjloar .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #wqomfjloar .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #wqomfjloar .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #wqomfjloar .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #wqomfjloar .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #wqomfjloar .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #wqomfjloar .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #wqomfjloar .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #wqomfjloar .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #wqomfjloar .gt_left { text-align: left; } #wqomfjloar .gt_center { text-align: center; } #wqomfjloar .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #wqomfjloar .gt_font_normal { font-weight: normal; } #wqomfjloar .gt_font_bold { font-weight: bold; } #wqomfjloar .gt_font_italic { font-style: italic; } #wqomfjloar .gt_super { font-size: 65%; } #wqomfjloar .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #wqomfjloar .gt_asterisk { font-size: 100%; vertical-align: 0; } #wqomfjloar .gt_indent_1 { text-indent: 5px; } #wqomfjloar .gt_indent_2 { text-indent: 10px; } #wqomfjloar .gt_indent_3 { text-indent: 15px; } #wqomfjloar .gt_indent_4 { text-indent: 20px; } #wqomfjloar .gt_indent_5 { text-indent: 25px; } TABLE 7.5: Linear regression output predicting electricity expenditure given region, urbanicity, square footage, air conditioning usage, and one-way interactions, RECS 2020 term estimate std.error statistic p.value RegionNortheast 543.73 56.57 9.61 &lt;0.0001 RegionMidwest 702.16 78.12 8.99 &lt;0.0001 RegionSouth 938.74 46.99 19.98 &lt;0.0001 RegionWest 603.27 36.31 16.61 &lt;0.0001 UrbanicityUrban Cluster 73.03 81.50 0.90 0.3764 UrbanicityRural 204.13 80.69 2.53 0.0161 TOTSQFT_EN 0.24 0.03 8.65 &lt;0.0001 ACUsedTRUE 252.06 54.05 4.66 &lt;0.0001 RegionMidwest:UrbanicityUrban Cluster 183.06 82.38 2.22 0.0328 RegionSouth:UrbanicityUrban Cluster 152.56 76.03 2.01 0.0526 RegionWest:UrbanicityUrban Cluster 98.02 75.16 1.30 0.2007 RegionMidwest:UrbanicityRural 312.83 50.88 6.15 &lt;0.0001 RegionSouth:UrbanicityRural 220.00 55.00 4.00 0.0003 RegionWest:UrbanicityRural 180.97 58.70 3.08 0.0040 RegionMidwest:TOTSQFT_EN −0.05 0.02 −2.09 0.0441 RegionSouth:TOTSQFT_EN 0.00 0.03 0.11 0.9109 RegionWest:TOTSQFT_EN −0.03 0.03 −1.00 0.3254 RegionMidwest:ACUsedTRUE −292.97 60.24 −4.86 &lt;0.0001 RegionSouth:ACUsedTRUE −294.07 57.44 −5.12 &lt;0.0001 RegionWest:ACUsedTRUE −77.68 47.05 −1.65 0.1076 UrbanicityUrban Cluster:TOTSQFT_EN −0.04 0.02 −1.63 0.1112 UrbanicityRural:TOTSQFT_EN −0.06 0.02 −2.60 0.0137 UrbanicityUrban Cluster:ACUsedTRUE −130.23 60.30 −2.16 0.0377 UrbanicityRural:ACUsedTRUE −33.80 59.30 −0.57 0.5724 TOTSQFT_EN:ACUsedTRUE 0.08 0.02 3.48 0.0014 As shown above, there are many terms in this model. To test whether coefficients for a term are different from zero, the function regTermTest() can be used. For example, in the above regression, we can test whether the interaction of region and urbanicity is significant as follows: urb_reg_test &lt;- regTermTest(m_electric_multi, ~Urbanicity:Region) urb_reg_test ## Wald test for Urbanicity:Region ## in svyglm(design = ., formula = DOLLAREL ~ (Region + Urbanicity + ## TOTSQFT_EN + ACUsed)^2 - 1, na.action = na.omit) ## F = 6.851 on 6 and 35 df: p= 7.2e-05 This output indicates there is a significant interaction between urbanicity and region (p-value=&lt;0.0001.) To examine the predictions, residuals, and more from the model, the function augment() from {broom} can be used. The augment() function returns a tibble with the independent and dependent variables and other fit statistics. The augment() function has not been specifically written for objects of class svyglm, and as such, a warning is displayed indicating this at this time. As it was not written exactly for this class of objects, a little tweaking needs to be done after using augment(). To obtain the standard error of the predicted values (.se.fit), we need to use the attr() function on the predicted values (.fitted) created by augment(). Additionally, the predicted values created are outputted with a type of svrep. If we want to plot the predicted values, we need to use as.numeric() to get the predicted values into a numeric format to work with. However, it is important to note that this adjustment must be completed after the standard error adjustment. fitstats &lt;- augment(m_electric_multi) %&gt;% mutate(.se.fit = sqrt(attr(.fitted, &quot;var&quot;)), .fitted = as.numeric(.fitted)) fitstats ## # A tibble: 18,496 × 13 ## DOLLAREL Region Urbanicity TOTSQFT_EN ACUsed `(weights)` .fitted ## &lt;dbl&gt; &lt;fct&gt; &lt;fct&gt; &lt;dbl&gt; &lt;lgl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 1955. West Urban Area 2100 TRUE 0.492 1397. ## 2 713. South Urban Area 590 TRUE 1.35 1090. ## 3 335. West Urban Area 900 TRUE 0.849 1043. ## 4 1425. South Urban Area 2100 TRUE 0.793 1584. ## 5 1087 Northeast Urban Area 800 TRUE 1.49 1055. ## 6 1896. South Urban Area 4520 TRUE 1.09 2375. ## 7 1418. South Urban Area 2100 TRUE 0.851 1584. ## 8 1237. South Urban Clust… 900 FALSE 1.45 1349. ## 9 538. South Urban Area 750 TRUE 0.185 1142. ## 10 625. West Urban Area 760 TRUE 1.06 1002. ## # ℹ 18,486 more rows ## # ℹ 6 more variables: .resid &lt;dbl&gt;, .hat &lt;dbl&gt;, .sigma &lt;dbl&gt;, ## # .cooksd &lt;dbl&gt;, .std.resid &lt;dbl&gt;, .se.fit &lt;dbl&gt; These results can then be used in a variety of ways, including examining residual plots as illustrated in the code below and Figure 7.2. In the residual plot, we look for any patterns in the data. If we do see patterns, this may indicate a violation of the heteroscedasticity assumption and the standard errors of the coefficients may be incorrect. In Figure 7.2, we do not see a strong pattern indicating that our assumption of heteroscedasticity may hold. fitstats %&gt;% ggplot(aes(x = .fitted, .resid)) + geom_point(alpha=.1) + geom_hline(yintercept = 0, color = &quot;red&quot;) + theme_minimal() + xlab(&quot;Fitted value of electricity cost&quot;) + ylab(&quot;Residual of model&quot;) + scale_y_continuous(labels = scales::dollar_format()) + scale_x_continuous(labels = scales::dollar_format()) FIGURE 7.2: Residual plot of electric cost model with covariates Region, Urbanicity, TOTSQFT_EN, and ACUsed Additionally, augment() can be used to predict outcomes for data not used in modeling. Perhaps we would like to predict the energy expenditure for a home in an urban area in the south that uses air-conditioning and is 2,500 square feet. To do this, we first make a tibble including that additional data and then use the newdata argument in the augment() function. As before, to obtain the standard error of the predicted values, we need to use the attr() function. add_data &lt;- recs_2020 %&gt;% select(DOEID, Region, Urbanicity, TOTSQFT_EN, ACUsed, DOLLAREL) %&gt;% rbind( tibble( DOEID = NA, Region = &quot;South&quot;, Urbanicity = &quot;Urban Area&quot;, TOTSQFT_EN = 2500, ACUsed = TRUE, DOLLAREL = NA ) ) %&gt;% tail(1) pred_data &lt;- augment(m_electric_multi, newdata = add_data) %&gt;% mutate(.se.fit = sqrt(attr(.fitted, &quot;var&quot;)), .fitted = as.numeric(.fitted)) pred_data ## # A tibble: 1 × 8 ## DOEID Region Urbanicity TOTSQFT_EN ACUsed DOLLAREL .fitted .se.fit ## &lt;dbl&gt; &lt;fct&gt; &lt;fct&gt; &lt;dbl&gt; &lt;lgl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 NA South Urban Area 2500 TRUE NA 1715. 22.6 In the above example, it is predicted that the energy expenditure would be $1,715. 7.4 Logistic regression Logistic regression is used to model binary outcomes, such as whether or not someone voted. There are several instances where an outcome may not be originally binary but is collapsed into being binary. For example, given that gender is often asked in surveys with multiple response options and not a binary scale, many researchers now code gender in logistic modeling as cis-male compared to not cis-male. We could also convert a 4-point Likert scale that has levels of “Strongly Agree”, “Agree”, “Disagree”, and “Strongly Disagree” to group the agreement levels into one group and disagreement levels into a second group. Logistic regression is a specific case of the generalized linear model (GLM.) A GLM uses a link function to link the response variable to the linear model. If we tried to use a normal linear regression with a binary outcome, many assumptions would not hold, namely, the response would not be continuous. Logistic regression allows us to link a linear model between the covariates and the propensity of an outcome. In logistic regression, the link model is the logit function. Specifically, the model is specified as follows: \\[ y_i \\sim \\text{Bernoulli}(\\pi_i)\\] \\[\\begin{equation} \\log \\left(\\frac{\\pi_i}{1-\\pi_i} \\right)=\\beta_0 +\\sum_{i=1}^n \\beta_i x_i \\end{equation}\\] which can be re-expressed as \\[ \\pi_i=\\frac{\\exp \\left(\\beta_0 +\\sum_{i=1}^n \\beta_i x_i \\right)}{1+\\exp \\left(\\beta_0 +\\sum_{i=1}^n \\beta_i x_i \\right)}.\\] where \\(y_i\\) is the outcome, \\(\\beta_0\\) is an intercept, and \\(x_1, \\cdots, x_n\\) are the predictors with \\(\\beta_1, \\cdots, \\beta_n\\) as the associated coefficients. The Bernoulli distribution is a distribution which has an outcome of 0 or 1 given some probability (\\(\\pi_i\\)) in this case and we model \\(\\pi_i\\) as a function of the covariates \\(x_i\\) using this logit link. Assumptions in logistic regression using survey data include: The outcome variable has two levels There is a linear relationship between the independent variables and the log odds (the equation for the logit function) The residuals are homoscedastic; that is, the error term is the same across all values of independent variables 7.4.1 Syntax The syntax for logistic regression is as follows: des_obj %&gt;% svyglm( formula = outcomevar ~ x1 + x2 + x3, design = ., na.action = na.omit, df.resid = NULL, family = quasibinomial ) The arguments are: formula: Formula in the form of y~x design: a tbl_svy object created by as_survey na.action: handling of missing data df.resid: degrees of freedom for Wald tests (optional) - defaults to using degf(design)-p where \\(p\\) is the rank of the design matrix family: the error distribution/link function to be used in the model Note svyglm() is the same function used in both ANOVA and normal linear regression. However, we’ve added the link function quasibinomial. While we can use the binomial link function, it is recommended to use the quasibinomial as our weights may not be integers, and the quasibinomial also allows for overdispersion (Lumley 2010; McCullagh and Nelder 1989; R Core Team 2023). The quasibinomial family has a default logit link, which is specified in the equations above. When specifying the outcome variable, it is likely specified in one of three ways with survey data: A two-level factor variable where the first level of the factor indicates a “failure” and the second level indicates a “success” A numeric variable which is 1 or 0 where 1 indicates a success A logical variable where TRUE indicates a success 7.4.2 Examples Example 1: Logistic regression with a single variable In the following example, the ANES data are used, and we are modeling whether someone usually has trust in the government25 by who someone voted for president in 2020. As a reminder, the leading candidates were Biden and Trump, though people could vote for someone else not in the Democratic or Republican parties. Those votes are all grouped into an “Other” category. We first create a binary outcome for trusting in the government by collapsing “Always” and “Most of the time” into a single-factor level, and the other response options (“About half the time”, “Some of the time”, and “Never”) into a second factor level. Next, a scatter plot of the raw data is not useful as it is all 0 and 1 outcomes, so instead, we plot a summary of the data. anes_des_der &lt;- anes_des %&gt;% mutate(TrustGovernmentUsually = case_when( is.na(TrustGovernment) ~ NA, TRUE ~ TrustGovernment %in% c(&quot;Always&quot;, &quot;Most of the time&quot;) )) anes_des_der %&gt;% group_by(VotedPres2020_selection) %&gt;% summarize(pct_trust = survey_mean(TrustGovernmentUsually, na.rm = TRUE, proportion = TRUE, vartype = &quot;ci&quot;), .groups = &quot;drop&quot;) %&gt;% filter(complete.cases(.)) %&gt;% ggplot(aes(x = VotedPres2020_selection, y = pct_trust, fill = VotedPres2020_selection)) + geom_bar(stat = &quot;identity&quot;) + geom_errorbar(aes(ymin = pct_trust_low, ymax = pct_trust_upp), width = .2) + scale_fill_manual(values = c(&quot;#0b3954&quot;, &quot;#bfd7ea&quot;, &quot;#8d6b94&quot;)) + xlab(&quot;Election choice (2020)&quot;) + ylab(&quot;Usually trust the government&quot;) + scale_y_continuous(labels = scales::percent) + guides(fill = &quot;none&quot;) + theme_minimal() FIGURE 7.3: Relationship between candidate selection and trust in government, ANES 2020 Looking at Figure 7.3, it appears that people who voted for Trump are more likely to say that they usually have trust in the government compared to those who voted for Biden and Other candidates. To determine if this insight is accurate, we next fit the model. logistic_trust_vote &lt;- anes_des_der %&gt;% svyglm(design = ., formula = TrustGovernmentUsually ~ VotedPres2020_selection, family = quasibinomial) tidy(logistic_trust_vote) %&gt;% mutate(p.value=pretty_p_value(p.value)) %&gt;% gt() %&gt;% fmt_number() #rwppmogzow table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #rwppmogzow thead, #rwppmogzow tbody, #rwppmogzow tfoot, #rwppmogzow tr, #rwppmogzow td, #rwppmogzow th { border-style: none; } #rwppmogzow p { margin: 0; padding: 0; } #rwppmogzow .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #rwppmogzow .gt_caption { padding-top: 4px; padding-bottom: 4px; } #rwppmogzow .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #rwppmogzow .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #rwppmogzow .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #rwppmogzow .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #rwppmogzow .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #rwppmogzow .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #rwppmogzow .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #rwppmogzow .gt_column_spanner_outer:first-child { padding-left: 0; } #rwppmogzow .gt_column_spanner_outer:last-child { padding-right: 0; } #rwppmogzow .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #rwppmogzow .gt_spanner_row { border-bottom-style: hidden; } #rwppmogzow .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #rwppmogzow .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #rwppmogzow .gt_from_md > :first-child { margin-top: 0; } #rwppmogzow .gt_from_md > :last-child { margin-bottom: 0; } #rwppmogzow .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #rwppmogzow .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #rwppmogzow .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #rwppmogzow .gt_row_group_first td { border-top-width: 2px; } #rwppmogzow .gt_row_group_first th { border-top-width: 2px; } #rwppmogzow .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #rwppmogzow .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #rwppmogzow .gt_first_summary_row.thick { border-top-width: 2px; } #rwppmogzow .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #rwppmogzow .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #rwppmogzow .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #rwppmogzow .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #rwppmogzow .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #rwppmogzow .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #rwppmogzow .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #rwppmogzow .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #rwppmogzow .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #rwppmogzow .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #rwppmogzow .gt_left { text-align: left; } #rwppmogzow .gt_center { text-align: center; } #rwppmogzow .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #rwppmogzow .gt_font_normal { font-weight: normal; } #rwppmogzow .gt_font_bold { font-weight: bold; } #rwppmogzow .gt_font_italic { font-style: italic; } #rwppmogzow .gt_super { font-size: 65%; } #rwppmogzow .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #rwppmogzow .gt_asterisk { font-size: 100%; vertical-align: 0; } #rwppmogzow .gt_indent_1 { text-indent: 5px; } #rwppmogzow .gt_indent_2 { text-indent: 10px; } #rwppmogzow .gt_indent_3 { text-indent: 15px; } #rwppmogzow .gt_indent_4 { text-indent: 20px; } #rwppmogzow .gt_indent_5 { text-indent: 25px; } TABLE 7.6: Logistic regression output predicting trust in government by presidential candidate selection, RECS 2020 term estimate std.error statistic p.value (Intercept) −1.96 0.07 −27.45 &lt;0.0001 VotedPres2020_selectionTrump 0.43 0.09 4.72 &lt;0.0001 VotedPres2020_selectionOther −0.65 0.44 −1.49 0.1429 In the output above, we can see the estimated coefficients (estimate), estimated standard errors of the coefficients (std.error), the t-statistic (statistic), and the p-value for each coefficient. This output indicates that respondents who voted for Trump are more likely to usually have trust in the government compared to those who voted for Biden (the reference level.) The coefficient of 0.435 represents the increase in the log odds of usually trusting the government. In most cases, it is easier to talk about the odds instead of the log odds. To do this, we need to exponentiate the coefficients. We can use the same tidy() function but include the argument exponentiate = TRUE to see the odds. tidy(logistic_trust_vote, exponentiate = TRUE) %&gt;% select(term, estimate) %&gt;% gt() %&gt;% fmt_number() #sqbfoxcwgf table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #sqbfoxcwgf thead, #sqbfoxcwgf tbody, #sqbfoxcwgf tfoot, #sqbfoxcwgf tr, #sqbfoxcwgf td, #sqbfoxcwgf th { border-style: none; } #sqbfoxcwgf p { margin: 0; padding: 0; } #sqbfoxcwgf .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #sqbfoxcwgf .gt_caption { padding-top: 4px; padding-bottom: 4px; } #sqbfoxcwgf .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #sqbfoxcwgf .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #sqbfoxcwgf .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #sqbfoxcwgf .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #sqbfoxcwgf .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #sqbfoxcwgf .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #sqbfoxcwgf .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #sqbfoxcwgf .gt_column_spanner_outer:first-child { padding-left: 0; } #sqbfoxcwgf .gt_column_spanner_outer:last-child { padding-right: 0; } #sqbfoxcwgf .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #sqbfoxcwgf .gt_spanner_row { border-bottom-style: hidden; } #sqbfoxcwgf .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #sqbfoxcwgf .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #sqbfoxcwgf .gt_from_md > :first-child { margin-top: 0; } #sqbfoxcwgf .gt_from_md > :last-child { margin-bottom: 0; } #sqbfoxcwgf .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #sqbfoxcwgf .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #sqbfoxcwgf .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #sqbfoxcwgf .gt_row_group_first td { border-top-width: 2px; } #sqbfoxcwgf .gt_row_group_first th { border-top-width: 2px; } #sqbfoxcwgf .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #sqbfoxcwgf .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #sqbfoxcwgf .gt_first_summary_row.thick { border-top-width: 2px; } #sqbfoxcwgf .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #sqbfoxcwgf .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #sqbfoxcwgf .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #sqbfoxcwgf .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #sqbfoxcwgf .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #sqbfoxcwgf .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #sqbfoxcwgf .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #sqbfoxcwgf .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #sqbfoxcwgf .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #sqbfoxcwgf .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #sqbfoxcwgf .gt_left { text-align: left; } #sqbfoxcwgf .gt_center { text-align: center; } #sqbfoxcwgf .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #sqbfoxcwgf .gt_font_normal { font-weight: normal; } #sqbfoxcwgf .gt_font_bold { font-weight: bold; } #sqbfoxcwgf .gt_font_italic { font-style: italic; } #sqbfoxcwgf .gt_super { font-size: 65%; } #sqbfoxcwgf .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #sqbfoxcwgf .gt_asterisk { font-size: 100%; vertical-align: 0; } #sqbfoxcwgf .gt_indent_1 { text-indent: 5px; } #sqbfoxcwgf .gt_indent_2 { text-indent: 10px; } #sqbfoxcwgf .gt_indent_3 { text-indent: 15px; } #sqbfoxcwgf .gt_indent_4 { text-indent: 20px; } #sqbfoxcwgf .gt_indent_5 { text-indent: 25px; } TABLE 7.7: Logistic regression predicting trust in government by presidential candidate selection with exponentiated coefficients (odds), RECS 2020 term estimate (Intercept) 0.14 VotedPres2020_selectionTrump 1.54 VotedPres2020_selectionOther 0.52 We can interpret this as saying that the odds of usually trusting the government for someone who voted for Trump is 154% as likely to trust the government compared to a person who voted for Biden (the reference level.) In comparison, a person who voted for neither Biden nor Trump is 52% as likely to trust the government as someone who voted for Biden. As with linear regression, the augment() can be used to predict values. By default, the prediction is the link function, not the probability model. To predict the probability, add an argument of type.predict=\"response\" as demonstrated below: logistic_trust_vote %&gt;% augment(type.predict = &quot;response&quot;) %&gt;% mutate(.se.fit = sqrt(attr(.fitted, &quot;var&quot;)), .fitted = as.numeric(.fitted)) %&gt;% select(TrustGovernmentUsually, VotedPres2020_selection, .fitted, .se.fit) ## # A tibble: 6,212 × 4 ## TrustGovernmentUsually VotedPres2020_selection .fitted .se.fit ## &lt;lgl&gt; &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 FALSE Other 0.0681 0.0279 ## 2 FALSE Biden 0.123 0.00772 ## 3 FALSE Biden 0.123 0.00772 ## 4 FALSE Trump 0.178 0.00919 ## 5 FALSE Biden 0.123 0.00772 ## 6 FALSE Trump 0.178 0.00919 ## 7 FALSE Biden 0.123 0.00772 ## 8 FALSE Biden 0.123 0.00772 ## 9 TRUE Biden 0.123 0.00772 ## 10 FALSE Biden 0.123 0.00772 ## # ℹ 6,202 more rows Example 2: Interaction Effects Let’s look at another example with interaction effects. If we’re interested in understanding the demographics of people who voted for Biden among all voters in 2020, we could include the indicator of if respondents voted early (EarlyVote2020) and their income group (Income7) in our model. First, we need to subset the data to 2020 voters and then create an indicator for voted for Biden. anes_des_ind &lt;- anes_des %&gt;% filter(!is.na(VotedPres2020_selection)) %&gt;% mutate(VoteBiden = case_when(VotedPres2020_selection == &quot;Biden&quot; ~ 1, TRUE ~ 0)) Let’s first look at the main effects of income grouping and early voting behavior. log_biden_main &lt;- anes_des_ind %&gt;% mutate(EarlyVote2020 = fct_relevel(EarlyVote2020, &quot;No&quot;, after = 0)) %&gt;% svyglm(design = ., formula = VoteBiden ~ EarlyVote2020 + Income7, family = quasibinomial) tidy(log_biden_main) %&gt;% mutate(p.value=pretty_p_value(p.value)) %&gt;% gt() %&gt;% fmt_number() #txvpomzvxj table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #txvpomzvxj thead, #txvpomzvxj tbody, #txvpomzvxj tfoot, #txvpomzvxj tr, #txvpomzvxj td, #txvpomzvxj th { border-style: none; } #txvpomzvxj p { margin: 0; padding: 0; } #txvpomzvxj .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #txvpomzvxj .gt_caption { padding-top: 4px; padding-bottom: 4px; } #txvpomzvxj .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #txvpomzvxj .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #txvpomzvxj .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #txvpomzvxj .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #txvpomzvxj .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #txvpomzvxj .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #txvpomzvxj .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #txvpomzvxj .gt_column_spanner_outer:first-child { padding-left: 0; } #txvpomzvxj .gt_column_spanner_outer:last-child { padding-right: 0; } #txvpomzvxj .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #txvpomzvxj .gt_spanner_row { border-bottom-style: hidden; } #txvpomzvxj .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #txvpomzvxj .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #txvpomzvxj .gt_from_md > :first-child { margin-top: 0; } #txvpomzvxj .gt_from_md > :last-child { margin-bottom: 0; } #txvpomzvxj .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #txvpomzvxj .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #txvpomzvxj .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #txvpomzvxj .gt_row_group_first td { border-top-width: 2px; } #txvpomzvxj .gt_row_group_first th { border-top-width: 2px; } #txvpomzvxj .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #txvpomzvxj .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #txvpomzvxj .gt_first_summary_row.thick { border-top-width: 2px; } #txvpomzvxj .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #txvpomzvxj .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #txvpomzvxj .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #txvpomzvxj .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #txvpomzvxj .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #txvpomzvxj .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #txvpomzvxj .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #txvpomzvxj .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #txvpomzvxj .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #txvpomzvxj .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #txvpomzvxj .gt_left { text-align: left; } #txvpomzvxj .gt_center { text-align: center; } #txvpomzvxj .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #txvpomzvxj .gt_font_normal { font-weight: normal; } #txvpomzvxj .gt_font_bold { font-weight: bold; } #txvpomzvxj .gt_font_italic { font-style: italic; } #txvpomzvxj .gt_super { font-size: 65%; } #txvpomzvxj .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #txvpomzvxj .gt_asterisk { font-size: 100%; vertical-align: 0; } #txvpomzvxj .gt_indent_1 { text-indent: 5px; } #txvpomzvxj .gt_indent_2 { text-indent: 10px; } #txvpomzvxj .gt_indent_3 { text-indent: 15px; } #txvpomzvxj .gt_indent_4 { text-indent: 20px; } #txvpomzvxj .gt_indent_5 { text-indent: 25px; } TABLE 7.8: Logistic regression output for predicting voting for Biden given early voting behavior and income - main effects only, ANES 2020 term estimate std.error statistic p.value (Intercept) 1.28 0.43 2.99 0.0047 EarlyVote2020Yes 0.44 0.34 1.29 0.2039 Income7$20k to &lt; 40k −1.06 0.49 −2.18 0.0352 Income7$40k to &lt; 60k −0.78 0.42 −1.86 0.0705 Income7$60k to &lt; 80k −1.24 0.70 −1.77 0.0842 Income7$80k to &lt; 100k −0.66 0.64 −1.02 0.3137 Income7$100k to &lt; 125k −1.02 0.54 −1.89 0.0662 Income7$125k or more −1.25 0.44 −2.87 0.0065 This main effect model (see Table 7.8) indicates that people with incomes of $125,000 or more have a significant negative coefficient -1.25 (p-value=0.0065). This indicates that people with incomes of $125,000 or more were less likely to vote for Biden in the 2020 election compared to people with incomes of $20,000 or less (reference level). Although early voting behavior was not significant, there may be an interaction between income and early voting behavior. To determine this, we can create a model that includes the interaction effects: log_biden_int &lt;- anes_des_ind %&gt;% mutate(EarlyVote2020 = fct_relevel(EarlyVote2020, &quot;No&quot;, after = 0)) %&gt;% svyglm(design = ., formula = VoteBiden ~ (EarlyVote2020 + Income7)^2, family = quasibinomial) tidy(log_biden_int) %&gt;% mutate(p.value=pretty_p_value(p.value)) %&gt;% gt() %&gt;% fmt_number() #owpwgkfrzt table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #owpwgkfrzt thead, #owpwgkfrzt tbody, #owpwgkfrzt tfoot, #owpwgkfrzt tr, #owpwgkfrzt td, #owpwgkfrzt th { border-style: none; } #owpwgkfrzt p { margin: 0; padding: 0; } #owpwgkfrzt .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #owpwgkfrzt .gt_caption { padding-top: 4px; padding-bottom: 4px; } #owpwgkfrzt .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #owpwgkfrzt .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #owpwgkfrzt .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #owpwgkfrzt .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #owpwgkfrzt .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #owpwgkfrzt .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #owpwgkfrzt .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #owpwgkfrzt .gt_column_spanner_outer:first-child { padding-left: 0; } #owpwgkfrzt .gt_column_spanner_outer:last-child { padding-right: 0; } #owpwgkfrzt .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #owpwgkfrzt .gt_spanner_row { border-bottom-style: hidden; } #owpwgkfrzt .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #owpwgkfrzt .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #owpwgkfrzt .gt_from_md > :first-child { margin-top: 0; } #owpwgkfrzt .gt_from_md > :last-child { margin-bottom: 0; } #owpwgkfrzt .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #owpwgkfrzt .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #owpwgkfrzt .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #owpwgkfrzt .gt_row_group_first td { border-top-width: 2px; } #owpwgkfrzt .gt_row_group_first th { border-top-width: 2px; } #owpwgkfrzt .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #owpwgkfrzt .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #owpwgkfrzt .gt_first_summary_row.thick { border-top-width: 2px; } #owpwgkfrzt .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #owpwgkfrzt .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #owpwgkfrzt .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #owpwgkfrzt .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #owpwgkfrzt .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #owpwgkfrzt .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #owpwgkfrzt .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #owpwgkfrzt .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #owpwgkfrzt .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #owpwgkfrzt .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #owpwgkfrzt .gt_left { text-align: left; } #owpwgkfrzt .gt_center { text-align: center; } #owpwgkfrzt .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #owpwgkfrzt .gt_font_normal { font-weight: normal; } #owpwgkfrzt .gt_font_bold { font-weight: bold; } #owpwgkfrzt .gt_font_italic { font-style: italic; } #owpwgkfrzt .gt_super { font-size: 65%; } #owpwgkfrzt .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #owpwgkfrzt .gt_asterisk { font-size: 100%; vertical-align: 0; } #owpwgkfrzt .gt_indent_1 { text-indent: 5px; } #owpwgkfrzt .gt_indent_2 { text-indent: 10px; } #owpwgkfrzt .gt_indent_3 { text-indent: 15px; } #owpwgkfrzt .gt_indent_4 { text-indent: 20px; } #owpwgkfrzt .gt_indent_5 { text-indent: 25px; } TABLE 7.9: Logistic regression output for predicting voting for Biden given early voting behavior and income - with interaction, ANES 2020 term estimate std.error statistic p.value (Intercept) 2.32 0.67 3.45 0.0015 EarlyVote2020Yes −0.81 0.78 −1.03 0.3081 Income7$20k to &lt; 40k −2.33 0.87 −2.68 0.0113 Income7$40k to &lt; 60k −1.67 0.89 −1.87 0.0700 Income7$60k to &lt; 80k −2.05 1.05 −1.96 0.0580 Income7$80k to &lt; 100k −3.42 1.12 −3.06 0.0043 Income7$100k to &lt; 125k −2.33 1.07 −2.17 0.0368 Income7$125k or more −2.09 0.92 −2.28 0.0289 EarlyVote2020Yes:Income7$20k to &lt; 40k 1.60 0.95 1.69 0.1006 EarlyVote2020Yes:Income7$40k to &lt; 60k 0.99 1.00 0.99 0.3289 EarlyVote2020Yes:Income7$60k to &lt; 80k 0.90 1.14 0.79 0.4373 EarlyVote2020Yes:Income7$80k to &lt; 100k 3.22 1.16 2.78 0.0087 EarlyVote2020Yes:Income7$100k to &lt; 125k 1.64 1.11 1.48 0.1492 EarlyVote2020Yes:Income7$125k or more 1.00 1.14 0.88 0.3867 The results from the interaction model (see Table 7.9) show that one interaction between early voting behavior and income is significant. To better understand what this interaction means, we can plot the predicted probabilities with an interaction plot. Let’s first obtain the predicted probabilities for each possible combination of variables using the augment() function. log_biden_pred &lt;- log_biden_int %&gt;% augment(type.predict = &quot;response&quot;) %&gt;% mutate(.se.fit = sqrt(attr(.fitted, &quot;var&quot;)), .fitted = as.numeric(.fitted)) %&gt;% select(VoteBiden, EarlyVote2020, Income7, .fitted, .se.fit) The y-axis is the predicted probabilities, one of our x-variables is on the x-axis, and the other is represented by multiple lines. Figure 7.4 shows the interaction plot with early voting behavior on the x-axis and income represented by the lines. log_biden_pred %&gt;% filter(VoteBiden == 1) %&gt;% distinct() %&gt;% arrange(EarlyVote2020, Income7) %&gt;% ggplot(aes( x = EarlyVote2020, y = .fitted, group = Income7, color = Income7, linetype = Income7 )) + geom_line(linewidth = 1.1) + scale_color_manual(values = colorRampPalette(book_colors)(7)) + ylab(&quot;Predicted Probability of Voting for Biden&quot;) + labs(x = &quot;Voted Early&quot;, color = &quot;Income&quot;, linetype = &quot;Income&quot;) + coord_cartesian(ylim = c(0, 1)) + guides(fill = &quot;none&quot;) + theme_minimal() FIGURE 7.4: Interaction Plot of Early Voting and Income Predicting the Probability of Voting for Biden From Figure 7.4, we can see that people who have incomes in most groups (e.g., $40k to &lt;60k) have roughly the same probability of voting for Biden regardless of whether they voted early or not. However, those with income in the $100k to &lt; 125k group were more likely to vote for Biden if they voted early than if they did not vote early. Interactions in models can be difficult to understand from the coefficients alone. Using these interaction plots can help others understand the nuances of the results. 7.5 Exercises The type of housing unit may have an impact on energy expenses. Is there any relationship between housing unit type (HousingUnitType) and total energy expenditure (TOTALDOL)? First, find the average energy expenditure by housing unit type as a descriptive analysis and then do the test. The reference level in the comparison should be the housing unit type that is most common. Does temperature play a role in electricity expenditure? Cooling degree days are a measure of how hot a place is. CDD65 for a given day indicates the number of degrees Fahrenheit warmer than 65°F (18.3°C) it is in a location. On a day that averages 65°F and below, CDD65=0, while a day that averages 85°F (29.4°C) would have CDD65=20 because it is 20 degrees Fahrenheit warmer (U.S. Energy Information Administration 2023d). Each day in the year is summed up to indicate how hot the place is throughout the year. Similarly, HDD65 indicates the days colder than 65°F. Can energy expenditure be predicted using these temperature indicators along with square footage? Is there a significant relationship? Include main effects and two-way interactions. Continuing with our results from question 2, create a plot between the actual and predicted expenditures and a residual plot for the predicted expenditures. Early voting expanded in 2020 (Sprunt 2020). Build a logistic model predicting early voting in 2020 (EarlyVote2020) using age (Age), education (Education), and party identification (PartyID). Include two-way interactions. Continuing from Exercise 4, predict the probability of early voting for two people. Both are 28 years old and have a graduate degree, but one person is a strong Democrat, and the other is a strong Republican. References Bollen, Kenneth A., Paul P. Biemer, Alan F. Karr, Stephen Tueller, and Marcus E. Berzofsky. 2016. “Are Survey Weights Needed? A Review of Diagnostic Tests in Regression Analysis.” Annual Review of Statistics and Its Application 3 (1): 375–92. https://doi.org/10.1146/annurev-statistics-011516-012958. Gelman, Andrew. 2007. “Struggles with Survey Weighting and Regression Modeling.” Statistical Science 22 (2): 153–64. https://doi.org/10.1214/088342306000000691. Iannone, Richard, Joe Cheng, Barret Schloerke, Ellis Hughes, Alexandra Lauer, and JooYoung Seo. 2023. gt: Easily Create Presentation-Ready Display Tables. Lumley, Thomas. 2010. Complex Surveys: A Guide to Analysis Using R. John Wiley; Sons. McCullagh, Peter, and John Ashworth Nelder. 1989. “Binary Data.” In Generalized Linear Models, 98–148. Springer. R Core Team. 2023. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/. Robinson, David, Alex Hayes, and Simon Couch. 2023. broom: Convert Statistical Objects into Tidy Tibbles. Sprunt, Barbara. 2020. “93 Million and Counting: Americans Are Shattering Early Voting Records.” National Public Radio. ———. 2023d. “Units and Calculators Explained: Degree Days.” https://www.eia.gov/energyexplained/units-and-calculators/degree-days.php. Use help(formula) or ?formula in R↩︎ There is some debate about whether weights should be used in regression (Bollen et al. 2016; Gelman 2007). However, for the purposes of providing complete information on how to analyze complex survey data, this chapter includes weights.↩︎ Question: How often can you trust the federal government in Washington to do what is right?↩︎ "],["c08-communicating-results.html", "Chapter 8 Communication of results 8.1 Introduction 8.2 Describing results through text 8.3 Visualizing data", " Chapter 8 Communication of results Prerequisites For this chapter, load the following packages: library(tidyverse) library(survey) library(srvyr) library(srvyrexploR) library(gt) library(gtsummary) We are using data from ANES as described in Chapter 4. As a reminder, here is the code to create the design objects for each to use throughout this chapter. For ANES, we need to adjust the weight so it sums to the population instead of the sample (see the ANES documentation and Chapter 4 for more information.) targetpop &lt;- 231592693 anes_adjwgt &lt;- anes_2020 %&gt;% mutate(Weight = Weight / sum(Weight) * targetpop) anes_des &lt;- anes_adjwgt %&gt;% as_survey_design( weights = Weight, strata = Stratum, ids = VarUnit, nest = TRUE ) 8.1 Introduction After finishing the analysis and modeling, we proceed to the important task of communicating the survey results. Our audience may range from seasoned researchers familiar with our survey data to newcomers encountering the information for the first time. We should aim to explain the methodology and analysis while presenting findings in an accessible way, and it is our responsibility to report information with care. Before beginning any dissemination of results, consider questions such as: How are we presenting results? Examples include a website, print, or other media. Based on the media type, we might limit or enhance the use of graphical representation. What is the audience’s familiarity with the study and/or data? Audiences can range from the general public to data experts. If we anticipate limited knowledge about the study, we should provide detailed descriptions (we discuss recommendations later in the chapter.) What are we trying to communicate? It could be summary statistics, trends, patterns, or other insights. Tables may suit summary statistics, while plots are better at conveying trends and patterns. Is the audience accustomed to interpreting plots? If not, include explanatory text to guide them on how to interpret the plots effectively. What is the audience’s statistical knowledge? If the audience does not have a strong statistics background, provide text on standard errors, confidence intervals, and other estimate types to enhance understanding. 8.2 Describing results through text As analysts, we often emphasize the data, and communicating results can sometimes be overlooked. To be effective communicators, we need to identify the appropriate information to share with our audience. Chapters 2 and 3 provide insights into factors we need to consider during analysis, and they remain relevant when presenting results to others. 8.2.1 Methodology If we are using existing data, methodologically-sound surveys provide documentation about how the survey was fielded, the questionnaires, and other necessary information for analyses. For example, the survey’s methodology reports should include the population of interest, sampling procedures, response rates, questionnaire documentation, weighting, and a general overview of disclosure statements. Many American organizations follow the American Association for Public Opinion Research’s (AAPOR) Transparency Initiative. The AAPOR Transparency Initiative requires organizations to include specific details in their methodology, making it clear how we can and should analyze and interpret the results. Being transparent about these methods is vital for the scientific rigor of the field. The details provided in Chapter 2 about the survey process should be shared with the audience when presenting the results. When using publicly-available data, like the examples in this book, we can often link to the methodology report in our final output. We should also provide high-level information for the audience to quickly grasp the context around the findings. For example, we can mention when and where the study was conducted, the population’s age range, or other contextual details. This information helps the audience understand how generalizable the results are. Providing this material is especially important when no methodology report is available for the analyzed data. For example, if we conducted a new survey for a specific purpose, we should document and present all the pertinent information during the analysis and reporting process. Adhering to the AAPOR Transparency Initiative guidelines is a reliable method to guarantee that all essential information is communicated to the audience. 8.2.2 Analysis Along with the survey methodology and weight calculations, we should also share our approach to preparing, cleaning, and analyzing the data. For example, in Chapter 6, we compared education distributions from the ANES survey to the American Community Survey (ACS.) To make the comparison, we had to collapse the education categories provided in the ANES data to match the ACS. The process for this particular example may seem straightforward (like combining Bachelor’s and Graduate Degrees into a single category), but there are multiple ways to deal with the data. Our choice is just one of many. We should document both the original ANES question and response options and the steps we took to match them with ACS data. This transparency helps clarify our analysis to our audience. Missing data is another instance where we want to be unambiguous and upfront with our audience. In this book, numerous examples and exercises remove missing data, as this is often the easiest way to handle them. However, there are circumstances where missing data holds substantive importance, and excluding them could introduce bias (see Chapter 11.) Being transparent about our handling of missing data is important to maintaining the integrity of our analysis and ensuring a comprehensive understanding of the results. 8.2.3 Results While tables and graphs are commonly used to communicate results, there are instances where text can be more effective in sharing information. Narrative details, such as context around point estimates or model coefficients, can go a long way in improving our communication. We have several strategies to effectively convey the significance of the data to the audience through text. First, we can highlight important data elements in a sentence using plain language. For example, if we were looking at election polling data conducted before an election, we could say: As of [DATE], an estimated XX% of registered U.S. voters say they will vote for [CANDIDATE NAME] for president in the [YEAR] general election. This sentence provides key pieces of information in a straightforward way: [DATE]: Given that polling data are time-specific, providing the date of reference lets the audience know when these data were valid. Registered U.S. voters: This tells the audience who we surveyed, letting them know the population of interest. XX%: This part provides the estimated percentage of people voting for a specific candidate for a specific office. [YEAR] general election: As with the bullet above, adding this gives more context about the election type and year. The estimate would take on a different meaning if we changed it to a primary election instead of a general election. We also included the word “estimated.” When presenting aggregate survey results, we have errors around each estimate. We want to convey this uncertainty rather than talk in absolutes. Words like “estimated,” “on average,” or “around” can help communicate this uncertainty to the audience. Instead of saying ‘XX%,’ we can also say ‘XX% (+/- Y%)’ to show the margin of error. Confidence intervals can also be incorporated into the text to assist readers. Second, providing context and discussing the meaning behind a point estimate can help the audience glean some insight into why the data are important. For example, when comparing two values, it can be helpful to highlight if there are statistically significant differences and explain the impact and relevance of this information. This is where we should do our best to be mindful of biases and present the facts logically. Keep in mind how we discuss these findings can greatly influence how the audience interprets them. If we include speculation, phrases like “the authors speculate” or “these findings may indicate” relays the uncertainty around the notion while still lending a plausible solution. Additionally, we can present alternative viewpoints or competing discussion points to explain the uncertainty in the results. 8.3 Visualizing data Although discussing key findings in the text is important, presenting large amounts of data in tables or visualizations is often more digestible for the audience. Effectively combining text, tables, and graphs can be powerful in communicating results. This section provides examples of using the {gt}, {gtsummary}, and {ggplot2} packages to enhance the dissemination of results (Iannone et al. 2023; Sjoberg et al. 2021; Wickham 2016). 8.3.1 Tables Tables are a great way to provide a large amount of data when individual data points need to be examined. However, it is important to present tables in a reader-friendly format. Numbers should align, rows and columns should be easy to follow, and the table size should not compromise readability. Using key visualization techniques, we can create tables that are informative and nice to look at. Many packages create easy-to-read tables (e.g., {kable} + {kableExtra}, {gt}, {gtsummary}, {DT}, {formattable}, {flextable}, {reactable}.) We appreciate the flexibility, ability to use pipes (e.g., %&gt;%), and numerous extensions of the {gt} package. While we focus on {gt} here, we encourage learning about others as they may have additional helpful features. Please note, at this time, {gtsummary} needs additional features to be widely used for survey analysis, particularly due to its lack of ability to work with replicate designs. We provide one example using {gtsummary} and hope it evolves into a more comprehensive tool over time. 8.3.1.1 Transitioning {srvyr} output to a {gt} table Let’s start by using some of the data we calculated earlier in this book. In Chapter 6, we looked at data on trust in government with the proportions calculated below: trust_gov &lt;- anes_des %&gt;% drop_na(TrustGovernment) %&gt;% group_by(TrustGovernment) %&gt;% summarize(trust_gov_p = survey_prop()) trust_gov ## # A tibble: 5 × 3 ## TrustGovernment trust_gov_p trust_gov_p_se ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Always 0.0155 0.00204 ## 2 Most of the time 0.132 0.00553 ## 3 About half the time 0.309 0.00829 ## 4 Some of the time 0.434 0.00855 ## 5 Never 0.110 0.00566 The default output generated by R may work for initial viewing inside our IDE or when creating basic output in an R Markdown or Quarto document. However, when presenting these results in other publications, such as the print version of this book or with other formal dissemination modes, modifying the display can improve our reader’s experience. Looking at the output from trust_gov, a couple of improvements stand out: (1) switching to percentages instead of proportions and (2) removing the variable names as column headers. The {gt} package is a good tool for implementing better labeling and creating publishable tables. Let’s walk through some code as we make a few changes to improve the table’s usefulness. First, we initiate the formatted table with the gt() function on the trust_gov tibble previously created. Next, we use the argument rowname_col() to designate the TrustGovernment column as the label for each row (called the table “stub”.) We apply the cols_label() function to create informative column labels instead of variable names and then the tab_spanner() function to add a label across multiple columns. In this case, we label all columns except the stub with “Trust in Government, 2020”. We then format the proportions into percentages with the fmt_percent() function and reduce the number of decimals shown to one with decimals = 1. Finally, the tab_caption() function adds a table title for the HTML version of the book. We can use the caption for cross-referencing in R Markdown, Quarto, and bookdown, as well as adding it to the list of tables in the book. These changes are all seen in Table 8.1. trust_gov_gt &lt;- trust_gov %&gt;% gt(rowname_col = &quot;TrustGovernment&quot;) %&gt;% cols_label(trust_gov_p = &quot;%&quot;, trust_gov_p_se = &quot;s.e. (%)&quot;) %&gt;% tab_spanner(label = &quot;Trust in Government, 2020&quot;, columns = c(trust_gov_p, trust_gov_p_se)) %&gt;% fmt_percent(decimals = 1) trust_gov_gt %&gt;% tab_caption(&quot;Example of gt table with trust in government estimate&quot;) #bdukvgbtnp table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #bdukvgbtnp thead, #bdukvgbtnp tbody, #bdukvgbtnp tfoot, #bdukvgbtnp tr, #bdukvgbtnp td, #bdukvgbtnp th { border-style: none; } #bdukvgbtnp p { margin: 0; padding: 0; } #bdukvgbtnp .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #bdukvgbtnp .gt_caption { padding-top: 4px; padding-bottom: 4px; } #bdukvgbtnp .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #bdukvgbtnp .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #bdukvgbtnp .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #bdukvgbtnp .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #bdukvgbtnp .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #bdukvgbtnp .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #bdukvgbtnp .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #bdukvgbtnp .gt_column_spanner_outer:first-child { padding-left: 0; } #bdukvgbtnp .gt_column_spanner_outer:last-child { padding-right: 0; } #bdukvgbtnp .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #bdukvgbtnp .gt_spanner_row { border-bottom-style: hidden; } #bdukvgbtnp .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #bdukvgbtnp .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #bdukvgbtnp .gt_from_md > :first-child { margin-top: 0; } #bdukvgbtnp .gt_from_md > :last-child { margin-bottom: 0; } #bdukvgbtnp .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #bdukvgbtnp .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #bdukvgbtnp .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #bdukvgbtnp .gt_row_group_first td { border-top-width: 2px; } #bdukvgbtnp .gt_row_group_first th { border-top-width: 2px; } #bdukvgbtnp .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #bdukvgbtnp .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #bdukvgbtnp .gt_first_summary_row.thick { border-top-width: 2px; } #bdukvgbtnp .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #bdukvgbtnp .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #bdukvgbtnp .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #bdukvgbtnp .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #bdukvgbtnp .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #bdukvgbtnp .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #bdukvgbtnp .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #bdukvgbtnp .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #bdukvgbtnp .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #bdukvgbtnp .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #bdukvgbtnp .gt_left { text-align: left; } #bdukvgbtnp .gt_center { text-align: center; } #bdukvgbtnp .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #bdukvgbtnp .gt_font_normal { font-weight: normal; } #bdukvgbtnp .gt_font_bold { font-weight: bold; } #bdukvgbtnp .gt_font_italic { font-style: italic; } #bdukvgbtnp .gt_super { font-size: 65%; } #bdukvgbtnp .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #bdukvgbtnp .gt_asterisk { font-size: 100%; vertical-align: 0; } #bdukvgbtnp .gt_indent_1 { text-indent: 5px; } #bdukvgbtnp .gt_indent_2 { text-indent: 10px; } #bdukvgbtnp .gt_indent_3 { text-indent: 15px; } #bdukvgbtnp .gt_indent_4 { text-indent: 20px; } #bdukvgbtnp .gt_indent_5 { text-indent: 25px; } TABLE 8.1: Example of gt table with trust in government estimate Trust in Government, 2020 % s.e. (%) Always 1.6% 0.2% Most of the time 13.2% 0.6% About half the time 30.9% 0.8% Some of the time 43.4% 0.9% Never 11.0% 0.6% We can add a few more enhancements, such as a title (which is different from a caption26), a data source note, and a footnote with the question information, using the functions tab_header(), tab_source_note(), and tab_footnote(). If having the percentage sign in both the header and the cells seems redundant, we can opt for fmt_number() instead of fmt_percent() and scale the number by 100 with scale_by = 100. The resulting table is displayed in Table 8.2. trust_gov_gt2 &lt;- trust_gov_gt %&gt;% tab_header(&quot;American voter&#39;s trust in the federal government, 2020&quot;) %&gt;% tab_source_note(&quot;American National Election Studies, 2020&quot;) %&gt;% tab_footnote( &quot;Question text: How often can you trust the federal government in Washington to do what is right?&quot; ) %&gt;% fmt_number(scale_by = 100, decimals = 1) trust_gov_gt2 #enromajpnn table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #enromajpnn thead, #enromajpnn tbody, #enromajpnn tfoot, #enromajpnn tr, #enromajpnn td, #enromajpnn th { border-style: none; } #enromajpnn p { margin: 0; padding: 0; } #enromajpnn .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #enromajpnn .gt_caption { padding-top: 4px; padding-bottom: 4px; } #enromajpnn .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #enromajpnn .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #enromajpnn .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #enromajpnn .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #enromajpnn .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #enromajpnn .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #enromajpnn .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #enromajpnn .gt_column_spanner_outer:first-child { padding-left: 0; } #enromajpnn .gt_column_spanner_outer:last-child { padding-right: 0; } #enromajpnn .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #enromajpnn .gt_spanner_row { border-bottom-style: hidden; } #enromajpnn .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #enromajpnn .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #enromajpnn .gt_from_md > :first-child { margin-top: 0; } #enromajpnn .gt_from_md > :last-child { margin-bottom: 0; } #enromajpnn .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #enromajpnn .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #enromajpnn .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #enromajpnn .gt_row_group_first td { border-top-width: 2px; } #enromajpnn .gt_row_group_first th { border-top-width: 2px; } #enromajpnn .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #enromajpnn .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #enromajpnn .gt_first_summary_row.thick { border-top-width: 2px; } #enromajpnn .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #enromajpnn .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #enromajpnn .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #enromajpnn .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #enromajpnn .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #enromajpnn .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #enromajpnn .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #enromajpnn .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #enromajpnn .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #enromajpnn .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #enromajpnn .gt_left { text-align: left; } #enromajpnn .gt_center { text-align: center; } #enromajpnn .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #enromajpnn .gt_font_normal { font-weight: normal; } #enromajpnn .gt_font_bold { font-weight: bold; } #enromajpnn .gt_font_italic { font-style: italic; } #enromajpnn .gt_super { font-size: 65%; } #enromajpnn .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #enromajpnn .gt_asterisk { font-size: 100%; vertical-align: 0; } #enromajpnn .gt_indent_1 { text-indent: 5px; } #enromajpnn .gt_indent_2 { text-indent: 10px; } #enromajpnn .gt_indent_3 { text-indent: 15px; } #enromajpnn .gt_indent_4 { text-indent: 20px; } #enromajpnn .gt_indent_5 { text-indent: 25px; } TABLE 8.2: Example of gt table with trust in government estimates and additional context American voter's trust in the federal government, 2020 Trust in Government, 2020 % s.e. (%) Always 1.6 0.2 Most of the time 13.2 0.6 About half the time 30.9 0.8 Some of the time 43.4 0.9 Never 11.0 0.6 American National Election Studies, 2020 Question text: How often can you trust the federal government in Washington to do what is right? Expanding tables using {gtsummary} The {gtsummary} package simultaneously summarizes data and creates publication-ready tables. Initially designed for clinical trial data, it has been extended to include survey analysis in certain capacities. At this time, it is only compatible with survey objects using Taylor’s Series Linearization and not replicate methods. While it offers a restricted set of summary statistics, the following are available for categorical variables: {n} frequency {N} denominator, or respondent population {p} proportion (stylized as a percentage by default) {p.std.error} standard error of the sample proportion {deff} design effect of the sample proportion {n_unweighted} unweighted frequency {N_unweighted} unweighted denominator {p_unweighted} unweighted formatted proportion (stylized as a percentage by default) The following summary statistics are available for continuous variables: {median} median {mean} mean {mean.std.error} standard error of the sample mean {deff} design effect of the sample mean {sd} standard deviation {var} variance {min} minimum {max} maximum {p#} any integer percentile, where # is an integer from 0 to 100 {sum} sum In the following example, we build a table using {gtsummary}, similar to the table in the {gt} example. The main function we use is tbl_svysummary(). In this function, we include the variables we want to analyze in the include argument and define the statistics we want to display in the statistic argument. To specify the statistics, we apply the syntax from the {glue} package, where we enclose the variables we want to insert within curly brackets. We must specify the desired statistics using the names listed above. For example, to specify that we want the proportion followed by the standard error of the proportion in parentheses, we use {p} ({p.std.error}). Table 8.3 displays the resulting table. anes_des_gtsum &lt;- anes_des %&gt;% tbl_svysummary(include = TrustGovernment, statistic = list(all_categorical() ~ &quot;{p} ({p.std.error})&quot;)) anes_des_gtsum #shenfykkfl table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #shenfykkfl thead, #shenfykkfl tbody, #shenfykkfl tfoot, #shenfykkfl tr, #shenfykkfl td, #shenfykkfl th { border-style: none; } #shenfykkfl p { margin: 0; padding: 0; } #shenfykkfl .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #shenfykkfl .gt_caption { padding-top: 4px; padding-bottom: 4px; } #shenfykkfl .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #shenfykkfl .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #shenfykkfl .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #shenfykkfl .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #shenfykkfl .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #shenfykkfl .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #shenfykkfl .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #shenfykkfl .gt_column_spanner_outer:first-child { padding-left: 0; } #shenfykkfl .gt_column_spanner_outer:last-child { padding-right: 0; } #shenfykkfl .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #shenfykkfl .gt_spanner_row { border-bottom-style: hidden; } #shenfykkfl .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #shenfykkfl .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #shenfykkfl .gt_from_md > :first-child { margin-top: 0; } #shenfykkfl .gt_from_md > :last-child { margin-bottom: 0; } #shenfykkfl .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #shenfykkfl .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #shenfykkfl .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #shenfykkfl .gt_row_group_first td { border-top-width: 2px; } #shenfykkfl .gt_row_group_first th { border-top-width: 2px; } #shenfykkfl .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #shenfykkfl .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #shenfykkfl .gt_first_summary_row.thick { border-top-width: 2px; } #shenfykkfl .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #shenfykkfl .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #shenfykkfl .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #shenfykkfl .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #shenfykkfl .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #shenfykkfl .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #shenfykkfl .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #shenfykkfl .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #shenfykkfl .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #shenfykkfl .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #shenfykkfl .gt_left { text-align: left; } #shenfykkfl .gt_center { text-align: center; } #shenfykkfl .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #shenfykkfl .gt_font_normal { font-weight: normal; } #shenfykkfl .gt_font_bold { font-weight: bold; } #shenfykkfl .gt_font_italic { font-style: italic; } #shenfykkfl .gt_super { font-size: 65%; } #shenfykkfl .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #shenfykkfl .gt_asterisk { font-size: 100%; vertical-align: 0; } #shenfykkfl .gt_indent_1 { text-indent: 5px; } #shenfykkfl .gt_indent_2 { text-indent: 10px; } #shenfykkfl .gt_indent_3 { text-indent: 15px; } #shenfykkfl .gt_indent_4 { text-indent: 20px; } #shenfykkfl .gt_indent_5 { text-indent: 25px; } TABLE 8.3: Example of {gtsummary} table with trust in government estimates Characteristic N = 231,034,1251 PRE: How often trust government in Washington to do what is right [revised]     Always 1.6 (0.00)     Most of the time 13 (0.01)     About half the time 31 (0.01)     Some of the time 43 (0.01)     Never 11 (0.01)     Unknown 673,773 1 % (SE(%)) The default table (shown in Table 8.3 includes the weighted number of missing (or Unknown) records. The standard error is reported as a proportion, while the proportion is styled as a percentage. In the next step, we remove the Unknown category by setting the missing argument to “no” and format the standard error as a percentage using the digits argument. To improve the table for publication, we provide a more polished label for the “TrustGovernment” variable using the label argument. Te resulting table is displayed in Table 8.4. anes_des_gtsum2 &lt;- anes_des %&gt;% tbl_svysummary( include = TrustGovernment, statistic = list(all_categorical() ~ &quot;{p} ({p.std.error})&quot;), missing = &quot;no&quot;, digits = list(TrustGovernment ~ style_percent), label = list(TrustGovernment ~ &quot;Trust in Government, 2020&quot;) ) anes_des_gtsum2 #vpbydkxlvl table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #vpbydkxlvl thead, #vpbydkxlvl tbody, #vpbydkxlvl tfoot, #vpbydkxlvl tr, #vpbydkxlvl td, #vpbydkxlvl th { border-style: none; } #vpbydkxlvl p { margin: 0; padding: 0; } #vpbydkxlvl .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #vpbydkxlvl .gt_caption { padding-top: 4px; padding-bottom: 4px; } #vpbydkxlvl .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #vpbydkxlvl .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #vpbydkxlvl .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #vpbydkxlvl .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #vpbydkxlvl .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #vpbydkxlvl .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #vpbydkxlvl .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #vpbydkxlvl .gt_column_spanner_outer:first-child { padding-left: 0; } #vpbydkxlvl .gt_column_spanner_outer:last-child { padding-right: 0; } #vpbydkxlvl .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #vpbydkxlvl .gt_spanner_row { border-bottom-style: hidden; } #vpbydkxlvl .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #vpbydkxlvl .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #vpbydkxlvl .gt_from_md > :first-child { margin-top: 0; } #vpbydkxlvl .gt_from_md > :last-child { margin-bottom: 0; } #vpbydkxlvl .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #vpbydkxlvl .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #vpbydkxlvl .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #vpbydkxlvl .gt_row_group_first td { border-top-width: 2px; } #vpbydkxlvl .gt_row_group_first th { border-top-width: 2px; } #vpbydkxlvl .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #vpbydkxlvl .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #vpbydkxlvl .gt_first_summary_row.thick { border-top-width: 2px; } #vpbydkxlvl .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #vpbydkxlvl .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #vpbydkxlvl .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #vpbydkxlvl .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #vpbydkxlvl .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #vpbydkxlvl .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #vpbydkxlvl .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #vpbydkxlvl .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #vpbydkxlvl .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #vpbydkxlvl .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #vpbydkxlvl .gt_left { text-align: left; } #vpbydkxlvl .gt_center { text-align: center; } #vpbydkxlvl .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #vpbydkxlvl .gt_font_normal { font-weight: normal; } #vpbydkxlvl .gt_font_bold { font-weight: bold; } #vpbydkxlvl .gt_font_italic { font-style: italic; } #vpbydkxlvl .gt_super { font-size: 65%; } #vpbydkxlvl .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #vpbydkxlvl .gt_asterisk { font-size: 100%; vertical-align: 0; } #vpbydkxlvl .gt_indent_1 { text-indent: 5px; } #vpbydkxlvl .gt_indent_2 { text-indent: 10px; } #vpbydkxlvl .gt_indent_3 { text-indent: 15px; } #vpbydkxlvl .gt_indent_4 { text-indent: 20px; } #vpbydkxlvl .gt_indent_5 { text-indent: 25px; } TABLE 8.4: Example of {gtsummary} table with trust in government estimates with labeling and digits options Characteristic N = 231,034,1251 Trust in Government, 2020     Always 1.6 (0.2)     Most of the time 13 (0.6)     About half the time 31 (0.8)     Some of the time 43 (0.9)     Never 11 (0.6) 1 % (SE(%)) Table 8.4 is closer to our ideal output, but we still want to make a few changes. To exclude the term “Characteristic” and the estimated population size (N), we can modify the header using the modify_header() function to update the label. Further adjustments can be made based on personal preferences, organizational guidelines, or other style guides. If we prefer having the standard error in the header, similar to the {gt} table, instead of in the footnote (the {gtsummary} default), we can make these changes by specifying stat_0 in the modify_header() function. Additionally, using modify_footnote() with update = everything() ~ NA removes the standard error from the footnote. After transforming the object into a gt table using as_gt(), we can add footnotes and a title using the same methods explained in Section 8.3.1.1. This updated table is displayed in Table 8.5. anes_des_gtsum3 &lt;- anes_des %&gt;% tbl_svysummary( include = TrustGovernment, statistic = list(all_categorical() ~ &quot;{p} ({p.std.error})&quot;), missing = &quot;no&quot;, digits = list(TrustGovernment ~ style_percent), label = list(TrustGovernment ~ &quot;Trust in Government, 2020&quot;) ) %&gt;% modify_footnote(update = everything() ~ NA) %&gt;% modify_header(label = &quot; &quot;, stat_0 = &quot;% (s.e.)&quot;) %&gt;% as_gt() %&gt;% tab_header(&quot;American voter&#39;s trust in the federal government, 2020&quot;) %&gt;% tab_source_note(&quot;American National Election Studies, 2020&quot;) %&gt;% tab_footnote( &quot;Question text: How often can you trust the federal government in Washington to do what is right?&quot; ) anes_des_gtsum3 #odgelfzebt table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #odgelfzebt thead, #odgelfzebt tbody, #odgelfzebt tfoot, #odgelfzebt tr, #odgelfzebt td, #odgelfzebt th { border-style: none; } #odgelfzebt p { margin: 0; padding: 0; } #odgelfzebt .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #odgelfzebt .gt_caption { padding-top: 4px; padding-bottom: 4px; } #odgelfzebt .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #odgelfzebt .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #odgelfzebt .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #odgelfzebt .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #odgelfzebt .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #odgelfzebt .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #odgelfzebt .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #odgelfzebt .gt_column_spanner_outer:first-child { padding-left: 0; } #odgelfzebt .gt_column_spanner_outer:last-child { padding-right: 0; } #odgelfzebt .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #odgelfzebt .gt_spanner_row { border-bottom-style: hidden; } #odgelfzebt .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #odgelfzebt .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #odgelfzebt .gt_from_md > :first-child { margin-top: 0; } #odgelfzebt .gt_from_md > :last-child { margin-bottom: 0; } #odgelfzebt .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #odgelfzebt .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #odgelfzebt .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #odgelfzebt .gt_row_group_first td { border-top-width: 2px; } #odgelfzebt .gt_row_group_first th { border-top-width: 2px; } #odgelfzebt .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #odgelfzebt .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #odgelfzebt .gt_first_summary_row.thick { border-top-width: 2px; } #odgelfzebt .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #odgelfzebt .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #odgelfzebt .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #odgelfzebt .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #odgelfzebt .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #odgelfzebt .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #odgelfzebt .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #odgelfzebt .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #odgelfzebt .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #odgelfzebt .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #odgelfzebt .gt_left { text-align: left; } #odgelfzebt .gt_center { text-align: center; } #odgelfzebt .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #odgelfzebt .gt_font_normal { font-weight: normal; } #odgelfzebt .gt_font_bold { font-weight: bold; } #odgelfzebt .gt_font_italic { font-style: italic; } #odgelfzebt .gt_super { font-size: 65%; } #odgelfzebt .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #odgelfzebt .gt_asterisk { font-size: 100%; vertical-align: 0; } #odgelfzebt .gt_indent_1 { text-indent: 5px; } #odgelfzebt .gt_indent_2 { text-indent: 10px; } #odgelfzebt .gt_indent_3 { text-indent: 15px; } #odgelfzebt .gt_indent_4 { text-indent: 20px; } #odgelfzebt .gt_indent_5 { text-indent: 25px; } TABLE 8.5: Example of {gtsummary} table with trust in government estimates with more labeling options and context American voter's trust in the federal government, 2020 % (s.e.) Trust in Government, 2020     Always 1.6 (0.2)     Most of the time 13 (0.6)     About half the time 31 (0.8)     Some of the time 43 (0.9)     Never 11 (0.6) American National Election Studies, 2020 Question text: How often can you trust the federal government in Washington to do what is right? We can also include summaries of more than one variable in the table. These variables can be either categorical or continuous. In the following code and Table 8.6, we add the mean age by updating the include, statistic, and digits arguments. anes_des_gtsum4 &lt;- anes_des %&gt;% tbl_svysummary( include = c(TrustGovernment, Age), statistic = list( all_categorical() ~ &quot;{p} ({p.std.error})&quot;, all_continuous() ~ &quot;{mean} ({mean.std.error})&quot; ), missing = &quot;no&quot;, digits = list(TrustGovernment ~ style_percent, Age ~ c(1, 2)), label = list(TrustGovernment ~ &quot;Trust in Government, 2020&quot;) ) %&gt;% modify_footnote(update = everything() ~ NA) %&gt;% modify_header(label = &quot; &quot;, stat_0 = &quot;% (s.e.)&quot;) %&gt;% as_gt() %&gt;% tab_header( &quot;American voter&#39;s trust in the federal government, 2020&quot;) %&gt;% tab_source_note(&quot;American National Election Studies, 2020&quot;) %&gt;% tab_footnote( &quot;Question text: How often can you trust the federal government in Washington to do what is right?&quot; ) %&gt;% tab_caption(&quot;Example of gtsummary table with trust in government estimates and average age&quot;) anes_des_gtsum4 #zsfsizshxx table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #zsfsizshxx thead, #zsfsizshxx tbody, #zsfsizshxx tfoot, #zsfsizshxx tr, #zsfsizshxx td, #zsfsizshxx th { border-style: none; } #zsfsizshxx p { margin: 0; padding: 0; } #zsfsizshxx .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #zsfsizshxx .gt_caption { padding-top: 4px; padding-bottom: 4px; } #zsfsizshxx .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #zsfsizshxx .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #zsfsizshxx .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #zsfsizshxx .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #zsfsizshxx .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #zsfsizshxx .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #zsfsizshxx .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #zsfsizshxx .gt_column_spanner_outer:first-child { padding-left: 0; } #zsfsizshxx .gt_column_spanner_outer:last-child { padding-right: 0; } #zsfsizshxx .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #zsfsizshxx .gt_spanner_row { border-bottom-style: hidden; } #zsfsizshxx .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #zsfsizshxx .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #zsfsizshxx .gt_from_md > :first-child { margin-top: 0; } #zsfsizshxx .gt_from_md > :last-child { margin-bottom: 0; } #zsfsizshxx .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #zsfsizshxx .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #zsfsizshxx .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #zsfsizshxx .gt_row_group_first td { border-top-width: 2px; } #zsfsizshxx .gt_row_group_first th { border-top-width: 2px; } #zsfsizshxx .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #zsfsizshxx .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #zsfsizshxx .gt_first_summary_row.thick { border-top-width: 2px; } #zsfsizshxx .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #zsfsizshxx .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #zsfsizshxx .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #zsfsizshxx .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #zsfsizshxx .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #zsfsizshxx .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #zsfsizshxx .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #zsfsizshxx .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #zsfsizshxx .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #zsfsizshxx .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #zsfsizshxx .gt_left { text-align: left; } #zsfsizshxx .gt_center { text-align: center; } #zsfsizshxx .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #zsfsizshxx .gt_font_normal { font-weight: normal; } #zsfsizshxx .gt_font_bold { font-weight: bold; } #zsfsizshxx .gt_font_italic { font-style: italic; } #zsfsizshxx .gt_super { font-size: 65%; } #zsfsizshxx .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #zsfsizshxx .gt_asterisk { font-size: 100%; vertical-align: 0; } #zsfsizshxx .gt_indent_1 { text-indent: 5px; } #zsfsizshxx .gt_indent_2 { text-indent: 10px; } #zsfsizshxx .gt_indent_3 { text-indent: 15px; } #zsfsizshxx .gt_indent_4 { text-indent: 20px; } #zsfsizshxx .gt_indent_5 { text-indent: 25px; } TABLE 8.6: Example of {gtsummary} table with trust in government estimates and average age American voter's trust in the federal government, 2020 % (s.e.) Trust in Government, 2020     Always 1.6 (0.2)     Most of the time 13 (0.6)     About half the time 31 (0.8)     Some of the time 43 (0.9)     Never 11 (0.6) PRE: SUMMARY: Respondent age 47.3 (0.36) American National Election Studies, 2020 Question text: How often can you trust the federal government in Washington to do what is right? With {gtsummary}, we can also calculate statistics by different groups. Let’s modify the previous example (displayed in Table 8.6 to analyze data on whether a respondent voted for president in 2020. We update the by argument and refine the header. The resulting table is displayed in Table 8.7. anes_des_gtsum5 &lt;- anes_des %&gt;% drop_na(VotedPres2020) %&gt;% tbl_svysummary( include = TrustGovernment, statistic = list(all_categorical() ~ &quot;{p} ({p.std.error})&quot;), missing = &quot;no&quot;, digits = list(TrustGovernment ~ style_percent), label = list(TrustGovernment ~ &quot;Trust in Government, 2020&quot;), by = VotedPres2020 ) %&gt;% modify_footnote(update = everything() ~ NA) %&gt;% modify_header(label = &quot; &quot;, stat_1 = &quot;Voted&quot;, stat_2 = &quot;Didn&#39;t vote&quot;) %&gt;% modify_spanning_header(all_stat_cols() ~ &quot;% (s.e.)&quot;) %&gt;% as_gt() %&gt;% tab_header( &quot;American voter&#39;s trust in the federal government by whether they voted in the 2020 presidential election&quot; ) %&gt;% tab_source_note(&quot;American National Election Studies, 2020&quot;) %&gt;% tab_footnote( &quot;Question text: How often can you trust the federal government in Washington to do what is right?&quot; ) anes_des_gtsum5 #karazkvpvb table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #karazkvpvb thead, #karazkvpvb tbody, #karazkvpvb tfoot, #karazkvpvb tr, #karazkvpvb td, #karazkvpvb th { border-style: none; } #karazkvpvb p { margin: 0; padding: 0; } #karazkvpvb .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #karazkvpvb .gt_caption { padding-top: 4px; padding-bottom: 4px; } #karazkvpvb .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #karazkvpvb .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #karazkvpvb .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #karazkvpvb .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #karazkvpvb .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #karazkvpvb .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #karazkvpvb .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #karazkvpvb .gt_column_spanner_outer:first-child { padding-left: 0; } #karazkvpvb .gt_column_spanner_outer:last-child { padding-right: 0; } #karazkvpvb .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #karazkvpvb .gt_spanner_row { border-bottom-style: hidden; } #karazkvpvb .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #karazkvpvb .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #karazkvpvb .gt_from_md > :first-child { margin-top: 0; } #karazkvpvb .gt_from_md > :last-child { margin-bottom: 0; } #karazkvpvb .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #karazkvpvb .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #karazkvpvb .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #karazkvpvb .gt_row_group_first td { border-top-width: 2px; } #karazkvpvb .gt_row_group_first th { border-top-width: 2px; } #karazkvpvb .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #karazkvpvb .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #karazkvpvb .gt_first_summary_row.thick { border-top-width: 2px; } #karazkvpvb .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #karazkvpvb .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #karazkvpvb .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #karazkvpvb .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #karazkvpvb .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #karazkvpvb .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #karazkvpvb .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #karazkvpvb .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #karazkvpvb .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #karazkvpvb .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #karazkvpvb .gt_left { text-align: left; } #karazkvpvb .gt_center { text-align: center; } #karazkvpvb .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #karazkvpvb .gt_font_normal { font-weight: normal; } #karazkvpvb .gt_font_bold { font-weight: bold; } #karazkvpvb .gt_font_italic { font-style: italic; } #karazkvpvb .gt_super { font-size: 65%; } #karazkvpvb .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #karazkvpvb .gt_asterisk { font-size: 100%; vertical-align: 0; } #karazkvpvb .gt_indent_1 { text-indent: 5px; } #karazkvpvb .gt_indent_2 { text-indent: 10px; } #karazkvpvb .gt_indent_3 { text-indent: 15px; } #karazkvpvb .gt_indent_4 { text-indent: 20px; } #karazkvpvb .gt_indent_5 { text-indent: 25px; } TABLE 8.7: Example of {gtsummary} table with trust in government estimates by voting status American voter's trust in the federal government by whether they voted in the 2020 presidential election % (s.e.) Voted Didn’t vote Trust in Government, 2020     Always 1.1 (0.2) 0.9 (0.9)     Most of the time 13 (0.6) 19 (5.3)     About half the time 32 (0.8) 30 (8.6)     Some of the time 45 (0.8) 45 (8.2)     Never 9.1 (0.7) 5.2 (2.2) American National Election Studies, 2020 Question text: How often can you trust the federal government in Washington to do what is right? 8.3.2 Charts and plots Survey analysis can yield an abundance of printed summary statistics and models. Even with the most careful analysis, interpreting the results can be overwhelming. This is where charts and plots play a key role in our work. By transforming complex data into a visual representation, we can recognize patterns, relationships, and trends with greater ease. R has numerous packages for creating compelling and insightful charts. In this section, we focus on {ggplot2}, a member of the {tidyverse} collection of packages. Known for its power and flexibility, {ggplot2} is an invaluable tool for creating a wide range of data visualizations (Wickham 2016). The {ggplot2} package follows the “grammar of graphics,” a framework that incrementally adds layers of chart components. This approach allows us to customize visual elements such as scales, colors, labels, and annotations to enhance the clarity of our results. After creating the survey design object, we can modify it to include additional outcomes and calculate estimates for our desired data points. Below, we create a binary variable TrustGovernmentUsually, which is TRUE when TrustGovernment is “Always” or “Most of the time” and FALSE otherwise. Then, we calculate the percentage of people who usually trust the government based on their vote in the 2020 presidential election (VotedPres2020_selection.) We remove the cases where people did not vote or did not indicate their choice. anes_des_der &lt;- anes_des %&gt;% mutate(TrustGovernmentUsually = case_when( is.na(TrustGovernment) ~ NA, TRUE ~ TrustGovernment %in% c(&quot;Always&quot;, &quot;Most of the time&quot;) )) %&gt;% drop_na(VotedPres2020_selection) %&gt;% group_by(VotedPres2020_selection) %&gt;% summarize( pct_trust = survey_mean( TrustGovernmentUsually, na.rm = TRUE, proportion = TRUE, vartype = &quot;ci&quot; ), .groups = &quot;drop&quot; ) anes_des_der ## # A tibble: 3 × 4 ## VotedPres2020_selection pct_trust pct_trust_low pct_trust_upp ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Biden 0.123 0.109 0.140 ## 2 Trump 0.178 0.161 0.198 ## 3 Other 0.0681 0.0290 0.152 Now, we can begin creating our chart with {ggplot2}. First, we set up our plot with ggplot(). Next, we define the data points to be displayed using aesthetics, or aes. Aesthetics represent the visual properties of the objects in the plot. In the following example, we create a bar chart of the percentage of people who usually trust the government by who they voted for in the 2020 election. To do this, we want to have who they voted for on the x-axis (VotedPres2020_selection) and the percent they usually trust the government on the y-axis (pct_trust.) We specify these variables in ggplot() and then indicate we want a bar chart with geom_bar(). The resulting plot is displayed in Figure 8.1. p &lt;- anes_des_der %&gt;% ggplot(aes(x = VotedPres2020_selection, y = pct_trust)) + geom_bar(stat = &quot;identity&quot;) p FIGURE 8.1: Bar chart of trust in government by chosen 2020 presidential candidate This is a great starting point: it appears that a higher percentage of people state they usually trust the government among those who voted for Trump compared to those who voted for Biden or other candidates. Now, what if we want to introduce color to better differentiate the three groups? We can add fill under aesthetics, indicating that we want to use distinct colors for each value of VotedPres2020_selection. In this instance, Biden and Trump are displayed in different colors (shades in the print version of this book) in Figure 8.2. pcolor &lt;- anes_des_der %&gt;% ggplot(aes(x = VotedPres2020_selection, y = pct_trust, fill = VotedPres2020_selection)) + geom_bar(stat = &quot;identity&quot;) pcolor FIGURE 8.2: Bar chart of trust in government by chosen 2020 presidential candidate with colors Let’s say we wanted to follow proper statistical analysis practice and incorporate variability in our plot. We can add another geom, geom_errorbar(), to display the confidence intervals on top of our existing geom_bar() layer. We can add the layer using a plus sign +. The resulting graph is displayed in Figure 8.3. pcol_error &lt;- anes_des_der %&gt;% ggplot(aes(x = VotedPres2020_selection, y = pct_trust, fill = VotedPres2020_selection)) + geom_bar(stat = &quot;identity&quot;) + geom_errorbar(aes(ymin = pct_trust_low, ymax = pct_trust_upp), width = .2) pcol_error FIGURE 8.3: Bar chart of trust in government by chosen 2020 presidential candidate with colors and error bars We can continue adding to our plot until we achieve our desired look. For example, we can eliminate the color legend as it doesn’t contribute meaningful information with guides(fill = \"none\"). We can also specify specific colors for fill using scale_fill_manual(). Inside this function, we provide a vector of values corresponding to the colors in our plot. These values are hexadecimal (hex) color codes, denoted by a leading pound sign # followed by six letters or numbers. The hex code #0b3954 used below is dark blue. There are many tools online that help pick hex codes, such as htmlcolorcodes.com. Additionally, Figure 8.4 incorporates better labels for the x and y axes (xlab(), ylab()), a title (labs(title=)), and a footnote with the data source (labs(caption=).) pfull &lt;- anes_des_der %&gt;% ggplot(aes(x = VotedPres2020_selection, y = pct_trust, fill = VotedPres2020_selection)) + geom_bar(stat = &quot;identity&quot;) + geom_errorbar(aes(ymin = pct_trust_low, ymax = pct_trust_upp), width = .2) + scale_fill_manual(values = c(&quot;#0b3954&quot;, &quot;#bfd7ea&quot;, &quot;#8d6b94&quot;)) + xlab(&quot;Election choice (2020)&quot;) + ylab(&quot;Usually trust the government&quot;) + scale_y_continuous(labels = scales::percent) + guides(fill = &quot;none&quot;) + labs(title = &quot;Percent of voters who usually trust the government by chosen 2020 presidential candidate&quot;, caption = &quot;Source: American National Election Studies, 2020&quot;) pfull FIGURE 8.4: Bar chart of trust in government by chosen 2020 presidential candidate with colors, labels, error bars, and title What we’ve explored in this section are just the foundational aspects of {ggplot2}, and the capabilities of this package extend far beyond what we’ve covered. Advanced features such as annotation, faceting, and theming allow for more sophisticated and customized visualizations. The ggplot2 book by Wickham (2016) is a comprehensive guide to learning more about this powerful tool. References Iannone, Richard, Joe Cheng, Barret Schloerke, Ellis Hughes, Alexandra Lauer, and JooYoung Seo. 2023. gt: Easily Create Presentation-Ready Display Tables. Sjoberg, Daniel D., Karissa Whiting, Michael Curry, Jessica A. Lavery, and Joseph Larmarange. 2021. “Reproducible Summary Tables with the gtsummary Package.” The R Journal 13: 570–80. https://doi.org/10.32614/RJ-2021-053. Wickham, Hadley. 2016. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. https://ggplot2.tidyverse.org. The function tab_caption() is intended for usage in R Markdown, Quarto, or bookdown to add the ability of cross-referencing where as the function tab_header() is used to add a title or subtitle to a table in any context including Shiny or GitHub flavor Markdown without cross-referencing and is placed within the table object itself whereas a caption is placed based with the table based on the output type.↩︎ "],["c09-reprex-data.html", "Chapter 9 Reproducible research 9.1 Introduction 9.2 Project-based workflows 9.3 Functions and packages 9.4 Version control with Git 9.5 Package management with {renv} 9.6 R environments with Docker 9.7 Workflow management with {targets} 9.8 Documentation with Quarto and R Markdown 9.9 Other tips for reproducibility 9.10 Additional resources", " Chapter 9 Reproducible research 9.1 Introduction Reproducing results is a crucial aspect of any research. First, reproducibility serves as a form of quality assurance. If we pass an analysis project to another person, they should be able to run the entire project from start to finish and obtain the same results. They can critically assess the methodology and code while detecting potential errors. Another goal of reproducibility is enabling the verification of our analysis. When someone else is able to check our results, it ensures the integrity of the analyses by determining that the conclusions are not dependent on a particular person running the code or workflow on a particular day or in a particular environment. Not only is reproducibility a key component in ethical and accurate research, but it is also a requirement for many scientific journals. For example, the Journal of Survey Statistics and Methodology (JSSAM) and Public Opinion Quarterly (POQ) require authors to make code, data, and methodology transparent and accessible to other researchers who wish to verify or build on existing work. Reproducible research requires that the key components of analysis are available, discoverable, documented, and shared with others. The four main components that we should consider are: Code: source code used for data cleaning, analysis, modeling, and reporting Data: raw data used in the workflow, or if data are sensitive or proprietary, as much data as possible that would allow others to run our workflow or provide details on how to access the data (e.g., access to a restricted use file (RUF)) Environment: environment of the project, including the R version, packages, operating system, and other dependencies used in the analysis Methodology: survey and analysis methodology, including rationale behind sample, questionnaire and analysis decisions, interpretations, and assumptions In Chapter 8, we briefly mention how each of these is important to include in the methodology report and when communicating the findings of a study. However, to be transparent and effective researchers, we need to ensure we not only discuss these through text but also provide files and additional information when requested. Often, when starting a project, we may be eager to dive into the data and make decisions as we go without full documentation. This can be challenging if we need to go back and make changes or understand even what we did a few months ago. It benefits other analysts and potentially our future selves to document everything from the start. The good news is that many tools, practices, and project management techniques make survey analysis projects easy to reproduce. For best results, we should decide which techniques and tools to use before starting a project (or very early on.) This chapter covers some of our suggestions for tools and techniques we can use in projects. This list is not comprehensive but aims to provide a starting point for those looking to create a reproducible workflow. 9.2 Project-based workflows We recommend a project-based workflow for analysis projects as described by Wickham, Çetinkaya-Rundel, and Grolemund (2023). A project-based workflow maintains a “source of truth” for our analyses. It helps with file system discipline by putting everything related to a project in a designated folder. Since all associated files are in a single location, they are easy to find and organize. When we reopen the project, we can recreate the environment in which we originally ran the code to reproduce our results. The RStudio IDE has built-in support for projects. When we create a project in RStudio, it creates an .Rproj file that stores settings specific to that project. Once we have created a project, we can create folders that help us organize our workflow. For example, a project directory could look like this: | anes_analysis/ | anes_analysis.Rproj | README.md | codebooks | codebook2020.pdf | codebook2016.pdf | rawdata | anes2020_raw.csv | anes2016_raw.csv | scripts | data-prep.R | data | anes2020_clean.csv | anes2016_clean.csv | report | anes_report.Rmd | anes_report.html | anes_report.pdf In a project-based workflow, all paths are relative and, by default, relative to the folder the .Rproj file is located in. By using relative paths, others can open and run our files even if their directory configuration differs from ours (e.g., Mac and Windows users have different directory path structures.) The {here} package enables easy file referencing, and we can start by using the here::here() function to build the path for loading or saving data (Müller 2020). Below, we ask R to read the CSV file anes_2020.csv in the project directory’s data folder: anes &lt;- read_csv(here::here(&quot;data&quot;, &quot;anes2020_clean.csv&quot;)) The combination of projects and the {here} package keep all associated files in an organized manner. This workflow makes it more likely that our analyses can be reproduced by us or our colleagues. 9.3 Functions and packages We may find ourselves repeating ourselves in our script, and the chance of errors increases whenever we copy and paste our code. By creating a function, we can create a consistent set of commands that reduce the likelihood of mistakes. Functions also organize our code, improve the code readability, and allow others to execute the same commands. For example, in Chapter 13, we create a function to run sequences of rename(), filter(), group_by(), and summarize statements across different variables. Creating functions helps us avoid overlooking necessary steps. A package is made up of a collection of functions. If we find ourselves sharing functions with others to replicate the same series of commands in a separate project, creating a package can be a useful tool for sharing the code along with data and documentation. 9.4 Version control with Git Often, a survey analysis project produces a lot of code. Keeping track of the latest version can become challenging as files evolve throughout a project. If a team of analysts is working on the same script, someone may use an outdated version, resulting in incorrect results or redundant work. Version control systems like Git can help alleviate these pains. Git is a system that tracks changes in files. We can use Git to follow code evaluation and manage asynchronous work. With Git, it is easy to see any changes made in a script, revert changes, and resolve differences between code versions (called conflicts.) Services such as GitHub or GitLab provide hosting and sharing of files as well as version control with Git. For example, we can visit the GitHub repository for this book and see the files that build the book, when they were committed to the repository, and the history of modifications over time. In addition to code scripts, platforms like GitHub can store data and documentation. They provide a way to maintain a history of data modifications through versioning and timestamps. By saving the data and documentation alongside the code, it becomes easier for others to refer to and access everything they need in one place. Using version control in analysis projects makes collaboration and maintenance more manageable. To connect Git with R, we recommend referencing the book Happy Git and GitHub for the useR (Bryan 2023). 9.5 Package management with {renv} Ensuring reproducibility involves not only using version control of code but also managing the versions of packages. If two people run the same code but use different package versions, the results might differ because of changes to those packages. For example, this book currently uses a version of the {srvyr} package from GitHub and not from CRAN. This is because the version of {srvyr} on CRAN has some bugs (errors) that result in incorrect calculations. The version on GitHub has corrected these errors, so we have asked readers to install the GitHub version to obtain the same results. One way to handle different package versions is with the {renv} package. This package allows researchers to set the versions for each package used and manage package dependencies. Specifically, {renv} creates isolated, project-specific environments that record the packages and their versions used in the code. When initiated by a new user, {renv} checks whether the installed packages are consistent with the recorded version for the project. If not, it installs the appropriate versions so that others can replicate the project’s environment to rerun the code and obtain consistent results (Ushey and Wickham 2023). 9.6 R environments with Docker Just as different versions of packages can introduce discrepancies or compatibility issues, the version of R can also prevent reproducibility. Tools such as Docker can help with this potential issue by creating isolated environments that define the version of R being used, along with other dependencies and configurations. The entire environment is bundled in a container. The container, defined by a Dockerfile, can be shared so anybody, regardless of their local setup, can run the R code in the same environment. 9.7 Workflow management with {targets} With complex studies involving multiple code files and dependencies, it is important to ensure each step is executed in the intended sequence. We can do this manually, e.g., by numbering files to indicate the order or providing detailed documentation on the order. Alternatively, we can automate the process so the code flows sequentially. Making sure that the code runs in the correct order helps ensure that the research is reproducible. Anyone should be able to pick up the set of scripts and get the same results by following the workflow. The {targets} package is growing as a popular workflow manager that documents, automates, and executes complex data workflows with multiple steps and dependencies. With this package, we first define the order of execution for our code, and then it consistently executes the code in that order each time it is run. One beneficial feature of {targets} is that if code changes later in the workflow, only the affected code and its downstream targets (i.e., the subsequent code files) are re-executed when we change a script. The {targets} package also provides interactive progress monitoring and reporting, allowing us to track the status and progress of our analysis pipeline (Landau 2021). 9.8 Documentation with Quarto and R Markdown Tools like Quarto and R Markdown aid in reproducibility by creating documents that weave together code, text, and results. We can present analysis results alongside the report’s narrative, so there’s no need to copy and paste code output into the final documentation. By eliminating manual steps, we can reduce the chances of errors in the final output. Quarto and R Markdown documents also allow users to re-execute the underlying code when needed. Another analyst can see the steps we took, follow the scripts, and recreate the report. We can include details about our work in one place thanks to the combination of text and code, making our work transparent and easier to verify (R-quarto?; Xie, Dervieux, and Riederer 2020). 9.8.1 Parameterization Another useful feature of Quarto and R Markdown is the ability to reduce repetitive code by parameterizing the files. Parameters can control various aspects of the analysis, such as dates, geography, or other analysis variables. We can define and modify these parameters to explore different scenarios or inputs. For example, suppose we start by creating a document that provides survey analysis results for North Carolina but then later decide we want to look at another state. In that case, we can define a state parameter and rerun the same analysis for a state like Washington without having to edit the code throughout the document. Parameters can be defined in the header or code chunks of our Quarto or R Markdown documents and easily modified and documented. By manually editing code throughout the script, we reduce errors that may occur and offer a flexible way for others to replicate the analysis and explore variations. 9.9 Other tips for reproducibility 9.9.1 Random number seeds Some tasks in survey analysis require randomness, such as imputation, model training, or creating random samples. By default, the random numbers generated by R change each time we rerun the code, making it difficult to reproduce the same results. By “setting the seed,” we can control the randomness and ensure that the random numbers remain consistent whenever we rerun the code. Others can use the same seed value to reproduce our random numbers and achieve the same results. In R, we can use the set.seed() function to control the randomness in our code. We set a seed value by providing an integer in the function argument. The following code chunk sets a seed using 999, then runs a random number function (runif()) to get five random numbers from a uniform distribution. set.seed(999) runif(5) ## [1] 0.38907 0.58306 0.09467 0.85263 0.78675 Since the seed is set to 999, running runif(5) multiple times always produces the same output: ## [1] 0.38907 0.58306 0.09467 0.85263 0.78675 The choice of the seed number is up to the analyst. For example, this could be the date (20240102) or time of day (1056) when the analysis was first conducted, a phone number (8675309), or the first few numbers that come to mind (369.) As long as the seed is set for a given analysis, the actual number is up to the analyst to decide. It is important to note that set.seed() should be used before random number generation. Run it once per program, and the seed is applied to the entire script. We recommend setting the seed at the beginning of a script, where libraries are loaded. 9.9.2 Descriptive names and labels Using descriptive variable names or labeling data can also assist with reproducible research. For example, in the ANES data, the variable names in the raw data all start with V20 and are a string of numbers. To make things easier to reproduce in this book, we opted to change the variable names to be more descriptive of what they contained (e.g., Age.) This can also be done with the data values themselves. One way to accomplish this is by creating factors for categorical data, which can ensure that we know that a value of 1 really means Female, for example. There are other ways of handling this, such as attaching labels to the data instead of recoding variables to be descriptive (see Chapter 11.) As with random number seeds, the exact method is up to the analyst, but providing this information can help ensure our research is reproducible. 9.10 Additional resources We can promote accuracy and verification of results by making our analysis reproducible. There are various tools and guides available to help achieve reproducibility in analysis work, a few of which were described in this chapter. Here are additional resources to explore: R for Data Science chapter on project-based workflows Building reproducible analytical pipelines with R Posit Solutions Site page on reproducible environments References Bryan, Jenny. 2023. Happy Git and GitHub for the useR. https://happygitwithr.com/. Landau, William Michael. 2021. “The targets R Package: A Dynamic Make-Like Function-Oriented Pipeline Toolkit for Reproducibility and High-Performance Computing.” Journal of Open Source Software 6 (57): 2959. https://doi.org/10.21105/joss.02959. Müller, Kirill. 2020. here: A Simpler Way to Find Your Files. Ushey, Kevin, and Hadley Wickham. 2023. renv: Project Environments. Wickham, Hadley, Mine Çetinkaya-Rundel, and Garrett Grolemund. 2023. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. 2rd Edition. https://r4ds.hadley.nz/; O’Reilly Media. Xie, Yihui, Christophe Dervieux, and Emily Riederer. 2020. R Markdown Cookbook. Boca Raton, Florida: Chapman; Hall/CRC. https://bookdown.org/yihui/rmarkdown-cookbook. "],["c10-sample-designs-replicate-weights.html", "Chapter 10 Sample designs and replicate weights 10.1 Introduction 10.2 Common sampling designs 10.3 Combining sampling methods 10.4 Replicate weights 10.5 Exercises", " Chapter 10 Sample designs and replicate weights Prerequisites For this chapter, load the following packages: library(tidyverse) library(survey) library(srvyr) library(srvyrexploR) To help explain the different types of sample designs, this chapter uses the api and scd data that are included in the {survey} package (Lumley 2010): data(api) data(scd) This chapter uses data from the Residential Energy Consumption Survey (RECS) - both 2015 and 2020, so we load the RECS data from the {srvyrexploR} package using their object names recs_2015 and recs_2020, respectively (Zimmer, Powell, and Velásquez 2024). 10.1 Introduction The primary reason for using packages like {survey} and {srvyr} is to account for the sampling design or replicate weights into estimates (Freedman Ellis and Schneider 2023; Lumley 2010). By incorporating the sampling design or replicate weights, precision estimates (e.g., standard errors and confidence intervals) are appropriately calculated. In this chapter, we introduce common sampling designs and common types of replicate weights, the mathematical methods for calculating estimates and standard errors for a given sampling design, and the R syntax to specify the sampling design or replicate weights. While we show the math behind the estimates, the functions in these packages handle the calculation. To deeply understand the math and the derivation, refer to Penn State (2019), Särndal, Swensson, and Wretman (2003), Wolter (2007), or Fuller (2011) (these are listed in order of increasing statistical rigorousness.) The general process for estimation in the {srvyr} package is to: Create a tbl_svy object (a survey object) using: as_survey_design() or as_survey_rep() Subset data (if needed) using filter() (subpopulations) Specify domains of analysis using group_by() Within summarize(), specify variables to calculate, including means, totals, proportions, quantiles, and more This chapter includes details on the first step - creating the survey object. Once this survey object is created, it can be used in the other steps (detailed in chapters 5 through 7) to account for the complex survey design. 10.2 Common sampling designs A sampling design is the method used to draw a sample. Both logistical and statistical elements are considered when developing a sampling design. When specifying a sampling design in R, we specify the levels of sampling along with the weights. The weight for each record is constructed so that the particular record represents that many units in the population. For example, in a survey of 6th-grade students in the United States, the weight associated with each responding student reflects how many 6th grade students across the country that record represents. Generally, the weights represent the inverse of the probability of selection, such that the sum of the weights corresponds to the total population size, although some studies may have the sum of the weights equal to the number of respondent records. Some common terminology across the designs are: sample size, generally denoted as \\(n\\), is the number of units selected to be sampled population size, generally denoted as \\(N\\), is the number of units in the population of interest sampling frame, the list of units from which the sample is drawn (see Chapter 2 for more information) 10.2.1 Simple random sample without replacement The simple random sample (SRS) without replacement is a sampling design in which a fixed sample size is selected from a sampling frame, and every possible subsample has an equal probability of selection. Without replacement refers to the fact that once a sampling unit has been selected, it is removed from the sample frame and cannot be selected again. Requirements: The sampling frame must include the entire population. Advantages: SRS requires no information about the units apart from contact information. Disadvantages: The sampling frame may not be available for the entire population. Example: Randomly select students in a university from a roster provided by the registrar’s office. The math The estimate for the population mean of variable \\(y\\) is: \\[\\bar{y}=\\frac{1}{n}\\sum_{i=1}^n y_i\\] where \\(\\bar{y}\\) represents the sample mean, \\(n\\) is the total number of respondents (or observations), and \\(y_i\\) is each individual value of \\(y\\). The estimate of the standard error of the mean is: \\[se(\\bar{y})=\\sqrt{\\frac{s^2}{n}\\left( 1-\\frac{n}{N} \\right)}\\] where \\[s^2=\\frac{1}{n-1}\\sum_{i=1}^n\\left(y_i-\\bar{y}\\right)^2.\\] and \\(N\\) is the population size. This standard error estimate might look very similar to equations in other statistical applications except for the part on the right side of the equation: \\(1-\\frac{n}{N}\\). This is called the finite population correction (FPC) factor. If the size of the frame, \\(N\\), is very large in comparison to the sample, the FPC is negligible, so it is often ignored. A common guideline is if the sample is less than 10% of the population, the FPC is negligible. To estimate proportions, we define \\(x_i\\) as the indicator if the outcome is observed. That is, \\(x_i=1\\) if the outcome is observed, and \\(x_i=0\\) if the outcome is not observed for respondent \\(i\\). Then the estimated proportion from an SRS design is: \\[\\hat{p}=\\frac{1}{n}\\sum_{i=1}^n x_i \\] and the estimated standard error of the proportion is: \\[se(\\hat{p})=\\sqrt{\\frac{\\hat{p}(1-\\hat{p})}{n-1}\\left(1-\\frac{n}{N}\\right)} \\] The syntax If a sample was drawn through SRS and had no nonresponse or other weighting adjustments, in R, we specify this design as: srs1_des &lt;- dat %&gt;% as_survey_design(fpc = fpcvar) where dat is a tibble or data.frame with the survey data, and fpcvar is a variable in the data indicating the sampling frame’s size (this variable has the same value for all cases in an SRS design.) If the frame is very large, sometimes the frame size is not provided. In that case, the FPC is not needed, and we specify the design as: srs2_des &lt;- dat %&gt;% as_survey_design() If some post-survey adjustments were implemented and the weights are not all equal, we specify the design as: srs3_des &lt;- dat %&gt;% as_survey_design(weights = wtvar, fpc = fpcvar) where wtvar is a variable in the data indicating the weight for each case. Again, the FPC can be omitted if it is unnecessary because the frame is large compared to the sample size. Example The {survey} package in R provides some example datasets that we use throughout this chapter. The documentation provides detailed information about the variables. One of the example datasets we use is from the Academic Performance Index (API.) The API was a program administered by the California Department of Education, and the {survey} package includes a population file (sample frame) of all schools with at least 100 students and several different samples pulled from that data using different sampling methods. For this first example, we use the apisrs dataset, which contains an SRS of 200 schools. For printing purposes, we create a new dataset called apisrs_slim, which sorts the data by the school district and school ID and subsets the data to only a few columns. The SRS sample data are illustrated below: apisrs_slim &lt;- apisrs %&gt;% as_tibble() %&gt;% arrange(dnum, snum) %&gt;% select(cds, dnum, snum, dname, sname, fpc, pw) apisrs_slim ## # A tibble: 200 × 7 ## cds dnum snum dname sname fpc pw ## &lt;chr&gt; &lt;int&gt; &lt;dbl&gt; &lt;chr&gt; &lt;chr&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 19642126061220 1 1121 ABC Unified Haske… 6194 31.0 ## 2 19642126066716 1 1124 ABC Unified Stowe… 6194 31.0 ## 3 36675876035174 5 3895 Adelanto Elementary Adela… 6194 31.0 ## 4 33669776031512 19 3347 Alvord Unified Arlan… 6194 31.0 ## 5 33669776031595 19 3352 Alvord Unified Wells… 6194 31.0 ## 6 31667876031033 39 3271 Auburn Union Elementary Cain … 6194 31.0 ## 7 19642876011407 42 1169 Baldwin Park Unified Deanz… 6194 31.0 ## 8 19642876011464 42 1175 Baldwin Park Unified Heath… 6194 31.0 ## 9 19642956011589 48 1187 Bassett Unified Erwin… 6194 31.0 ## 10 41688586043392 49 4948 Bayshore Elementary Baysh… 6194 31.0 ## # ℹ 190 more rows Table 10.1 provides details on all the variables in this dataset. TABLE 10.1: Overview of Variables in api Data Variable Name Description cds Unique identifier for each school dnum School district identifier within county snum School identifier within district dname District Name sname School Name fpc Finite population correction factor (FPC) pw Weight To create the tbl_survey object for the SRS data, we specify the design as: apisrs_des &lt;- apisrs_slim %&gt;% as_survey_design(weights = pw, fpc = fpc) apisrs_des ## Independent Sampling design ## Called via srvyr ## Sampling variables: ## - ids: `1` ## - fpc: fpc ## - weights: pw ## Data variables: ## - cds (chr), dnum (int), snum (dbl), dname (chr), sname (chr), fpc ## (dbl), pw (dbl) In the printed design object, the design is described as an “Independent Sampling design,” which is another term for SRS. The ids are specified as 1, which means there is no clustering (a topic described in Section 10.2.4), the FPC variable is indicated, and the weights are indicated. We can also look at the summary of the design object (summary()), and see the distribution of the probabilities (inverse of the weights) along with the population size and a list of the variables in the dataset. summary(apisrs_des) ## Independent Sampling design ## Called via srvyr ## Probabilities: ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0.0323 0.0323 0.0323 0.0323 0.0323 0.0323 ## Population size (PSUs): 6194 ## Data variables: ## [1] &quot;cds&quot; &quot;dnum&quot; &quot;snum&quot; &quot;dname&quot; &quot;sname&quot; &quot;fpc&quot; &quot;pw&quot; 10.2.2 Simple random sample with replacement Similar to the SRS design, the simple random sample with replacement (SRSWR) design randomly selects the sample from the entire sampling frame. However, while SRS removes sampled units before selecting again, the SRSWR instead replaces each sampled unit before drawing again, so units can be selected more than once. Requirements: The sampling frame must include the entire population. Advantages: SRSWR requires no information about the units apart from contact information. Disadvantages: The sampling frame may not be available for the entire population. Units can be selected more than once, resulting in a smaller realized sample size because receiving duplicate information from a single respondent does not provide additional information. For small populations, SRSWR has larger standard errors than SRS designs. Example: A professor puts all students’ names on paper slips and selects them randomly to ask students questions, but the professor replaces the paper after calling on the student so they can be selected again at any time. In general for surveys, using an SRS design (without replacement) is preferred as we do not want respondents to answer a survey more than once. The math The estimate for the population mean of variable \\(y\\) is: \\[\\bar{y}=\\frac{1}{n}\\sum_{i=1}^n y_i\\] and the estimate of the standard error of mean is: \\[se(\\bar{y})=\\sqrt{\\frac{s^2}{n}}\\] where \\[s^2=\\frac{1}{n-1}\\sum_{i=1}^n\\left(y_i-\\bar{y}\\right)^2.\\] To calculate the estimated proportion, we define \\(x_i\\) as the indicator that the outcome is observed (as we did with SRS): \\[\\hat{p}=\\frac{1}{n}\\sum_{i=1}^n x_i \\] and the estimated standard error of the proportion is: \\[se(\\hat{p})=\\sqrt{\\frac{\\hat{p}(1-\\hat{p})}{n}} \\] The syntax If we had a sample that was drawn through SRSWR and had no nonresponse or other weighting adjustments, in R, we specify this design as: srswr1_des &lt;- dat %&gt;% as_survey_design() where dat is a tibble or data.frame containing our survey data. This syntax is the same as a SRS design, except a finite population correction (FPC) is not included. This is because when calculating a sample with replacement, the population pool to select from is no longer finite, so a correction is not needed. Therefore, with large populations where the FPC is negligible, the underlying formulas for SRS and SRSWR designs are the same. If some post-survey adjustments were implemented and the weights are not all equal, we specify the design as: srswr2_des &lt;- dat %&gt;% as_survey_design(weights = wtvar) where wtvar is the variable for the weight of the data. Example The {survey} package does not include an example of SRSWR, so to illustrate this design, we need to create an example. We use the api population data provided by the {survey} package apipop and select a sample of 200 cases using the slice_sample() function from the tidyverse. One of the arguments in the slice_sample() function is replace. If replace=TRUE, then we are conducting a SRSWR. We then calculate selection weights as the inverse of the probability of selection and call this new dataset apisrswr. set.seed(409963) apisrswr &lt;- apipop %&gt;% as_tibble() %&gt;% slice_sample(n = 200, replace = TRUE) %&gt;% select(cds, dnum, snum, dname, sname) %&gt;% mutate( weight = nrow(apipop)/200 ) head(apisrswr) ## # A tibble: 6 × 6 ## cds dnum snum dname sname weight ## &lt;chr&gt; &lt;int&gt; &lt;dbl&gt; &lt;chr&gt; &lt;chr&gt; &lt;dbl&gt; ## 1 43696416060065 533 5348 Palo Alto Unified Jordan (Da… 31.0 ## 2 07618046005060 650 509 San Ramon Valley Unified Alamo Elem… 31.0 ## 3 19648086085674 457 2134 Montebello Unified La Merced … 31.0 ## 4 07617056003719 346 377 Knightsen Elementary Knightsen … 31.0 ## 5 19650606023022 744 2351 Torrance Unified Carr (Evel… 31.0 ## 6 01611196090120 6 13 Alameda City Unified Paden (Wil… 31.0 Because this is a SRS design with replacement, there may be duplicates in the data. It is important to keep the duplicates in the data for proper estimation, but for reference, we can view the duplicates in the example data we just created. apisrswr %&gt;% group_by(cds) %&gt;% filter(n()&gt;1) %&gt;% arrange(cds) ## # A tibble: 4 × 6 ## # Groups: cds [2] ## cds dnum snum dname sname weight ## &lt;chr&gt; &lt;int&gt; &lt;dbl&gt; &lt;chr&gt; &lt;chr&gt; &lt;dbl&gt; ## 1 15633216008841 41 869 Bakersfield City Elem Chipman Junio… 31.0 ## 2 15633216008841 41 869 Bakersfield City Elem Chipman Junio… 31.0 ## 3 39686766042782 716 4880 Stockton City Unified Tyler Skills … 31.0 ## 4 39686766042782 716 4880 Stockton City Unified Tyler Skills … 31.0 We created a weight variable in this example data, which is the inverse of the probability of selection. We specify the sampling design for apisrswr as: apisrswr_des &lt;- apisrswr %&gt;% as_survey_design(weights = weight) apisrswr_des ## Independent Sampling design (with replacement) ## Called via srvyr ## Sampling variables: ## - ids: `1` ## - weights: weight ## Data variables: ## - cds (chr), dnum (int), snum (dbl), dname (chr), sname (chr), weight ## (dbl) summary(apisrswr_des) ## Independent Sampling design (with replacement) ## Called via srvyr ## Probabilities: ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0.0323 0.0323 0.0323 0.0323 0.0323 0.0323 ## Data variables: ## [1] &quot;cds&quot; &quot;dnum&quot; &quot;snum&quot; &quot;dname&quot; &quot;sname&quot; &quot;weight&quot; In the output above, the design object and the object summary are shown. Both note that the sampling is done “with replacement” because no FPC was specified. The probabilities, which are derived from the weights, are summarized in the summary function output. 10.2.3 Stratified sampling Stratified sampling occurs when a population is divided into mutually exclusive subpopulations (strata), and then samples are selected independently within each stratum. Requirements: The sampling frame must include the information to divide the population into strata for every unit. Advantages: This design ensures sample representation in all subpopulations. If the strata are correlated with survey outcomes, a stratified sample has smaller standard errors compared to a SRS sample of the same size. This results in a more efficient design. Disadvantages: Auxiliary data may not exist to divide the sampling frame into strata, or the data may be outdated. Examples: Example 1: A population of North Carolina residents could be stratified into urban and rural areas, and then an SRS of residents from both rural and urban areas is selected independently. This ensures there are residents from both areas in the sample. Example 2: Law enforcement agencies could be stratified into the three primary general-purpose categories in the U.S.: local police, sheriff’s departments, and state police. A SRS of agencies from each of the three types is then selected independently to ensure all three types of agencies are represented. The math Let \\(\\bar{y}_h\\) be the sample mean for stratum \\(h\\), \\(N_h\\) be the population size of stratum \\(h\\), \\(n_h\\) be the sample size of stratum \\(h\\), and \\(H\\) is the total number of strata. Then, the estimate for the population mean under stratified SRS sampling is: \\[\\bar{y}=\\frac{1}{N}\\sum_{h=1}^H N_h\\bar{y}_h\\] and the estimate of the standard error of \\(\\bar{y}\\) is: \\[se(\\bar{y})=\\sqrt{\\frac{1}{N^2} \\sum_{h=1}^H N_h^2 \\frac{s_h^2}{n_h}\\left(1-\\frac{n_h}{N_h}\\right)} \\] where \\[s_h^2=\\frac{1}{n_h-1}\\sum_{i=1}^{n_h}\\left(y_{i,h}-\\bar{y}_h\\right)^2\\] For estimates of proportions, let \\(\\hat{p}_h\\) be the estimated proportion in stratum \\(h\\). Then, the population proportion estimate is: \\[\\hat{p}= \\frac{1}{N}\\sum_{h=1}^H N_h \\hat{p}_h\\] The standard error of the proportion is: \\[se(\\hat{p}) = \\frac{1}{N} \\sqrt{ \\sum_{h=1}^H N_h^2 \\frac{\\hat{p}_h(1-\\hat{p}_h)}{n_h-1} \\left(1-\\frac{n_h}{N_h}\\right)}\\] The syntax In addition to the fpc and weights arguments discussed in the types above, stratified designs require the addition of the strata argument. For example, to specify a stratified SRS design in {srvyr} when using the FPC, that is, where the population sizes of the strata are not too large and are known, we specify the design as: stsrs1_des &lt;- dat %&gt;% as_survey_design(fpc = fpcvar, strata = stratvar) where fpcvar is a variable on our data that indicates \\(N_h\\) for each row, and stratavar is a variable indicating the stratum for each row. We can omit the FPC if it is not applicable. Additionally, we can indicate the weight variable if it is present where wtvar is a variable on our data with a numeric weight. stsrs2_des &lt;- dat %&gt;% as_survey_design(weights = wtvar, strata = stratvar) Example In the example API data, apistrat is a stratified random sample, stratified by school type (stype) with three levels: E for elementary school, M for middle school, and H for high school. As with the SRS example above, we sort and select specific variables for use in printing. The data are illustrated below, including a count of the number of cases per stratum: apistrat_slim &lt;- apistrat %&gt;% as_tibble() %&gt;% arrange(dnum, snum) %&gt;% select(cds, dnum, snum, dname, sname, stype, fpc, pw) apistrat_slim %&gt;% count(stype, fpc) ## # A tibble: 3 × 3 ## stype fpc n ## &lt;fct&gt; &lt;dbl&gt; &lt;int&gt; ## 1 E 4421 100 ## 2 H 755 50 ## 3 M 1018 50 The FPC is the same for each case within each stratum. This output also shows that 100 elementary schools, 50 middle schools, and 50 high schools were sampled. It is often common for the number of units sampled from each strata to be different based on the goals of the project, or to mirror the size of each strata in the population. We specify the design as: apistrat_des &lt;- apistrat_slim %&gt;% as_survey_design(strata = stype, weights = pw, fpc = fpc) apistrat_des ## Stratified Independent Sampling design ## Called via srvyr ## Sampling variables: ## - ids: `1` ## - strata: stype ## - fpc: fpc ## - weights: pw ## Data variables: ## - cds (chr), dnum (int), snum (dbl), dname (chr), sname (chr), stype ## (fct), fpc (dbl), pw (dbl) summary(apistrat_des) ## Stratified Independent Sampling design ## Called via srvyr ## Probabilities: ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0.0226 0.0226 0.0359 0.0401 0.0534 0.0662 ## Stratum Sizes: ## E H M ## obs 100 50 50 ## design.PSU 100 50 50 ## actual.PSU 100 50 50 ## Population stratum sizes (PSUs): ## E H M ## 4421 755 1018 ## Data variables: ## [1] &quot;cds&quot; &quot;dnum&quot; &quot;snum&quot; &quot;dname&quot; &quot;sname&quot; &quot;stype&quot; &quot;fpc&quot; &quot;pw&quot; When printing the object, it is specified as an “Stratified Independent Sampling design,” also known as a stratified SRS, and the strata variable is included. Printing the summary, we see a distribution of probabilities, as we saw with SRS, but we also see the sample and population sizes by stratum. 10.2.4 Clustered sampling Clustered sampling occurs when a population is divided into mutually exclusive subgroups called clusters or primary sampling units (PSUs.) A random selection of PSUs is sampled, and then another level of sampling is done within these clusters. There can be multiple levels of this selection. Clustered sampling is often used when a list of the entire population is not available or data collection involves interviewers needing direct contact with respondents. Requirements: There must be a way to divide the population into clusters. Clusters are commonly structural, such as institutions (e.g., schools, prisons) or geography (e.g., states, counties.) Advantages: Clustered sampling is advantageous when data collection is done in person, so interviewers are sent to specific sampled areas rather than completely at random across a country. With clustered sampling, a list of the entire population is not necessary. For example, if sampling students, we do not need a list of all students, but only a list of all schools. Once the schools are sampled, lists of students can be obtained within the sampled schools. Disadvantages: Compared to a simple random sample for the same sample size, clustered samples generally have larger standard errors of estimates. Examples: Example 1: Consider a study needing a sample of 6th-grade students in the United States. No list likely exists of all these students. However, it is more likely to obtain a list of schools that enroll 6th graders, so a study design could select a random sample of schools that enroll 6th graders. The selected schools can then provide a list of students to do a second stage of sampling where 6th-grade students are randomly sampled within each of the sampled schools. This is a one-stage sample design (the one representing the number of clusters) and is the type of design we discuss in the formulas below. Example 2: Consider a study sending interviewers to households for a survey. This is a more complicated example that requires two levels of clustering (two-stage sample design) to efficiently use interviewers in geographic clusters. First, in the U.S., counties could be selected as the PSU and then census block groups within counties could be selected as the secondary sampling unit (SSU.) Households could then be randomly sampled within the block groups. This type of design is popular for in-person surveys as it reduces the travel necessary for interviewers. The math Consider a survey where a sample of \\(a\\) clusters are sampled from a population of \\(A\\) clusters via SRS. Units within each sampled cluster are sampled via SRS as well. Within each sampled cluster, \\(i\\), there are \\(B_i\\) units in the population and \\(b_i\\) units are sampled via SRS. Let \\(\\bar{y}_{i}\\) be the sample mean of cluster \\(i\\). Then, a ratio estimator of the population mean is: \\[\\bar{y}=\\frac{\\sum_{i=1}^a B_i \\bar{y}_{i}}{ \\sum_{i=1}^a B_i}\\] Note this is a consistent but biased estimator. Often the population size is not known, so this is a method to estimate a mean without knowing the population size. The estimated standard error of the mean is: \\[se(\\bar{y})= \\frac{1}{\\hat{N}}\\sqrt{\\left(1-\\frac{a}{A}\\right)\\frac{s_a^2}{a} + \\frac{A}{a} \\sum_{i=1}^a \\left(1-\\frac{b_i}{B_i}\\right) \\frac{s_i^2}{b_i} }\\] where \\(\\hat{N}\\) is the estimated population size, \\(s_a^2\\) is the between-cluster variance and \\(s_i^2\\) is the within-cluster variance. The formula for the between-cluster variance (\\(s_a^2\\)) is: \\[s_a^2=\\frac{1}{a-1}\\sum_{i=1}^a \\left( \\hat{y}_i - \\frac{\\sum_{i=1}^a \\hat{y}_{i} }{a}\\right)^2\\] where \\(\\hat{y}_i =B_i\\bar{y_i}\\) . The formula for the within-cluster variance (\\(s_i^2\\)) is: \\[s_i^2=\\frac{1}{a(b_i-1)} \\sum_{j=1}^{b_i} \\left(y_{ij}-\\bar{y}_i\\right)^2\\] where \\(y_{ij}\\) is the outcome for sampled unit \\(j\\) within cluster \\(i\\). The syntax Clustered sampling designs require the addition of the ids argument, which specifies what the cluster levels variables. To specify a two-stage clustered design without replacement, we specify the design as: clus2_des &lt;- dat %&gt;% as_survey_design(weights = wtvar, ids = c(PSU, SSU), fpc = c(A, B)) where PSU and SSU are the variables indicating the PSU and SSU identifiers, and A and B are the variables indicating the population sizes for each level (i.e., A is the number of clusters, and B is the number of units within each cluster.) Note that A is the same for all records, and B is the same for all records within the same cluster. If clusters were sampled with replacement or from a very large population, the FPC is unnecessary. Additionally, only the first stage of selection is necessary regardless of whether the units were selected with replacement at any stage. The subsequent stages of selection are ignored in computation as their contribution to the variance is overpowered by the first stage (see Särndal, Swensson, and Wretman (2003) or Wolter (2007) for a more in-depth discussion.) Therefore, the two design objects specified below yield the same estimates in the end: clus2ex1_des &lt;- dat %&gt;% as_survey_design(weights = wtvar, ids = c(PSU, SSU)) clus2ex2_des &lt;- dat %&gt;% as_survey_design(weights = wtvar, ids = PSU) Note that there is one additional argument that is sometimes necessary, which is nest = TRUE. This option relabels cluster IDs to enforce nesting within strata. Sometimes, as an example, there may be a cluster 1 within each stratum, but cluster 1 in stratum 1 is a different cluster than cluster 1 in stratum 2. These are actually different clusters. This option indicates that repeated numbering does not mean it is the same cluster. If this option is not used and there are repeated cluster IDs across different strata, an error is generated. Example The survey package includes a two-stage cluster sample data, apiclus2, in which school districts were sampled, and then a random sample of five schools was selected within each district. strict. All districts with fewer than five schools were sampled. School districts are identified by dnum, and schools are identified by snum. The variable fpc1 indicates how many districts there are in California (the total number of PSUs or A), and fpc2 indicates how many schools were in a given district with at least 100 students (the total number of SSUs or B.) The data include a row for each school. In the data printed below, there are 757 school districts, as indicated by fpc1, and nine schools in District 731, one school in District 742, two schools in District 768, and so on, as indicated by fpc2. For illustration purposes, the object apiclus2_slim has been created from apiclus2, which subsets the data to only the necessary columns and sorts the data. apiclus2_slim &lt;- apiclus2 %&gt;% as_tibble() %&gt;% arrange(desc(dnum), snum) %&gt;% select(cds, dnum, snum, fpc1, fpc2, pw) apiclus2_slim ## # A tibble: 126 × 6 ## cds dnum snum fpc1 fpc2 pw ## &lt;chr&gt; &lt;int&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;int[1d]&gt; &lt;dbl&gt; ## 1 47704826050942 795 5552 757 1 18.9 ## 2 07618126005169 781 530 757 6 22.7 ## 3 07618126005177 781 531 757 6 22.7 ## 4 07618126005185 781 532 757 6 22.7 ## 5 07618126005193 781 533 757 6 22.7 ## 6 07618126005243 781 535 757 6 22.7 ## 7 19650786023337 768 2371 757 2 18.9 ## 8 19650786023345 768 2372 757 2 18.9 ## 9 54722076054423 742 5898 757 1 18.9 ## 10 50712906053086 731 5781 757 9 34.1 ## # ℹ 116 more rows To specify this design in R, we use the following: apiclus2_des &lt;- apiclus2_slim %&gt;% as_survey_design( ids = c(dnum, snum), fpc = c(fpc1, fpc2), weights = pw ) apiclus2_des ## 2 - level Cluster Sampling design ## With (40, 126) clusters. ## Called via srvyr ## Sampling variables: ## - ids: `dnum + snum` ## - fpc: `fpc1 + fpc2` ## - weights: pw ## Data variables: ## - cds (chr), dnum (int), snum (dbl), fpc1 (dbl), fpc2 (int[1d]), pw ## (dbl) summary(apiclus2_des) ## 2 - level Cluster Sampling design ## With (40, 126) clusters. ## Called via srvyr ## Probabilities: ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0.00367 0.03774 0.05284 0.04239 0.05284 0.05284 ## Population size (PSUs): 757 ## Data variables: ## [1] &quot;cds&quot; &quot;dnum&quot; &quot;snum&quot; &quot;fpc1&quot; &quot;fpc2&quot; &quot;pw&quot; The design objects are described as “2 - level Cluster Sampling design,” and include the ids (cluster), FPC, and weight variables. The summary notes that the sample includes 40 first-level clusters (PSUs), which are school districts, and 126 second-level clusters (SSUs), which are schools. Additionally, the summary includes a numeric summary of the probabilities of selection and the population size (number of PSUs) as 757. 10.3 Combining sampling methods SRS, stratified, and clustered designs are the backbone of sampling designs, and the features are often combined in one design. Additionally, rather than using SRS for selection, other sampling mechanisms are commonly used, such as probability proportional to size (PPS), systematic sampling, or selection with unequal probabilities, which are briefly described here. In PPS sampling, a size measure is constructed for each unit (e.g., the population of the PSU or the number of occupied housing units), and units with larger size measures are more likely to be sampled. Systematic sampling is commonly used to ensure representation across a population. Units are sorted by a feature, and then every \\(k\\) units is selected from a random start point so the sample is spread across the population. In addition to PPS, other unequal probabilities of selection may be used. For example, in a study of establishments (e.g., businesses or public institutions) that conducts a survey every year, an establishment that recently participated (e.g., participated last year) may have a reduced chance of selection in a subsequent round to reduce the burden on the establishment. To learn more about sampling designs, refer to Valliant, Dever, and Kreuter (2013), Cox et al. (2011), Cochran (1977), and Deming (1991). A common method of sampling is to stratify PSUs, select PSUs within the stratum using PPS selection, and then select units within the PSUs either with SRS or PPS. Reading survey documentation is an important first step in survey analysis to understand the design of the survey we are using and variables necessary to specify the design. Good documentation highlights the variables necessary to specify the design. This is often found in the user guide, methodology report, analysis guide, or technical documentation (see Chapter 3 for more details.) Example For example, the 2017-2019 National Survey of Family Growth had a stratified multi-stage area probability sample: In the first stage, PSUs are counties or collections of counties and are stratified by Census region/division, size (population), and MSA status. Within each stratum, PSUs were selected via PPS. In the second stage, neighborhoods were selected within the sampled PSUs using PPS selection. In the third stage, housing units were selected within the sampled neighborhoods. In the fourth stage, a person was randomly chosen among eligible persons within the selected housing units using unequal probabilities based on the person’s age and sex. The public use file does not include all these levels of selection and instead has pseudo-strata and pseudo-clusters, which are the variables used in R to specify the design. As specified on page 4 of the documentation, the stratum variable is SEST, the cluster variable is SECU, and the weight variable is WGT2017_2019. Thus, to specify this design in R, we use the following syntax: nsfg_des &lt;- nsfgdata %&gt;% as_survey_design(ids = SECU, strata = SEST, weights = WGT2017_2019) 10.4 Replicate weights Replicate weights are often included on analysis files instead of, or in addition to, the design variables (strata and PSUs.) Replicate weights are used as another method to estimate variability. Often, researchers choose to use replicate weights to avoid publishing design variables (strata or clustering variables) as a measure to reduce the risk of disclosure. There are several types of replicate weights, including balanced repeated replication (BRR), Fay’s BRR, jackknife, and bootstrap methods. An overview of the process for using replicate weights is as follows: Divide the sample into subsample replicates that mirror the design of the sample Calculate weights for each replicate using the same procedures for the full-sample weight (i.e., nonresponse and post-stratification) Calculate estimates for each replicate using the same method as the full-sample estimate Calculate the estimated variance, which is proportional to the variance of the replicate estimates The different types of replicate weights largely differ between step 1 (how the sample is divided into subsamples) and step 4 (which multiplication factors (scales) are used to multiply the variance.) The general format for the standard error is: \\[ \\sqrt{\\alpha \\sum_{r=1}^R \\alpha_r (\\hat{\\theta}_r - \\hat{\\theta})^2 }\\] where \\(R\\) is the number of replicates, \\(\\alpha\\) is a constant that depends on the replication method, \\(\\alpha_r\\) is a factor associated with each replicate, \\(\\hat{\\theta}\\) is the weighted estimate based on the full sample, and \\(\\hat{\\theta}_r\\) is the weighted estimate of \\(\\theta\\) based on the \\(r^{\\text{th}}\\) replicate. To create the design object for surveys with replicate weights, we use as_survey_rep() instead of as_survey_design(), which we use for the common sampling designs in the sections above. 10.4.1 Balanced Repeated Replication (BRR) method The BRR method requires a stratified sample design with two PSUs in each stratum. Each replicate is constructed by deleting one PSU per stratum using a Hadamard matrix. For the PSU that is included, the weight is generally multiplied by two but may have other adjustments, such as post-stratification. A Hadamard matrix is a special square matrix with entries of +1 or -1 with mutually orthogonal rows. Hadamard matrices must have one row, two rows, or a multiple of four rows. The size of the Hadamard matrix is determined by the first multiple of 4 greater than or equal to the number of strata. For example, if a survey had seven strata, the Hadamard matrix would be an \\(8\\times8\\) matrix. Additionally, a survey with eight strata would also have an \\(8\\times8\\) Hadamard matrix. The columns in the matrix specify the strata and the rows specify the replicate. In each replicate (row), a +1 means to use the first PSU and a -1 means to use the second PSU in the estimate. For example, here is a \\(4\\times4\\) Hadamard matrix: \\[ \\begin{array}{rrrr} +1 &amp;+1 &amp;+1 &amp;+1\\\\ +1&amp;-1&amp;+1&amp;-1\\\\ +1&amp;+1&amp;-1&amp;-1\\\\ +1 &amp;-1&amp;-1&amp;+1 \\end{array} \\] In the first replicate (row), all the values are +1, so in each stratum, the first PSU would be used in the estimate. In the second replicate, the first PSU would be used in strata 1 and 3, while the second PSU would be used in strata 2 and 4. In the third replicate, the first PSU would be used in strata 1 and 2, while the second PSU would be used in strata 3 and 4. Finally, in the fourth replicate, the first PSU would be used in strata 1 and 4, while the second PSU would be used in strata 2 and 3. For more information about Hadamard matrices, see Wolter (2007). Note that supplied BRR weights from a data provider already incorporate this adjustment, and the {survey} package generates the Hadamard matrix, if necessary, for calculating BRR weights, so an analyst does not need to create or provide the matrix. The math A weighted estimate for the full sample is calculated as \\(\\hat{\\theta}\\), and then a weighted estimate for each replicate is calculated as \\(\\hat{\\theta}_r\\) for \\(R\\) replicates. Using the generic notation above, \\(\\alpha=\\frac{1}{R}\\) and \\(\\alpha_r=1\\) for each \\(r\\). The standard error of the estimate is calculated as follows: \\[se(\\hat{\\theta})=\\sqrt{\\frac{1}{R} \\sum_{r=1}^R \\left( \\hat{\\theta}_r-\\hat{\\theta}\\right)^2}\\] Specifying replicate weights in R requires specifying the type of replicate weights, the main weight variable, the replicate weight variables, and other options. One of the key options is for the mean squared error (MSE.) If mse=TRUE, variances are computed around the point estimate \\((\\hat{\\theta})\\), whereas if mse=FALSE, variances are computed around the mean of the replicates \\((\\bar{\\theta})\\) instead, which looks like this: \\[se(\\hat{\\theta})=\\sqrt{\\frac{1}{R} \\sum_{r=1}^R \\left( \\hat{\\theta}_r-\\bar{\\theta}\\right)^2}\\] where \\[\\bar{\\theta}=\\frac{1}{R}\\sum_{r=1}^R \\hat{\\theta}_r\\] The default option for mse is to use the global option of “survey.replicates.mse” which is set to FALSE initially unless a user changes it. To determine if mse should be set to TRUE or FALSE, read the survey documentation. If there is no indication in the survey documentation for BRR, we recommend setting mse to TRUE as this is the default in other software (e.g., SAS, SUDAAN.) The syntax Replicate weights generally come in groups and are sequentially numbered, such as PWGTP1, PWGTP2, …, PWGTP80 for the person weights in the American Community Survey (ACS) (U.S. Census Bureau 2021) or BRRWT1, BRRWT2, …, BRRWT96 in the 2015 Residential Energy Consumption Survey (RECS) (U.S. Energy Information Administration 2017). This makes it easy to use some of the tidy selection functions in R. To specify a BRR design, we need to specify the weight variable (weights), the replicate weight variables (repweights), the type of replicate weights as BRR (type = BRR), and whether the mean squared error should be used (mse = TRUE) or not (mse = FALSE.) For example, if a dataset had WT0 for the main weight and had 20 BRR weights indicated WT1, WT2, …, WT20, we can use the following syntax (both are equivalent): brr_des &lt;- dat %&gt;% as_survey_rep(weights = WT0, repweights = all_of(str_c(&quot;WT&quot;, 1:20)), type = &quot;BRR&quot;, mse = TRUE) brr_des &lt;- dat %&gt;% as_survey_rep(weights = WT0, repweights = num_range(&quot;WT&quot;, 1:20), type = &quot;BRR&quot;, mse = TRUE) If a dataset had WT for the main weight and had 20 BRR weights indicated REPWT1, REPWT2, …, REPWT20, we can use the following syntax (both are equivalent): brr_des &lt;- dat %&gt;% as_survey_rep(weights = WT, repweights = all_of(str_c(&quot;REPWT&quot;, 1:20)), type = &quot;BRR&quot;, mse = TRUE) brr_des &lt;- dat %&gt;% as_survey_rep(weights = WT, repweights = starts_with(&quot;REPWT&quot;), type = &quot;BRR&quot;, mse = TRUE) If the replicate weight variables are in the file consecutively, we can also use the following syntax: brr_des &lt;- dat %&gt;% as_survey_rep(weights = WT, repweights = REPWT1:REPWT20, type = &quot;BRR&quot;, mse = TRUE) Typically, each replicate weight sums to a value similar to the main weight, as both the replicate weights and the main weight are supposed to provide population estimates. Rarely, an alternative method is used where the replicate weights have values of 0 or 2 in the case of BRR weights. This would be indicated in the documentation (see Chapter 3 for more information on reading documentation.) In this case, the replicate weights are not combined, and the option combined_weights = FALSE should be indicated, as the default value for this argument is TRUE. This specific syntax is shown below: brr_des &lt;- dat %&gt;% as_survey_rep(weights = WT, repweights = starts_with(&quot;REPWT&quot;), type = &quot;BRR&quot;, combined_weights = FALSE, mse = TRUE) Example The {survey} package includes a data example from Section 12.2 of Levy and Lemeshow (2013). In this fictional data, two out of five ambulance stations were sampled from each of three emergency service areas (ESAs), thus BRR weights are appropriate with 2 PSUs (stations) sampled in each stratum (ESA.) In the code below, we create BRR weights as was done by Levy and Lemeshow (2013). scdbrr &lt;- scd %&gt;% as_tibble() %&gt;% mutate(wt = 5 / 2, rep1 = 2 * c(1, 0, 1, 0, 1, 0), rep2 = 2 * c(1, 0, 0, 1, 0, 1), rep3 = 2 * c(0, 1, 1, 0, 0, 1), rep4 = 2 * c(0, 1, 0, 1, 1, 0)) scdbrr ## # A tibble: 6 × 9 ## ESA ambulance arrests alive wt rep1 rep2 rep3 rep4 ## &lt;int&gt; &lt;int&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 1 1 120 25 2.5 2 2 0 0 ## 2 1 2 78 24 2.5 0 0 2 2 ## 3 2 1 185 30 2.5 2 0 2 0 ## 4 2 2 228 49 2.5 0 2 0 2 ## 5 3 1 670 80 2.5 2 0 0 2 ## 6 3 2 530 70 2.5 0 2 2 0 To specify the BRR weights, we use the following syntax: scdbrr_des &lt;- scdbrr %&gt;% as_survey_rep(type = &quot;BRR&quot;, repweights = starts_with(&quot;rep&quot;), combined_weights = FALSE, weight = wt) scdbrr_des ## Call: Called via srvyr ## Balanced Repeated Replicates with 4 replicates. ## Sampling variables: ## - repweights: `rep1 + rep2 + rep3 + rep4` ## - weights: wt ## Data variables: ## - ESA (int), ambulance (int), arrests (dbl), alive (dbl), wt (dbl), ## rep1 (dbl), rep2 (dbl), rep3 (dbl), rep4 (dbl) summary(scdbrr_des) ## Call: Called via srvyr ## Balanced Repeated Replicates with 4 replicates. ## Sampling variables: ## - repweights: `rep1 + rep2 + rep3 + rep4` ## - weights: wt ## Data variables: ## - ESA (int), ambulance (int), arrests (dbl), alive (dbl), wt (dbl), ## rep1 (dbl), rep2 (dbl), rep3 (dbl), rep4 (dbl) ## Variables: ## [1] &quot;ESA&quot; &quot;ambulance&quot; &quot;arrests&quot; &quot;alive&quot; &quot;wt&quot; ## [6] &quot;rep1&quot; &quot;rep2&quot; &quot;rep3&quot; &quot;rep4&quot; Note that combined_weights was specified as FALSE because these weights are simply specified as 0 and 2 and do not incorporate the overall weight. When printing the object, the type of replication is noted as Balanced Repeated Replicates, and the replicate weights and the weight variable are specified. Additionally, the summary lists the variables included in the data and design object. 10.4.2 Fay’s BRR method Fay’s BRR method for replicate weights is similar to the BRR method in that it uses a Hadamard matrix to construct replicate weights. However, rather than deleting PSUs for each replicate, with Fay’s BRR, half of the PSUs have a replicate weight, which is the main weight multiplied by \\(\\rho\\), and the other half have the main weight multiplied by \\((2-\\rho)\\), where \\(0 \\le \\rho &lt; 1\\). Note that when \\(\\rho=0\\), this is equivalent to the standard BRR weights, and as \\(\\rho\\) becomes closer to 1, this method is more similar to jackknife discussed in Section 10.4.3. To obtain the value of \\(\\rho\\), it is necessary to read the survey documentation (see Chapter 3.) The math The standard error estimate for \\(\\hat{\\theta}\\) is slightly different than the BRR, due to the addition of the multiplier of \\(\\rho\\). Using the generic notation above, \\(\\alpha=\\frac{1}{R \\left(1-\\rho\\right)^2}\\) and \\(\\alpha_r=1 \\text{ for all } r\\). The standard error is calculated as: \\[se(\\hat{\\theta})=\\sqrt{\\frac{1}{R (1-\\rho)^2} \\sum_{r=1}^R \\left( \\hat{\\theta}_r-\\hat{\\theta}\\right)^2}\\] The syntax The syntax is very similar for BRR and Fay’s BRR. To specify a Fay’s BRR design, we need to specify the weight variable (weights), the replicate weight variables (repweights), the type of replicate weights as Fay’s BRR (type = Fay), whether the mean squared error should be used (mse = TRUE) or not (mse = FALSE), and Fay’s multiplier (rho.) For example, if a dataset had WT0 for the main weight and had 20 BRR weights indicated as WT1, WT2, …, WT20, and Fay’s multiplier is 0.3, we use the following syntax: fay_des &lt;- dat %&gt;% as_survey_rep(weights = WT0, repweights = num_range(&quot;WT&quot;, 1:20), type = &quot;Fay&quot;, mse = TRUE, rho = 0.3) Example The 2015 RECS (U.S. Energy Information Administration 2017) uses Fay’s BRR weights with the final weight as NWEIGHT and replicate weights as BRRWT1 - BRRWT96, and the documentation specifies a Fay’s multiplier of 0.5. On the file, DOEID is a unique identifier for each respondent, TOTALDOL is the total energy cost, TOTSQFT_EN is the total square footage of the residence, and REGOINC is the Census region. We use the 2015 RECS data from the {srvyrexploR} package that provides data for this book (see the prerequisites box at the beginning of this chapter.) To specify the design for the recs_2015 data, we use the following syntax: recs_2015_des &lt;- recs_2015 %&gt;% as_survey_rep(weights = NWEIGHT, repweights = BRRWT1:BRRWT96, type = &quot;Fay&quot;, rho = 0.5, mse = TRUE, variables = c(DOEID, TOTALDOL, TOTSQFT_EN, REGIONC)) recs_2015_des ## Call: Called via srvyr ## Fay&#39;s variance method (rho= 0.5 ) with 96 replicates and MSE variances. ## Sampling variables: ## - repweights: `BRRWT1 + BRRWT2 + BRRWT3 + BRRWT4 + BRRWT5 + BRRWT6 + ## BRRWT7 + BRRWT8 + BRRWT9 + BRRWT10 + BRRWT11 + BRRWT12 + BRRWT13 + ## BRRWT14 + BRRWT15 + BRRWT16 + BRRWT17 + BRRWT18 + BRRWT19 + BRRWT20 ## + BRRWT21 + BRRWT22 + BRRWT23 + BRRWT24 + BRRWT25 + BRRWT26 + ## BRRWT27 + BRRWT28 + BRRWT29 + BRRWT30 + BRRWT31 + BRRWT32 + BRRWT33 ## + BRRWT34 + BRRWT35 + BRRWT36 + BRRWT37 + BRRWT38 + BRRWT39 + ## BRRWT40 + BRRWT41 + BRRWT42 + BRRWT43 + BRRWT44 + BRRWT45 + BRRWT46 ## + BRRWT47 + BRRWT48 + BRRWT49 + BRRWT50 + BRRWT51 + BRRWT52 + ## BRRWT53 + BRRWT54 + BRRWT55 + BRRWT56 + BRRWT57 + BRRWT58 + BRRWT59 ## + BRRWT60 + BRRWT61 + BRRWT62 + BRRWT63 + BRRWT64 + BRRWT65 + ## BRRWT66 + BRRWT67 + BRRWT68 + BRRWT69 + BRRWT70 + BRRWT71 + BRRWT72 ## + BRRWT73 + BRRWT74 + BRRWT75 + BRRWT76 + BRRWT77 + BRRWT78 + ## BRRWT79 + BRRWT80 + BRRWT81 + BRRWT82 + BRRWT83 + BRRWT84 + BRRWT85 ## + BRRWT86 + BRRWT87 + BRRWT88 + BRRWT89 + BRRWT90 + BRRWT91 + ## BRRWT92 + BRRWT93 + BRRWT94 + BRRWT95 + BRRWT96` ## - weights: NWEIGHT ## Data variables: ## - DOEID (dbl), TOTALDOL (dbl), TOTSQFT_EN (dbl), REGIONC (dbl) summary(recs_2015_des) ## Call: Called via srvyr ## Fay&#39;s variance method (rho= 0.5 ) with 96 replicates and MSE variances. ## Sampling variables: ## - repweights: `BRRWT1 + BRRWT2 + BRRWT3 + BRRWT4 + BRRWT5 + BRRWT6 + ## BRRWT7 + BRRWT8 + BRRWT9 + BRRWT10 + BRRWT11 + BRRWT12 + BRRWT13 + ## BRRWT14 + BRRWT15 + BRRWT16 + BRRWT17 + BRRWT18 + BRRWT19 + BRRWT20 ## + BRRWT21 + BRRWT22 + BRRWT23 + BRRWT24 + BRRWT25 + BRRWT26 + ## BRRWT27 + BRRWT28 + BRRWT29 + BRRWT30 + BRRWT31 + BRRWT32 + BRRWT33 ## + BRRWT34 + BRRWT35 + BRRWT36 + BRRWT37 + BRRWT38 + BRRWT39 + ## BRRWT40 + BRRWT41 + BRRWT42 + BRRWT43 + BRRWT44 + BRRWT45 + BRRWT46 ## + BRRWT47 + BRRWT48 + BRRWT49 + BRRWT50 + BRRWT51 + BRRWT52 + ## BRRWT53 + BRRWT54 + BRRWT55 + BRRWT56 + BRRWT57 + BRRWT58 + BRRWT59 ## + BRRWT60 + BRRWT61 + BRRWT62 + BRRWT63 + BRRWT64 + BRRWT65 + ## BRRWT66 + BRRWT67 + BRRWT68 + BRRWT69 + BRRWT70 + BRRWT71 + BRRWT72 ## + BRRWT73 + BRRWT74 + BRRWT75 + BRRWT76 + BRRWT77 + BRRWT78 + ## BRRWT79 + BRRWT80 + BRRWT81 + BRRWT82 + BRRWT83 + BRRWT84 + BRRWT85 ## + BRRWT86 + BRRWT87 + BRRWT88 + BRRWT89 + BRRWT90 + BRRWT91 + ## BRRWT92 + BRRWT93 + BRRWT94 + BRRWT95 + BRRWT96` ## - weights: NWEIGHT ## Data variables: ## - DOEID (dbl), TOTALDOL (dbl), TOTSQFT_EN (dbl), REGIONC (dbl) ## Variables: ## [1] &quot;DOEID&quot; &quot;TOTALDOL&quot; &quot;TOTSQFT_EN&quot; &quot;REGIONC&quot; In specifying the design, the variables option was also used to include which variables might be used in analyses. This is optional but can make our object smaller and easier to work with. When printing the design object or looking at the summary, the replicate weight type is re-iterated as Fay's variance method (rho= 0.5) with 96 replicates and MSE variances, and the variables are included. No weight or probability summary is included in this output, as we have seen in some other design objects. 10.4.3 Jackknife method There are three jackknife estimators implemented in {srvyr} - jackknife 1 (JK1), jackknife n (JKn), and jackknife 2 (JK2.) The JK1 method can be used for unstratified designs, and replicates are created by removing one PSU at a time so the number of replicates is the same as the number of PSUs. If there is no clustering, then the PSU is the ultimate sampling unit (e.g., students.) The JKn method is used for stratified designs and requires two or more PSUs per stratum. In this case, each replicate is created by deleting one PSU from a single stratum, so the number of replicates is the number of total PSUs across all strata. The JK2 method is a special case of JKn when there are exactly 2 PSUs sampled per stratum. For variance estimation, we also need to specify the scaling constants. The math Using the generic notation above, \\(\\alpha=\\frac{R-1}{R}\\) and \\(\\alpha_r=1 \\text{ for all } r\\). For the JK1 method, the standard error estimate for \\(\\hat{\\theta}\\) is calculated as: \\[se(\\hat{\\theta})=\\sqrt{\\frac{R-1}{R} \\sum_{r=1}^R \\left( \\hat{\\theta}_r-\\hat{\\theta}\\right)^2}\\] The JKn method is a bit more complex, but the coefficients are generally provided with restricted and public-use files. For each replicate, one stratum has a PSU removed, and the weights are adjusted by \\(n_h/(n_h-1)\\) where \\(n_h\\) is the number of PSUs in stratum \\(h\\). The coefficients in other strata are set to 1. Denote the coefficient that results from this process for replicate \\(r\\) as \\(\\alpha_r\\), then the standard error estimate for \\(\\hat{\\theta}\\) is calculated as: \\[se(\\hat{\\theta})=\\sqrt{\\sum_{r=1}^R \\alpha_r \\left( \\hat{\\theta}_r-\\hat{\\theta}\\right)^2}\\] The syntax To specify the jackknife method, we use the survey documentation to understand the type of jackknife (1, n, or 2) and the multiplier. In the syntax, we need to specify the weight variable (weights), the replicate weight variables (repweights), the type of replicate weights as jackknife 1 (type = \"JK1\"), n (type = \"JKN\"), or 2 (type = \"JK2\"), whether the mean squared error should be used (mse = TRUE) or not (mse = FALSE), and the multiplier (scale.) For example, if the survey is a jackknife 1 method with a multiplier of \\(\\alpha_r=(R-1)/R=19/20=0.95\\), the dataset has WT0 for the main weight and 20 replicate weights indicated as WT1, WT2, …, WT20, we use the following syntax: jk1_des &lt;- dat %&gt;% as_survey_rep(weights = WT0, repweights= num_range(&quot;WT&quot;, 1:20), type=&quot;JK1&quot;, mse=TRUE, scale=0.95) For a jackknife n method, we need to specify the multiplier for all replicates. In this case, we use the rscales argument to specify each one. The documentation provides details on what the multipliers (\\(\\alpha_r\\)) are, and they may be the same for all replicates. For example, consider a case where \\(\\alpha_r=0.1\\) for all replicates, and the dataset had WT0 for the main weight and had 20 replicate weights indicated as WT1, WT2, …, WT20. We specify the type as type = \"JKN\", and the multiplier as rscales=rep(0.1,20): jkn_des &lt;- dat %&gt;% as_survey_rep(weights = WT0, repweights= num_range(&quot;WT&quot;, 1:20), type=&quot;JKN&quot;, mse=TRUE, rscales=rep(0.1, 20)) Example The 2020 RECS (U.S. Energy Information Administration 2023c) uses jackknife weights with the final weight as NWEIGHT and replicate weights as NWEIGHT1 - NWEIGHT60 with a scale of \\((R-1)/R=59/60\\). On the file, DOEID is a unique identifier for each respondent, TOTALDOL is the total cost of energy, TOTSQFT_EN is the total square footage of the residence, and REGOINC is the Census region. We use the 2020 RECS data from the {srvyrexploR} package that provides data for this book (see the prerequisites box at the beginning of this chapter.) To specify this design, we use the following syntax: recs_des &lt;- recs_2020 %&gt;% as_survey_rep( weights = NWEIGHT, repweights = NWEIGHT1:NWEIGHT60, type = &quot;JK1&quot;, scale = 59/60, mse = TRUE, variables = c(DOEID, TOTALDOL, TOTSQFT_EN, REGIONC) ) recs_des ## Call: Called via srvyr ## Unstratified cluster jacknife (JK1) with 60 replicates and MSE variances. ## Sampling variables: ## - repweights: `NWEIGHT1 + NWEIGHT2 + NWEIGHT3 + NWEIGHT4 + NWEIGHT5 + ## NWEIGHT6 + NWEIGHT7 + NWEIGHT8 + NWEIGHT9 + NWEIGHT10 + NWEIGHT11 + ## NWEIGHT12 + NWEIGHT13 + NWEIGHT14 + NWEIGHT15 + NWEIGHT16 + ## NWEIGHT17 + NWEIGHT18 + NWEIGHT19 + NWEIGHT20 + NWEIGHT21 + ## NWEIGHT22 + NWEIGHT23 + NWEIGHT24 + NWEIGHT25 + NWEIGHT26 + ## NWEIGHT27 + NWEIGHT28 + NWEIGHT29 + NWEIGHT30 + NWEIGHT31 + ## NWEIGHT32 + NWEIGHT33 + NWEIGHT34 + NWEIGHT35 + NWEIGHT36 + ## NWEIGHT37 + NWEIGHT38 + NWEIGHT39 + NWEIGHT40 + NWEIGHT41 + ## NWEIGHT42 + NWEIGHT43 + NWEIGHT44 + NWEIGHT45 + NWEIGHT46 + ## NWEIGHT47 + NWEIGHT48 + NWEIGHT49 + NWEIGHT50 + NWEIGHT51 + ## NWEIGHT52 + NWEIGHT53 + NWEIGHT54 + NWEIGHT55 + NWEIGHT56 + ## NWEIGHT57 + NWEIGHT58 + NWEIGHT59 + NWEIGHT60` ## - weights: NWEIGHT ## Data variables: ## - DOEID (dbl), TOTALDOL (dbl), TOTSQFT_EN (dbl), REGIONC (chr) summary(recs_des) ## Call: Called via srvyr ## Unstratified cluster jacknife (JK1) with 60 replicates and MSE variances. ## Sampling variables: ## - repweights: `NWEIGHT1 + NWEIGHT2 + NWEIGHT3 + NWEIGHT4 + NWEIGHT5 + ## NWEIGHT6 + NWEIGHT7 + NWEIGHT8 + NWEIGHT9 + NWEIGHT10 + NWEIGHT11 + ## NWEIGHT12 + NWEIGHT13 + NWEIGHT14 + NWEIGHT15 + NWEIGHT16 + ## NWEIGHT17 + NWEIGHT18 + NWEIGHT19 + NWEIGHT20 + NWEIGHT21 + ## NWEIGHT22 + NWEIGHT23 + NWEIGHT24 + NWEIGHT25 + NWEIGHT26 + ## NWEIGHT27 + NWEIGHT28 + NWEIGHT29 + NWEIGHT30 + NWEIGHT31 + ## NWEIGHT32 + NWEIGHT33 + NWEIGHT34 + NWEIGHT35 + NWEIGHT36 + ## NWEIGHT37 + NWEIGHT38 + NWEIGHT39 + NWEIGHT40 + NWEIGHT41 + ## NWEIGHT42 + NWEIGHT43 + NWEIGHT44 + NWEIGHT45 + NWEIGHT46 + ## NWEIGHT47 + NWEIGHT48 + NWEIGHT49 + NWEIGHT50 + NWEIGHT51 + ## NWEIGHT52 + NWEIGHT53 + NWEIGHT54 + NWEIGHT55 + NWEIGHT56 + ## NWEIGHT57 + NWEIGHT58 + NWEIGHT59 + NWEIGHT60` ## - weights: NWEIGHT ## Data variables: ## - DOEID (dbl), TOTALDOL (dbl), TOTSQFT_EN (dbl), REGIONC (chr) ## Variables: ## [1] &quot;DOEID&quot; &quot;TOTALDOL&quot; &quot;TOTSQFT_EN&quot; &quot;REGIONC&quot; When printing the design object or looking at the summary, the replicate weight type is re-iterated as Unstratified cluster jacknife (JK1) with 60 replicates and MSE variances, and the variables are included. No weight or probability summary is included. 10.4.4 Bootstrap method In bootstrap resampling, replicates are created by selecting random samples of the PSUs with replacement (SRSWR.) If there are \\(A\\) PSUs in the sample, then each replicate is created by selecting a random sample of \\(A\\) PSUs with replacement. Each replicate is created independently, and the weights for each replicate are adjusted to reflect the population, generally using the same method as how the analysis weight was adjusted. The math A weighted estimate for the full sample is calculated as \\(\\hat{\\theta}\\), and then a weighted estimate for each replicate is calculated as \\(\\hat{\\theta}_r\\) for \\(R\\) replicates. Then the standard error of the estimate is calculated as follows: \\[se(\\hat{\\theta})=\\sqrt{\\alpha \\sum_{r=1}^R \\left( \\hat{\\theta}_r-\\hat{\\theta}\\right)^2}\\] where \\(\\alpha\\) is the scaling constant. Note that the scaling constant (\\(\\alpha\\)) is provided in the survey documentation, as there are many types of bootstrap methods that generate custom scaling constants. The syntax To specify a bootstrap method, we need to specify the weight variable (weights), the replicate weight variables (repweights), the type of replicate weights as bootstrap (type = \"bootstrap\"), whether the mean squared error should be used (mse = TRUE) or not (mse = FALSE), and the multiplier (scale.) For example, if a dataset had WT0 for the main weight, 20 bootstrap weights indicated WT1, WT2, …, WT20, and a multiplier of \\(\\alpha=.02\\), we use the following syntax: bs_des &lt;- dat %&gt;% as_survey_rep(weights = WT0, repweights= num_range(&quot;WT&quot;, 1:20), type=&quot;bootstrap&quot;, mse=TRUE, scale=.02) Example Returning to the api example, we are going to create a dataset with bootstrap weights to use as an example. In this example, we construct a one-cluster design with fifty replicate weights.27 apiclus1_slim &lt;- apiclus1 %&gt;% as_tibble() %&gt;% arrange(dnum) %&gt;% select(cds, dnum, fpc, pw) set.seed(662152) apibw &lt;- bootweights(psu = apiclus1_slim$dnum, strata = rep(1, nrow(apiclus1_slim)), fpc = apiclus1_slim$fpc, replicates = 50) bwmata &lt;- apibw$repweights$weights[apibw$repweights$index,] * apiclus1_slim$pw apiclus1_slim &lt;- bwmata %&gt;% as.data.frame() %&gt;% set_names(str_c(&quot;pw&quot;, 1:50)) %&gt;% cbind(apiclus1_slim) %&gt;% as_tibble() %&gt;% select(cds, dnum, fpc, pw, everything()) apiclus1_slim ## # A tibble: 183 × 54 ## cds dnum fpc pw pw1 pw2 pw3 pw4 pw5 pw6 pw7 ## &lt;chr&gt; &lt;int&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 43693776… 61 757 33.8 33.8 0 0 33.8 0 33.8 0 ## 2 43693776… 61 757 33.8 33.8 0 0 33.8 0 33.8 0 ## 3 43693776… 61 757 33.8 33.8 0 0 33.8 0 33.8 0 ## 4 43693776… 61 757 33.8 33.8 0 0 33.8 0 33.8 0 ## 5 43693776… 61 757 33.8 33.8 0 0 33.8 0 33.8 0 ## 6 43693776… 61 757 33.8 33.8 0 0 33.8 0 33.8 0 ## 7 43693776… 61 757 33.8 33.8 0 0 33.8 0 33.8 0 ## 8 43693776… 61 757 33.8 33.8 0 0 33.8 0 33.8 0 ## 9 43693776… 61 757 33.8 33.8 0 0 33.8 0 33.8 0 ## 10 43693776… 61 757 33.8 33.8 0 0 33.8 0 33.8 0 ## # ℹ 173 more rows ## # ℹ 43 more variables: pw8 &lt;dbl&gt;, pw9 &lt;dbl&gt;, pw10 &lt;dbl&gt;, pw11 &lt;dbl&gt;, ## # pw12 &lt;dbl&gt;, pw13 &lt;dbl&gt;, pw14 &lt;dbl&gt;, pw15 &lt;dbl&gt;, pw16 &lt;dbl&gt;, ## # pw17 &lt;dbl&gt;, pw18 &lt;dbl&gt;, pw19 &lt;dbl&gt;, pw20 &lt;dbl&gt;, pw21 &lt;dbl&gt;, ## # pw22 &lt;dbl&gt;, pw23 &lt;dbl&gt;, pw24 &lt;dbl&gt;, pw25 &lt;dbl&gt;, pw26 &lt;dbl&gt;, ## # pw27 &lt;dbl&gt;, pw28 &lt;dbl&gt;, pw29 &lt;dbl&gt;, pw30 &lt;dbl&gt;, pw31 &lt;dbl&gt;, ## # pw32 &lt;dbl&gt;, pw33 &lt;dbl&gt;, pw34 &lt;dbl&gt;, pw35 &lt;dbl&gt;, pw36 &lt;dbl&gt;, … The output of apiclus1_slim includes the same variables we have seen in other api examples (see Table 10.1), but now additionally includes bootstrap weights pw1, …, pw50. When creating the survey design object, we use the bootstrap weights as the replicate weights. Additionally, with replicate weights we need to include the scale (\\(\\alpha\\).) For this example, we created: \\[\\alpha=\\frac{A}{(A-1)(R-1)}=\\frac{15}{(15-1)*(50-1)}=0.02186589\\] where \\(A\\) is the average number of PSUs per strata and \\(R\\) is the number of replicates. There is only 1 stratum and the number of clusters/PSUs is 15 so \\(A=15\\). Using this information, we specify the design object as: api1_bs_des &lt;- apiclus1_slim %&gt;% as_survey_rep(weights = pw, repweights = pw1:pw50, type = &quot;bootstrap&quot;, scale = 0.02186589, mse = TRUE) api1_bs_des ## Call: Called via srvyr ## Survey bootstrap with 50 replicates and MSE variances. ## Sampling variables: ## - repweights: `pw1 + pw2 + pw3 + pw4 + pw5 + pw6 + pw7 + pw8 + pw9 + ## pw10 + pw11 + pw12 + pw13 + pw14 + pw15 + pw16 + pw17 + pw18 + pw19 ## + pw20 + pw21 + pw22 + pw23 + pw24 + pw25 + pw26 + pw27 + pw28 + ## pw29 + pw30 + pw31 + pw32 + pw33 + pw34 + pw35 + pw36 + pw37 + pw38 ## + pw39 + pw40 + pw41 + pw42 + pw43 + pw44 + pw45 + pw46 + pw47 + ## pw48 + pw49 + pw50` ## - weights: pw ## Data variables: ## - cds (chr), dnum (int), fpc (dbl), pw (dbl), pw1 (dbl), pw2 (dbl), ## pw3 (dbl), pw4 (dbl), pw5 (dbl), pw6 (dbl), pw7 (dbl), pw8 (dbl), ## pw9 (dbl), pw10 (dbl), pw11 (dbl), pw12 (dbl), pw13 (dbl), pw14 ## (dbl), pw15 (dbl), pw16 (dbl), pw17 (dbl), pw18 (dbl), pw19 (dbl), ## pw20 (dbl), pw21 (dbl), pw22 (dbl), pw23 (dbl), pw24 (dbl), pw25 ## (dbl), pw26 (dbl), pw27 (dbl), pw28 (dbl), pw29 (dbl), pw30 (dbl), ## pw31 (dbl), pw32 (dbl), pw33 (dbl), pw34 (dbl), pw35 (dbl), pw36 ## (dbl), pw37 (dbl), pw38 (dbl), pw39 (dbl), pw40 (dbl), pw41 (dbl), ## pw42 (dbl), pw43 (dbl), pw44 (dbl), pw45 (dbl), pw46 (dbl), pw47 ## (dbl), pw48 (dbl), pw49 (dbl), pw50 (dbl) summary(api1_bs_des) ## Call: Called via srvyr ## Survey bootstrap with 50 replicates and MSE variances. ## Sampling variables: ## - repweights: `pw1 + pw2 + pw3 + pw4 + pw5 + pw6 + pw7 + pw8 + pw9 + ## pw10 + pw11 + pw12 + pw13 + pw14 + pw15 + pw16 + pw17 + pw18 + pw19 ## + pw20 + pw21 + pw22 + pw23 + pw24 + pw25 + pw26 + pw27 + pw28 + ## pw29 + pw30 + pw31 + pw32 + pw33 + pw34 + pw35 + pw36 + pw37 + pw38 ## + pw39 + pw40 + pw41 + pw42 + pw43 + pw44 + pw45 + pw46 + pw47 + ## pw48 + pw49 + pw50` ## - weights: pw ## Data variables: ## - cds (chr), dnum (int), fpc (dbl), pw (dbl), pw1 (dbl), pw2 (dbl), ## pw3 (dbl), pw4 (dbl), pw5 (dbl), pw6 (dbl), pw7 (dbl), pw8 (dbl), ## pw9 (dbl), pw10 (dbl), pw11 (dbl), pw12 (dbl), pw13 (dbl), pw14 ## (dbl), pw15 (dbl), pw16 (dbl), pw17 (dbl), pw18 (dbl), pw19 (dbl), ## pw20 (dbl), pw21 (dbl), pw22 (dbl), pw23 (dbl), pw24 (dbl), pw25 ## (dbl), pw26 (dbl), pw27 (dbl), pw28 (dbl), pw29 (dbl), pw30 (dbl), ## pw31 (dbl), pw32 (dbl), pw33 (dbl), pw34 (dbl), pw35 (dbl), pw36 ## (dbl), pw37 (dbl), pw38 (dbl), pw39 (dbl), pw40 (dbl), pw41 (dbl), ## pw42 (dbl), pw43 (dbl), pw44 (dbl), pw45 (dbl), pw46 (dbl), pw47 ## (dbl), pw48 (dbl), pw49 (dbl), pw50 (dbl) ## Variables: ## [1] &quot;cds&quot; &quot;dnum&quot; &quot;fpc&quot; &quot;pw&quot; &quot;pw1&quot; &quot;pw2&quot; &quot;pw3&quot; &quot;pw4&quot; &quot;pw5&quot; ## [10] &quot;pw6&quot; &quot;pw7&quot; &quot;pw8&quot; &quot;pw9&quot; &quot;pw10&quot; &quot;pw11&quot; &quot;pw12&quot; &quot;pw13&quot; &quot;pw14&quot; ## [19] &quot;pw15&quot; &quot;pw16&quot; &quot;pw17&quot; &quot;pw18&quot; &quot;pw19&quot; &quot;pw20&quot; &quot;pw21&quot; &quot;pw22&quot; &quot;pw23&quot; ## [28] &quot;pw24&quot; &quot;pw25&quot; &quot;pw26&quot; &quot;pw27&quot; &quot;pw28&quot; &quot;pw29&quot; &quot;pw30&quot; &quot;pw31&quot; &quot;pw32&quot; ## [37] &quot;pw33&quot; &quot;pw34&quot; &quot;pw35&quot; &quot;pw36&quot; &quot;pw37&quot; &quot;pw38&quot; &quot;pw39&quot; &quot;pw40&quot; &quot;pw41&quot; ## [46] &quot;pw42&quot; &quot;pw43&quot; &quot;pw44&quot; &quot;pw45&quot; &quot;pw46&quot; &quot;pw47&quot; &quot;pw48&quot; &quot;pw49&quot; &quot;pw50&quot; As with other replicate design objects, when printing the object or looking at the summary, the replicate weights are provided along with the data variables. 10.5 Exercises For this chapter, the exercises entail reading public documentation to determine how to specify the survey design. While reading the documentation, be on the lookout for description of the weights and the survey design variables or replicate weights. The National Health Interview Survey (NHIS) is an annual household survey conducted by the National Center for Health Statistics (NCHS.) The NHIS includes a wide variety of health topics for adults, including health status and conditions, functioning and disability, health care access and health service utilization, health-related behaviors, health promotion, mental health, barriers to care, and community engagement. Like many national in-person surveys, the sampling design is a stratified clustered design with details included in the Survey Description (National Center for Health Statistics 2023). The Survey Description provides information on setting up syntax in SUDAAN, Stata, SPSS, SAS, and R ({survey} package implementation.) We have imported the data and the variable containing the data as nhis_adult_data. How would we specify the design using either as_survey_design() or as_survey_rep()? The General Social Survey is a survey that has been administered since 1972 on social, behavioral, and attitudinal topics. The 2016-2020 GSS Panel codebook provides examples of setting up syntax in SAS and Stata but not R (Davern et al. 2021). We have imported the data and the variable containing the data as: gss_data. How would we specify the design in R using either as_survey_design() or as_survey_rep()? References Cochran, William G. 1977. Sampling Techniques. John Wiley &amp; Sons. Cox, Brenda G, David A Binder, B Nanjamma Chinnappa, Anders Christianson, Michael J Colledge, and Phillip S Kott. 2011. Business Survey Methods. John Wiley &amp; Sons. Davern, Michael, Rene Bautista, Jeremy Freese, Stephen L. Morgan, and Tom W. Smith. 2021. “General Social Survey 2016-2020 Panel Codebook.” Edited by Chicago NORC. https://gss.norc.org/Documents/codebook/2016-2020%20GSS%20Panel%20Codebook%20-%20R1a.pdf. Deming, W Edwards. 1991. Sample Design in Business Research. Vol. 23. John Wiley &amp; Sons. Freedman Ellis, Greg, and Ben Schneider. 2023. srvyr: ’dplyr’-Like Syntax for Summary Statistics of Survey Data. Fuller, Wayne A. 2011. Sampling Statistics. John Wiley &amp; Sons. Levy, Paul S, and Stanley Lemeshow. 2013. Sampling of Populations: Methods and Applications. John Wiley &amp; Sons. Lumley, Thomas. 2010. Complex Surveys: A Guide to Analysis Using R. John Wiley; Sons. National Center for Health Statistics. 2023. “National Health Interview Survey, 2022 survey description.” https://ftp.cdc.gov/pub/Health_Statistics/NCHS/Dataset_Documentation/NHIS/2022/srvydesc-508.pdf. Penn State. 2019. “STAT 506: Sampling Theory and Methods [Online Course].” https://online.stat.psu.edu/stat506/. Särndal, Carl-Erik, Bengt Swensson, and Jan Wretman. 2003. Model Assisted Survey Sampling. Springer Science &amp; Business Media. U.S. Census Bureau. 2021. “Understanding and Using the American Community Survey Public Use Microdata Sample Files What Data Users Need to Know.” U.S. Government Printing Office; https://www.census.gov/content/dam/Census/library/publications/2021/acs/acs_pums_handbook_2021.pdf. U.S. Energy Information Administration. 2017. “Residential Energy Consumption Survey (RECS): Using the 2015 microdata file to compute estimates and standard errors (RSEs).” https://www.eia.gov/consumption/residential/data/2015/pdf/microdata_v3.pdf. ———. 2023c. “2020 Residential Energy Consumption Survey: Using the microdata file to compute estimates and relative standard errors (RSEs).” https://www.eia.gov/consumption/residential/data/2020/pdf/microdata-guide.pdf. Valliant, Richard, Jill A Dever, and Frauke Kreuter. 2013. Practical Tools for Designing and Weighting Survey Samples. Vol. 1. Springer. Wolter, Kirk M. 2007. Introduction to Variance Estimation. Vol. 53. Springer. Zimmer, Stephanie, Rebecca Powell, and Isabella Velásquez. 2024. srvyrexploR: Data Supplement for Exploring Complex Survey Data Analysis in R. We provide the code here to replicate this example, but are not focusing on the creation of the weights as that is outside the scope of this book. We recommend referencing Wolter (2007) for more information on creating bootstrap weights.↩︎ "],["c11-missing-data.html", "Chapter 11 Missing data 11.1 Introduction 11.2 Missing data mechanisms 11.3 Assessing missing data 11.4 Analysis with missing data", " Chapter 11 Missing data Prerequisites For this chapter, load the following packages: library(tidyverse) library(survey) library(srvyr) library(srvyrexploR) library(naniar) library(haven) library(gt) We are using data from ANES and RECS described in Chapter 4. As a reminder, here is the code to create the design objects for each to use throughout this chapter. For ANES, we need to adjust the weight so it sums to the population instead of the sample (see the ANES documentation and Chapter 4 for more information.) targetpop &lt;- 231592693 anes_adjwgt &lt;- anes_2020 %&gt;% mutate(Weight = Weight / sum(Weight) * targetpop) anes_des &lt;- anes_adjwgt %&gt;% as_survey_design( weights = Weight, strata = Stratum, ids = VarUnit, nest = TRUE ) For RECS, details are included in the RECS documentation and Chapter 10. recs_des &lt;- recs_2020 %&gt;% as_survey_rep( weights = NWEIGHT, repweights = NWEIGHT1:NWEIGHT60, type = &quot;JK1&quot;, scale = 59/60, mse = TRUE ) 11.1 Introduction Missing data in surveys refer to situations where participants do not provide complete responses to survey questions. Respondents may not have seen a question by design. Or, they may not respond to a question for various other reasons, such as not wanting to answer a particular question, not understanding the question, or simply forgetting to answer. Missing data are important to consider and account for, as it can introduce bias and reduce the representativeness of the data. This chapter provides an overview of the types of missing data, how to assess missing data in surveys, and how to conduct analysis when missing data are present. Understanding this complex topic can help ensure accurate reporting of survey results and provide insight into potential changes to the survey design for the future. 11.2 Missing data mechanisms There are two main categories that missing data typically fall into: missing by design and unintentional missing data. Missing by design is part of the survey plan and can be more easily incorporated into weights and analyses. Unintentional missing data on the other hand, can lead to bias in survey estimates if not correctly accounted for. Below we provide more information on the types of missing data. Missing by design/questionnaire skip logic: This type of missingness occurs when certain respondents are intentionally directed to skip specific questions based on their previous responses or characteristics. For example, in a survey about employment, if a respondent indicates that they are not employed, they may be directed to skip questions related to their job responsibilities. Additionally, some surveys randomize questions or modules so that not all participants respond to all questions. In these instances, respondents would have missing data for the modules not randomly assigned to them. Unintentional missing data: This type of missingness occurs when researchers do not intend for there to be missing data on a particular question, for example, if respondents did not finish the survey or refused to answer individual questions. There are three main types of unintentional missing data that each should be considered and handled differently (Mack, Su, and Westreich 2018; Schafer and Graham 2002): Missing completely at random (MCAR): The missing data are unrelated to both observed and unobserved data, and the probability of being missing is the same across all cases. For example, if a respondent missed a question because they had to leave the survey early due to an emergency. Missing at random (MAR): The missing data are related to observed data but not unobserved data, and the probability of being missing is the same within groups. For example, we know the respondents’ ages if and older respondents choose not to answer specific questions but younger respondents do answer them. Missing not at random (MNAR): The missing data are related to unobserved data, and the probability of being missing varies for reasons we are not measuring. For example, if respondents with depression do not answer a question about depression severity. 11.3 Assessing missing data Before beginning an analysis, we should explore the data to determine if there is missing data and what types of missing data are present. Conducting this descriptive analysis can help with the analysis and reporting of survey data (see Section 12) and can inform the survey design in future studies. For example, large amounts of unexpected missing data may indicate the questions were unclear or difficult to recall. There are several ways to explore missing data, which we walk through below. When assessing the missing data, we recommend using a data.frame object and not the survey object, as most of the analysis is about patterns of records, and weights are not necessary. 11.3.1 Summarize data A very rudimentary first exploration is to use the summary() function to summarize the data, which illuminates NA values in the data. Let’s look at a few analytic variables on the ANES 2020 data using summary(): anes_2020 %&gt;% select(V202051:EarlyVote2020) %&gt;% summary() ## V202051 Income7 Income ## Min. :-9.000 $125k or more:1468 Under $9,999 : 647 ## 1st Qu.:-1.000 Under $20k :1076 $50,000-59,999 : 485 ## Median :-1.000 $20k to &lt; 40k:1051 $100,000-109,999: 451 ## Mean :-0.726 $40k to &lt; 60k: 984 $250,000 or more: 405 ## 3rd Qu.:-1.000 $60k to &lt; 80k: 920 $80,000-89,999 : 383 ## Max. : 3.000 (Other) :1437 (Other) :4565 ## NA&#39;s : 517 NA&#39;s : 517 ## V201617x V201616 V201615 V201613 V201611 ## Min. :-9.0 Min. :-3 Min. :-3 Min. :-3 Min. :-3 ## 1st Qu.: 4.0 1st Qu.:-3 1st Qu.:-3 1st Qu.:-3 1st Qu.:-3 ## Median :11.0 Median :-3 Median :-3 Median :-3 Median :-3 ## Mean :10.4 Mean :-3 Mean :-3 Mean :-3 Mean :-3 ## 3rd Qu.:17.0 3rd Qu.:-3 3rd Qu.:-3 3rd Qu.:-3 3rd Qu.:-3 ## Max. :22.0 Max. :-3 Max. :-3 Max. :-3 Max. :-3 ## ## V201610 V201607 Gender V201600 ## Min. :-3 Min. :-3 Male :3375 Min. :-9.00 ## 1st Qu.:-3 1st Qu.:-3 Female:4027 1st Qu.: 1.00 ## Median :-3 Median :-3 NA&#39;s : 51 Median : 2.00 ## Mean :-3 Mean :-3 Mean : 1.47 ## 3rd Qu.:-3 3rd Qu.:-3 3rd Qu.: 2.00 ## Max. :-3 Max. :-3 Max. : 2.00 ## ## RaceEth V201549x V201547z V201547e ## White :5420 Min. :-9.0 Min. :-3 Min. :-3 ## Black : 650 1st Qu.: 1.0 1st Qu.:-3 1st Qu.:-3 ## Hispanic : 662 Median : 1.0 Median :-3 Median :-3 ## Asian, NH/PI : 248 Mean : 1.5 Mean :-3 Mean :-3 ## AI/AN : 155 3rd Qu.: 2.0 3rd Qu.:-3 3rd Qu.:-3 ## Other/multiple race: 237 Max. : 6.0 Max. :-3 Max. :-3 ## NA&#39;s : 81 ## V201547d V201547c V201547b V201547a V201546 ## Min. :-3 Min. :-3 Min. :-3 Min. :-3 Min. :-9.00 ## 1st Qu.:-3 1st Qu.:-3 1st Qu.:-3 1st Qu.:-3 1st Qu.: 2.00 ## Median :-3 Median :-3 Median :-3 Median :-3 Median : 2.00 ## Mean :-3 Mean :-3 Mean :-3 Mean :-3 Mean : 1.84 ## 3rd Qu.:-3 3rd Qu.:-3 3rd Qu.:-3 3rd Qu.:-3 3rd Qu.: 2.00 ## Max. :-3 Max. :-3 Max. :-3 Max. :-3 Max. : 2.00 ## ## Education V201510 AgeGroup Age ## Less than HS: 312 Min. :-9.00 18-29 : 871 Min. :18.0 ## High school :1160 1st Qu.: 3.00 30-39 :1241 1st Qu.:37.0 ## Post HS :2514 Median : 5.00 40-49 :1081 Median :53.0 ## Bachelor&#39;s :1877 Mean : 5.62 50-59 :1200 Mean :51.8 ## Graduate :1474 3rd Qu.: 6.00 60-69 :1436 3rd Qu.:66.0 ## NA&#39;s : 116 Max. :95.00 70 or older:1330 Max. :80.0 ## NA&#39;s : 294 NA&#39;s :294 ## V201507x TrustPeople V201237 ## Min. :-9.0 Always : 48 Min. :-9.00 ## 1st Qu.:35.0 Most of the time :3511 1st Qu.: 2.00 ## Median :51.0 About half the time:2020 Median : 3.00 ## Mean :49.4 Some of the time :1597 Mean : 2.78 ## 3rd Qu.:66.0 Never : 264 3rd Qu.: 3.00 ## Max. :80.0 NA&#39;s : 13 Max. : 5.00 ## ## TrustGovernment V201233 ## Always : 80 Min. :-9.00 ## Most of the time :1016 1st Qu.: 3.00 ## About half the time:2313 Median : 4.00 ## Some of the time :3313 Mean : 3.43 ## Never : 702 3rd Qu.: 4.00 ## NA&#39;s : 29 Max. : 5.00 ## ## PartyID V201231x V201230 ## Strong democrat :1796 Min. :-9.00 Min. :-9.000 ## Strong republican :1545 1st Qu.: 2.00 1st Qu.:-1.000 ## Independent-democrat : 881 Median : 4.00 Median :-1.000 ## Independent : 876 Mean : 3.83 Mean : 0.013 ## Not very strong democrat: 790 3rd Qu.: 6.00 3rd Qu.: 1.000 ## (Other) :1540 Max. : 7.00 Max. : 3.000 ## NA&#39;s : 25 ## V201229 V201228 VotedPres2016_selection ## Min. :-9.000 Min. :-9.00 Clinton:2911 ## 1st Qu.:-1.000 1st Qu.: 1.00 Trump :2466 ## Median : 1.000 Median : 2.00 Other : 390 ## Mean : 0.515 Mean : 1.99 NA&#39;s :1686 ## 3rd Qu.: 1.000 3rd Qu.: 3.00 ## Max. : 2.000 Max. : 5.00 ## ## V201103 VotedPres2016 V201102 V201101 ## Min. :-9.00 Yes :5810 Min. :-9.000 Min. :-9.000 ## 1st Qu.: 1.00 No :1622 1st Qu.:-1.000 1st Qu.:-1.000 ## Median : 1.00 NA&#39;s: 21 Median : 1.000 Median :-1.000 ## Mean : 1.04 Mean : 0.105 Mean : 0.085 ## 3rd Qu.: 2.00 3rd Qu.: 1.000 3rd Qu.: 1.000 ## Max. : 5.00 Max. : 2.000 Max. : 2.000 ## ## V201029 V201028 V201025x V201024 ## Min. :-9.000 Min. :-9.0 Min. :-4.00 Min. :-9.00 ## 1st Qu.:-1.000 1st Qu.:-1.0 1st Qu.: 3.00 1st Qu.:-1.00 ## Median :-1.000 Median :-1.0 Median : 3.00 Median :-1.00 ## Mean :-0.897 Mean :-0.9 Mean : 2.92 Mean :-0.86 ## 3rd Qu.:-1.000 3rd Qu.:-1.0 3rd Qu.: 3.00 3rd Qu.:-1.00 ## Max. :12.000 Max. : 2.0 Max. : 4.00 Max. : 4.00 ## ## EarlyVote2020 ## Yes : 375 ## No : 115 ## NA&#39;s:6963 ## ## ## ## We see that there are NA values in several of the derived variables (those not beginning with “V”) and negative values in the original variables (those beginning with “V”.) We can also use the count() function to get an understanding of the different types of missing data on the original variables. For example, let’s look at the count of data for V202072, which corresponds to our VotedPres2020 variable. anes_2020 %&gt;% count(VotedPres2020,V202072) ## # A tibble: 7 × 3 ## VotedPres2020 V202072 n ## &lt;fct&gt; &lt;dbl+lbl&gt; &lt;int&gt; ## 1 Yes -1 [-1. Inapplicable] 361 ## 2 Yes 1 [1. Yes, voted for President] 5952 ## 3 No -1 [-1. Inapplicable] 10 ## 4 No 2 [2. No, didn&#39;t vote for President] 77 ## 5 &lt;NA&gt; -9 [-9. Refused] 2 ## 6 &lt;NA&gt; -6 [-6. No post-election interview] 4 ## 7 &lt;NA&gt; -1 [-1. Inapplicable] 1047 Here, we can see that there are three types of missing data, and the majority of them fall under the “Inapplicable” category. This is usually a term associated with data missing due to skip patterns and is considered to be missing data by design. Based on the documentation from ANES (DeBell 2010), we can see that this question was only asked to respondents who voted in the election. 11.3.2 Visualization of missing data It can be challenging to look at tables for every variable and instead may be more efficient to view missing data in a graphical format to help narrow in on patterns or unique variables. The {naniar} package is very useful in exploring missing data visually. We can use the vis_miss() function available in both {visdat} and {naniar} packages to view the amount of missing data by variable (see Figure 11.1) (Tierney 2017; Tierney and Cook 2023). anes_2020_derived&lt;-anes_2020 %&gt;% select(!starts_with(&quot;V2&quot;),-CaseID,-InterviewMode,-Weight,-Stratum,-VarUnit) anes_2020_derived %&gt;% vis_miss(cluster= TRUE, show_perc = FALSE) + scale_fill_manual(values = book_colors[c(3,1)], labels = c(&quot;Present&quot;,&quot;Missing&quot;), name = &quot;&quot;) FIGURE 11.1: Visual depiction of missing data in the ANES 2020 data From the visualization in Figure 11.1, we can start to get a picture of what questions may be connected to each other in terms of missing data. Even if we did not have the informative variable names, we could deduce that VotedPres2020, VotedPres2020_selection, and EarlyVote2020 are likely connected since their missing data patterns are similar. Additionally, we can also look at VotedPres2016_selection and see that there is a lot of missing data in that variable. The missing data are likely due to a skip pattern, and we can look at other graphics to see how they relate to other variables. The {naniar} package has multiple visualization functions that can help dive deeper, such as the gg_miss_fct() function, which looks at missing data for all variables by levels of another variable (see Figure 11.2.) anes_2020_derived %&gt;% gg_miss_fct(VotedPres2016) + scale_fill_gradientn( guide = &quot;colorbar&quot;, name = &quot;% Miss&quot;, colors = book_colors[c(3, 2, 1)] ) + ylab(&quot;Variable&quot;) + xlab(&quot;Voted for President in 2016&quot;) ## Scale for fill is already present. ## Adding another scale for fill, which will replace the existing scale. FIGURE 11.2: Missingness in variables for each level of VotedPres2016 in the ANES 2020 data In Figure 11.2, we can see that if respondents did not vote for president in 2016 or did not answer that question, then they were not asked about who they voted for in 2016 (the percentage of missing data is 100%.) Additionally, we can see with Figure 11.2, that there is more missing data across all questions if they did not provide an answer to VotedPres2016. There are other visualizations that work well with numeric data. For example, in the RECS 2020 data, we can plot two continuous variables and the missing data associated with them to see if there are any patterns in the missingness. To do this, we can use the bind_shadow() function from the {naniar} package. This creates a nabular (combination of “na” with “tabular”), which features the original columns followed by the same number of columns with a specific NA format. These NA columns are indicators of whether the value in the original data is missing or not. The example printed below shows how most levels of HeatingBehavior are not missing (!NA) in the NA variable of HeatingBehavior_NA, but those missing in HeatingBehavior are also missing in HeatingBehavior_NA. recs_2020_shadow &lt;- recs_2020 %&gt;% bind_shadow() ncol(recs_2020) ## [1] 118 ncol(recs_2020_shadow) ## [1] 236 recs_2020_shadow %&gt;% count(HeatingBehavior,HeatingBehavior_NA) ## # A tibble: 7 × 3 ## HeatingBehavior HeatingBehavior_NA n ## &lt;fct&gt; &lt;fct&gt; &lt;int&gt; ## 1 Set one temp and leave it !NA 7806 ## 2 Manually adjust at night/no one home !NA 4654 ## 3 Programmable or smart thermostat automatical… !NA 3310 ## 4 Turn on or off as needed !NA 1491 ## 5 No control !NA 438 ## 6 Other !NA 46 ## 7 &lt;NA&gt; NA 751 We can then use these new variables to plot the missing data alongside the actual data. For example, let’s plot a histogram of the total electric bill grouped by those missing and not missing by heating behavior (see Figure 11.3.) recs_2020_shadow %&gt;% filter(TOTALDOL &lt; 5000) %&gt;% ggplot(aes(x = TOTALDOL, fill = HeatingBehavior_NA)) + geom_histogram() + scale_fill_manual( values = book_colors[c(3, 1)], labels = c(&quot;Present&quot;, &quot;Missing&quot;), name = &quot;Heating Behavior&quot; ) + theme_minimal() + xlab(&quot;Total Energy Cost (Truncated at $5000)&quot;) + ylab(&quot;Number of Households&quot;) + ggtitle(&quot;Histogram of Energy Cost by Heating Behavior Missing Data&quot;) ## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`. FIGURE 11.3: Histogram of Energy Cost by Heating Behavior Missing Data Figure 11.3 indicates that respondents who did not provide a response for the heating behavior question may have a different distribution of total energy cost compared to respondents who did provide a response. This view of the raw data and missingness could indicate some bias in the data. Researchers take these different bias aspects into account when calculating weights, and we need to make sure that we incorporate the weights when analyzing the data. There are many other visualizations that can be helpful in reviewing the data, and we recommend reviewing the {naniar} documentation for more information (Tierney and Cook 2023). 11.4 Analysis with missing data Once we understand the types of missingness, we can begin the analysis of the data. Different missingness types may be handled in different ways. In most publicly available datasets, researchers have already calculated weights and imputed missing values if necessary. For those interested in learning more about how to calculate weights and impute data for different missing data mechanisms, we recommended Kim and Shao (2021) and Valliant and Dever (2018). Even with weights and imputation, missing data are most likely still present and need to be accounted for in analysis. This section provides an overview on how to recode missing data in R, and how to account for skip patterns in analysis. 11.4.1 Recoding missing data Even within a variable, there can be different reasons for missing data. In publicly released data, negative values are often present to provide different meanings for values. For example, in the ANES 2020 data, they have the following negative values to represent different types of missing data: -9: Refused -8: Don’t Know -7: No post-election data, deleted due to incomplete interview -6: No post-election interview -5: Interview breakoff (sufficient partial IW) -4: Technical error -3: Restricted -2: Other missing reason (question specific) -1: Inapplicable When we created the derived variables for use in this book, we coded all negative values as NA and proceeded to analyze the data. For most cases, this is an appropriate approach as long as we filter the data appropriately to account for skip patterns (see Section 11.4.2). However, the {naniar} package does have the option to code special missing values. For example, if we wanted to have two NA values, one that indicated the question was missing by design (e.g., due to skip patterns) and one for the other missing categories, we can use the nabular format to incorporate these with the recode_shadow() function. anes_2020_shadow&lt;-anes_2020 %&gt;% select(starts_with(&quot;V2&quot;)) %&gt;% mutate(across(everything(),~case_when(.x &lt; -1 ~ NA, TRUE~.x))) %&gt;% bind_shadow() %&gt;% recode_shadow(V201103 = .where(V201103==-1~&quot;skip&quot;)) anes_2020_shadow %&gt;% count(V201103,V201103_NA) ## # A tibble: 5 × 3 ## V201103 V201103_NA n ## &lt;dbl+lbl&gt; &lt;fct&gt; &lt;int&gt; ## 1 -1 [-1. Inapplicable] NA_skip 1643 ## 2 1 [1. Hillary Clinton] !NA 2911 ## 3 2 [2. Donald Trump] !NA 2466 ## 4 5 [5. Other {SPECIFY}] !NA 390 ## 5 NA NA 43 However, it is important to note that at the time of publication, there is no easy way to implement recode_shadow() to multiple variables at once (e.g., we cannot use the tidyverse feature of across().) The example code above only implements this for a single variable, so this would have to be done manually or in a loop for all variables of interest. 11.4.2 Accounting for skip patterns When questions are skipped by design in a survey, it is meaningful that the data are later missing. For example, the RECS survey asks people how they control the heat in their homes in the winter (HeatingBehavior.) This is only among those who have heat in their home (SpaceHeatingUsed.) If no heating equipment was used, the value of HeatingBehavior is missing. One has several choices when analyzing these data, which include 1) only including those with a valid value of HeatingBehavior and specifying the universe as those with heat, and 2) including those who do not have heat. It is important to specify what population an analysis generalizes to. Here is an example where we only include those with a valid value of HeatingBehavior (choice 1.) Note that we use the design object (recs_des) and then filter to those that are not missing on HeatingBehavior. heat_cntl_1 &lt;- recs_des %&gt;% filter(!is.na(HeatingBehavior)) %&gt;% group_by(HeatingBehavior) %&gt;% summarize( p=survey_prop() ) heat_cntl_1 ## # A tibble: 6 × 3 ## HeatingBehavior p p_se ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Set one temp and leave it 0.430 4.69e-3 ## 2 Manually adjust at night/no one home 0.264 4.54e-3 ## 3 Programmable or smart thermostat automatically adjust… 0.168 3.12e-3 ## 4 Turn on or off as needed 0.102 2.89e-3 ## 5 No control 0.0333 1.70e-3 ## 6 Other 0.00208 3.59e-4 Here is an example where we include those that do not have heat (choice 2.) To help understand what we are looking at, we have included the output to show both variables, SpaceHeatingUsed and HeatingBehavior. heat_cntl_2 &lt;- recs_des %&gt;% group_by(interact(SpaceHeatingUsed, HeatingBehavior)) %&gt;% summarize( p=survey_prop() ) heat_cntl_2 ## # A tibble: 7 × 4 ## SpaceHeatingUsed HeatingBehavior p p_se ## &lt;lgl&gt; &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 FALSE &lt;NA&gt; 0.0469 2.07e-3 ## 2 TRUE Set one temp and leave it 0.410 4.60e-3 ## 3 TRUE Manually adjust at night/no one home 0.251 4.36e-3 ## 4 TRUE Programmable or smart thermostat aut… 0.160 2.95e-3 ## 5 TRUE Turn on or off as needed 0.0976 2.79e-3 ## 6 TRUE No control 0.0317 1.62e-3 ## 7 TRUE Other 0.00198 3.41e-4 If we ran the first analysis, we would say that 16.8% of households with heat use a programmable or smart thermostat for heating of their home. If we used the results from the second analysis, we would say that 16% of households use a programmable or smart thermostat for heating of their home. The distinction between the two statements is made bold for emphasis. Skip patterns often change the universe we are talking about and need to be carefully examined. Filtering to the correct universe is important when handling these types of missing data. The nabular we created above can also help with this. If we have NA_skip values in the shadow, we can make sure that we filter out all of these values and only include relevant missing values. To do this with survey data, we could first create the nabular, then create the design object on that data, and then use the shadow variables to assist with filtering the data. Let’s use the nabular we created above for ANES 2020 (anes_2020_shadow) to create the design object. anes_adjwgt_shadow &lt;- anes_2020_shadow %&gt;% mutate(V200010b = V200010b/sum(V200010b)*targetpop) anes_des_shadow &lt;- anes_adjwgt_shadow %&gt;% as_survey_design( weights = V200010b, strata = V200010d, ids = V200010c, nest = TRUE ) Then, we can use this design object to look at the percentage of the population that voted for each candidate in 2016 (V201103.) First, let’s look at the percentages without removing any cases: pres16_select1&lt;-anes_des_shadow %&gt;% group_by(V201103) %&gt;% summarize( All_Missing=survey_prop() ) pres16_select1 ## # A tibble: 5 × 3 ## V201103 All_Missing All_Missing_se ## &lt;dbl+lbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 -1 [-1. Inapplicable] 0.324 0.00933 ## 2 1 [1. Hillary Clinton] 0.330 0.00728 ## 3 2 [2. Donald Trump] 0.299 0.00728 ## 4 5 [5. Other {SPECIFY}] 0.0409 0.00230 ## 5 NA 0.00627 0.00121 Next, we look at the percentages, removing only those missing due to skip patterns (i.e., they did not receive this question.) pres16_select2&lt;-anes_des_shadow %&gt;% filter(V201103_NA!=&quot;NA_skip&quot;) %&gt;% group_by(V201103) %&gt;% summarize( No_Skip_Missing=survey_prop() ) pres16_select2 ## # A tibble: 4 × 3 ## V201103 No_Skip_Missing No_Skip_Missing_se ## &lt;dbl+lbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 1 [1. Hillary Clinton] 0.488 0.00870 ## 2 2 [2. Donald Trump] 0.443 0.00856 ## 3 5 [5. Other {SPECIFY}] 0.0606 0.00330 ## 4 NA 0.00928 0.00178 Finally, we look at the percentages, removing all missing values both due to skip patterns and due to those who refused to answer the question. pres16_select3&lt;-anes_des_shadow %&gt;% filter(V201103_NA==&quot;!NA&quot;) %&gt;% group_by(V201103) %&gt;% summarize( No_Missing=survey_prop() ) pres16_select3 ## # A tibble: 3 × 3 ## V201103 No_Missing No_Missing_se ## &lt;dbl+lbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 1 [1. Hillary Clinton] 0.492 0.00875 ## 2 2 [2. Donald Trump] 0.447 0.00861 ## 3 5 [5. Other {SPECIFY}] 0.0611 0.00332 #edxahdlkim table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #edxahdlkim thead, #edxahdlkim tbody, #edxahdlkim tfoot, #edxahdlkim tr, #edxahdlkim td, #edxahdlkim th { border-style: none; } #edxahdlkim p { margin: 0; padding: 0; } #edxahdlkim .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #edxahdlkim .gt_caption { padding-top: 4px; padding-bottom: 4px; } #edxahdlkim .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #edxahdlkim .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #edxahdlkim .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #edxahdlkim .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #edxahdlkim .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #edxahdlkim .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #edxahdlkim .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #edxahdlkim .gt_column_spanner_outer:first-child { padding-left: 0; } #edxahdlkim .gt_column_spanner_outer:last-child { padding-right: 0; } #edxahdlkim .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #edxahdlkim .gt_spanner_row { border-bottom-style: hidden; } #edxahdlkim .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #edxahdlkim .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #edxahdlkim .gt_from_md > :first-child { margin-top: 0; } #edxahdlkim .gt_from_md > :last-child { margin-bottom: 0; } #edxahdlkim .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #edxahdlkim .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #edxahdlkim .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #edxahdlkim .gt_row_group_first td { border-top-width: 2px; } #edxahdlkim .gt_row_group_first th { border-top-width: 2px; } #edxahdlkim .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #edxahdlkim .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #edxahdlkim .gt_first_summary_row.thick { border-top-width: 2px; } #edxahdlkim .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #edxahdlkim .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #edxahdlkim .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #edxahdlkim .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #edxahdlkim .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #edxahdlkim .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #edxahdlkim .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #edxahdlkim .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #edxahdlkim .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #edxahdlkim .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #edxahdlkim .gt_left { text-align: left; } #edxahdlkim .gt_center { text-align: center; } #edxahdlkim .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #edxahdlkim .gt_font_normal { font-weight: normal; } #edxahdlkim .gt_font_bold { font-weight: bold; } #edxahdlkim .gt_font_italic { font-style: italic; } #edxahdlkim .gt_super { font-size: 65%; } #edxahdlkim .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #edxahdlkim .gt_asterisk { font-size: 100%; vertical-align: 0; } #edxahdlkim .gt_indent_1 { text-indent: 5px; } #edxahdlkim .gt_indent_2 { text-indent: 10px; } #edxahdlkim .gt_indent_3 { text-indent: 15px; } #edxahdlkim .gt_indent_4 { text-indent: 20px; } #edxahdlkim .gt_indent_5 { text-indent: 25px; } TABLE 11.1: Percentage of Votes by Candidate for Different Missing Data Inclusions Candidate Including All Missing Data Removing Skip Patterns Only Removing All Missing Data % s.e. (%) % s.e. (%) % s.e. (%) Did not Vote for President in 2016 32.4% 0.9% NA NA NA NA Hillary Clinton 33.0% 0.7% 48.8% 0.9% 49.2% 0.9% Donald Trump 29.9% 0.7% 44.3% 0.9% 44.7% 0.9% Other Candidate 4.1% 0.2% 6.1% 0.3% 6.1% 0.3% Missing 0.6% 0.1% 0.9% 0.2% NA NA As Table 11.1 shows, the results can vary greatly depending on which type of missing data are removed. If we remove only the skip patterns the margin between Clinton and Trump is 4.5 percentage points, but if we include all data, even including those that did not vote in 2016, the margin is 3.1 percentage points. How we handle the different types of missing values is important for interpreting the data. References DeBell, Matthew. 2010. “How to Analyze ANES Survey Data.” ANES Technical Report Series nes012492. Palo Alto, CA: Stanford University; Ann Arbor, MI: the University of Michigan; https://electionstudies.org/wp-content/uploads/2018/05/HowToAnalyzeANESData.pdf. Kim, Jae Kwang, and Jun Shao. 2021. Statistical Methods for Handling Incomplete Data. Chapman &amp; Hall/CRC Press. Mack, Christina, Zhaohui Su, and Daniel Westreich. 2018. “Types of Missing Data.” In Managing Missing Data in Patient Registries: Addendum to Registries for Evaluating Patient Outcomes: A User’s Guide, Third Edition [Internet]. Rockville (MD): Agency for Healthcare Research; Quality (US); https://www.ncbi.nlm.nih.gov/books/NBK493614/. Schafer, Joseph L, and John W Graham. 2002. “Missing Data: Our View of the State of the Art.” Psychological Methods 7: 147–77. https://doi.org/10.1037//1082-989X.7.2.147. Tierney, Nicholas. 2017. “visdat: Visualising Whole Data Frames.” Journal of Open Source Software 2 (16): 355. https://doi.org/10.21105/joss.00355. Tierney, Nicholas, and Dianne Cook. 2023. “Expanding Tidy Data Principles to Facilitate Missing Data Exploration, Visualization and Assessment of Imputations.” Journal of Statistical Software 105 (7): 1–31. https://doi.org/10.18637/jss.v105.i07. Valliant, Richard, and Jill A. Dever. 2018. Survey Weights: A Step-by-Step Guide to Calculation. Stata Press. "],["c12-recommendations.html", "Chapter 12 Successful survey analysis recommendations 12.1 Introduction 12.2 Follow the survey analysis process 12.3 Begin with descriptive analysis 12.4 Check variable types 12.5 Improve debugging skills 12.6 Think critically about conclusions", " Chapter 12 Successful survey analysis recommendations Prerequisites For this chapter, load the following packages: library(tidyverse) library(survey) library(srvyr) library(srvyrexploR) To illustrate the importance of data visualization, we discuss Anscombe’s Quartet. The dataset can be replicated by running the code below: anscombe_tidy &lt;- anscombe %&gt;% mutate(obs = row_number()) %&gt;% pivot_longer(-obs, names_to = &quot;key&quot;, values_to = &quot;value&quot;) %&gt;% separate(key, c(&quot;variable&quot;, &quot;set&quot;), 1, convert = TRUE) %&gt;% mutate(set = c(&quot;I&quot;, &quot;II&quot;, &quot;III&quot;, &quot;IV&quot;)[set]) %&gt;% pivot_wider(names_from = variable, values_from = value) We create an example survey dataset to explain potential pitfalls and how to overcome them in survey analysis. To recreate the dataset, run the code below: example_srvy &lt;- tribble( ~id, ~region, ~q_d1, ~q_d2_1, ~gender, ~weight, 1L, 1L, 1L, &quot;Somewhat interested&quot;, &quot;female&quot;, 1740, 2L, 1L, 1L, &quot;Not at all interested&quot;, &quot;female&quot;, 1428, 3L, 2L, NA, &quot;Somewhat interested&quot;, &quot;female&quot;, 496, 4L, 2L, 1L, &quot;Not at all interested&quot;, &quot;female&quot;, 550, 5L, 3L, 1L, &quot;Somewhat interested&quot;, &quot;female&quot;, 1762, 6L, 4L, NA, &quot;Very interested&quot;, &quot;female&quot;, 1004, 7L, 4L, NA, &quot;Somewhat interested&quot;, &quot;female&quot;, 522, 8L, 3L, 2L, &quot;Not at all interested&quot;, &quot;female&quot;, 1099, 9L, 4L, 2L, &quot;Somewhat interested&quot;, &quot;female&quot;, 1295, 10L, 2L, 2L, &quot;Somewhat interested&quot;, &quot;male&quot;, 983 ) example_des &lt;- example_srvy %&gt;% as_survey_design(weights = weight) 12.1 Introduction The previous chapters in this book aimed to provide the technical skills and knowledge required for running survey analyses. This chapter builds upon the previously mentioned best practices to present a curated set of recommendations for running a successful survey analysis. We hope this list provides practical insights that assist in producing meaningful and reliable results. 12.2 Follow the survey analysis process As we first introduced in Chapter 4, there are four main steps to successfully analyze survey data: Create a tbl_svy object (a survey object) using: as_survey_design() or as_survey_rep() Subset data (if needed) using filter() (to create subpopulations) Specify domains of analysis using group_by() Within summarize(), specify variables to calculate, including means, totals, proportions, quantiles, and more The order of these steps matters in survey analysis. For example, if we need to subset the data, we must use filter() on our data after creating the survey design. If we do this before the survey design is created, we may not be correctly accounting for the study design, resulting in incorrect findings. Additionally, correctly identifying the survey design is one of the most important steps in survey analysis. Knowing the type of sample design (e.g., clustered, stratified) helps ensure the underlying error structure is correctly calculated and weights are correctly used. Reviewing the documentation (see Chapter 3) helps us understand what variables to use from the data. Learning about complex design factors such as clustering, stratification, and weighting is foundational to complex survey analysis, and we recommend that all analysts review Chapter 10 before creating their first design object. Making sure to use the survey analysis functions from the {srvyr} and {survey} packages is also important in survey analysis. For example, using mean() and survey_mean() on the same data results in different findings and outputs. Each of the survey functions from {srvyr} and {survey} impacts standard errors and variance, and we cannot treat complex surveys as unweighted simple random samples if we want to produce unbiased estimates (Freedman Ellis and Schneider 2023; Lumley 2010). 12.3 Begin with descriptive analysis When receiving a fresh batch of data, it is tempting to jump right into running models to find significant results. However, a successful data analyst begins by exploring the dataset. Chapter 11 talks about the importance of reviewing data when examining missing data patterns. In this chapter, we illustrate the value of reviewing all types of data. This involves running descriptive analysis on the dataset as a whole, as well as individual variables and combinations of variables. As described in Chapter 5, descriptive analyses should always precede statistical analysis to prevent avoidable (and potentially embarrassing) mistakes. 12.3.1 Table review Even before applying weights, consider running cross-tabulations on the raw data. Cross-tabs can help us see if any patterns stand out that may be alarming or something worth further investigating. For example, let’s explore the example survey dataset introduced in the Prerequisites box, example_srvy. We run the code below on the unweighted data to inspect the gender variable: example_srvy %&gt;% group_by(gender) %&gt;% summarise(n = n()) ## # A tibble: 2 × 2 ## gender n ## &lt;chr&gt; &lt;int&gt; ## 1 female 9 ## 2 male 1 The data show that males comprise 1 out of 10, or 10%, of the sample. Generally, we assume something close to a 50/50 split between male and female respondents in a population. The sizable female proportion could indicate either a unique sample or a potential error in the data. If we review the survey documentation and see this was a deliberate part of the design, we can continue our analysis using the appropriate methods. If this was not an intentional choice by the researchers, the results alert us that something may be incorrect in the data or our code, and we can verify if there’s an issue by comparing the results with the weighted means. 12.3.2 Graphical review Tables provide a quick check of our assumptions, but there is no substitute for graphs and plots to visualize the distribution of data. We might miss outliers or nuances if we scan only summary statistics. For example, Anscombe’s Quartet demonstrates the importance of visualization in analysis. Let’s say we have a dataset with x- and y- variables in an object called anscombe_tidy. Let’s take a look at how the dataset is structured: head(anscombe_tidy) ## # A tibble: 6 × 4 ## obs set x y ## &lt;int&gt; &lt;chr&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 1 I 10 8.04 ## 2 1 II 10 9.14 ## 3 1 III 10 7.46 ## 4 1 IV 8 6.58 ## 5 2 I 8 6.95 ## 6 2 II 8 8.14 We can begin by checking one set of variables. For Set I, the x-variables have an average of 9 with a standard deviation of 3.3; for y, we have an average of 7.5 with a standard deviation of 2.03. The two variables have a correlation of 0.81. anscombe_tidy %&gt;% filter(set == &quot;I&quot;) %&gt;% summarize( x_mean = mean(x), x_sd = sd(x), y_mean = mean(y), y_sd = sd(y), correlation = cor(x, y) ) ## # A tibble: 1 × 5 ## x_mean x_sd y_mean y_sd correlation ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 9 3.32 7.50 2.03 0.816 These are useful statistics. We can note that the data do not have high variability, and the two variables are strongly correlated. Now, let’s check all the sets (I-IV) in the Anscombe data. Notice anything interesting? anscombe_tidy %&gt;% group_by(set) %&gt;% summarize( x_mean = mean(x), x_sd = sd(x, na.rm = TRUE), y_mean = mean(y), y_sd = sd(y, na.rm = TRUE), correlation = cor(x, y) ) ## # A tibble: 4 × 6 ## set x_mean x_sd y_mean y_sd correlation ## &lt;chr&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 I 9 3.32 7.50 2.03 0.816 ## 2 II 9 3.32 7.50 2.03 0.816 ## 3 III 9 3.32 7.5 2.03 0.816 ## 4 IV 9 3.32 7.50 2.03 0.817 The summary results for these four sets are nearly identical! Based on this, we might assume that each distribution is similar. Let’s look at a graphical visualization to see if our assumption is correct (see Figure 12.1.) ggplot(anscombe_tidy, aes(x, y)) + geom_point() + facet_wrap( ~ set) + geom_smooth(method = &quot;lm&quot;, se = FALSE, alpha = 0.5) + theme_minimal() FIGURE 12.1: Plot of Anscombe’s Quartet data and the importance of reviewing data graphically Although each of the four sets has the same summary statistics and regression line, when reviewing the plots (see Figure 12.1), it becomes apparent that the distributions of the data are not the same at all. Each set of points results in different shapes and distributions. Imagine sharing each set (I-IV) and the corresponding plot with a different colleague. The interpretations and descriptions of the data would be very different even though the statistics are similar. Plotting data can also ensure that we are using the correct analysis method on the data, so understanding the underlying distributions is an important first step. 12.4 Check variable types When we pull the data from surveys into R, the data may be listed as character, factor, numeric, or logical/Boolean. The tidyverse functions that read in data (e.g., read_csv(), read_excel()) default to have all strings load as character variables. This is important when dealing with survey data, as many strings may be better suited for factors than character variables. For example, let’s revisit the example_srvy data. Taking a glimpse() of the data gives us insight into what it contains: example_srvy %&gt;% glimpse() ## Rows: 10 ## Columns: 6 ## $ id &lt;int&gt; 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 ## $ region &lt;int&gt; 1, 1, 2, 2, 3, 4, 4, 3, 4, 2 ## $ q_d1 &lt;int&gt; 1, 1, NA, 1, 1, NA, NA, 2, 2, 2 ## $ q_d2_1 &lt;chr&gt; &quot;Somewhat interested&quot;, &quot;Not at all interested&quot;, &quot;Somewh… ## $ gender &lt;chr&gt; &quot;female&quot;, &quot;female&quot;, &quot;female&quot;, &quot;female&quot;, &quot;female&quot;, &quot;fema… ## $ weight &lt;dbl&gt; 1740, 1428, 496, 550, 1762, 1004, 522, 1099, 1295, 983 The output shows that q_d2_1 is a character variable, but the values of that variable show three options (Very interested / Somewhat interested / Not at all interested.) In this case, we most likely want to change q_d2_1 to be a factor variable and order the factor levels to indicate that this is an ordinal variable. Here is some code on how we might approach this task using the {forcats} package (Wickham 2023a): example_srvy_fct &lt;- example_srvy %&gt;% mutate(q_d2_1_fct = factor( q_d2_1, levels = c(&quot;Very interested&quot;, &quot;Somewhat interested&quot;, &quot;Not at all interested&quot;) )) example_srvy_fct %&gt;% glimpse() ## Rows: 10 ## Columns: 7 ## $ id &lt;int&gt; 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 ## $ region &lt;int&gt; 1, 1, 2, 2, 3, 4, 4, 3, 4, 2 ## $ q_d1 &lt;int&gt; 1, 1, NA, 1, 1, NA, NA, 2, 2, 2 ## $ q_d2_1 &lt;chr&gt; &quot;Somewhat interested&quot;, &quot;Not at all interested&quot;, &quot;So… ## $ gender &lt;chr&gt; &quot;female&quot;, &quot;female&quot;, &quot;female&quot;, &quot;female&quot;, &quot;female&quot;, &quot;… ## $ weight &lt;dbl&gt; 1740, 1428, 496, 550, 1762, 1004, 522, 1099, 1295, … ## $ q_d2_1_fct &lt;fct&gt; Somewhat interested, Not at all interested, Somewha… example_srvy_fct %&gt;% count(q_d2_1_fct, q_d2_1) ## # A tibble: 3 × 3 ## q_d2_1_fct q_d2_1 n ## &lt;fct&gt; &lt;chr&gt; &lt;int&gt; ## 1 Very interested Very interested 1 ## 2 Somewhat interested Somewhat interested 6 ## 3 Not at all interested Not at all interested 3 This example dataset also includes a column called region, which is imported as a number (&lt;int&gt;.) This is a good reminder to use the questionnaire and codebook along with the data to find out if the values actually reflect a number or are perhaps a coded categorical variable (see Chapter 3 for more details.) R calculates the mean even if it is not appropriate, leading to the common mistake of applying an average to categorical values instead of a proportion function. For example, for ease of coding, we may use the across() function to calculate the mean across all numeric variables: example_des %&gt;% select(-weight) %&gt;% summarize(across(where(is.numeric), ~ survey_mean(.x, na.rm = TRUE))) ## # A tibble: 1 × 6 ## id id_se region region_se q_d1 q_d1_se ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 5.24 1.12 2.49 0.428 1.38 0.196 In this example, if we do not adjust region to be a factor variable type, we might accidentally report an average region of 2.49 in our findings, which is meaningless. Checking that our variables are appropriate avoids this pitfall and ensures the measures and models are suitable for the variable type. 12.5 Improve debugging skills It is common for analysts working in R to come across warning or error messages, and learning how to debug these messages (i.e., find and fix issues) ensures we can proceed with our work and avoid potential mistakes. We’ve discussed a few examples in this book. For example, if we calculate an average with survey_mean() and get NA instead of a number, it may be because our column has missing values. example_des %&gt;% summarize(mean = survey_mean(q_d1)) ## # A tibble: 1 × 2 ## mean mean_se ## &lt;dbl&gt; &lt;dbl&gt; ## 1 NA NaN Including the na.rm = TRUE would resolve the issue: example_des %&gt;% summarize(mean = survey_mean(q_d1, na.rm = TRUE)) ## # A tibble: 1 × 2 ## mean mean_se ## &lt;dbl&gt; &lt;dbl&gt; ## 1 1.38 0.196 Another common error message that we may see with survey analysis may look something like the following: example_des %&gt;% svyttest(q_d1~gender) ## Error in UseMethod(&quot;svymean&quot;, design): no applicable method for &#39;svymean&#39; applied to an object of class &quot;formula&quot; In this case, we need to remember that with functions from the {survey} packages like svyttest(), the design object is not the first argument, and we have to use the dot (.) notation (see Chapter 6.) Adding in the named argument of design=. fixes this error. example_des %&gt;% svyttest(q_d1 ~ gender, design = .) ## ## Design-based t-test ## ## data: q_d1 ~ gender ## t = 3.5, df = 5, p-value = 0.02 ## alternative hypothesis: true difference in mean is not equal to 0 ## 95 percent confidence interval: ## 0.1878 1.2041 ## sample estimates: ## difference in mean ## 0.696 Often, debugging involves interpreting the message from R. For example, if our code results in this error: Error in `contrasts&lt;-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : contrasts can be applied only to factors with 2 or more levels We can see that the error has to do with a function requiring a factor with two or more levels and that it has been applied to something else. This ties back to our section on using appropriate variable types. We can check the variable of interest to examine whether it’s the correct type. The internet also offers many resources for debugging. Searching for a specific error message can often lead to a solution. In addition, we can post on community forums like Posit Community for direct help from others. 12.6 Think critically about conclusions Once we have our findings, we need to learn to think critically about our findings. As mentioned in Chapter 2, many aspects of the study design can impact our interpretation of the results, for example, the number and types of response options provided to the respondent or who was asked the question (both thinking about the full sample and any skip patterns.) Knowing the overall study design can help us accurately think through what the findings may mean and identify any issues with our analyses. Additionally, we should make sure that our survey design object is correctly defined (see Chapter 10), carefully consider how we are managing missing data (see Chapter 11), and follow statistical analysis procedures such as avoiding model overfitting by using too many variables in our formulas. These considerations allow us to conduct our analyses and review findings for statistically significant results. It’s important to note that even significant results do not mean that they are meaningful or important. A large enough sample can produce statistically significant results. Therefore, we want to look at our results in context, such as comparing them with results from other studies or analyzing them in conjunction with confidence intervals and other measures. Communicating the results (see Chapter 8) in an unbiased manner is also a critical step in any analysis project. If we present results without error measures or only present results that support our initial hypotheses, we are not thinking critically and may incorrectly represent the data. As survey data analysts, we often interpret the survey data for the public. We must ensure that we are the best stewards of the data and work to bring light to meaningful and interesting findings that the public wants and needs to know about. References Freedman Ellis, Greg, and Ben Schneider. 2023. srvyr: ’dplyr’-Like Syntax for Summary Statistics of Survey Data. Lumley, Thomas. 2010. Complex Surveys: A Guide to Analysis Using R. John Wiley; Sons. ———. 2023a. forcats: Tools for Working with Categorical Variables (Factors). "],["c13-ncvs-vignette.html", "Chapter 13 National Crime Victimization Survey Vignette 13.1 Introduction 13.2 Data structure 13.3 Survey notation 13.4 Data file preparation 13.5 Survey design objects 13.6 Calculating estimates 13.7 Statistical testing 13.8 Exercises", " Chapter 13 National Crime Victimization Survey Vignette Prerequisites For this chapter, load the following packages: library(tidyverse) library(survey) library(srvyr) library(srvyrexploR) library(gt) We use data from the United States National Crime Victimization Survey (NCVS.) These data are available in the {srvyrexploR} package as ncvs_2021_incident, ncvs_2021_household, and ncvs_2021_person. 13.1 Introduction The National Crime Victimization Survey (NCVS) is a household survey sponsored by the Bureau of Justice Statistics (BJS), which collects data on criminal victimization, including characteristics of the crimes, offenders, and victims. Crime types include both household and personal crimes, as well as violent and non-violent crimes. The population of interest of this survey is all people in the United States age 12 and older living in housing units and noninstitutional group quarters. The NCVS has been ongoing since 1992. An earlier survey, the National Crime Survey, was run from 1972 to 1991 (Bureau of Justice Statistics 2017). The survey is administered using a rotating panel. When an address enters the sample, the residents of that address are interviewed every six months for a total of seven interviews. If the initial residents move away from the address during the period and new residents move in, the new residents are included in the survey, as people are not followed when they move. NCVS data are publicly available and distributed by Inter-university Consortium for Political and Social Research (ICPSR), with data going back to 1992. The vignette in this book includes data from 2021 (United States. Bureau of Justice Statistics 2022). The NCVS data structure is complicated, and the User’s Guide contains examples for analysis in SAS, SUDAAN, SPSS, and Stata, but not R (Shook-Sa, Bonnie, Couzens, G. Lance, and Berzofsky, Marcus 2015). This vignette adapts those examples for R. 13.2 Data structure The data from ICPSR are distributed with five files, each having its unique identifier indicated: Address Record - YEARQ, IDHH Household Record - YEARQ, IDHH Person Record - YEARQ, IDHH, IDPER Incident Record - YEARQ, IDHH, IDPER 2021 Collection Year Incident - YEARQ, IDHH, IDPER In this vignette, we focus on the household, person, and incident files and have selected a subset of columns for use in the examples. We have included data in the {srvyexploR} package with this subset of columns, but the complete data files can be downloaded from ICPSR. 13.3 Survey notation The NCVS User Guide (Shook-Sa, Bonnie, Couzens, G. Lance, and Berzofsky, Marcus 2015) uses the following notation: \\(i\\) represents NCVS households, identified on the household-level file with the household identification number IDHH. \\(j\\) represents NCVS individual respondents within household \\(i\\), identified on the person-level file with the person identification number IDPER. \\(k\\) represents reporting periods (i.e., YEARQ) for household \\(i\\) and individual respondent \\(j\\). \\(l\\) represents victimization records for respondent \\(j\\) in household \\(i\\) and reporting period \\(k\\). Each record on the NCVS incident-level file is associated with a victimization record \\(l\\). \\(D\\) represents one or more domain characteristics of interest in the calculation of NCVS estimates. For victimization totals and proportions, domains can be defined on the basis of crime types (e.g., violent crimes, property crimes), characteristics of victims (e.g., age, sex, household income), or characteristics of the victimizations (e.g., victimizations reported to police, victimizations committed with a weapon present.) Domains could also be a combination of all of these types of characteristics. For example, in the calculation of victimization rates, domains are defined on the basis of the characteristics of the victims. \\(A_a\\) represents the level \\(a\\) of covariate \\(A\\). Covariate \\(A\\) is defined in the calculation of victimization proportions and represents the characteristic we want to obtain the distribution of victimizations in domain \\(D\\). \\(C\\) represents the personal or property crime for which we want to obtain a victimization rate. In this vignette, we discuss four estimates: Victimization totals estimate the number of criminal victimizations with a given characteristic. As demonstrated below, these can be calculated from any of the data files. The estimated victimization total, \\(\\hat{t}_D\\) for domain \\(D\\) is estimated as \\[ \\hat{t}_D = \\sum_{ijkl \\in D} v_{ijkl}\\] where \\(v_{ijkl}\\) is the series-adjusted victimization weight for household \\(i\\), respondent \\(j\\), reporting period \\(k\\), and victimization \\(l\\), represented in the data as WGTVICCY. Victimization proportions estimate characteristics among victimizations or victims. Victimization proportions are calculated using the incident data file. The estimated victimization proportion for domain \\(D\\) across level \\(a\\) of covariate \\(A\\), \\(\\hat{p}_{A_a,D}\\) is \\[ \\hat{p}_{A_a,D} =\\frac{\\sum_{ijkl \\in A_a, D} v_{ijkl}}{\\sum_{ijkl \\in D} v_{ijkl}}.\\] The numerator is the number of incidents with a particular characteristic in a domain, and the denominator is the number of incidents in a domain. Victimization rates are estimates of the number of victimizations per 1,000 persons or households in the population.28 Victimization rates are calculated using the household or person-level data files. The estimated victimization rate for crime \\(C\\) in domain \\(D\\) is \\[\\hat{VR}_{C,D}= \\frac{\\sum_{ijkl \\in C,D} v_{ijkl}}{\\sum_{ijk \\in D} w_{ijk}}\\times 1000\\] where \\(w_{ijk}\\) is the person weight (WGTPERCY) for personal crimes or household weight (WGTHHCY) for household crimes. The numerator is the number of incidents in a domain, and the denominator is the number of persons or households in a domain. Notice that the weights in the numerator and denominator are different - this is important, and in the syntax and examples below, we discuss how to make an estimate that involves two weights. Prevalence rates are estimates of the percentage of the population (persons or households) who are victims of a crime. These are estimated using the household or person-level data files. The estimated prevalence rate for crime \\(C\\) in domain \\(D\\) is \\[ \\hat{PR}_{C, D}= \\frac{\\sum_{ijk \\in {C,D}} I_{ij}w_{ijk}}{\\sum_{ijk \\in D} w_{ijk}} \\times 100\\] where \\(I_{ij}\\) is an indicator that a person or household in domain \\(D\\) was a victim of crime \\(C\\) at any time in the year. The numerator is the number of victims in domain \\(D\\) for crime \\(C\\), and the denominator is the number of people or households in the population. 13.4 Data file preparation Some work is necessary to prepare the files before analysis. The design variables indicating pseudostratum (V2117) and half-sample code (V2118) are only included on the household file, so they must be added to the person and incident files for any analysis. For victimization rates, we need to know the victimization status for both victims and non-victims. Therefore, the incident file must be summarized and merged onto the household or person files for household-level and person-level crimes, respectively. We begin this vignette by discussing how to create these incident summary files. This is following Section 2.2 of the NCVS User’s Guide (Shook-Sa, Bonnie, Couzens, G. Lance, and Berzofsky, Marcus 2015). 13.4.1 Preparing files for estimation of victimization rates Each record on the incident file represents one victimization, which is not the same as one incident. Some victimizations have several instances that make it difficult for the victim to differentiate the details of these incidents, labeled as “series crimes”. Appendix A of the User’s Guide indicates how to calculate the series weight in other statistical languages. Here, we adapt that code for R. Essentially, if a victimization is a series crime, its series weight is top-coded at 10 based on the number of actual victimizations, that is, even if the crime occurred more than 10 times, it is counted as 10 times to reduce the influence of extreme outliers. If an incident is a series crime, but the number of occurrences is unknown, the series weight is set to 6. A description of the variables used to create indicators of series and the associated weights is included in Table 13.1. TABLE 13.1: Codebook for incident variables - related to series weight Description Value Label V4016 How many times incident occur last 6 mos 1-996 Number of times 997 Don’t know V4017 How many incidents 1 1-5 incidents (not a “series”) 2 6 or more incidents 8 Residue (invalid data) V4018 Incidents similar in detail 1 Similar 2 Different (not in a “series”) 8 Residue (invalid data) V4019 Enough detail to distinguish incidents 1 Yes (not a “series”) 2 No (is a “series”) 8 Residue (invalid data) WGTVICCY Adjusted victimization weight Numeric We want to create four variables to indicate if an incident is a series crime. First, we create a variable called series using V4017, V4018, and V4019 where an incident is considered a series crime if there are 6 or more incidents (V4107), the incidents are similar in detail (V4018), or there is not enough detail to distinguish the incidents (V4019.) Second, we top-code the number of incidents (V4016) by creating a variable n10v4016, which is set to 10 if V4016 &gt; 10. Third, we create the serieswgt using the two new variables series and n10v4019 to classify the max series based on missing data and number of incidents. Finally, we create the new weight using our new serieswgt variable and the existing weight (WGTVICCY.) inc_series &lt;- ncvs_2021_incident %&gt;% mutate( series = case_when(V4017 %in% c(1, 8) ~ 1, V4018 %in% c(2, 8) ~ 1, V4019 %in% c(1, 8) ~ 1, TRUE ~ 2 ), n10v4016 = case_when(V4016 %in% c(997, 998) ~ NA_real_, V4016 &gt; 10 ~ 10, TRUE ~ V4016), serieswgt = case_when(series == 2 &amp; is.na(n10v4016) ~ 6, series == 2 ~ n10v4016, TRUE ~ 1), NEWWGT = WGTVICCY * serieswgt ) The next step in preparing the files for estimation is to create indicators on the victimization file for characteristics of interest. Almost all BJS publications limit the analysis to records where the victimization occurred in the United States (where V4022 is not equal to 1). We do this for all estimates as well. A brief codebook of variables for this task is located in Table 13.2 TABLE 13.2: Codebook for incident variables - crime type indicators and characteristics Variable Description Value Label V4022 In what city/town/village 1 Outside U.S. 2 Not inside a city/town/village 3 Same city/town/village as present residence 4 Different city/town/village as present residence 5 Don’t know 6 Don’t know if 2, 4, or 5 V4049 Did offender have weapon 1 Yes 2 No 3 Don’t know V4050 What was weapon 1 At least one good entry 3 Indicates “Yes-Type Weapon-NA” 7 Indicates “Gun Type Unknown” 8 No good entry V4051 Hand gun 0 No 1 Yes V4052 Other gun 0 No 1 Yes V4053 Knife 0 No 1 Yes V4399 Reported to police 1 Yes 2 No 3 Don’t know V4529 Type of crime code 01 Completed rape 02 Attempted rape 03 Sexual attack with serious assault 04 Sexual attack with minor assault 05 Completed robbery with injury from serious assault 06 Completed robbery with injury from minor assault 07 Completed robbery without injury from minor assault 08 Attempted robbery with injury from serious assault 09 Attempted robbery with injury from minor assault 10 Attempted robbery without injury 11 Completed aggravated assault with injury 12 Attempted aggravated assault with weapon 13 Threatened assault with weapon 14 Simple assault completed with injury 15 Sexual assault without injury 16 Unwanted sexual contact without force 17 Assault without weapon without injury 18 Verbal threat of rape 19 Verbal threat of sexual assault 20 Verbal threat of assault 21 Completed purse snatching 22 Attempted purse snatching 23 Pocket picking (completed only) 31 Completed burglary, forcible entry 32 Completed burglary, unlawful entry without force 33 Attempted forcible entry 40 Completed motor vehicle theft 41 Attempted motor vehicle theft 54 Completed theft less than $10 55 Completed theft $10 to $49 56 Completed theft $50 to $249 57 Completed theft $250 or greater 58 Completed theft value NA 59 Attempted theft Using these variables, we create the following indicators: Property crime V4529 &gt;= 31 Variable: Property Violent crime V4529 &lt;= 20 Variable: Violent Property crime reported to the police V4529 &gt;= 31 and V4399=1 Variable: Property_ReportPolice Violent crime reported to the police V4529 &lt; 31 and V4399=1 Variable: Violent_ReportPolice Aggravated assault without a weapon V4529 in 11:12 and V4049=2 Variable: AAST_NoWeap Aggravated assault with a firearm V4529 in 11:12 and V4049=1 and (V4051=1 or V4052=1 or V4050=7) Variable: AAST_Firearm Aggravated assault with a knife or sharp object V4529 in 11:12 and V4049=1 and (V4053=1 or V4054=1) Variable: AAST_Knife Aggravated assault with another type of weapon V4529 in 11:12 and V4049=1 and V4050=1 and not firearm or knife Variable: AAST_Other inc_ind &lt;- inc_series %&gt;% filter(V4022 != 1) %&gt;% mutate( WeapCat = case_when( is.na(V4049) ~ NA_character_, V4049 == 2 ~ &quot;NoWeap&quot;, V4049 == 3 ~ &quot;UnkWeapUse&quot;, V4050 == 3 ~ &quot;Other&quot;, V4051 == 1 | V4052 == 1 | V4050 == 7 ~ &quot;Firearm&quot;, V4053 == 1 | V4054 == 1 ~ &quot;Knife&quot;, TRUE ~ &quot;Other&quot; ), V4529_num = parse_number(as.character(V4529)), ReportPolice = V4399 == 1, Property = V4529_num &gt;= 31, Violent = V4529_num &lt;= 20, Property_ReportPolice = Property &amp; ReportPolice, Violent_ReportPolice = Violent &amp; ReportPolice, AAST = V4529_num %in% 11:13, AAST_NoWeap = AAST &amp; WeapCat == &quot;NoWeap&quot;, AAST_Firearm = AAST &amp; WeapCat == &quot;Firearm&quot;, AAST_Knife = AAST &amp; WeapCat == &quot;Knife&quot;, AAST_Other = AAST &amp; WeapCat == &quot;Other&quot; ) This is a good point to pause to look at the output of crosswalks between an original variable and a derived one to check that the logic was programmed correctly and that everything ends up in the expected category. inc_series %&gt;% count(V4022) ## # A tibble: 6 × 2 ## V4022 n ## &lt;fct&gt; &lt;int&gt; ## 1 1 34 ## 2 2 65 ## 3 3 7697 ## 4 4 1143 ## 5 5 39 ## 6 8 4 inc_ind %&gt;% count(V4022) ## # A tibble: 5 × 2 ## V4022 n ## &lt;fct&gt; &lt;int&gt; ## 1 2 65 ## 2 3 7697 ## 3 4 1143 ## 4 5 39 ## 5 8 4 inc_ind %&gt;% count(WeapCat, V4049, V4050, V4051, V4052, V4052, V4053, V4054) ## # A tibble: 13 × 8 ## WeapCat V4049 V4050 V4051 V4052 V4053 V4054 n ## &lt;chr&gt; &lt;fct&gt; &lt;fct&gt; &lt;fct&gt; &lt;fct&gt; &lt;fct&gt; &lt;fct&gt; &lt;int&gt; ## 1 Firearm 1 1 0 1 0 0 15 ## 2 Firearm 1 1 0 1 1 1 1 ## 3 Firearm 1 1 1 0 0 0 125 ## 4 Firearm 1 1 1 0 1 0 2 ## 5 Firearm 1 1 1 1 0 0 3 ## 6 Firearm 1 7 0 0 0 0 3 ## 7 Knife 1 1 0 0 0 1 14 ## 8 Knife 1 1 0 0 1 0 71 ## 9 NoWeap 2 &lt;NA&gt; &lt;NA&gt; &lt;NA&gt; &lt;NA&gt; &lt;NA&gt; 1794 ## 10 Other 1 1 0 0 0 0 147 ## 11 Other 1 3 0 0 0 0 26 ## 12 UnkWeapUse 3 &lt;NA&gt; &lt;NA&gt; &lt;NA&gt; &lt;NA&gt; &lt;NA&gt; 519 ## 13 &lt;NA&gt; &lt;NA&gt; &lt;NA&gt; &lt;NA&gt; &lt;NA&gt; &lt;NA&gt; &lt;NA&gt; 6228 inc_ind %&gt;% count(V4529, Property, Violent, AAST) %&gt;% print(n = 40) ## # A tibble: 34 × 5 ## V4529 Property Violent AAST n ## &lt;fct&gt; &lt;lgl&gt; &lt;lgl&gt; &lt;lgl&gt; &lt;int&gt; ## 1 1 FALSE TRUE FALSE 45 ## 2 2 FALSE TRUE FALSE 20 ## 3 3 FALSE TRUE FALSE 11 ## 4 4 FALSE TRUE FALSE 3 ## 5 5 FALSE TRUE FALSE 24 ## 6 6 FALSE TRUE FALSE 26 ## 7 7 FALSE TRUE FALSE 59 ## 8 8 FALSE TRUE FALSE 5 ## 9 9 FALSE TRUE FALSE 7 ## 10 10 FALSE TRUE FALSE 57 ## 11 11 FALSE TRUE TRUE 97 ## 12 12 FALSE TRUE TRUE 91 ## 13 13 FALSE TRUE TRUE 163 ## 14 14 FALSE TRUE FALSE 165 ## 15 15 FALSE TRUE FALSE 24 ## 16 16 FALSE TRUE FALSE 12 ## 17 17 FALSE TRUE FALSE 357 ## 18 18 FALSE TRUE FALSE 14 ## 19 19 FALSE TRUE FALSE 3 ## 20 20 FALSE TRUE FALSE 607 ## 21 21 FALSE FALSE FALSE 2 ## 22 22 FALSE FALSE FALSE 2 ## 23 23 FALSE FALSE FALSE 19 ## 24 31 TRUE FALSE FALSE 248 ## 25 32 TRUE FALSE FALSE 634 ## 26 33 TRUE FALSE FALSE 188 ## 27 40 TRUE FALSE FALSE 256 ## 28 41 TRUE FALSE FALSE 97 ## 29 54 TRUE FALSE FALSE 407 ## 30 55 TRUE FALSE FALSE 1006 ## 31 56 TRUE FALSE FALSE 1686 ## 32 57 TRUE FALSE FALSE 1420 ## 33 58 TRUE FALSE FALSE 798 ## 34 59 TRUE FALSE FALSE 395 inc_ind %&gt;% count(ReportPolice, V4399) ## # A tibble: 4 × 3 ## ReportPolice V4399 n ## &lt;lgl&gt; &lt;fct&gt; &lt;int&gt; ## 1 FALSE 2 5670 ## 2 FALSE 3 103 ## 3 FALSE 8 12 ## 4 TRUE 1 3163 inc_ind %&gt;% count(AAST, WeapCat, AAST_NoWeap, AAST_Firearm, AAST_Knife, AAST_Other) ## # A tibble: 11 × 7 ## AAST WeapCat AAST_NoWeap AAST_Firearm AAST_Knife AAST_Other n ## &lt;lgl&gt; &lt;chr&gt; &lt;lgl&gt; &lt;lgl&gt; &lt;lgl&gt; &lt;lgl&gt; &lt;int&gt; ## 1 FALSE Firearm FALSE FALSE FALSE FALSE 34 ## 2 FALSE Knife FALSE FALSE FALSE FALSE 23 ## 3 FALSE NoWeap FALSE FALSE FALSE FALSE 1769 ## 4 FALSE Other FALSE FALSE FALSE FALSE 27 ## 5 FALSE UnkWeapUse FALSE FALSE FALSE FALSE 516 ## 6 FALSE &lt;NA&gt; FALSE FALSE FALSE FALSE 6228 ## 7 TRUE Firearm FALSE TRUE FALSE FALSE 115 ## 8 TRUE Knife FALSE FALSE TRUE FALSE 62 ## 9 TRUE NoWeap TRUE FALSE FALSE FALSE 25 ## 10 TRUE Other FALSE FALSE FALSE TRUE 146 ## 11 TRUE UnkWeapUse FALSE FALSE FALSE FALSE 3 After creating indicators of victimization types and characteristics, the file is summarized, and crimes are summed across persons or households by YEARQ. Property crimes (i.e., crimes committed against households, such as household burglary or motor vehicle theft) are summed across households, and personal crimes (i.e., crimes committed against an individual, such as assault, robbery, and personal theft) are summed across persons. The indicators are summed using our created series weight variable (serieswgt.) Additionally, the existing weight variable (WGTVICCY) needs to be retained for later analysis. inc_hh_sums &lt;- inc_ind %&gt;% filter(V4529_num &gt; 23) %&gt;% # restrict to household crimes group_by(YEARQ, IDHH) %&gt;% summarize(WGTVICCY = WGTVICCY[1], across(starts_with(&quot;Property&quot;), ~ sum(. * serieswgt), .names = &quot;{.col}&quot;), .groups = &quot;drop&quot;) inc_pers_sums &lt;- inc_ind %&gt;% filter(V4529_num &lt;= 23) %&gt;% # restrict to person crimes group_by(YEARQ, IDHH, IDPER) %&gt;% summarize(WGTVICCY = WGTVICCY[1], across(c(starts_with(&quot;Violent&quot;), starts_with(&quot;AAST&quot;)), ~ sum(. * serieswgt), .names = &quot;{.col}&quot;), .groups = &quot;drop&quot;) Now, we merge the victimization summary files into the appropriate files. For any record on the household or person file that is not on the victimization file, the victimization counts are set to 0 after merging. In this step, we also create the victimization adjustment factor. See Section 2.2.4 in the User’s Guide for details of why this adjustment is created (Shook-Sa, Bonnie, Couzens, G. Lance, and Berzofsky, Marcus 2015). It is calculated as follows: \\[ A_{ijk}=\\frac{v_{ijk}}{w_{ijk}}\\] where \\(w_{ijk}\\) is the person weight (WGTPERCY) for personal crimes or the household weight (WGTHHCY) for household crimes, and \\(v_{ijk}\\) is the victimization weight (WGTVICCY) for household \\(i\\), respondent \\(j\\), in reporting period \\(k\\). The adjustment factor is set to 0 if no incidents are reported. hh_z_list &lt;- rep(0, ncol(inc_hh_sums) - 3) %&gt;% as.list() %&gt;% setNames(names(inc_hh_sums)[-(1:3)]) pers_z_list &lt;- rep(0, ncol(inc_pers_sums) - 4) %&gt;% as.list() %&gt;% setNames(names(inc_pers_sums)[-(1:4)]) hh_vsum &lt;- ncvs_2021_household %&gt;% full_join(inc_hh_sums, by = c(&quot;YEARQ&quot;, &quot;IDHH&quot;)) %&gt;% replace_na(hh_z_list) %&gt;% mutate(ADJINC_WT = if_else(is.na(WGTVICCY), 0, WGTVICCY / WGTHHCY)) pers_vsum &lt;- ncvs_2021_person %&gt;% full_join(inc_pers_sums, by = c(&quot;YEARQ&quot;, &quot;IDHH&quot;, &quot;IDPER&quot;)) %&gt;% replace_na(pers_z_list) %&gt;% mutate(ADJINC_WT = if_else(is.na(WGTVICCY), 0, WGTVICCY / WGTPERCY)) 13.4.2 Derived demographic variables A final step in file preparation for the household and person files is creating any derived variables on the household and person files, such as income categories or age categories, for subgroup analysis. We can do this step before or after merging the victimization counts. 13.4.2.1 Household variables For the household file, we create categories for tenure (rental status), urbanicity, income, place size, and region. A codebook of the household variables is located in Table 13.3. TABLE 13.3: Codebook for household variables Variable Description Value Label V2015 Tenure 1 Owned or being bought 2 Rented for cash 3 No cash rent SC214A Household Income 01 Less than $5,000 02 $5,000 to $7,499 03 $7,500 to $9,999 04 $10,000 to $12,499 05 $12,500 to $14,999 06 $15,000 to $17,499 07 $17,500 to $19,999 08 $20,000 to $24,999 09 $25,000 to $29,999 10 $30,000 to $34,999 11 $35,000 to $39,999 12 $40,000 to $49,999 13 $50,000 to $74,999 15 $75,000 to $99,999 16 $100,000-$149,999 17 $150,000-$199,999 18 $200,000 or more V2126B Place Size Code 00 Not in a place 13 Under 10,000 16 10,000-49,999 17 50,000-99,999 18 100,000-249,999 19 250,000-499,999 20 500,000-999,999 21 1,000,000-2,499,999 22 2,500,000-4,999,999 23 5,000,000 or more V2127B Region 1 Northeast 2 Midwest 3 South 4 West V2143 Urbanicity 1 Urban 2 Suburban 3 Rural hh_vsum_der &lt;- hh_vsum %&gt;% mutate( Tenure = factor(case_when(V2015 == 1 ~ &quot;Owned&quot;, !is.na(V2015) ~ &quot;Rented&quot;), levels = c(&quot;Owned&quot;, &quot;Rented&quot;)), Urbanicity = factor(case_when(V2143 == 1 ~ &quot;Urban&quot;, V2143 == 2 ~ &quot;Suburban&quot;, V2143 == 3 ~ &quot;Rural&quot;), levels = c(&quot;Urban&quot;, &quot;Suburban&quot;, &quot;Rural&quot;)), SC214A_num = as.numeric(as.character(SC214A)), Income = case_when(SC214A_num &lt;= 8 ~ &quot;Less than $25,000&quot;, SC214A_num &lt;= 12 ~ &quot;$25,000-49,999&quot;, SC214A_num &lt;= 15 ~ &quot;$50,000-99,999&quot;, SC214A_num &lt;= 17 ~ &quot;$100,000-199,999&quot;, SC214A_num &lt;= 18 ~ &quot;$200,000 or more&quot;), Income = fct_reorder(Income, SC214A_num, .na_rm = FALSE), PlaceSize = case_match(as.numeric(as.character(V2126B)), 0 ~ &quot;Not in a place&quot;, 13 ~ &quot;Under 10,000&quot;, 16 ~ &quot;10,000-49,999&quot;, 17 ~ &quot;50,000-99,999&quot;, 18 ~ &quot;100,000-249,999&quot;, 19 ~ &quot;250,000-499,999&quot;, 20 ~ &quot;500,000-999,999&quot;, c(21, 22, 23) ~ &quot;1,000,000 or more&quot;), PlaceSize = fct_reorder(PlaceSize, as.numeric(V2126B)), Region = case_match(as.numeric(V2127B), 1 ~ &quot;Northeast&quot;, 2 ~ &quot;Midwest&quot;, 3 ~ &quot;South&quot;, 4 ~ &quot;West&quot;), Region = fct_reorder(Region, as.numeric(V2127B)) ) As before, we want to check to make sure the recoded variables we create match the existing data as expected. hh_vsum_der %&gt;% count(Tenure, V2015) ## # A tibble: 4 × 3 ## Tenure V2015 n ## &lt;fct&gt; &lt;fct&gt; &lt;int&gt; ## 1 Owned 1 101944 ## 2 Rented 2 46269 ## 3 Rented 3 1925 ## 4 &lt;NA&gt; &lt;NA&gt; 106322 hh_vsum_der %&gt;% count(Urbanicity, V2143) ## # A tibble: 3 × 3 ## Urbanicity V2143 n ## &lt;fct&gt; &lt;fct&gt; &lt;int&gt; ## 1 Urban 1 26878 ## 2 Suburban 2 173491 ## 3 Rural 3 56091 hh_vsum_der %&gt;% count(Income, SC214A) ## # A tibble: 18 × 3 ## Income SC214A n ## &lt;fct&gt; &lt;fct&gt; &lt;int&gt; ## 1 Less than $25,000 1 7841 ## 2 Less than $25,000 2 2626 ## 3 Less than $25,000 3 3949 ## 4 Less than $25,000 4 5546 ## 5 Less than $25,000 5 5445 ## 6 Less than $25,000 6 4821 ## 7 Less than $25,000 7 5038 ## 8 Less than $25,000 8 11887 ## 9 $25,000-49,999 9 11550 ## 10 $25,000-49,999 10 13689 ## 11 $25,000-49,999 11 13655 ## 12 $25,000-49,999 12 23282 ## 13 $50,000-99,999 13 44601 ## 14 $50,000-99,999 15 33353 ## 15 $100,000-199,999 16 34287 ## 16 $100,000-199,999 17 15317 ## 17 $200,000 or more 18 16892 ## 18 &lt;NA&gt; &lt;NA&gt; 2681 hh_vsum_der %&gt;% count(PlaceSize, V2126B) ## # A tibble: 10 × 3 ## PlaceSize V2126B n ## &lt;fct&gt; &lt;fct&gt; &lt;int&gt; ## 1 Not in a place 0 69484 ## 2 Under 10,000 13 39873 ## 3 10,000-49,999 16 53002 ## 4 50,000-99,999 17 27205 ## 5 100,000-249,999 18 24461 ## 6 250,000-499,999 19 13111 ## 7 500,000-999,999 20 15194 ## 8 1,000,000 or more 21 6167 ## 9 1,000,000 or more 22 3857 ## 10 1,000,000 or more 23 4106 hh_vsum_der %&gt;% count(Region, V2127B) ## # A tibble: 4 × 3 ## Region V2127B n ## &lt;fct&gt; &lt;fct&gt; &lt;int&gt; ## 1 Northeast 1 41585 ## 2 Midwest 2 74666 ## 3 South 3 87783 ## 4 West 4 52426 13.4.2.2 Person variables For the person file, we create categories for sex, race/Hispanic origin, age categories, and marital status. A codebook of the household variables is located in Table 13.4. We also merge the household demographics to the person file as well as the design variables (V2117 and V2118.) TABLE 13.4: Codebook for person variables Variable Description Value Label V3014 Age 12 through 90 V3015 Current Marital Status 1 Married 2 Widowed 3 Divorced 4 Separated 5 Never married V3018 Sex 1 Male 2 Female V3023A Race 01 White only 02 Black only 03 American Indian, Alaska native only 04 Asian only 05 Hawaiian/Pacific Islander only 06 White-Black 07 White-American Indian 08 White-Asian 09 White-Hawaiian 10 Black-American Indian 11 Black-Asian 12 Black-Hawaiian/Pacific Islander 13 American Indian-Asian 14 Asian-Hawaiian/Pacific Islander 15 White-Black-American Indian 16 White-Black-Asian 17 White-American Indian-Asian 18 White-Asian-Hawaiian 19 2 or 3 races 20 4 or 5 races V3024 Hispanic Origin 1 Yes 2 No NHOPI &lt;- &quot;Native Hawaiian or Other Pacific Islander&quot; pers_vsum_der &lt;- pers_vsum %&gt;% mutate( Sex = factor(case_when(V3018 == 1 ~ &quot;Male&quot;, V3018 == 2 ~ &quot;Female&quot;)), RaceHispOrigin = factor(case_when(V3024 == 1 ~ &quot;Hispanic&quot;, V3023A == 1 ~ &quot;White&quot;, V3023A == 2 ~ &quot;Black&quot;, V3023A == 4 ~ &quot;Asian&quot;, V3023A == 5 ~ NHOPI, TRUE ~ &quot;Other&quot;), levels = c(&quot;White&quot;, &quot;Black&quot;, &quot;Hispanic&quot;, &quot;Asian&quot;, NHOPI, &quot;Other&quot;)), V3014_num = as.numeric(as.character(V3014)), AgeGroup = case_when(V3014_num &lt;= 17 ~ &quot;12-17&quot;, V3014_num &lt;= 24 ~ &quot;18-24&quot;, V3014_num &lt;= 34 ~ &quot;25-34&quot;, V3014_num &lt;= 49 ~ &quot;35-49&quot;, V3014_num &lt;= 64 ~ &quot;50-64&quot;, V3014_num &lt;= 90 ~ &quot;65 or older&quot;), AgeGroup = fct_reorder(AgeGroup, V3014_num), MaritalStatus = factor(case_when(V3015 == 1 ~ &quot;Married&quot;, V3015 == 2 ~ &quot;Widowed&quot;, V3015 == 3 ~ &quot;Divorced&quot;, V3015 == 4 ~ &quot;Separated&quot;, V3015 == 5 ~ &quot;Never married&quot;), levels = c(&quot;Never married&quot;, &quot;Married&quot;, &quot;Widowed&quot;,&quot;Divorced&quot;, &quot;Separated&quot;)) ) %&gt;% left_join(hh_vsum_der %&gt;% select(YEARQ, IDHH, V2117, V2118, Tenure:Region), by = c(&quot;YEARQ&quot;, &quot;IDHH&quot;)) As before, we want to check to make sure the recoded variables we create match the existing data as expected. pers_vsum_der %&gt;% count(Sex, V3018) ## # A tibble: 2 × 3 ## Sex V3018 n ## &lt;fct&gt; &lt;fct&gt; &lt;int&gt; ## 1 Female 2 150956 ## 2 Male 1 140922 pers_vsum_der %&gt;% count(RaceHispOrigin, V3024) ## # A tibble: 11 × 3 ## RaceHispOrigin V3024 n ## &lt;fct&gt; &lt;fct&gt; &lt;int&gt; ## 1 White 2 197292 ## 2 White 8 883 ## 3 Black 2 29947 ## 4 Black 8 120 ## 5 Hispanic 1 41450 ## 6 Asian 2 16015 ## 7 Asian 8 61 ## 8 Native Hawaiian or Other Pacific Islander 2 891 ## 9 Native Hawaiian or Other Pacific Islander 8 9 ## 10 Other 2 5161 ## 11 Other 8 49 pers_vsum_der %&gt;% filter(RaceHispOrigin != &quot;Hispanic&quot; | is.na(RaceHispOrigin)) %&gt;% count(RaceHispOrigin, V3023A) ## # A tibble: 20 × 3 ## RaceHispOrigin V3023A n ## &lt;fct&gt; &lt;fct&gt; &lt;int&gt; ## 1 White 1 198175 ## 2 Black 2 30067 ## 3 Asian 4 16076 ## 4 Native Hawaiian or Other Pacific Islander 5 900 ## 5 Other 3 1319 ## 6 Other 6 1217 ## 7 Other 7 1025 ## 8 Other 8 837 ## 9 Other 9 184 ## 10 Other 10 178 ## 11 Other 11 87 ## 12 Other 12 27 ## 13 Other 13 13 ## 14 Other 14 53 ## 15 Other 15 136 ## 16 Other 16 45 ## 17 Other 17 11 ## 18 Other 18 33 ## 19 Other 19 22 ## 20 Other 20 23 pers_vsum_der %&gt;% group_by(AgeGroup) %&gt;% summarize(minAge = min(V3014), maxAge = max(V3014), .groups = &quot;drop&quot;) ## # A tibble: 6 × 3 ## AgeGroup minAge maxAge ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 12-17 12 17 ## 2 18-24 18 24 ## 3 25-34 25 34 ## 4 35-49 35 49 ## 5 50-64 50 64 ## 6 65 or older 65 90 pers_vsum_der %&gt;% count(MaritalStatus, V3015) ## # A tibble: 6 × 3 ## MaritalStatus V3015 n ## &lt;fct&gt; &lt;fct&gt; &lt;int&gt; ## 1 Never married 5 90425 ## 2 Married 1 148131 ## 3 Widowed 2 17668 ## 4 Divorced 3 28596 ## 5 Separated 4 4524 ## 6 &lt;NA&gt; 8 2534 We then create tibbles that contain only the variables we need, which makes it easier for analyses. hh_vsum_slim &lt;- hh_vsum_der %&gt;% select(YEARQ:V2118, WGTVICCY:ADJINC_WT, Tenure, Urbanicity, Income, PlaceSize, Region) pers_vsum_slim &lt;- pers_vsum_der %&gt;% select(YEARQ:WGTPERCY, WGTVICCY:ADJINC_WT, Sex:Region) To calculate estimates about types of crime, such as what percentage of violent crimes are reported to the police, we must use the incident file. The incident file is not guaranteed to have every pseudostratum and half-sample code, so dummy records are created to append before estimation. Finally, we merge demographic variables onto the incident tibble. dummy_records &lt;- hh_vsum_slim %&gt;% distinct(V2117, V2118) %&gt;% mutate(Dummy = 1, WGTVICCY = 1, NEWWGT = 1) inc_analysis &lt;- inc_ind %&gt;% mutate(Dummy = 0) %&gt;% left_join(select(pers_vsum_slim, YEARQ, IDHH, IDPER, Sex:Region), by = c(&quot;YEARQ&quot;, &quot;IDHH&quot;, &quot;IDPER&quot;)) %&gt;% bind_rows(dummy_records) %&gt;% select(YEARQ:IDPER, WGTVICCY, NEWWGT, V4529, WeapCat, ReportPolice, Property:Region) The tibbles hh_vsum_slim, pers_vsum_slim, and inc_analysis can now be used to create design objects and calculate crime rate estimates. 13.5 Survey design objects All the data prep above is necessary to prepare the data for survey analysis. At this point, we can create the design objects and finally begin analysis. We create three design objects for different types of analysis as they depend on which type of estimate we are creating. For the incident data, the weight of analysis is NEWWGT, which we constructed previously. The household and person-level data use WGTHHCY and WGTPERCY, respectively. For all analyses, V2117 is the strata variable, and V2118 is the cluster/PSU variable for analysis. All this information can be found in the User’s Guide (Shook-Sa, Bonnie, Couzens, G. Lance, and Berzofsky, Marcus 2015). inc_des &lt;- inc_analysis %&gt;% as_survey( weight = NEWWGT, strata = V2117, ids = V2118, nest = TRUE ) hh_des &lt;- hh_vsum_slim %&gt;% as_survey( weight = WGTHHCY, strata = V2117, ids = V2118, nest = TRUE ) pers_des &lt;- pers_vsum_slim %&gt;% as_survey( weight = WGTPERCY, strata = V2117, ids = V2118, nest = TRUE ) 13.6 Calculating estimates Now that we have prepared our data and created the design objects, we can calculate our estimates. As a reminder, those are: Victimization totals estimate the number of criminal victimizations with a given characteristic. Victimization proportions estimate characteristics among victimizations or victims. Victimization rates are estimates of the number of victimizations per 1,000 persons or households in the population. Prevalence rates are estimates of the percentage of the population (persons or households) who are victims of a crime. 13.6.1 Estimation 1: Victimization totals There are two ways to calculate victimization totals. Using the incident design object (inc_des) is the most straightforward method, but the person (pers_des) and household (hh_des) design objects can be used as well if the adjustment factor (ADJINC_WT) is incorporated. In the example below, the total number of property and violent victimizations is first calculated using the incident file and then using the household and person design objects. The incident file is smaller and estimation is faster using that file, but the estimates are the same as illustrated in Table 13.5, Table 13.6, and Table 13.7. vt1 &lt;- inc_des %&gt;% summarize(Property_Vzn = survey_total(Property, na.rm = TRUE), Violent_Vzn = survey_total(Violent, na.rm = TRUE)) %&gt;% gt() %&gt;% tab_spanner( label=&quot;Property crime&quot;, columns=starts_with(&quot;Property&quot;) ) %&gt;% tab_spanner( label=&quot;Violent crime&quot;, columns=starts_with(&quot;Violent&quot;) ) %&gt;% cols_label( ends_with(&quot;Vzn&quot;)~&quot;Total&quot;, ends_with(&quot;se&quot;)~&quot;S.E.&quot; ) %&gt;% fmt_number(decimals=0) vt2a &lt;- hh_des %&gt;% summarize(Property_Vzn = survey_total(Property * ADJINC_WT, na.rm = TRUE)) %&gt;% gt() %&gt;% tab_spanner( label=&quot;Property crime&quot;, columns=starts_with(&quot;Property&quot;) ) %&gt;% cols_label( ends_with(&quot;Vzn&quot;)~&quot;Total&quot;, ends_with(&quot;se&quot;)~&quot;S.E.&quot; ) %&gt;% fmt_number(decimals=0) vt2b &lt;- pers_des %&gt;% summarize(Violent_Vzn = survey_total(Violent * ADJINC_WT, na.rm = TRUE)) %&gt;% gt() %&gt;% tab_spanner( label=&quot;Violent crime&quot;, columns=starts_with(&quot;Violent&quot;) ) %&gt;% cols_label( ends_with(&quot;Vzn&quot;)~&quot;Total&quot;, ends_with(&quot;se&quot;)~&quot;S.E.&quot; ) %&gt;% fmt_number(decimals=0) #jslvphoojc table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #jslvphoojc thead, #jslvphoojc tbody, #jslvphoojc tfoot, #jslvphoojc tr, #jslvphoojc td, #jslvphoojc th { border-style: none; } #jslvphoojc p { margin: 0; padding: 0; } #jslvphoojc .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #jslvphoojc .gt_caption { padding-top: 4px; padding-bottom: 4px; } #jslvphoojc .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #jslvphoojc .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #jslvphoojc .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #jslvphoojc .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #jslvphoojc .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #jslvphoojc .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #jslvphoojc .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #jslvphoojc .gt_column_spanner_outer:first-child { padding-left: 0; } #jslvphoojc .gt_column_spanner_outer:last-child { padding-right: 0; } #jslvphoojc .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #jslvphoojc .gt_spanner_row { border-bottom-style: hidden; } #jslvphoojc .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #jslvphoojc .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #jslvphoojc .gt_from_md > :first-child { margin-top: 0; } #jslvphoojc .gt_from_md > :last-child { margin-bottom: 0; } #jslvphoojc .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #jslvphoojc .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #jslvphoojc .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #jslvphoojc .gt_row_group_first td { border-top-width: 2px; } #jslvphoojc .gt_row_group_first th { border-top-width: 2px; } #jslvphoojc .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #jslvphoojc .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #jslvphoojc .gt_first_summary_row.thick { border-top-width: 2px; } #jslvphoojc .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #jslvphoojc .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #jslvphoojc .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #jslvphoojc .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #jslvphoojc .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #jslvphoojc .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #jslvphoojc .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #jslvphoojc .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #jslvphoojc .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #jslvphoojc .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #jslvphoojc .gt_left { text-align: left; } #jslvphoojc .gt_center { text-align: center; } #jslvphoojc .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #jslvphoojc .gt_font_normal { font-weight: normal; } #jslvphoojc .gt_font_bold { font-weight: bold; } #jslvphoojc .gt_font_italic { font-style: italic; } #jslvphoojc .gt_super { font-size: 65%; } #jslvphoojc .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #jslvphoojc .gt_asterisk { font-size: 100%; vertical-align: 0; } #jslvphoojc .gt_indent_1 { text-indent: 5px; } #jslvphoojc .gt_indent_2 { text-indent: 10px; } #jslvphoojc .gt_indent_3 { text-indent: 15px; } #jslvphoojc .gt_indent_4 { text-indent: 20px; } #jslvphoojc .gt_indent_5 { text-indent: 25px; } TABLE 13.5: Estimates of total property and violent victimizations with standard errors calculated using the incident design object, 2021 (vt1) Property crime Violent crime Total S.E. Total S.E. 11,682,056 263,844 4,598,306 198,115 #uphlolqabb table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #uphlolqabb thead, #uphlolqabb tbody, #uphlolqabb tfoot, #uphlolqabb tr, #uphlolqabb td, #uphlolqabb th { border-style: none; } #uphlolqabb p { margin: 0; padding: 0; } #uphlolqabb .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #uphlolqabb .gt_caption { padding-top: 4px; padding-bottom: 4px; } #uphlolqabb .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #uphlolqabb .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #uphlolqabb .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #uphlolqabb .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #uphlolqabb .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #uphlolqabb .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #uphlolqabb .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #uphlolqabb .gt_column_spanner_outer:first-child { padding-left: 0; } #uphlolqabb .gt_column_spanner_outer:last-child { padding-right: 0; } #uphlolqabb .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #uphlolqabb .gt_spanner_row { border-bottom-style: hidden; } #uphlolqabb .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #uphlolqabb .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #uphlolqabb .gt_from_md > :first-child { margin-top: 0; } #uphlolqabb .gt_from_md > :last-child { margin-bottom: 0; } #uphlolqabb .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #uphlolqabb .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #uphlolqabb .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #uphlolqabb .gt_row_group_first td { border-top-width: 2px; } #uphlolqabb .gt_row_group_first th { border-top-width: 2px; } #uphlolqabb .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #uphlolqabb .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #uphlolqabb .gt_first_summary_row.thick { border-top-width: 2px; } #uphlolqabb .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #uphlolqabb .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #uphlolqabb .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #uphlolqabb .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #uphlolqabb .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #uphlolqabb .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #uphlolqabb .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #uphlolqabb .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #uphlolqabb .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #uphlolqabb .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #uphlolqabb .gt_left { text-align: left; } #uphlolqabb .gt_center { text-align: center; } #uphlolqabb .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #uphlolqabb .gt_font_normal { font-weight: normal; } #uphlolqabb .gt_font_bold { font-weight: bold; } #uphlolqabb .gt_font_italic { font-style: italic; } #uphlolqabb .gt_super { font-size: 65%; } #uphlolqabb .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #uphlolqabb .gt_asterisk { font-size: 100%; vertical-align: 0; } #uphlolqabb .gt_indent_1 { text-indent: 5px; } #uphlolqabb .gt_indent_2 { text-indent: 10px; } #uphlolqabb .gt_indent_3 { text-indent: 15px; } #uphlolqabb .gt_indent_4 { text-indent: 20px; } #uphlolqabb .gt_indent_5 { text-indent: 25px; } TABLE 13.6: Estimates of total property victimizations with standard errors calculated using the household design object, 2021 (vt2a) Property crime Total S.E. 11,682,056 263,844 #ismfkpkdnv table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #ismfkpkdnv thead, #ismfkpkdnv tbody, #ismfkpkdnv tfoot, #ismfkpkdnv tr, #ismfkpkdnv td, #ismfkpkdnv th { border-style: none; } #ismfkpkdnv p { margin: 0; padding: 0; } #ismfkpkdnv .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #ismfkpkdnv .gt_caption { padding-top: 4px; padding-bottom: 4px; } #ismfkpkdnv .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #ismfkpkdnv .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #ismfkpkdnv .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #ismfkpkdnv .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #ismfkpkdnv .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #ismfkpkdnv .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #ismfkpkdnv .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #ismfkpkdnv .gt_column_spanner_outer:first-child { padding-left: 0; } #ismfkpkdnv .gt_column_spanner_outer:last-child { padding-right: 0; } #ismfkpkdnv .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #ismfkpkdnv .gt_spanner_row { border-bottom-style: hidden; } #ismfkpkdnv .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #ismfkpkdnv .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #ismfkpkdnv .gt_from_md > :first-child { margin-top: 0; } #ismfkpkdnv .gt_from_md > :last-child { margin-bottom: 0; } #ismfkpkdnv .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #ismfkpkdnv .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #ismfkpkdnv .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #ismfkpkdnv .gt_row_group_first td { border-top-width: 2px; } #ismfkpkdnv .gt_row_group_first th { border-top-width: 2px; } #ismfkpkdnv .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #ismfkpkdnv .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #ismfkpkdnv .gt_first_summary_row.thick { border-top-width: 2px; } #ismfkpkdnv .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #ismfkpkdnv .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #ismfkpkdnv .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #ismfkpkdnv .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #ismfkpkdnv .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #ismfkpkdnv .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #ismfkpkdnv .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #ismfkpkdnv .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #ismfkpkdnv .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #ismfkpkdnv .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #ismfkpkdnv .gt_left { text-align: left; } #ismfkpkdnv .gt_center { text-align: center; } #ismfkpkdnv .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #ismfkpkdnv .gt_font_normal { font-weight: normal; } #ismfkpkdnv .gt_font_bold { font-weight: bold; } #ismfkpkdnv .gt_font_italic { font-style: italic; } #ismfkpkdnv .gt_super { font-size: 65%; } #ismfkpkdnv .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #ismfkpkdnv .gt_asterisk { font-size: 100%; vertical-align: 0; } #ismfkpkdnv .gt_indent_1 { text-indent: 5px; } #ismfkpkdnv .gt_indent_2 { text-indent: 10px; } #ismfkpkdnv .gt_indent_3 { text-indent: 15px; } #ismfkpkdnv .gt_indent_4 { text-indent: 20px; } #ismfkpkdnv .gt_indent_5 { text-indent: 25px; } TABLE 13.7: Estimates of total violent victimizations with standard errors calculated using the person design object, 2021 (vt2b) Violent crime Total S.E. 4,598,306 198,115 The number of victimizations estimated using the incident file is equivalent to the person and household file method. There are an estimated 11,682,056 property victimizations and 4,598,306 violent victimizations in 2021. 13.6.2 Estimation 2: Victimization proportions Victimization proportions are proportions describing features of a victimization. The key here is that these are estimates among victimizations, not among the population. These types of estimates can only be calculated using the incident design object (inc_des.) For example, we could be interested in the percentage of property victimizations reported to the police as shown in the following code with an estimate, the standard error, and 95% confidence interval: prop1 &lt;- inc_des %&gt;% filter(Property) %&gt;% summarize(Pct = survey_mean(ReportPolice, na.rm = TRUE, proportion=TRUE, vartype=c(&quot;se&quot;, &quot;ci&quot;)) * 100) prop1 ## # A tibble: 1 × 4 ## Pct Pct_se Pct_low Pct_upp ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 30.8 0.798 29.2 32.4 Or, the percentage of violent victimizations that are in urban areas: prop2 &lt;- inc_des %&gt;% filter(Violent) %&gt;% summarize(Pct = survey_mean(Urbanicity==&quot;Urban&quot;, na.rm = TRUE) * 100) prop2 ## # A tibble: 1 × 2 ## Pct Pct_se ## &lt;dbl&gt; &lt;dbl&gt; ## 1 18.1 1.49 In 2021, we estimate that 30.8% of property crimes were reported to the police, and 18.1% of violent crimes occurred in urban areas. 13.6.3 Estimation 3: Victimization rates Victimization rates measure the number of victimizations per population. They are not an estimate of the proportion of households or persons who are victimized, which is a prevalence rate described in Section 13.6.4. Victimization rates are estimated using the household (hh_des) or person (pers_des) design objects depending on the type of crime, and the adjustment factor (ADJINC_WT) must be incorporated. We return to the example of property and violent victimizations used in the example for victimization totals (Section 13.6.1.) In the following example, the property victimization totals are calculated as above, as well as the property victimization rate (using survey_mean()) and the population size using survey_total(). Victimization rates use the incident weight in the numerator and the person or household weight in the denominator. This is accomplished by calculating the rates with the weight adjustment (ADJINC_WT) multiplied by the estimate of interest. Let’s look at an example of property victimization. vr_prop &lt;- hh_des %&gt;% summarize( Property_Vzn = survey_total(Property * ADJINC_WT, na.rm = TRUE), Property_Rate = survey_mean(Property * ADJINC_WT * 1000, na.rm = TRUE), PopSize = survey_total(1, vartype = NULL) ) vr_prop ## # A tibble: 1 × 5 ## Property_Vzn Property_Vzn_se Property_Rate Property_Rate_se PopSize ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 11682056. 263844. 90.3 1.95 129319232. In the output above, we see the estimate for property victimization rate in 2021 was 90.3 per 1,000 households. This is consistent with calculating the number of victimizations per 1,000 population, as demonstrated in the following code output. vr_prop %&gt;% select(-ends_with(&quot;se&quot;)) %&gt;% mutate(Property_Rate_manual=Property_Vzn/PopSize*1000) ## # A tibble: 1 × 4 ## Property_Vzn Property_Rate PopSize Property_Rate_manual ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 11682056. 90.3 129319232. 90.3 Victimization rates can also be calculated based on particular characteristics of the victimization. In the following example, we calculate the rate of aggravated assault with no weapon, a firearm, a knife, and another weapon. pers_des %&gt;% summarize(across( starts_with(&quot;AAST_&quot;), ~ survey_mean(. * ADJINC_WT * 1000, na.rm = TRUE) )) ## # A tibble: 1 × 8 ## AAST_NoWeap AAST_NoWeap_se AAST_Firearm AAST_Firearm_se AAST_Knife ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 0.249 0.0595 0.860 0.101 0.455 ## # ℹ 3 more variables: AAST_Knife_se &lt;dbl&gt;, AAST_Other &lt;dbl&gt;, ## # AAST_Other_se &lt;dbl&gt; A common desire is to calculate victimization rates by several characteristics. For example, we may want to calculate the violent victimization rate and aggravated assault rate by sex, race/Hispanic origin, age group, marital status, and household income. This requires a group_by() statement for each categorization separately. Thus, we make a function to do this and then use map_df() from the {purrr} package (part of the tidyverse) to loop through the variables (Wickham and Henry 2023). This function takes a demographic variable as its input (byarvar) and calculates the violent and aggravated assault victimization rate for each level. It then creates some columns with the variable, the level of each variable, and a numeric version of the variable (LevelNum) for sorting later. The function is run across multiple variables using map() and then stacks the results into a single output using bind_rows(). pers_est_by &lt;- function(byvar) { pers_des %&gt;% rename(Level := {{byvar}}) %&gt;% filter(!is.na(Level)) %&gt;% group_by(Level) %&gt;% summarize( Violent = survey_mean(Violent * ADJINC_WT * 1000, na.rm = TRUE), AAST = survey_mean(AAST * ADJINC_WT * 1000, na.rm = TRUE) ) %&gt;% mutate( Variable = byvar, LevelNum = as.numeric(Level), Level = as.character(Level) ) %&gt;% select(Variable, Level, LevelNum, everything()) } pers_est_df &lt;- c(&quot;Sex&quot;, &quot;RaceHispOrigin&quot;, &quot;AgeGroup&quot;, &quot;MaritalStatus&quot;, &quot;Income&quot;) %&gt;% map(pers_est_by) %&gt;% bind_rows() The output from all the estimates is cleaned to create better labels, such as going from “RaceHispOrigin” to “Race/Hispanic Origin”. Finally, the {gt} package is used to make a publishable table (Table 13.8.) Using the functions from the {gt} package, we add column labels and footnotes and present estimates rounded to the first decimal place (Iannone et al. 2023). vr_gt&lt;-pers_est_df %&gt;% mutate( Variable = case_when( Variable == &quot;RaceHispOrigin&quot; ~ &quot;Race/Hispanic origin&quot;, Variable == &quot;MaritalStatus&quot; ~ &quot;Marital status&quot;, Variable == &quot;AgeGroup&quot; ~ &quot;Age&quot;, TRUE ~ Variable ) ) %&gt;% select(-LevelNum) %&gt;% group_by(Variable) %&gt;% gt(rowname_col = &quot;Level&quot;) %&gt;% tab_spanner( label = &quot;Violent crime&quot;, id = &quot;viol_span&quot;, columns = c(&quot;Violent&quot;, &quot;Violent_se&quot;) ) %&gt;% tab_spanner(label = &quot;Aggravated assault&quot;, columns = c(&quot;AAST&quot;, &quot;AAST_se&quot;)) %&gt;% cols_label( Violent = &quot;Rate&quot;, Violent_se = &quot;SE&quot;, AAST = &quot;Rate&quot;, AAST_se = &quot;SE&quot;, ) %&gt;% fmt_number( columns = c(&quot;Violent&quot;, &quot;Violent_se&quot;, &quot;AAST&quot;, &quot;AAST_se&quot;), decimals = 1 ) %&gt;% tab_footnote( footnote = &quot;Includes rape or sexual assault, robbery, aggravated assault, and simple assault.&quot;, locations = cells_column_spanners(spanners = &quot;viol_span&quot;) ) %&gt;% tab_footnote( footnote = &quot;Excludes persons of Hispanic origin&quot;, locations = cells_stub(rows = Level %in% c(&quot;White&quot;, &quot;Black&quot;, &quot;Asian&quot;, NHOPI, &quot;Other&quot;))) %&gt;% tab_footnote( footnote = &quot;Includes persons who identified as Native Hawaiian or Other Pacific Islander only.&quot;, locations = cells_stub(rows = Level == NHOPI) ) %&gt;% tab_footnote( footnote = &quot;Includes persons who identified as American Indian or Alaska Native only or as two or more races.&quot;, locations = cells_stub(rows = Level == &quot;Other&quot;) ) %&gt;% tab_source_note( source_note = &quot;Note: Rates per 1,000 persons age 12 or older.&quot;) %&gt;% tab_source_note(source_note = &quot;Source: Bureau of Justice Statistics, National Crime Victimization Survey, 2021.&quot;) %&gt;% tab_stubhead(label = &quot;Victim demographic&quot;) %&gt;% tab_caption(&quot;Rate and standard error of violent victimization, by type of crime and demographic characteristics, 2021&quot;) vr_gt #zpnruhcqur table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #zpnruhcqur thead, #zpnruhcqur tbody, #zpnruhcqur tfoot, #zpnruhcqur tr, #zpnruhcqur td, #zpnruhcqur th { border-style: none; } #zpnruhcqur p { margin: 0; padding: 0; } #zpnruhcqur .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #zpnruhcqur .gt_caption { padding-top: 4px; padding-bottom: 4px; } #zpnruhcqur .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #zpnruhcqur .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #zpnruhcqur .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #zpnruhcqur .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #zpnruhcqur .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #zpnruhcqur .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #zpnruhcqur .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #zpnruhcqur .gt_column_spanner_outer:first-child { padding-left: 0; } #zpnruhcqur .gt_column_spanner_outer:last-child { padding-right: 0; } #zpnruhcqur .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #zpnruhcqur .gt_spanner_row { border-bottom-style: hidden; } #zpnruhcqur .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #zpnruhcqur .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #zpnruhcqur .gt_from_md > :first-child { margin-top: 0; } #zpnruhcqur .gt_from_md > :last-child { margin-bottom: 0; } #zpnruhcqur .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #zpnruhcqur .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #zpnruhcqur .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #zpnruhcqur .gt_row_group_first td { border-top-width: 2px; } #zpnruhcqur .gt_row_group_first th { border-top-width: 2px; } #zpnruhcqur .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #zpnruhcqur .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #zpnruhcqur .gt_first_summary_row.thick { border-top-width: 2px; } #zpnruhcqur .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #zpnruhcqur .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #zpnruhcqur .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #zpnruhcqur .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #zpnruhcqur .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #zpnruhcqur .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #zpnruhcqur .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #zpnruhcqur .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #zpnruhcqur .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #zpnruhcqur .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #zpnruhcqur .gt_left { text-align: left; } #zpnruhcqur .gt_center { text-align: center; } #zpnruhcqur .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #zpnruhcqur .gt_font_normal { font-weight: normal; } #zpnruhcqur .gt_font_bold { font-weight: bold; } #zpnruhcqur .gt_font_italic { font-style: italic; } #zpnruhcqur .gt_super { font-size: 65%; } #zpnruhcqur .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #zpnruhcqur .gt_asterisk { font-size: 100%; vertical-align: 0; } #zpnruhcqur .gt_indent_1 { text-indent: 5px; } #zpnruhcqur .gt_indent_2 { text-indent: 10px; } #zpnruhcqur .gt_indent_3 { text-indent: 15px; } #zpnruhcqur .gt_indent_4 { text-indent: 20px; } #zpnruhcqur .gt_indent_5 { text-indent: 25px; } TABLE 13.8: Rate and standard error of violent victimization, by type of crime and demographic characteristics, 2021 Victim demographic Violent crime1 Aggravated assault Rate SE Rate SE Sex Female 15.5 0.9 2.3 0.2 Male 17.5 1.1 3.2 0.3 Race/Hispanic origin White2 16.1 0.9 2.7 0.3 Black2 18.5 2.2 3.7 0.7 Hispanic 15.9 1.7 2.3 0.4 Asian2 8.6 1.3 1.9 0.6 Native Hawaiian or Other Pacific Islander2,3 36.1 34.4 0.0 0.0 Other2,4 45.4 13.0 6.2 2.0 Age 12-17 13.2 2.2 2.5 0.8 18-24 23.1 2.1 3.9 0.9 25-34 22.0 2.1 4.0 0.6 35-49 19.4 1.6 3.6 0.5 50-64 16.9 1.9 2.0 0.3 65 or older 6.4 1.1 1.1 0.3 Marital status Never married 22.2 1.4 4.0 0.4 Married 9.5 0.9 1.5 0.2 Widowed 10.7 3.5 0.9 0.2 Divorced 27.4 2.9 4.0 0.7 Separated 36.8 6.7 8.8 3.1 Income Less than $25,000 29.6 2.5 5.1 0.7 $25,000-49,999 16.9 1.5 3.0 0.4 $50,000-99,999 14.6 1.1 1.9 0.3 $100,000-199,999 12.2 1.3 2.5 0.4 $200,000 or more 9.7 1.4 1.7 0.6 Note: Rates per 1,000 persons age 12 or older. Source: Bureau of Justice Statistics, National Crime Victimization Survey, 2021. 1 Includes rape or sexual assault, robbery, aggravated assault, and simple assault. 2 Excludes persons of Hispanic origin 3 Includes persons who identified as Native Hawaiian or Other Pacific Islander only. 4 Includes persons who identified as American Indian or Alaska Native only or as two or more races. 13.6.4 Estimation 4: Prevalence rates Prevalence rates differ from victimization rates as the numerator is the number of people or households victimized rather than the number of victimizations. To calculate the prevalence rates, we must run another summary of the data by calculating an indicator for whether a person or household is a victim of a particular crime at any point in the year. Below is an example of calculating the indicator and then the prevalence rate of violent crime and aggravated assault. pers_prev_des &lt;- pers_vsum_slim %&gt;% mutate(Year = floor(YEARQ)) %&gt;% mutate(Violent_Ind = sum(Violent) &gt; 0, AAST_Ind = sum(AAST) &gt; 0, .by = c(&quot;Year&quot;, &quot;IDHH&quot;, &quot;IDPER&quot;)) %&gt;% as_survey( weight = WGTPERCY, strata = V2117, ids = V2118, nest = TRUE ) pers_prev_ests &lt;- pers_prev_des %&gt;% summarize(Violent_Prev = survey_mean(Violent_Ind * 100), AAST_Prev = survey_mean(AAST_Ind * 100)) pers_prev_ests ## # A tibble: 1 × 4 ## Violent_Prev Violent_Prev_se AAST_Prev AAST_Prev_se ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 0.980 0.0349 0.215 0.0143 In the example above, the indicator is multiplied by 100 to return a percentage rather than a proportion. In 2021, we estimate that 0.98% of people aged 12 and older were a victim of violent crime in the United States, and 0.22% were victims of aggravated assault. 13.7 Statistical testing For any of the types of estimates discussed, we can also perform statistical testing. For example, we could test whether property victimization rates are different between properties that are owned versus rented. First, we calculate the point estimates. prop_tenure &lt;- hh_des %&gt;% group_by(Tenure) %&gt;% summarize( Property_Rate = survey_mean(Property * ADJINC_WT * 1000, na.rm = TRUE, vartype=&quot;ci&quot;), ) prop_tenure ## # A tibble: 3 × 4 ## Tenure Property_Rate Property_Rate_low Property_Rate_upp ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Owned 68.2 64.3 72.1 ## 2 Rented 130. 123. 137. ## 3 &lt;NA&gt; NaN NaN NaN The property victimization rate for rented households is 129.8 per 1,000 households, while the property victimization rate for owned households is 68.2, which seem very different, especially given the non-overlapping confidence intervals. However, survey data are inherently non-independent, so statistical testing cannot be done by comparing confidence intervals. To conduct the statistical test, we first need to create a variable that incorporates the adjusted incident weight (ADJINC_WT), and then the test can be conducted on this adjusted variable as discussed in Chapter 6. prop_tenure_test &lt;- hh_des %&gt;% mutate( Prop_Adj=Property * ADJINC_WT * 1000 ) %&gt;% svyttest( formula = Prop_Adj ~ Tenure, design = ., na.rm = TRUE ) %&gt;% broom::tidy() prop_tenure_test %&gt;% mutate(p.value = pretty_p_value(p.value)) %&gt;% gt() %&gt;% fmt_number() #sgxskozkog table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #sgxskozkog thead, #sgxskozkog tbody, #sgxskozkog tfoot, #sgxskozkog tr, #sgxskozkog td, #sgxskozkog th { border-style: none; } #sgxskozkog p { margin: 0; padding: 0; } #sgxskozkog .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #sgxskozkog .gt_caption { padding-top: 4px; padding-bottom: 4px; } #sgxskozkog .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #sgxskozkog .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #sgxskozkog .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #sgxskozkog .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #sgxskozkog .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #sgxskozkog .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #sgxskozkog .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #sgxskozkog .gt_column_spanner_outer:first-child { padding-left: 0; } #sgxskozkog .gt_column_spanner_outer:last-child { padding-right: 0; } #sgxskozkog .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #sgxskozkog .gt_spanner_row { border-bottom-style: hidden; } #sgxskozkog .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #sgxskozkog .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #sgxskozkog .gt_from_md > :first-child { margin-top: 0; } #sgxskozkog .gt_from_md > :last-child { margin-bottom: 0; } #sgxskozkog .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #sgxskozkog .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #sgxskozkog .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #sgxskozkog .gt_row_group_first td { border-top-width: 2px; } #sgxskozkog .gt_row_group_first th { border-top-width: 2px; } #sgxskozkog .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #sgxskozkog .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #sgxskozkog .gt_first_summary_row.thick { border-top-width: 2px; } #sgxskozkog .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #sgxskozkog .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #sgxskozkog .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #sgxskozkog .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #sgxskozkog .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #sgxskozkog .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #sgxskozkog .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #sgxskozkog .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #sgxskozkog .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #sgxskozkog .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #sgxskozkog .gt_left { text-align: left; } #sgxskozkog .gt_center { text-align: center; } #sgxskozkog .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #sgxskozkog .gt_font_normal { font-weight: normal; } #sgxskozkog .gt_font_bold { font-weight: bold; } #sgxskozkog .gt_font_italic { font-style: italic; } #sgxskozkog .gt_super { font-size: 65%; } #sgxskozkog .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #sgxskozkog .gt_asterisk { font-size: 100%; vertical-align: 0; } #sgxskozkog .gt_indent_1 { text-indent: 5px; } #sgxskozkog .gt_indent_2 { text-indent: 10px; } #sgxskozkog .gt_indent_3 { text-indent: 15px; } #sgxskozkog .gt_indent_4 { text-indent: 20px; } #sgxskozkog .gt_indent_5 { text-indent: 25px; } TABLE 13.9: T-test output for estimates of property victimization rates between properties that are owned versus rented, NCVS 2021 estimate statistic p.value parameter conf.low conf.high method alternative 61.62 16.04 &lt;0.0001 169.00 54.03 69.21 Design-based t-test two.sided The output of the statistical test shown in Table 13.9 indicates a difference of 61.6 between the property victimization rates of renters and owners, and the test is highly significant with the p-value of &lt;0.0001. 13.8 Exercises What proportion of completed motor vehicle thefts are not reported to the police? Hint: Use the codebook to look at the definition of Type of Crime (V4529.) How many violent crimes occur in each region? What is the property victimization rate among each income level? What is the difference between the violent victimization rate between males and females? Is it statistically different? References Bureau of Justice Statistics. 2017. “National Crime Victimization Survey, 2016: Technical Documentation.” https://bjs.ojp.gov/sites/g/files/xyckuh236/files/media/document/ncvstd16.pdf. Iannone, Richard, Joe Cheng, Barret Schloerke, Ellis Hughes, Alexandra Lauer, and JooYoung Seo. 2023. gt: Easily Create Presentation-Ready Display Tables. Shook-Sa, Bonnie, Couzens, G. Lance, and Berzofsky, Marcus. 2015. “Users’ Guide to the National Crime Victimization Survey (NCVS) Direct Variance Estimation.” https://bjs.ojp.gov/sites/g/files/xyckuh236/files/media/document/ncvs_variance_user_guide_11.06.14.pdf; Bureau of Justice Statistics. United States. Bureau of Justice Statistics. 2022. “National Crime Victimization Survey, [United States], 2021.” https://www.icpsr.umich.edu/web/NACJD/studies/38429; Inter-university Consortium for Political; Social Research [distributor]. https://doi.org/10.3886/ICPSR38429.v1. Wickham, Hadley, and Lionel Henry. 2023. purrr: Functional Programming Tools. BJS publishes victimization rates per 1,000, which are also presented in these examples↩︎ "],["c14-ambarom-vignette.html", "Chapter 14 AmericasBarometer Vignette 14.1 Introduction 14.2 Data structure 14.3 Preparing files 14.4 Survey design objects 14.5 Calculating estimates 14.6 Mapping survey data 14.7 Exercises", " Chapter 14 AmericasBarometer Vignette Prerequisites For this chapter, load the following packages: library(tidyverse) library(survey) library(srvyr) library(sf) library(rnaturalearth) library(rnaturalearthdata) library(gt) library(ggpattern) In this vignette, we use a subset of data from the 2021 AmericasBarometer survey. Download the raw files, available on the LAPOP website. We work with version 1.2 of the data, and there are separate files for each of the 22 countries. To read all files into R while ignoring the Stata labels, we recommend running the following code using read_stata() function from the {haven} package to import the data (Wickham, Miller, and Smith 2023): stata_files &lt;- list.files(here(&quot;RawData&quot;, &quot;LAPOP_2021&quot;), &quot;*.dta&quot;) read_stata_unlabeled &lt;- function(file) { read_stata(file) %&gt;% zap_labels() %&gt;% zap_label() } ambarom_in &lt;- here(&quot;RawData&quot;, &quot;LAPOP_2021&quot;, stata_files) %&gt;% map_df(read_stata_unlabeled) %&gt;% select(pais, strata, upm, weight1500, strata, core_a_core_b, q2, q1tb, covid2at, a4, idio2, idio2cov, it1, jc13, m1, mil10a, mil10e, ccch1, ccch3, ccus1, ccus3, edr, ocup4a, q14, q11n, q12c, q12bn, starts_with(&quot;covidedu1&quot;), gi0n, r15, r18n, r18) The code above reads all .dta files and combines them into one tibble. 14.1 Introduction The AmericasBarometer surveys, conducted by the LAPOP Lab (LAPOP 2023b), are public opinion surveys of the Americas focused on democracy. The study was launched in 2004/2005 with 11 countries. Though the countries grow and fluctuate over time, AmericasBarometers maintains a consistent methodology across many countries. In 2021, the study included 22 countries ranging from Canada in the north to Chile and Argentina in the South (LAPOP 2023a). Historically, surveys were administered through in-person household interviews, but the COVID-19 pandemic changed the study significantly. Now, random-digit dialing (RDD) of mobile phones is used in all countries except the United States and Canada (LAPOP 2021c). In Canada, LAPOP collaborated with the Environics Institute to collect data from a panel of Canadians using a web survey (LAPOP 2021a). In the United States, YouGov conducted the survey on behalf of LAPOP by conducting a web survey among its panelists (LAPOP 2021b). The survey includes a core set of questions for all countries, but not every question is asked in each country. Additionally, some questions are only posed to half of the respondents in a country, with different sections randomized to respondents (LAPOP 2021d). 14.2 Data structure Each country and year has its own file available in Stata format (.dta.) In this vignette, we download and combine all the data from the 22 participating countries in 2021. We subset the data to a smaller set of columns, as noted in the prerequisites box. We recommend reviewing the core questionnaire to understand the common variables across the countries (LAPOP 2021d). 14.3 Preparing files Many of the variables are coded as numeric and do not have intuitive variable names, so the next step is to create derived variables and wrangle the data for analysis. Using the core questionnaire as a codebook, we reference the factor descriptions to create derived variables with informative names: ambarom &lt;- ambarom_in %&gt;% mutate( Country = factor( case_match(pais, 1 ~ &quot;Mexico&quot;, 2 ~ &quot;Guatemala&quot;, 3 ~ &quot;El Salvador&quot;, 4 ~ &quot;Honduras&quot;, 5 ~ &quot;Nicaragua&quot;, 6 ~ &quot;Costa Rica&quot;, 7 ~ &quot;Panama&quot;, 8 ~ &quot;Colombia&quot;, 9 ~ &quot;Ecuador&quot;, 10 ~ &quot;Bolivia&quot;, 11 ~ &quot;Peru&quot;, 12 ~ &quot;Paraguay&quot;, 13 ~ &quot;Chile&quot;, 14 ~ &quot;Uruguay&quot;, 15 ~ &quot;Brazil&quot;, 17 ~ &quot;Argentina&quot;, 21 ~ &quot;Dominican Republic&quot;, 22 ~ &quot;Haiti&quot;, 23 ~ &quot;Jamaica&quot;, 24 ~ &quot;Guyana&quot;, 40 ~ &quot;United States&quot;, 41 ~ &quot;Canada&quot;)), CovidWorry = fct_reorder( case_match(covid2at, 1 ~ &quot;Very worried&quot;, 2 ~ &quot;Somewhat worried&quot;, 3 ~ &quot;A little worried&quot;, 4 ~ &quot;Not worried at all&quot;), covid2at, .na_rm = FALSE) ) %&gt;% rename(Educ_NotInSchool = covidedu1_1, Educ_NormalSchool = covidedu1_2, Educ_VirtualSchool = covidedu1_3, Educ_Hybrid = covidedu1_4, Educ_NoSchool = covidedu1_5, BroadbandInternet = r18n, Internet = r18) At this point, it is a good time to check the cross-tabs between the original and newly derived variables. These tables help us confirm that we have correctly matched the numeric data from the original dataset to the renamed factor data in the new dataset. For instance, let’s check the original variable pais and the derived variable Country. We can consult the questionnaire or codebook to confirm that Argentina is coded as 17, Bolivia as 10, etc. Similarly, for CovidWorry and covid2at, we can verify that Very worried is coded as 1, and so on for the other variables. ambarom %&gt;% count(Country, pais) %&gt;% print(n = 22) ## # A tibble: 22 × 3 ## Country pais n ## &lt;fct&gt; &lt;dbl&gt; &lt;int&gt; ## 1 Argentina 17 3011 ## 2 Bolivia 10 3002 ## 3 Brazil 15 3016 ## 4 Canada 41 2201 ## 5 Chile 13 2954 ## 6 Colombia 8 2993 ## 7 Costa Rica 6 2977 ## 8 Dominican Republic 21 3000 ## 9 Ecuador 9 3005 ## 10 El Salvador 3 3245 ## 11 Guatemala 2 3000 ## 12 Guyana 24 3011 ## 13 Haiti 22 3088 ## 14 Honduras 4 2999 ## 15 Jamaica 23 3121 ## 16 Mexico 1 2998 ## 17 Nicaragua 5 2997 ## 18 Panama 7 3183 ## 19 Paraguay 12 3004 ## 20 Peru 11 3038 ## 21 United States 40 1500 ## 22 Uruguay 14 3009 ambarom %&gt;% count(CovidWorry, covid2at) ## # A tibble: 5 × 3 ## CovidWorry covid2at n ## &lt;fct&gt; &lt;dbl&gt; &lt;int&gt; ## 1 Very worried 1 24327 ## 2 Somewhat worried 2 13233 ## 3 A little worried 3 11478 ## 4 Not worried at all 4 8628 ## 5 &lt;NA&gt; NA 6686 14.4 Survey design objects The technical report is the best reference for understanding how to specify the sampling design in R (LAPOP 2021c). The data include two weights: wt and weight1500. The first weight variable is specific to each country and sums to the sample size, but it is calibrated to reflect each country’s demographics. The second weight variable sums to 1500 for each country and is recommended for multi-country analyses. Although not explicitly stated in the documentation, the Stata syntax example (svyset upm [pw=weight1500], strata(strata)) indicates the variable upm is a clustering variable, and strata is the strata variable. Therefore, the design object for multi-country analysis is created in R as follows: ambarom_des &lt;- ambarom %&gt;% as_survey_design(ids = upm, strata = strata, weight = weight1500) One interesting thing to note is that these weight variables can provide estimates for comparing countries rather than for multi-country estimates. The reason is that the weights do not account for the different sizes of countries. For example, Canada has about 10% of the population of the United States, but an estimate that uses records from both countries would weigh them equally. 14.5 Calculating estimates When calculating estimates from the data, we use the survey design object ambarom_des and then apply the survey_mean() function. The next sections walk through a few examples. 14.5.1 Example: Worried about COVID This survey was administered between March and August of 2021, with the specific timing varying by country.29 Given the state of the pandemic at that time, several questions about COVID were included. The first question about COVID asked: How worried are you about the possibility that you or someone in your household will get sick from coronavirus in the next 3 months? Very worried Somewhat worried A little worried Not worried at all If we are interested in those who are very worried or somewhat worried, we can create a new variable (CovidWorry_bin) that groups levels of the original question using the fct_collapse() function from the {forcats} package (Wickham 2023a). We then use the survey_count() function to understand how responses are distributed across each category of the original variable (CovidWorry) and the new variable (CovidWorry_bin.) covid_worry_collapse &lt;- ambarom_des %&gt;% mutate(CovidWorry_bin = fct_collapse( CovidWorry, WorriedHi = c(&quot;Very worried&quot;, &quot;Somewhat worried&quot;), WorriedLo = c(&quot;A little worried&quot;, &quot;Not worried at all&quot;) )) covid_worry_collapse %&gt;% survey_count(CovidWorry_bin, CovidWorry) ## # A tibble: 5 × 4 ## CovidWorry_bin CovidWorry n n_se ## &lt;fct&gt; &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 WorriedHi Very worried 12369. 83.6 ## 2 WorriedHi Somewhat worried 6378. 63.4 ## 3 WorriedLo A little worried 5896. 62.6 ## 4 WorriedLo Not worried at all 4840. 59.7 ## 5 &lt;NA&gt; &lt;NA&gt; 3518. 42.2 With this new variable, we can now use survey_mean() to calculate the percentage of people in each country who are either very or somewhat worried about COVID. There are missing data, as indicated in the survey_count() output above, so we need to use na.rm = TRUE in the survey_mean() function to handle the missing values. covid_worry_country_ests &lt;- covid_worry_collapse %&gt;% group_by(Country) %&gt;% summarize(p = survey_mean(CovidWorry_bin == &quot;WorriedHi&quot;, na.rm = TRUE) * 100) covid_worry_country_ests ## # A tibble: 22 × 3 ## Country p p_se ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Argentina 65.8 1.08 ## 2 Bolivia 71.6 0.960 ## 3 Brazil 83.5 0.962 ## 4 Canada 48.9 1.34 ## 5 Chile 81.8 0.828 ## 6 Colombia 67.9 1.12 ## 7 Costa Rica 72.6 0.952 ## 8 Dominican Republic 50.1 1.13 ## 9 Ecuador 71.7 0.967 ## 10 El Salvador 52.5 1.02 ## # ℹ 12 more rows To view the results for all countries, we can use the {gt} package to create Table 14.1 (Iannone et al. 2023). covid_worry_country_ests_gt &lt;- covid_worry_country_ests %&gt;% gt(rowname_col = &quot;Country&quot;) %&gt;% cols_label(p = &quot;Percent&quot;, p_se = &quot;SE&quot;) %&gt;% fmt_number(decimals = 1) %&gt;% tab_source_note(&quot;AmericasBarometer Surveys, 2021&quot;) covid_worry_country_ests_gt #ibkckwmzsj table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #ibkckwmzsj thead, #ibkckwmzsj tbody, #ibkckwmzsj tfoot, #ibkckwmzsj tr, #ibkckwmzsj td, #ibkckwmzsj th { border-style: none; } #ibkckwmzsj p { margin: 0; padding: 0; } #ibkckwmzsj .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #ibkckwmzsj .gt_caption { padding-top: 4px; padding-bottom: 4px; } #ibkckwmzsj .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #ibkckwmzsj .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #ibkckwmzsj .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #ibkckwmzsj .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #ibkckwmzsj .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #ibkckwmzsj .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #ibkckwmzsj .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #ibkckwmzsj .gt_column_spanner_outer:first-child { padding-left: 0; } #ibkckwmzsj .gt_column_spanner_outer:last-child { padding-right: 0; } #ibkckwmzsj .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #ibkckwmzsj .gt_spanner_row { border-bottom-style: hidden; } #ibkckwmzsj .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #ibkckwmzsj .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #ibkckwmzsj .gt_from_md > :first-child { margin-top: 0; } #ibkckwmzsj .gt_from_md > :last-child { margin-bottom: 0; } #ibkckwmzsj .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #ibkckwmzsj .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #ibkckwmzsj .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #ibkckwmzsj .gt_row_group_first td { border-top-width: 2px; } #ibkckwmzsj .gt_row_group_first th { border-top-width: 2px; } #ibkckwmzsj .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #ibkckwmzsj .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #ibkckwmzsj .gt_first_summary_row.thick { border-top-width: 2px; } #ibkckwmzsj .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #ibkckwmzsj .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #ibkckwmzsj .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #ibkckwmzsj .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #ibkckwmzsj .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #ibkckwmzsj .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #ibkckwmzsj .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #ibkckwmzsj .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #ibkckwmzsj .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #ibkckwmzsj .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #ibkckwmzsj .gt_left { text-align: left; } #ibkckwmzsj .gt_center { text-align: center; } #ibkckwmzsj .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #ibkckwmzsj .gt_font_normal { font-weight: normal; } #ibkckwmzsj .gt_font_bold { font-weight: bold; } #ibkckwmzsj .gt_font_italic { font-style: italic; } #ibkckwmzsj .gt_super { font-size: 65%; } #ibkckwmzsj .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #ibkckwmzsj .gt_asterisk { font-size: 100%; vertical-align: 0; } #ibkckwmzsj .gt_indent_1 { text-indent: 5px; } #ibkckwmzsj .gt_indent_2 { text-indent: 10px; } #ibkckwmzsj .gt_indent_3 { text-indent: 15px; } #ibkckwmzsj .gt_indent_4 { text-indent: 20px; } #ibkckwmzsj .gt_indent_5 { text-indent: 25px; } TABLE 14.1: Percentage worried about the possibility that they or someone in their household will get sick from coronavirus in the next 3 months Percent SE Argentina 65.8 1.1 Bolivia 71.6 1.0 Brazil 83.5 1.0 Canada 48.9 1.3 Chile 81.8 0.8 Colombia 67.9 1.1 Costa Rica 72.6 1.0 Dominican Republic 50.1 1.1 Ecuador 71.7 1.0 El Salvador 52.5 1.0 Guatemala 69.3 1.0 Guyana 60.0 1.6 Haiti 54.4 1.8 Honduras 64.6 1.1 Jamaica 28.4 0.9 Mexico 63.6 1.0 Nicaragua 80.0 1.0 Panama 70.2 1.0 Paraguay 61.5 1.1 Peru 77.1 2.5 United States 46.6 1.7 Uruguay 60.9 1.1 AmericasBarometer Surveys, 2021 14.5.2 Example: Education affected by COVID Respondents were also asked a question about how the pandemic affected education. This question was asked to households with children under the age of 13, and respondents could select more than one option, as follows: Did any of these children have their school education affected due to the pandemic?   - No, because they are not yet school age or because they do not attend school for another reason   - No, their classes continued normally   - Yes, they went to virtual or remote classes   - Yes, they switched to a combination of virtual and in-person classes   - Yes, they cut all ties with the school Working with multiple-choice questions can be both challenging and interesting. Let’s walk through how to analyze this question. If we are interested in the impact on education, we should focus on the data of those whose children are attending school. This means we need to exclude those who selected the first response option: “No, because they are not yet school age or because they do not attend school for another reason.” To do this, we use the Educ_NotInSchool variable in the dataset, which has values of 0 and 1. A value of 1 indicates that the respondent chose the first response option (none of the children are in school), and a value of 0 means that at least one of their children is in school. By filtering the data to those with a value of 0 (they have at least one child in school), we can consider only respondents with at least one child attending school. Now, let’s review the data for those who selected one of the next three response options: No, their classes continued normally: Educ_NormalSchool Yes, they went to virtual or remote classes: Educ_VirtualSchool Yes, they switched to a combination of virtual and in-person classes: Educ_Hybrid The unweighted cross-tab for these responses is included below. It reveals a wide range of impacts, where many combinations of effects on education are possible. ambarom %&gt;% filter(Educ_NotInSchool == 0) %&gt;% count(Educ_NormalSchool, Educ_VirtualSchool, Educ_Hybrid) ## # A tibble: 8 × 4 ## Educ_NormalSchool Educ_VirtualSchool Educ_Hybrid n ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;int&gt; ## 1 0 0 0 861 ## 2 0 0 1 1192 ## 3 0 1 0 7554 ## 4 0 1 1 280 ## 5 1 0 0 833 ## 6 1 0 1 18 ## 7 1 1 0 72 ## 8 1 1 1 7 In reviewing the survey question, we might be interested in knowing the answers to the following: What percentage of households indicated that school continued as normal with no virtual or hybrid option? What percentage of households indicated that the education medium was changed to either virtual or hybrid? What percentage of households indicated that they cut ties with their school? To find the answers, we create indicators for the first two questions, make national estimates for all three questions, and then construct a summary table for easy viewing. First, we create and inspect the indicators and their distributions using survey_count(). ambarom_des_educ &lt;- ambarom_des %&gt;% filter(Educ_NotInSchool == 0) %&gt;% mutate( Educ_OnlyNormal = (Educ_NormalSchool == 1 &amp; Educ_VirtualSchool == 0 &amp; Educ_Hybrid == 0), Educ_MediumChange = (Educ_VirtualSchool == 1 | Educ_Hybrid == 1) ) ambarom_des_educ %&gt;% survey_count(Educ_OnlyNormal, Educ_NormalSchool, Educ_VirtualSchool, Educ_Hybrid) ## # A tibble: 8 × 6 ## Educ_OnlyNormal Educ_NormalSchool Educ_VirtualSchool Educ_Hybrid ## &lt;lgl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 FALSE 0 0 0 ## 2 FALSE 0 0 1 ## 3 FALSE 0 1 0 ## 4 FALSE 0 1 1 ## 5 FALSE 1 0 1 ## 6 FALSE 1 1 0 ## 7 FALSE 1 1 1 ## 8 TRUE 1 0 0 ## # ℹ 2 more variables: n &lt;dbl&gt;, n_se &lt;dbl&gt; ambarom_des_educ %&gt;% survey_count(Educ_MediumChange, Educ_VirtualSchool, Educ_Hybrid) ## # A tibble: 4 × 5 ## Educ_MediumChange Educ_VirtualSchool Educ_Hybrid n n_se ## &lt;lgl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 FALSE 0 0 880. 26.1 ## 2 TRUE 0 1 561. 19.2 ## 3 TRUE 1 0 3812. 49.4 ## 4 TRUE 1 1 136. 9.86 Next, we group the data by country and calculate the population estimates for our three questions. covid_educ_ests &lt;- ambarom_des_educ %&gt;% group_by(Country) %&gt;% summarize( p_onlynormal = survey_mean(Educ_OnlyNormal, na.rm = TRUE) * 100, p_mediumchange = survey_mean(Educ_MediumChange, na.rm = TRUE) * 100, p_noschool = survey_mean(Educ_NoSchool, na.rm = TRUE) * 100, ) covid_educ_ests ## # A tibble: 16 × 7 ## Country p_onlynormal p_onlynormal_se p_mediumchange p_mediumchange_se ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Argent… 5.39 1.14 87.1 1.72 ## 2 Brazil 4.28 1.17 81.5 2.33 ## 3 Chile 0.715 0.267 96.2 0.962 ## 4 Colomb… 2.84 0.727 90.3 1.40 ## 5 Domini… 3.75 0.793 87.4 1.45 ## 6 Ecuador 5.18 0.963 87.5 1.39 ## 7 El Sal… 2.92 0.680 85.8 1.53 ## 8 Guatem… 3.00 0.727 82.2 1.73 ## 9 Guyana 3.34 0.702 85.3 1.67 ## 10 Haiti 81.1 2.25 7.25 1.48 ## 11 Hondur… 3.68 0.882 80.7 1.72 ## 12 Jamaica 5.42 0.950 88.1 1.43 ## 13 Panama 7.20 1.18 89.4 1.42 ## 14 Paragu… 4.66 0.939 90.7 1.37 ## 15 Peru 2.04 0.604 91.8 1.20 ## 16 Uruguay 8.60 1.40 84.3 2.02 ## # ℹ 2 more variables: p_noschool &lt;dbl&gt;, p_noschool_se &lt;dbl&gt; Finally, to view the results for all countries, we can use the {gt} package to construct Table 14.2. covid_educ_ests_gt &lt;- covid_educ_ests %&gt;% gt(rowname_col = &quot;Country&quot;) %&gt;% cols_label( p_onlynormal = &quot;%&quot;, p_onlynormal_se = &quot;SE&quot;, p_mediumchange = &quot;%&quot;, p_mediumchange_se = &quot;SE&quot;, p_noschool = &quot;%&quot;, p_noschool_se = &quot;SE&quot; ) %&gt;% tab_spanner(label = &quot;Normal school only&quot;, columns = c(&quot;p_onlynormal&quot;, &quot;p_onlynormal_se&quot;)) %&gt;% tab_spanner(label = &quot;Medium change&quot;, columns = c(&quot;p_mediumchange&quot;, &quot;p_mediumchange_se&quot;)) %&gt;% tab_spanner(label = &quot;Cut ties with school&quot;, columns = c(&quot;p_noschool&quot;, &quot;p_noschool_se&quot;)) %&gt;% fmt_number(decimals = 1) %&gt;% tab_source_note(&quot;AmericasBarometer Surveys, 2021&quot;) covid_educ_ests_gt #hrwokkyhya table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #hrwokkyhya thead, #hrwokkyhya tbody, #hrwokkyhya tfoot, #hrwokkyhya tr, #hrwokkyhya td, #hrwokkyhya th { border-style: none; } #hrwokkyhya p { margin: 0; padding: 0; } #hrwokkyhya .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #hrwokkyhya .gt_caption { padding-top: 4px; padding-bottom: 4px; } #hrwokkyhya .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #hrwokkyhya .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #hrwokkyhya .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #hrwokkyhya .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #hrwokkyhya .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #hrwokkyhya .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #hrwokkyhya .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #hrwokkyhya .gt_column_spanner_outer:first-child { padding-left: 0; } #hrwokkyhya .gt_column_spanner_outer:last-child { padding-right: 0; } #hrwokkyhya .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #hrwokkyhya .gt_spanner_row { border-bottom-style: hidden; } #hrwokkyhya .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #hrwokkyhya .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #hrwokkyhya .gt_from_md > :first-child { margin-top: 0; } #hrwokkyhya .gt_from_md > :last-child { margin-bottom: 0; } #hrwokkyhya .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #hrwokkyhya .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #hrwokkyhya .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #hrwokkyhya .gt_row_group_first td { border-top-width: 2px; } #hrwokkyhya .gt_row_group_first th { border-top-width: 2px; } #hrwokkyhya .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #hrwokkyhya .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #hrwokkyhya .gt_first_summary_row.thick { border-top-width: 2px; } #hrwokkyhya .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #hrwokkyhya .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #hrwokkyhya .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #hrwokkyhya .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #hrwokkyhya .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #hrwokkyhya .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #hrwokkyhya .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #hrwokkyhya .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #hrwokkyhya .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #hrwokkyhya .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #hrwokkyhya .gt_left { text-align: left; } #hrwokkyhya .gt_center { text-align: center; } #hrwokkyhya .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #hrwokkyhya .gt_font_normal { font-weight: normal; } #hrwokkyhya .gt_font_bold { font-weight: bold; } #hrwokkyhya .gt_font_italic { font-style: italic; } #hrwokkyhya .gt_super { font-size: 65%; } #hrwokkyhya .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #hrwokkyhya .gt_asterisk { font-size: 100%; vertical-align: 0; } #hrwokkyhya .gt_indent_1 { text-indent: 5px; } #hrwokkyhya .gt_indent_2 { text-indent: 10px; } #hrwokkyhya .gt_indent_3 { text-indent: 15px; } #hrwokkyhya .gt_indent_4 { text-indent: 20px; } #hrwokkyhya .gt_indent_5 { text-indent: 25px; } TABLE 14.2: Impact on education in households with children under the age of 13 who had children that would generally attend school Normal school only Medium change Cut ties with school % SE % SE % SE Argentina 5.4 1.1 87.1 1.7 9.9 1.6 Brazil 4.3 1.2 81.5 2.3 22.1 2.5 Chile 0.7 0.3 96.2 1.0 4.0 1.0 Colombia 2.8 0.7 90.3 1.4 7.5 1.3 Dominican Republic 3.8 0.8 87.4 1.5 10.5 1.4 Ecuador 5.2 1.0 87.5 1.4 7.9 1.1 El Salvador 2.9 0.7 85.8 1.5 11.8 1.4 Guatemala 3.0 0.7 82.2 1.7 17.7 1.8 Guyana 3.3 0.7 85.3 1.7 13.0 1.6 Haiti 81.1 2.3 7.2 1.5 11.7 1.8 Honduras 3.7 0.9 80.7 1.7 16.9 1.6 Jamaica 5.4 0.9 88.1 1.4 7.5 1.2 Panama 7.2 1.2 89.4 1.4 3.8 0.9 Paraguay 4.7 0.9 90.7 1.4 6.4 1.2 Peru 2.0 0.6 91.8 1.2 6.8 1.1 Uruguay 8.6 1.4 84.3 2.0 8.0 1.6 AmericasBarometer Surveys, 2021 In the countries that were asked this question, many households experienced a change in their child’s education medium. However, in Haiti, only 7.2% of households with children switched to virtual or hybrid learning. 14.6 Mapping survey data While the table effectively presents the data, a map could also be insightful. To generate maps of the countries, we can use the package {rnaturalearth} and subset North and South America with the ne_countries() function (Massicotte and South 2023). The function returns an sf (simple features) object with many columns (Pebesma and Bivand 2023), but most importantly, soverignt (sovereignty), geounit (country or territory), and geometry (the shape.) For an example of the difference between sovereignty and country/territory, the United States, Puerto Rico, and the U.S. Virgin Islands are all separate units with the same sovereignty. A map without data is plotted in Figure 14.1 using geom_sf() from the {ggplot2} package, which plots sf objects (Wickham 2016). country_shape &lt;- ne_countries( scale = &quot;medium&quot;, returnclass = &quot;sf&quot;, continent = c(&quot;North America&quot;, &quot;South America&quot;) ) country_shape %&gt;% ggplot() + geom_sf() FIGURE 14.1: Map of North and South America The map in Figure 14.1 appears very wide due to the Aleutian islands in Alaska extending into the Eastern Hemisphere. We can crop the shapefile to include only the Western Hemisphere, which removes some of the trailing islands of Alaska using st_crop() from the {sf} package. country_shape_crop &lt;- country_shape %&gt;% st_crop(c(xmin = -180, xmax = 0, ymin = -90, ymax = 90)) Now that we have the necessary shape files, our next step is to match our survey data to the map. Countries can be named differently (e.g., “U.S”, “U.S.A”, “United States”.) To make sure we can visualize our survey data on the map, we need to match the country names in both the survey data and the map data. To do this, we can use the anti_join() function to identify the countries in the survey data that aren’t in the map data. For example, as shown below, the United States is referred to as “United States” in the survey data but “United States of America” in the map data. Table 14.3 shows the countries in the survey data but not the map data, and Table 14.4 shows the countries in the map data but not the survey data. survey_country_list &lt;- ambarom %&gt;% distinct(Country) survey_country_list_gt &lt;- survey_country_list %&gt;% anti_join(country_shape_crop, by = c(&quot;Country&quot; = &quot;geounit&quot;)) %&gt;% gt() survey_country_list_gt #uqkclffrjq table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #uqkclffrjq thead, #uqkclffrjq tbody, #uqkclffrjq tfoot, #uqkclffrjq tr, #uqkclffrjq td, #uqkclffrjq th { border-style: none; } #uqkclffrjq p { margin: 0; padding: 0; } #uqkclffrjq .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #uqkclffrjq .gt_caption { padding-top: 4px; padding-bottom: 4px; } #uqkclffrjq .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #uqkclffrjq .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #uqkclffrjq .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #uqkclffrjq .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #uqkclffrjq .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #uqkclffrjq .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #uqkclffrjq .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #uqkclffrjq .gt_column_spanner_outer:first-child { padding-left: 0; } #uqkclffrjq .gt_column_spanner_outer:last-child { padding-right: 0; } #uqkclffrjq .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #uqkclffrjq .gt_spanner_row { border-bottom-style: hidden; } #uqkclffrjq .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #uqkclffrjq .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #uqkclffrjq .gt_from_md > :first-child { margin-top: 0; } #uqkclffrjq .gt_from_md > :last-child { margin-bottom: 0; } #uqkclffrjq .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #uqkclffrjq .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #uqkclffrjq .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #uqkclffrjq .gt_row_group_first td { border-top-width: 2px; } #uqkclffrjq .gt_row_group_first th { border-top-width: 2px; } #uqkclffrjq .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #uqkclffrjq .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #uqkclffrjq .gt_first_summary_row.thick { border-top-width: 2px; } #uqkclffrjq .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #uqkclffrjq .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #uqkclffrjq .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #uqkclffrjq .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #uqkclffrjq .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #uqkclffrjq .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #uqkclffrjq .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #uqkclffrjq .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #uqkclffrjq .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #uqkclffrjq .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #uqkclffrjq .gt_left { text-align: left; } #uqkclffrjq .gt_center { text-align: center; } #uqkclffrjq .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #uqkclffrjq .gt_font_normal { font-weight: normal; } #uqkclffrjq .gt_font_bold { font-weight: bold; } #uqkclffrjq .gt_font_italic { font-style: italic; } #uqkclffrjq .gt_super { font-size: 65%; } #uqkclffrjq .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #uqkclffrjq .gt_asterisk { font-size: 100%; vertical-align: 0; } #uqkclffrjq .gt_indent_1 { text-indent: 5px; } #uqkclffrjq .gt_indent_2 { text-indent: 10px; } #uqkclffrjq .gt_indent_3 { text-indent: 15px; } #uqkclffrjq .gt_indent_4 { text-indent: 20px; } #uqkclffrjq .gt_indent_5 { text-indent: 25px; } TABLE 14.3: Countries in the survey data but not the map data Country United States map_country_list_gt&lt;-country_shape_crop %&gt;% as_tibble() %&gt;% select(geounit, sovereignt) %&gt;% anti_join(survey_country_list, by = c(&quot;geounit&quot; = &quot;Country&quot;)) %&gt;% arrange(geounit) %&gt;% gt() map_country_list_gt #xqossclppl table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #xqossclppl thead, #xqossclppl tbody, #xqossclppl tfoot, #xqossclppl tr, #xqossclppl td, #xqossclppl th { border-style: none; } #xqossclppl p { margin: 0; padding: 0; } #xqossclppl .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #xqossclppl .gt_caption { padding-top: 4px; padding-bottom: 4px; } #xqossclppl .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #xqossclppl .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #xqossclppl .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #xqossclppl .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #xqossclppl .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #xqossclppl .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #xqossclppl .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #xqossclppl .gt_column_spanner_outer:first-child { padding-left: 0; } #xqossclppl .gt_column_spanner_outer:last-child { padding-right: 0; } #xqossclppl .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #xqossclppl .gt_spanner_row { border-bottom-style: hidden; } #xqossclppl .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #xqossclppl .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #xqossclppl .gt_from_md > :first-child { margin-top: 0; } #xqossclppl .gt_from_md > :last-child { margin-bottom: 0; } #xqossclppl .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #xqossclppl .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #xqossclppl .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #xqossclppl .gt_row_group_first td { border-top-width: 2px; } #xqossclppl .gt_row_group_first th { border-top-width: 2px; } #xqossclppl .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #xqossclppl .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #xqossclppl .gt_first_summary_row.thick { border-top-width: 2px; } #xqossclppl .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #xqossclppl .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #xqossclppl .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #xqossclppl .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #xqossclppl .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #xqossclppl .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #xqossclppl .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #xqossclppl .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #xqossclppl .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #xqossclppl .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #xqossclppl .gt_left { text-align: left; } #xqossclppl .gt_center { text-align: center; } #xqossclppl .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #xqossclppl .gt_font_normal { font-weight: normal; } #xqossclppl .gt_font_bold { font-weight: bold; } #xqossclppl .gt_font_italic { font-style: italic; } #xqossclppl .gt_super { font-size: 65%; } #xqossclppl .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #xqossclppl .gt_asterisk { font-size: 100%; vertical-align: 0; } #xqossclppl .gt_indent_1 { text-indent: 5px; } #xqossclppl .gt_indent_2 { text-indent: 10px; } #xqossclppl .gt_indent_3 { text-indent: 15px; } #xqossclppl .gt_indent_4 { text-indent: 20px; } #xqossclppl .gt_indent_5 { text-indent: 25px; } TABLE 14.4: Countries in the map data but not the survey data geounit sovereignt Anguilla United Kingdom Antigua and Barbuda Antigua and Barbuda Aruba Netherlands Barbados Barbados Belize Belize Bermuda United Kingdom British Virgin Islands United Kingdom Cayman Islands United Kingdom Cuba Cuba Curaçao Netherlands Dominica Dominica Falkland Islands United Kingdom Greenland Denmark Grenada Grenada Montserrat United Kingdom Puerto Rico United States of America Saint Barthelemy France Saint Kitts and Nevis Saint Kitts and Nevis Saint Lucia Saint Lucia Saint Martin France Saint Pierre and Miquelon France Saint Vincent and the Grenadines Saint Vincent and the Grenadines Sint Maarten Netherlands Suriname Suriname The Bahamas The Bahamas Trinidad and Tobago Trinidad and Tobago Turks and Caicos Islands United Kingdom United States Virgin Islands United States of America United States of America United States of America Venezuela Venezuela There are several ways to fix the mismatched names for a successful join. The simplest solution is to rename the data in the shape object before merging. Since only one country name in the survey data differs from the map data, we rename the map data accordingly. country_shape_upd &lt;- country_shape_crop %&gt;% mutate(geounit = if_else(geounit == &quot;United States of America&quot;, &quot;United States&quot;, geounit)) Now that the country names match, we can merge the survey and map data and then plot the data. We begin with the map file and merge it with the survey estimates generated in Section 14.5 (covid_worry_country_ests and covid_educ_ests.) We use the {sf} function of full_join(), which joins the rows in the map data and the survey estimates based on the columns geounit and Country. A full join keeps all the rows from both datasets, matching rows when possible. For any rows without matches, the function fills in an NA for the missing value (Pebesma and Bivand 2023). covid_sf &lt;- country_shape_upd %&gt;% full_join(covid_worry_country_ests, by = c(&quot;geounit&quot; = &quot;Country&quot;)) %&gt;% full_join(covid_educ_ests, by = c(&quot;geounit&quot; = &quot;Country&quot;)) After the merge, we create two figures that display the population estimates for the percentage of people worried about COVID (Figure 14.2) and the percentage of households with at least one child participating in virtual or hybrid learning (Figure 14.3.) We also add a cross-hatching pattern to the countries without any data using the geom_sf_pattern() function from the {ggpattern} package (FC, Davis, and ggplot2 authors 2022). ggplot() + geom_sf(data = covid_sf, aes(fill = p, geometry = geometry), color = &quot;darkgray&quot;) + scale_fill_gradientn( guide = &quot;colorbar&quot;, name = &quot;Percent&quot;, labels = scales::comma, colors = c(&quot;#BFD7EA&quot;, &quot;#087e8b&quot;, &quot;#0B3954&quot;), na.value = NA ) + geom_sf_pattern( data = filter(covid_sf, is.na(p)), pattern = &quot;crosshatch&quot;, pattern_fill = &quot;lightgray&quot;, pattern_color = &quot;lightgray&quot;, fill = NA, color = &quot;darkgray&quot; ) + theme_minimal() FIGURE 14.2: Percent of households worried someone in their household will get COVID-19 in the next 3 months by country ggplot() + geom_sf( data = covid_sf, aes(fill = p_mediumchange, geometry = geometry), color = &quot;darkgray&quot; ) + scale_fill_gradientn( guide = &quot;colorbar&quot;, name = &quot;Percent&quot;, labels = scales::comma, colors = c(&quot;#BFD7EA&quot;, &quot;#087e8b&quot;, &quot;#0B3954&quot;), na.value = NA ) + geom_sf_pattern( data = filter(covid_sf, is.na(p_mediumchange)), pattern = &quot;crosshatch&quot;, pattern_fill = &quot;lightgray&quot;, pattern_color = &quot;lightgray&quot;, fill = NA, color = &quot;darkgray&quot; ) + theme_minimal() FIGURE 14.3: Percent of households who had at least one child participate in virtual or hybrid learning In Figure 14.3, we observe missing data (represented by the crosshatch pattern) for Canada, Mexico, and the United States. The questionnaires indicate that these three countries did not include the education question in the survey. To focus on countries with available data, we can remove North America from the map and show only Central and South America. We do this below by restricting the shape files to Latin America and the Caribbean, as depicted in Figure 14.4. covid_c_s &lt;- covid_sf %&gt;% filter(region_wb == &quot;Latin America &amp; Caribbean&quot;) ggplot() + geom_sf( data = covid_c_s, aes(fill = p_mediumchange, geometry = geometry), color = &quot;darkgray&quot; ) + scale_fill_gradientn( guide = &quot;colorbar&quot;, name = &quot;Percent&quot;, labels = scales::comma, colors = c(&quot;#BFD7EA&quot;, &quot;#087e8b&quot;, &quot;#0B3954&quot;), na.value = NA ) + geom_sf_pattern( data = filter(covid_c_s, is.na(p_mediumchange)), pattern = &quot;crosshatch&quot;, pattern_fill = &quot;lightgray&quot;, pattern_color = &quot;lightgray&quot;, fill = NA, color = &quot;darkgray&quot; ) + theme_minimal() FIGURE 14.4: Percent of households who had at least one child participate in virtual or hybrid learning, Central and South America In Figure 14.4, we can see that most countries with available data have similar percentages (reflected in their similar shades.) However, Haiti stands out with a lighter shade, indicating a considerably lower percentage of households with at least one child participating in virtual or hybrid learning. 14.7 Exercises Calculate the percentage of households with broadband internet and those with any internet at home, including from a phone or tablet in Latin America and the Caribbean. Hint: if there are countries with 0% internet usage, try filtering by something first. Create a faceted map showing both broadband internet and any internet usage. References FC, Mike, Trevor L Davis, and ggplot2 authors. 2022. ggpattern: ’ggplot2’ Pattern Geoms. Iannone, Richard, Joe Cheng, Barret Schloerke, Ellis Hughes, Alexandra Lauer, and JooYoung Seo. 2023. gt: Easily Create Presentation-Ready Display Tables. LAPOP. 2021a. “AmericasBarometer 2021 - Canada: Technical Information.” Vanderbilt University; http://datasets.americasbarometer.org/database/files/ABCAN2021-Technical-Report-v1.0-FINAL-eng-110921.pdf. ———. 2021b. “AmericasBarometer 2021 - U.S.: Technical Information.” Vanderbilt University; http://datasets.americasbarometer.org/database/files/ABUSA2021-Technical-Report-v1.0-FINAL-eng-110921.pdf. ———. 2021c. “AmericasBarometer 2021: Technical Information.” Vanderbilt University; https://www.vanderbilt.edu/lapop/ab2021/AB2021-Technical-Report-v1.0-FINAL-eng-030722.pdf. ———. 2021d. “Core Questionnaire.” https://www.vanderbilt.edu/lapop/ab2021/AB2021-Core-Questionnaire-v17.5-Eng-210514-W-v2.pdf. ———. 2023a. “About the AmericasBarometer.” https://www.vanderbilt.edu/lapop/about-americasbarometer.php. ———. 2023b. “The AmericasBarometer by the LAPOP Lab.” www.vanderbilt.edu/lapop. Massicotte, Philippe, and Andy South. 2023. rnaturalearth: World Map Data from Natural Earth. https://docs.ropensci.org/rnaturalearth/ Pebesma, Edzer, and Roger Bivand. 2023. Spatial Data Science: With applications in R. Chapman and Hall/CRC. https://doi.org/10.1201/9780429459016. Wickham, Hadley. 2016. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. https://ggplot2.tidyverse.org. ———. 2023a. forcats: Tools for Working with Categorical Variables (Factors). Wickham, Hadley, Evan Miller, and Danny Smith. 2023. haven: Import and Export ’SPSS’, ’Stata’ and ’SAS’ Files. See Table 2 in LAPOP (2021c) for dates by country↩︎ "],["importing-survey-data-into-r.html", "A Importing survey data into R A.1 Importing delimiter-separated files into R A.2 Loading Excel files into R A.3 Importing Stata, SAS, and SPSS files into R A.4 Importing data from APIs into R A.5 Accessing databases in R A.6 Importing data from other formats", " A Importing survey data into R To analyze a survey, we need to import the survey data into R. This process is often referred to as importing, loading, or reading in data. Survey files come in different formats depending on the software used to create them. One of the many advantages of R is the flexibility in handling various data formats, regardless of their file extensions. Here are examples of common public-use survey file formats we may encounter: Delimiter-separated text files Excel spreadsheets in .xls or .xlsx format R native .rda files Stata datasets in .dta format SAS datasets in .sas format SPSS datasets in .sav format Application Programming Interfaces (APIs), often in JSON format Data stored in databases This appendix guides analysts through the process of importing these various types of survey data into R. A.1 Importing delimiter-separated files into R Delimiter-separated files use specific characters, known as delimiters, to separate values within the file. For example, CSV (Comma-Separated Values) files use commas as delimiters, while TSV (Tab-Separated Values) files use tabs. These file formats are widely used because of their simplicity and compatibility with various software applications. The {readr} package, part of the tidyverse ecosystem, offers efficient ways to import delimiter-separated files into R (Wickham, Hester, and Bryan 2023). It provides several advantages, including automatic data type detection and flexible handling of missing values, depending on one’s survey research needs. The {readr} package includes functions for: read_csv(): This function is specifically designed to read CSV files. read_tsv(): Use this function for Tab-Separated Values (TSV) files. read_delim(): This function can handle a broader range of delimiter-separated files, including CSV and TSV. Specify the delimiter using the delim argument. read_fwf(): This function is useful for importing Fixed-Width Files, where columns have predetermined widths, and values are aligned in specific positions. read_table(): Use this function when dealing with whitespace-separated files, such as those with spaces or multiple spaces as delimiters. read_log(): This function can read and parse web log files. The syntax for read_csv() is: read_csv( file, col_names = TRUE, col_types = NULL, col_select = NULL, id = NULL, locale = default_locale(), na = c(&quot;&quot;, &quot;NA&quot;), comment = &quot;&quot;, trim_ws = TRUE, skip = 0, n_max = Inf, guess_max = min(1000, n_max), name_repair = &quot;unique&quot;, num_threads = readr_threads(), progress = show_progress(), show_col_types = should_show_types(), skip_empty_rows = TRUE, lazy = should_read_lazy() ) The arguments are: file: the path to the Excel file to import col_names: a value of TRUE imports the first row of the file as column names and not included in the data frame. A value of FALSE creates automated column names. Alternatively, we can provide a vector of column names. col_types: by default, R infers the column variable types. We can also provide a column specification using list() or cols(); for example, use col_types = cols(.default = \"c\") to read all the columns as characters. Alternatively, we can use a string to specify the variable types for each column. col_select: the columns to include in the results id: a column for storing the file path. This is useful for keeping track of the input file when importing multiple CSVs at a time. locale: the location-specific defaults for the file na: a character vector of values to interpret as missing comment: a character vector of values to interpret as comments trim_ws: a value of TRUE trims leading and trailing white space skip: number of lines to skip before importing the data n_max: maximum number of lines to read guess_max: maximum number of lines use for guessing column types name_repair: whether to check column names. By default, the column names are unique. num_threads: the number of processing threads to use for initial parsing and lazy reading of data progress: a value of TRUE displays a progress bar show_col_types: a value of TRUE displays the column types skip_empty_rows: a value of TRUE ignores blank rows lazy: a value of TRUE reads values lazily The other functions share a similar syntax to read_csv(). To find more details, run ?? followed by the function name. For example, run ??read_delim in the Console for additional information. In the example below, we use {readr} to load a CSV file named ‘anes_timeseries_2020_csv_20220210.csv’ into an R object called anes_csv. The read_csv() imports the file and stores the data in the anes_csv object. We can then use this object for further analysis. library(readr) anes_csv &lt;- read_csv(&quot;data/anes_timeseries_2020_csv_20220210.csv&quot;) A.2 Loading Excel files into R Excel, a widely used spreadsheet software program created by Microsoft, is a common file format in survey research. We can load Excel spreadsheets into the R environment using the {readxl} package. The package supports both the legacy .xls files and the modern .xlsx format. To load Excel data into R, we can use the read_excel() function from the {readxl} package. This function offers a range of options for the import process. Let’s explore the syntax: read_excel( path, sheet = NULL, range = NULL, col_names = TRUE, col_types = NULL, na = &quot;&quot;, trim_ws = TRUE, skip = 0, n_max = Inf, guess_max = min(1000, n_max), progress = readxl_progress(), .name_repair = &quot;unique&quot; ) The arguments are: path: the path to the Excel file to import sheet: the name or index of the sheet (sometimes called tabs) within the Excel file range: the range of cells to import (for example, “P15:T87”) col_names: indicates whether the first row of the dataset contains column names col_types: specify the data types of columns na: define the representation of missing values (for example, NULL) trim_ws: controls whether leading and trailing whitespaces should be trimmed skip and n_max: enable skipping rows and limit the number of rows imported guess_max: sets the maximum number of rows used for data type guessing progress: specifies a progress bar for large imports .name_repair: determines how column names are repaired if they are not valid In the code example below, we load an Excel spreadsheet named ‘anes_timeseries_2020_csv_20220210.xlsx’ into R. The resulting data is saved as a tibble in the anes_excel object, ready for further analysis. library(readxl) anes_excel &lt;- read_excel(path = &quot;data/anes_timeseries_2020_csv_20220210.xlsx&quot;) A.3 Importing Stata, SAS, and SPSS files into R The {haven} package, also from the tidyverse ecosystem, imports various proprietary data formats: Stata .dta files, SPSS .sav files, and SAS .sas7bdat and .sas7bcat files (Wickham, Miller, and Smith 2023). One of the notable strengths of the {haven} package is its ability to handle multiple proprietary formats within a unified framework. It offers dedicated functions for each supported proprietary format, making it straightforward to import data regardless of the program. Here, we introduce read_dat() for Stata files, read_sav() for SPSS files, and read_sas() for SAS files. A.3.1 Syntax Let’s explore the syntax for importing Stata files .dat files using haven::read_dat(): read_dta( file, encoding = NULL, col_select = NULL, skip = 0, n_max = Inf, .name_repair = &quot;unique&quot; ) The arguments are: file: the path to the proprietary data file to import encoding: specifies the character encoding of the data file col_select: select specific columns for import skip and n_max: control the number of rows skipped and the maximum number of rows imported .name_repair: determines how column names are repaired if they are not valid The syntax for read_sav() is similar to read_dat(): read_sav( file, encoding = NULL, user_na = FALSE, col_select = NULL, skip = 0, n_max = Inf, .name_repair = &quot;unique&quot; ) The arguments are: file: the path to the proprietary data file to import encoding: specifies the character encoding of the data file col_select: select specific columns for import user_na: a value of TRUE reads variables with user defined missing labels into labelled_spss() objects skip and n_max: control the number of rows skipped and the maximum number of rows imported .name_repair: determines how column names are repaired if they are not valid The syntax for importing SAS files with read_sas() is as follows: read_sas( data_file, catalog_file = NULL, encoding = NULL, catalog_encoding = encoding, col_select = NULL, skip = 0L, n_max = Inf, .name_repair = &quot;unique&quot; ) The arguments are: data_file: the path to the proprietary data file to import catalog_file: the path to the catalog file to import encoding: specifies the character encoding of the data file catalog_encoding: specifies the character encoding of the catalog file col_select: select specific columns for import skip and n_max: control the number of rows skipped and the maximum number of rows imported .name_repair: determines how column names are repaired if they are not valid In the code examples below, we demonstrate how to load Stata, SPSS, and SAS files into R using the respective {haven} functions. The resulting data are stored in anes_dta, anes_sav, and anes_sas objects as tibbles, ready for use in R. For the Stata example, we show you how to load in the data from the {srvyrexploR} package and will use this data in examples later in this Appendix. Stata: library(haven) anes_dta &lt;- read_dta(system.file(&quot;extdata&quot;, &quot;anes_2020_stata_example.dta&quot;, package=&quot;srvyrexploR&quot;)) SPSS: library(haven) anes_sav &lt;- read_sav(file = &quot;data/anes_timeseries_2020_spss_20220210.sav&quot;) SAS: library(haven) anes_sas &lt;- read_sas(file = &quot;data/anes_timeseries_2020_sas_20220210.sas7bdat&quot;) A.3.2 Working with labeled data Stata, SPSS, and SAS files often contain labeled variables and values. These labels provide descriptive information about categorical data, making it easier to understand and analyze. When importing data from Stata, SPSS, or SAS, preserving these labels is essential for maintaining data fidelity. Consider a variable like ‘Education Level’ with coded values (e.g., 1, 2, 3.) Without labels, these codes can be cryptic. However, with labels (‘High School Graduate,’ ‘Bachelor’s Degree,’ ‘Master’s Degree’), the data become more informative and easier to work with. With the {haven} package, we have the capability to import and work with labeled data from Stata, SPSS, and SAS files. The package uses a special class of data called haven_labelled to store labeled variables. When a dataset label is defined in Stata, it is stored in the ‘label’ attribute of the tibble when imported, ensuring that the information is not lost. We can use functions like select(), glimpse(), and is.labelled() to inspect the imported data and verify if the variables are labeled. Take a look at the ANES Stata file. Notice that categorical variables are marked with a type of &lt;dbl+lbl&gt;. This notation indicates that these variables are labeled. library(dplyr) anes_dta %&gt;% select(1:6) %&gt;% glimpse() ## Rows: 7,453 ## Columns: 6 ## $ V200001 &lt;dbl&gt; 200015, 200022, 200039, 200046, 200053, 200060, 20008… ## $ V200002 &lt;dbl+lbl&gt; 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3… ## $ V200010b &lt;dbl&gt; 1.0057, 1.1635, 0.7687, 0.5210, 0.9658, 0.2347, 0.440… ## $ V200010d &lt;dbl&gt; 9, 26, 41, 29, 23, 37, 7, 37, 32, 41, 22, 7, 38, 21, … ## $ V200010c &lt;dbl&gt; 2, 2, 1, 2, 1, 2, 1, 2, 2, 2, 1, 1, 2, 2, 2, 2, 1, 1,… ## $ V201006 &lt;dbl+lbl&gt; 2, 3, 2, 3, 2, 1, 2, 3, 2, 2, 2, 2, 2, 1, 2, 1, 1… We can confirm this label status using the haven::is.labelled() function. haven::is.labelled(anes_dta$V200002) ## [1] TRUE To explore the labels further, we can use the attributes() function. This function provides insights into both the variable labels ($label) and the associated value labels ($labels.) attributes(anes_dta$V200002) ## $label ## [1] &quot;Mode of interview: pre-election interview&quot; ## ## $format.stata ## [1] &quot;%10.0g&quot; ## ## $class ## [1] &quot;haven_labelled&quot; &quot;vctrs_vctr&quot; &quot;double&quot; ## ## $labels ## 1. Video 2. Telephone 3. Web ## 1 2 3 When we import a labeled dataset using {haven}, it results in a tibble containing both the data and label information. However, this is meant to be an intermediary data structure and not intended to be the final data format for analysis. Instead, we should convert it into a regular R data frame before continuing our data workflow. There are two primary methods to achieve this conversion: (1) convert to factors or (2) remove the labels. Option 1: Convert the vector into a factor Factors are native R data types for working with categorical data. They consist of integer values that correspond to character values, known as levels. Below is a dummy example of factors. The factors show the four different levels in the data: strongly agree, agree, disagree, and strongly disagree. response &lt;- c(&quot;strongly agree&quot;, &quot;agree&quot;, &quot;agree&quot;, &quot;disagree&quot;) response_levels &lt;- c(&quot;strongly agree&quot;, &quot;agree&quot;, &quot;disagree&quot;, &quot;strongly disagree&quot;) factors &lt;- factor(response, levels = response_levels) factors ## [1] strongly agree agree agree disagree ## Levels: strongly agree agree disagree strongly disagree Factors are integer vectors, though they may look like character strings. We can confirm by looking at the vector’s structure: glimpse(factors) ## Factor w/ 4 levels &quot;strongly agree&quot;,..: 1 2 2 3 R’s factors differ from Stata, SPSS, or SAS’ labeled vectors. However, we can convert labeled variables into factors using the as_factor() function. anes_dta %&gt;% transmute(V200002 = as_factor(V200002)) ## # A tibble: 7,453 × 1 ## V200002 ## &lt;fct&gt; ## 1 3. Web ## 2 3. Web ## 3 3. Web ## 4 3. Web ## 5 3. Web ## 6 3. Web ## 7 3. Web ## 8 3. Web ## 9 3. Web ## 10 3. Web ## # ℹ 7,443 more rows The as_factor() function can be applied to all columns in a data frame or individual ones. Below, we convert all &lt;dbl+lbl&gt; columns into factors. anes_dta_factor &lt;- anes_dta %&gt;% as_factor() anes_dta_factor %&gt;% select(1:6) %&gt;% glimpse() ## Rows: 7,453 ## Columns: 6 ## $ V200001 &lt;dbl&gt; 200015, 200022, 200039, 200046, 200053, 200060, 20008… ## $ V200002 &lt;fct&gt; 3. Web, 3. Web, 3. Web, 3. Web, 3. Web, 3. Web, 3. We… ## $ V200010b &lt;dbl&gt; 1.0057, 1.1635, 0.7687, 0.5210, 0.9658, 0.2347, 0.440… ## $ V200010d &lt;dbl&gt; 9, 26, 41, 29, 23, 37, 7, 37, 32, 41, 22, 7, 38, 21, … ## $ V200010c &lt;dbl&gt; 2, 2, 1, 2, 1, 2, 1, 2, 2, 2, 1, 1, 2, 2, 2, 2, 1, 1,… ## $ V201006 &lt;fct&gt; 2. Somewhat interested, 3. Not much interested, 2. So… Option 2: Strip the labels The second option is to remove the labels altogether, converting the labeled data into a regular R data frame. To remove, or ‘zap’ the labels from our tibble, we can use the {haven} package’s zap_label() and zap_labels() functions. This approach removes the labels but retains the data values in their original form. The ANES Stata file columns contain variable labels. Using the function map() from {purrr}, we can review the labels using attr. In the example below, we list the first two variables and their labels. For instance, the label for V200002 is “Mode of interview: pre-election interview”. purrr::map(anes_dta, ~attr(.x, &quot;label&quot;)) %&gt;% head(2) ## $V200001 ## [1] &quot;2020 Case ID&quot; ## ## $V200002 ## [1] &quot;Mode of interview: pre-election interview&quot; Use zap_label() to remove the variable labels but retain the value labels. Notice that the labels return as NULL. zap_label(anes_dta) %&gt;% purrr::map(~attr(.x, &quot;label&quot;)) %&gt;% head(2) ## $V200001 ## NULL ## ## $V200002 ## 1. Video 2. Telephone 3. Web ## 1 2 3 To remove the value labels, use zap_labels(). Notice the previous &lt;dbl+lbl&gt; columns are now &lt;dbl&gt;. zap_labels(anes_dta) %&gt;% select(1:6) %&gt;% glimpse() ## Rows: 7,453 ## Columns: 6 ## $ V200001 &lt;dbl&gt; 200015, 200022, 200039, 200046, 200053, 200060, 20008… ## $ V200002 &lt;dbl&gt; 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,… ## $ V200010b &lt;dbl&gt; 1.0057, 1.1635, 0.7687, 0.5210, 0.9658, 0.2347, 0.440… ## $ V200010d &lt;dbl&gt; 9, 26, 41, 29, 23, 37, 7, 37, 32, 41, 22, 7, 38, 21, … ## $ V200010c &lt;dbl&gt; 2, 2, 1, 2, 1, 2, 1, 2, 2, 2, 1, 1, 2, 2, 2, 2, 1, 1,… ## $ V201006 &lt;dbl&gt; 2, 3, 2, 3, 2, 1, 2, 3, 2, 2, 2, 2, 2, 1, 2, 1, 1, 1,… While it is important to convert labeled datasets into regular R data frames for working in R, the labels themselves often contain valuable information that provides context and meaning to the survey variables. To aid with interpretability and documentation, consider creating a data dictionary from the labeled dataset. A data dictionary is a reference document that provides detailed information about the variables and values of a survey. The {labelled} package offers a convenient function, generate_dictionary(), that creates data dictionaries directly from a labeled dataset (Larmarange 2023). This function extracts variable labels, value labels, and other metadata and organizes them into a structured document that we can browse and reference throughout our analysis. Let’s create a data dictionary from the ANES Stata dataset as an example: library(labelled) dictionary &lt;- generate_dictionary(anes_dta) Once we’ve generated the data dictionary, we can take a look at the V200002 variable and see the label, column type, number of missing entries, and associated values. dictionary %&gt;% filter(variable == &quot;V200002&quot;) ## pos variable label col_type missing ## 2 V200002 Mode of interview: pre-electi~ dbl+lbl 0 ## ## ## values ## [1] 1. Video ## [2] 2. Telephone ## [3] 3. Web A.3.3 Labeled missing data values In survey data analysis, dealing with missing values is a crucial aspect of data preparation. Stata, SPSS, and SAS files each have their own methods for handling missing values. Stata has “extended” missing values, .A through .Z. SAS has “special” missing values, .A through .Z and ._. SPSS has per-column “user” missing values. Each column can declare up to three distinct values or a range of values (plus one distinct value) that should be treated as missing. SAS and Stata use a concept known as ‘tagged’ missing values, which extend R’s regular NA. A ‘tagged’ missing value is essentially an NA with an additional single-character label. These values behave identically to regular NA in standard R operations while preserving the informative tag associated with the missing value. Here is an example from the NORC at the University of Chicago’s 2018 General Society Survey. head(gss_dta$HEALTH) #&gt; &lt;labelled&lt;double&gt;[6]&gt;: condition of health #&gt; [1] 2 1 NA(i) NA(i) 1 2 #&gt; #&gt; Labels: #&gt; value label #&gt; 1 excellent #&gt; 2 good #&gt; 3 fair #&gt; 4 poor #&gt; NA(d) DK #&gt; NA(i) IAP #&gt; NA(n) NA In contrast, SPSS uses a different approach called ‘user-defined values’ to denote missing values. Each column in an SPSS dataset can have up to three distinct values designated as missing or a specified range of missing values. To model these additional user-defined missing values, {haven} provides the labeled_spss() subclass of labeled(). When importing SPSS data using {haven}, it ensures that user-defined missing values are correctly handled. We can work with these data in R while preserving the unique missing value conventions from SPSS. Here is what the GSS SPSS data looks like when loaded with {haven}. head(gss_sps$HEALTH) #&gt; &lt;labelled_spss&lt;double&gt;[6]&gt;: Condition of health #&gt; [1] 2 1 0 0 1 2 #&gt; Missing values: 0, 8, 9 #&gt; #&gt; Labels: #&gt; value label #&gt; 0 IAP #&gt; 1 EXCELLENT #&gt; 2 GOOD #&gt; 3 FAIR #&gt; 4 POOR #&gt; 8 DK #&gt; 9 NA A.4 Importing data from APIs into R In addition to working with data saved as files, we may also need to retrieve data through Application Programming Interfaces (APIs.) APIs provide a structured way to access data hosted on external servers and import it directly into R for analysis. To access these data, we need to understand how to construct API requests. Each API has unique endpoints, parameters, and authentication requirements. Pay attention to: Endpoints: These are URLs that point to specific data or services Parameters: Information passed to the API to customize the request (e.g., date ranges, filters) Authentication: APIs may require API keys or tokens for access Rate Limits: APIs may have usage limits, so be aware of any rate limits or quotas Typically, we begin by making a GET request to an API endpoint. The {httr2} package allows us to generate and process HTTP requests (Wickham 2023b). We can make the GET request by pointing to the URL that contains the data we would like. library(httr2) api_url &lt;- &quot;https://api.example.com/survey-data&quot; response &lt;- GET(api_url) Once we make the request, we obtain the data as the response. The data often come in JSON format. We can extract and parse the data using the {jsonlite} package, allowing us to work with it in R (Ooms 2014). The fromJSON() function, shown below, converts JSON data to an R object. survey_data &lt;- fromJSON(content(response, &quot;text&quot;)) Note that these are dummy examples. Please review the documentation to understand how to make requests from a specific API. R offers several packages that simplify API access by providing ready-to-use functions for popular APIs. These packages are called “wrappers”, as they “wrap” the API to make it easier to use. For example, the {tidycensus} package used in this book simplifies access to U.S. Census data, allowing us to retrieve data with R commands instead of writing API requests from scratch (Walker and Herman 2024). Behind the scenes, get_pums() is making a GET request from the Census API, and the {tidycensus} functions are converting the response into an R-friendly format. For example, if we are interested in the age, sex, race, and Hispanicity of those in the American Community Survey sample in Durham County, North Carolina30, we can use the get_pums() function to extract this microdata as shown in the code below. We can then use the replicate weights to create a survey object and calculate estimates for Durham County. library(tidycensus) durh_pums &lt;- get_pums( variables = c(&quot;PUMA&quot;, &quot;SEX&quot;, &quot;AGEP&quot;, &quot;RAC1P&quot;, &quot;HISP&quot;), state = &quot;NC&quot;, puma = c(&quot;01301&quot;, &quot;01302&quot;), survey = &quot;acs1&quot;, year = 2022, rep_weights = &quot;person&quot; ) ## Getting data from the 2022 1-year ACS Public Use Microdata Sample ## Warning: • You have not set a Census API key. Users without a key are limited to 500 ## queries per day and may experience performance limitations. ## ℹ For best results, get a Census API key at ## http://api.census.gov/data/key_signup.html and then supply the key to the ## `census_api_key()` function to use it throughout your tidycensus session. ## This warning is displayed once per session. durh_pums ## # A tibble: 2,724 × 90 ## SERIALNO SPORDER AGEP PUMA ST SEX HISP RAC1P WGTP PWGTP ## &lt;chr&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 2022HU0937941 1 60 01302 37 2 01 1 132 132 ## 2 2022HU0937941 2 61 01302 37 1 01 1 132 107 ## 3 2022HU0938759 1 44 01301 37 1 01 1 60 61 ## 4 2022HU0938759 2 48 01301 37 2 01 1 60 63 ## 5 2022HU0938759 3 19 01301 37 1 01 1 60 107 ## 6 2022HU0938759 4 16 01301 37 2 01 1 60 50 ## 7 2022HU0938759 5 12 01301 37 2 01 1 60 84 ## 8 2022HU0939904 1 53 01302 37 1 01 1 104 104 ## 9 2022HU0939904 2 53 01302 37 1 01 1 104 101 ## 10 2022HU0941348 1 70 01301 37 1 01 1 77 77 ## # ℹ 2,714 more rows ## # ℹ 80 more variables: PWGTP1 &lt;dbl&gt;, PWGTP2 &lt;dbl&gt;, PWGTP3 &lt;dbl&gt;, ## # PWGTP4 &lt;dbl&gt;, PWGTP5 &lt;dbl&gt;, PWGTP6 &lt;dbl&gt;, PWGTP7 &lt;dbl&gt;, ## # PWGTP8 &lt;dbl&gt;, PWGTP9 &lt;dbl&gt;, PWGTP10 &lt;dbl&gt;, PWGTP11 &lt;dbl&gt;, ## # PWGTP12 &lt;dbl&gt;, PWGTP13 &lt;dbl&gt;, PWGTP14 &lt;dbl&gt;, PWGTP15 &lt;dbl&gt;, ## # PWGTP16 &lt;dbl&gt;, PWGTP17 &lt;dbl&gt;, PWGTP18 &lt;dbl&gt;, PWGTP19 &lt;dbl&gt;, ## # PWGTP20 &lt;dbl&gt;, PWGTP21 &lt;dbl&gt;, PWGTP22 &lt;dbl&gt;, PWGTP23 &lt;dbl&gt;, … In Chapter 4, we used the {censusapi} package to get data from the Census data API for the Current Population Survey. To discover if there’s an R package that directly interfaces with a specific survey or data source, search for “[survey] R wrapper” or “[data source] R package” online. A.5 Accessing databases in R Databases provide a secure and organized solution as the volume and complexity of data grow. We can access, manage, and update data stored in databases in a systematic way. Because of how the data are organized, teams can draw from the same source and obtain any metadata that would be helpful for analysis. There are various ways of working with databases in RStudio. We can connect to different databases through the Connections Pane in the top right of the IDE. We can also use packages like {DBI} and {odbc} to access database tables in R files. Here is an example script connecting to a database: con &lt;- DBI::dbConnect( odbc::odbc(), Driver = &quot;[driver name]&quot;, Server = &quot;[server path]&quot;, UID = rstudioapi::askForPassword(&quot;Database user&quot;), PWD = rstudioapi::askForPassword(&quot;Database password&quot;), Database = &quot;[database name]&quot;, Warehouse = &quot;[warehouse name]&quot;, Schema = &quot;[schema name]&quot; ) The {dbplyr} and {dplyr} packages allow us to make queries and run data analysis entirely using {dplyr} syntax. All of the code can be written in R, so we do not have to switch between R and SQL to explore the data. Here is some sample code: q1 &lt;- tbl(con, &quot;bank&quot;) %&gt;% group_by(month_idx, year, month) %&gt;% summarise( subscribe = sum(ifelse(term_deposit == &quot;yes&quot;, 1, 0)), total = n()) show_query(q1) Be sure to check the documentation to configure a database connection. A.6 Importing data from other formats R also offers dedicated packages such as {googlesheets4} for Google Sheets or {qualtRics} for Qualtrics. With less common or proprietary file formats, the broader data science community can often provide guidance. Online resources like Stack Overflow and dedicated forums like Posit Community are valuable sources of information for importing data into R. References Larmarange, Joseph. 2023. labelled: Manipulating Labelled Data. https://larmarange.github.io/labelled/. Ooms, Jeroen. 2014. “The jsonlite Package: A Practical and Consistent Mapping Between JSON Data and R Objects.” arXiv:1403.2805 [Stat.CO]. https://arxiv.org/abs/1403.2805. Walker, Kyle, and Matt Herman. 2024. tidycensus: Load US Census Boundary and Attribute Data as ’tidyverse’ and ’sf’-Ready Data Frames. https://walker-data.com/tidycensus/. ———. 2023b. httr2: Perform HTTP Requests and Process the Responses. Wickham, Hadley, Jim Hester, and Jennifer Bryan. 2023. readr: Read Rectangular Text Data. Wickham, Hadley, Evan Miller, and Danny Smith. 2023. haven: Import and Export ’SPSS’, ’Stata’ and ’SAS’ Files. The public use microdata areas (PUMA) for Durham County were identified using the 2020 PUMA Names File: https://www2.census.gov/geo/pdfs/reference/puma2020/2020_PUMA_Names.pdf↩︎ "],["anes-cb.html", "B ANES derived variable codebook B.1 ADMIN B.2 WEIGHTS B.3 PRE-ELECTION SURVEY QUESTIONNAIRE B.4 POST-ELECTION SURVEY QUESTIONNAIRE", " B ANES derived variable codebook The full codebook with the original variables is available at American National Election Studies (2022). This is a codebook for the ANES data used in this book (anes_2020) from the {srvyrexploR} package. B.1 ADMIN V200001 Description: 2020 Case ID Variable class: numeric CaseID Description: 2020 Case ID Variable class: numeric V200002 Description: Mode of interview: pre-election interview Variable class: haven_labelled, vctrs_vctr, double V200002 Label n Unweighted Freq 1 Video 274 0.037 2 Telephone 115 0.015 3 Web 7064 0.948 Total 7453 1.000 InterviewMode Description: Mode of interview: pre-election interview Variable class: factor InterviewMode n Unweighted Freq Video 274 0.037 Telephone 115 0.015 Web 7064 0.948 Total 7453 1.000 B.2 WEIGHTS V200010b Description: Full sample post-election weight Variable class: numeric N Missing Minimum Median Maximum 0 0.0083 0.6863 6.651 Weight Description: Full sample post-election weight Variable class: numeric N Missing Minimum Median Maximum 0 0.0083 0.6863 6.651 V200010c Description: Full sample variance unit Variable class: numeric N Missing Minimum Median Maximum 0 1 2 3 VarUnit Description: Full sample variance unit Variable class: factor VarUnit n Unweighted Freq 1 3689 0.495 2 3750 0.503 3 14 0.002 Total 7453 1.000 V200010d Description: Full sample variance stratum Variable class: numeric N Missing Minimum Median Maximum 0 1 24 50 Stratum Description: Full sample variance stratum Variable class: factor Stratum n Unweighted Freq 1 167 0.022 2 148 0.020 3 158 0.021 4 151 0.020 5 147 0.020 6 172 0.023 7 163 0.022 8 159 0.021 9 160 0.021 10 159 0.021 11 137 0.018 12 179 0.024 13 148 0.020 14 160 0.021 15 159 0.021 16 148 0.020 17 158 0.021 18 156 0.021 19 154 0.021 20 144 0.019 21 170 0.023 22 146 0.020 23 165 0.022 24 147 0.020 25 169 0.023 26 165 0.022 27 172 0.023 28 133 0.018 29 157 0.021 30 167 0.022 31 154 0.021 32 143 0.019 33 143 0.019 34 124 0.017 35 138 0.019 36 130 0.017 37 136 0.018 38 145 0.019 39 140 0.019 40 125 0.017 41 158 0.021 42 146 0.020 43 130 0.017 44 126 0.017 45 126 0.017 46 135 0.018 47 133 0.018 48 140 0.019 49 133 0.018 50 130 0.017 Total 7453 1.000 B.3 PRE-ELECTION SURVEY QUESTIONNAIRE V201006 Description: PRE: How interested in following campaigns Question: Some people don’t pay much attention to political campaigns. How about you? Would you say that you have been very much interested, somewhat interested or not much interested in the political campaigns so far this year? Variable class: haven_labelled, vctrs_vctr, double V201006 Label n Unweighted Freq -9 -9. Refused 1 0.000 1 Very much interested 3940 0.529 2 Somewhat interested 2569 0.345 3 Not much interested 943 0.127 Total 7453 1.000 CampaignInterest Description: PRE: How interested in following campaigns Question: Some people don’t pay much attention to political campaigns. How about you? Would you say that you have been very much interested, somewhat interested or not much interested in the political campaigns so far this year? Variable class: factor CampaignInterest n Unweighted Freq Very much interested 3940 0.529 Somewhat interested 2569 0.345 Not much interested 943 0.127 NA 1 0.000 Total 7453 1.000 V201023 Description: PRE: Confirmation voted (early) in November 3 Election (2020) Question: Just to be clear, I’m recording that you already voted in the election that is scheduled to take place on November 3. Is that right? Variable class: haven_labelled, vctrs_vctr, double V201023 Label n Unweighted Freq -9 -9. Refused 2 0.000 -1 -1. Inapplicable 6961 0.934 1 Yes, voted 375 0.050 2 No, have not voted 115 0.015 Total 7453 1.000 EarlyVote2020 Description: PRE: Confirmation voted (early) in November 3 Election (2020) Question: Just to be clear, I’m recording that you already voted in the election that is scheduled to take place on November 3. Is that right? Variable class: factor EarlyVote2020 n Unweighted Freq Yes 375 0.050 No 115 0.015 NA 6963 0.934 Total 7453 1.000 V201024 Description: PRE: In what manner did R vote Question: Which one of the following best describes how you voted? Variable class: haven_labelled, vctrs_vctr, double V201024 Label n Unweighted Freq -9 -9. Refused 1 0.000 -1 -1. Inapplicable 7078 0.950 1 Definitely voted in person at a polling place before election day 101 0.014 2 Definitely voted by mailing a ballot to elections officials before election day 242 0.032 3 Definitely voted in some other way 28 0.004 4 Not completely sure whether you voted or not 3 0.000 Total 7453 1.000 V201025x Description: PRE: SUMMARY: Registration and early vote status Variable class: haven_labelled, vctrs_vctr, double V201025x Label n Unweighted Freq -4 -4. Technical error 1 0.000 1 Not registered (or DK/RF), does not intend to register (or DK/RF intent) 339 0.045 2 Not registered (or DK/RF), intends to register 290 0.039 3 Registered but did not vote early (or DK/RF) 6452 0.866 4 Registered and voted early 371 0.050 Total 7453 1.000 V201028 Description: PRE: DID R VOTE FOR PRESIDENT Question: How about the election for President? Did you vote for a candidate for President? Variable class: haven_labelled, vctrs_vctr, double V201028 Label n Unweighted Freq -9 -9. Refused 1 0.000 -1 -1. Inapplicable 7081 0.950 1 Yes, voted for President 361 0.048 2 No, didn’t vote for President 10 0.001 Total 7453 1.000 V201029 Description: PRE: For whom did R vote for President Question: Who did you vote for? [Joe Biden, Donald Trump/Donald Trump, Joe Biden], Jo Jorgensen, Howie Hawkins, or someone else? Variable class: haven_labelled, vctrs_vctr, double V201029 Label n Unweighted Freq -9 -9. Refused 10 0.001 -1 -1. Inapplicable 7092 0.952 1 Joe Biden 239 0.032 2 Donald Trump 103 0.014 3 Jo Jorgensen 2 0.000 4 Howie Hawkins 1 0.000 5 Other candidate {SPECIFY} 4 0.001 12 Specified as refused 2 0.000 Total 7453 1.000 V201101 Description: PRE: Did R vote for President in 2016 [revised] Question: Four years ago, in 2016, Hillary Clinton ran on the Democratic ticket against Donald Trump for the Republicans. We talk to many people who tell us they did not vote. And we talk to a few people who tell us they did vote, who really did not. We can tell they did not vote by checking with official government records. What about you? If we check the official government voter records, will they show that you voted in the 2016 presidential election, or that you did not vote in that election? Variable class: haven_labelled, vctrs_vctr, double V201101 Label n Unweighted Freq -9 -9. Refused 13 0.002 -8 -8. Don’t know 1 0.000 -1 -1. Inapplicable 3780 0.507 1 Yes, voted 2780 0.373 2 No, didn’t vote 879 0.118 Total 7453 1.000 V201102 Description: PRE: Did R vote for President in 2016 Question: Four years ago, in 2016, Hillary Clinton ran on the Democratic ticket against Donald Trump for the Republicans. Do you remember for sure whether or not you voted in that election? Variable class: haven_labelled, vctrs_vctr, double V201102 Label n Unweighted Freq -9 -9. Refused 6 0.001 -8 -8. Don’t know 1 0.000 -1 -1. Inapplicable 3673 0.493 1 Yes, voted 3030 0.407 2 No, didn’t vote 743 0.100 Total 7453 1.000 VotedPres2016 Description: PRE: Did R vote for President in 2016 Question: Derived from V201102, V201101 Variable class: factor VotedPres2016 n Unweighted Freq Yes 5810 0.780 No 1622 0.218 NA 21 0.003 Total 7453 1.000 V201103 Description: PRE: Recall of last (2016) Presidential vote choice Question: Which one did you vote for? Variable class: haven_labelled, vctrs_vctr, double V201103 Label n Unweighted Freq -9 -9. Refused 41 0.006 -8 -8. Don’t know 2 0.000 -1 -1. Inapplicable 1643 0.220 1 Hillary Clinton 2911 0.391 2 Donald Trump 2466 0.331 5 Other {SPECIFY} 390 0.052 Total 7453 1.000 VotedPres2016_selection Description: PRE: Recall of last (2016) Presidential vote choice Question: Which one did you vote for? Variable class: factor VotedPres2016_selection n Unweighted Freq Clinton 2911 0.391 Trump 2466 0.331 Other 390 0.052 NA 1686 0.226 Total 7453 1.000 V201228 Description: PRE: Party ID: Does R think of self as Democrat, Republican, or Independent Question: Generally speaking, do you usually think of yourself as [a Democrat, a Republican / a Republican, a Democrat], an independent, or what? Variable class: haven_labelled, vctrs_vctr, double V201228 Label n Unweighted Freq -9 -9. Refused 37 0.005 -8 -8. Don’t know 4 0.001 -4 -4. Technical error 1 0.000 0 No preference {VOL - video/phone only} 6 0.001 1 Democrat 2589 0.347 2 Republican 2304 0.309 3 Independent 2277 0.306 5 Other party {SPECIFY} 235 0.032 Total 7453 1.000 V201229 Description: PRE: Party Identification strong - Democrat Republican Question: Would you call yourself a strong [Democrat / Republican] or a not very strong [Democrat / Republican]? Variable class: haven_labelled, vctrs_vctr, double V201229 Label n Unweighted Freq -9 -9. Refused 4 0.001 -1 -1. Inapplicable 2560 0.343 1 Strong 3341 0.448 2 Not very strong 1548 0.208 Total 7453 1.000 V201230 Description: PRE: No Party Identification - closer to Democratic Party or Republican Party Question: Do you think of yourself as closer to the Republican Party or to the Democratic Party? Variable class: haven_labelled, vctrs_vctr, double V201230 Label n Unweighted Freq -9 -9. Refused 19 0.003 -8 -8. Don’t know 2 0.000 -1 -1. Inapplicable 4893 0.657 1 Closer to Republican 782 0.105 2 Neither {VOL in video and phone} 876 0.118 3 Closer to Democratic 881 0.118 Total 7453 1.000 V201231x Description: PRE: SUMMARY: Party ID Question: Derived from V201228, V201229, and PTYID_LEANPTY Variable class: haven_labelled, vctrs_vctr, double V201231x Label n Unweighted Freq -9 -9. Refused 23 0.003 -8 -8. Don’t know 2 0.000 1 Strong Democrat 1796 0.241 2 Not very strong Democrat 790 0.106 3 Independent-Democrat 881 0.118 4 Independent 876 0.118 5 Independent-Republican 782 0.105 6 Not very strong Republican 758 0.102 7 Strong Republican 1545 0.207 Total 7453 1.000 PartyID Description: PRE: SUMMARY: Party ID Question: Derived from V201228, V201229, and PTYID_LEANPTY Variable class: factor PartyID n Unweighted Freq Strong democrat 1796 0.241 Not very strong democrat 790 0.106 Independent-democrat 881 0.118 Independent 876 0.118 Independent-republican 782 0.105 Not very strong republican 758 0.102 Strong republican 1545 0.207 NA 25 0.003 Total 7453 1.000 V201233 Description: PRE: How often trust government in Washington to do what is right [revised] Question: How often can you trust the federal government in Washington to do what is right? Variable class: haven_labelled, vctrs_vctr, double V201233 Label n Unweighted Freq -9 -9. Refused 26 0.003 -8 -8. Don’t know 3 0.000 1 Always 80 0.011 2 Most of the time 1016 0.136 3 About half the time 2313 0.310 4 Some of the time 3313 0.445 5 Never 702 0.094 Total 7453 1.000 TrustGovernment Description: PRE: How often trust government in Washington to do what is right [revised] Question: How often can you trust the federal government in Washington to do what is right? Variable class: factor TrustGovernment n Unweighted Freq Always 80 0.011 Most of the time 1016 0.136 About half the time 2313 0.310 Some of the time 3313 0.445 Never 702 0.094 NA 29 0.004 Total 7453 1.000 V201237 Description: PRE: How often can people be trusted Question: Generally speaking, how often can you trust other people? Variable class: haven_labelled, vctrs_vctr, double V201237 Label n Unweighted Freq -9 -9. Refused 12 0.002 -8 -8. Don’t know 1 0.000 1 Always 48 0.006 2 Most of the time 3511 0.471 3 About half the time 2020 0.271 4 Some of the time 1597 0.214 5 Never 264 0.035 Total 7453 1.000 TrustPeople Description: PRE: How often can people be trusted Question: Generally speaking, how often can you trust other people? Variable class: factor TrustPeople n Unweighted Freq Always 48 0.006 Most of the time 3511 0.471 About half the time 2020 0.271 Some of the time 1597 0.214 Never 264 0.035 NA 13 0.002 Total 7453 1.000 V201507x Description: PRE: SUMMARY: Respondent age Question: Derived from birth month, day and year Variable class: haven_labelled, vctrs_vctr, double N Missing N Refused (-9) Minimum Median Maximum 0 294 18 53 80 Age Description: PRE: SUMMARY: Respondent age Question: Derived from birth month, day and year Variable class: numeric N Missing Minimum Median Maximum 294 18 53 80 AgeGroup Description: PRE: SUMMARY: Respondent age Question: Derived from birth month, day and year Variable class: factor AgeGroup n Unweighted Freq 18-29 871 0.117 30-39 1241 0.167 40-49 1081 0.145 50-59 1200 0.161 60-69 1436 0.193 70 or older 1330 0.178 NA 294 0.039 Total 7453 1.000 V201510 Description: PRE: Highest level of Education Question: What is the highest level of school you have completed or the highest degree you have received? Variable class: haven_labelled, vctrs_vctr, double V201510 Label n Unweighted Freq -9 -9. Refused 25 0.003 -8 -8. Don’t know 1 0.000 1 Less than high school credential 312 0.042 2 High school graduate - High school diploma or equivalent (e.g. GED) 1160 0.156 3 Some college but no degree 1519 0.204 4 Associate degree in college - occupational/vocational 550 0.074 5 Associate degree in college - academic 445 0.060 6 Bachelor’s degree (e.g. BA, AB, BS) 1877 0.252 7 Master’s degree (e.g. MA, MS, MEng, MEd, MSW, MBA) 1092 0.147 8 Professional school degree (e.g. MD, DDS, DVM, LLB, JD)/Doctoral degree (e.g. PHD, EDD) 382 0.051 95 Other {SPECIFY} 90 0.012 Total 7453 1.000 Education Description: PRE: Highest level of Education Question: What is the highest level of school you have completed or the highest degree you have received? Variable class: factor Education n Unweighted Freq Less than HS 312 0.042 High school 1160 0.156 Post HS 2514 0.337 Bachelor’s 1877 0.252 Graduate 1474 0.198 NA 116 0.016 Total 7453 1.000 V201546 Description: PRE: R: Are you Spanish, Hispanic, or Latino Question: Are you of Hispanic, Latino, or Spanish origin? Variable class: haven_labelled, vctrs_vctr, double V201546 Label n Unweighted Freq -9 -9. Refused 45 0.006 -8 -8. Don’t know 3 0.000 1 Yes 662 0.089 2 No 6743 0.905 Total 7453 1.000 V201547a Description: RESTRICTED: PRE: Race of R: White [mention] Question: I am going to read you a list of five race categories. You may choose one or more races. For this survey, Hispanic origin is not a race. Are you White? Variable class: haven_labelled, vctrs_vctr, double V201547a Label n Unweighted Freq -3 -3. Restricted 7453 1 Total 7453 1 V201547b Description: RESTRICTED: PRE: Race of R: Black or African-American [mention] Question: I am going to read you a list of five race categories. You may choose one or more races. For this survey, Hispanic origin is not a race. Are you Black or African American? Variable class: haven_labelled, vctrs_vctr, double V201547b Label n Unweighted Freq -3 -3. Restricted 7453 1 Total 7453 1 V201547c Description: RESTRICTED: PRE: Race of R: Asian [mention] Question: I am going to read you a list of five race categories. You may choose one or more races. For this survey, Hispanic origin is not a race. Are you Asian? Variable class: haven_labelled, vctrs_vctr, double V201547c Label n Unweighted Freq -3 -3. Restricted 7453 1 Total 7453 1 V201547d Description: RESTRICTED: PRE: Race of R: Native Hawaiian or Pacific Islander [mention] Question: I am going to read you a list of five race categories. You may choose one or more races. For this survey, Hispanic origin is not a race. Are you White; Black or African American; American Indian or Alaska Native; Asian; or Native Hawaiian or Other Pacific Islander? Variable class: haven_labelled, vctrs_vctr, double V201547d Label n Unweighted Freq -3 -3. Restricted 7453 1 Total 7453 1 V201547e Description: RESTRICTED: PRE: Race of R: Native American or Alaska Native [mention] Question: I am going to read you a list of five race categories. You may choose one or more races. For this survey, Hispanic origin is not a race. Are you American Indian or Alaska Native? Variable class: haven_labelled, vctrs_vctr, double V201547e Label n Unweighted Freq -3 -3. Restricted 7453 1 Total 7453 1 V201547z Description: RESTRICTED: PRE: Race of R: other specify Question: I am going to read you a list of five race categories. You may choose one or more races. For this survey, Hispanic origin is not a race. Reported other Variable class: haven_labelled, vctrs_vctr, double V201547z Label n Unweighted Freq -3 -3. Restricted 7453 1 Total 7453 1 V201549x Description: PRE: SUMMARY: R self-identified race/ethnicity Question: Derived from V201546, V201547a-V201547e, and V201547z Variable class: haven_labelled, vctrs_vctr, double V201549x Label n Unweighted Freq -9 -9. Refused 75 0.010 -8 -8. Don’t know 6 0.001 1 White, non-Hispanic 5420 0.727 2 Black, non-Hispanic 650 0.087 3 Hispanic 662 0.089 4 Asian or Native Hawaiian/other Pacific Islander, non-Hispanic alone 248 0.033 5 Native American/Alaska Native or other race, non-Hispanic alone 155 0.021 6 Multiple races, non-Hispanic 237 0.032 Total 7453 1.000 RaceEth Description: PRE: SUMMARY: R self-identified race/ethnicity Question: Derived from V201546, V201547a-V201547e, and V201547z Variable class: factor RaceEth n Unweighted Freq White 5420 0.727 Black 650 0.087 Hispanic 662 0.089 Asian, NH/PI 248 0.033 AI/AN 155 0.021 Other/multiple race 237 0.032 NA 81 0.011 Total 7453 1.000 V201600 Description: PRE: What is your (R) sex? [revised] Question: What is your sex? Variable class: haven_labelled, vctrs_vctr, double V201600 Label n Unweighted Freq -9 -9. Refused 51 0.007 1 Male 3375 0.453 2 Female 4027 0.540 Total 7453 1.000 Gender Description: PRE: What is your (R) sex? [revised] Question: What is your sex? Variable class: factor Gender n Unweighted Freq Male 3375 0.453 Female 4027 0.540 NA 51 0.007 Total 7453 1.000 V201607 Description: RESTRICTED: PRE: Total income amount - revised Question: The next question is about [the total combined income of all members of your family / your total income] during the past 12 months. This includes money from jobs, net income from business, farm or rent, pensions, dividends, interest, Social Security payments, and any other money income received by members of your family who are 15 years of age or older. What was the total income of your family during the past 12 months? TYPE THE NUMBER. YOUR BEST GUESS IS FINE. Variable class: haven_labelled, vctrs_vctr, double V201607 Label n Unweighted Freq -3 -3. Restricted 7453 1 Total 7453 1 V201610 Description: RESTRICTED: PRE: Income amt missing - categories lt 20K Question: Please choose the answer that includes the income of all members of your family during the past 12 months before taxes. Variable class: haven_labelled, vctrs_vctr, double V201610 Label n Unweighted Freq -3 -3. Restricted 7453 1 Total 7453 1 V201611 Description: RESTRICTED: PRE: Income amt missing - categories 20-40K Question: Please choose the answer that includes the income of all members of your family during the past 12 months before taxes. Variable class: haven_labelled, vctrs_vctr, double V201611 Label n Unweighted Freq -3 -3. Restricted 7453 1 Total 7453 1 V201613 Description: RESTRICTED: PRE: Income amt missing - categories 40-70K Question: Please choose the answer that includes the income of all members of your family during the past 12 months before taxes. Variable class: haven_labelled, vctrs_vctr, double V201613 Label n Unweighted Freq -3 -3. Restricted 7453 1 Total 7453 1 V201615 Description: RESTRICTED: PRE: Income amt missing - categories 70-100K Question: Please choose the answer that includes the income of all members of your family during the past 12 months before taxes. Variable class: haven_labelled, vctrs_vctr, double V201615 Label n Unweighted Freq -3 -3. Restricted 7453 1 Total 7453 1 V201616 Description: RESTRICTED: PRE: Income amt missing - categories 100+K Question: Please choose the answer that includes the income of all members of your family during the past 12 months before taxes. Variable class: haven_labelled, vctrs_vctr, double V201616 Label n Unweighted Freq -3 -3. Restricted 7453 1 Total 7453 1 V201617x Description: PRE: SUMMARY: Total (family) income Question: Derived from V201607, V201610, V201611, V201613, V201615, V201616 Variable class: haven_labelled, vctrs_vctr, double V201617x Label n Unweighted Freq -9 -9. Refused 502 0.067 -5 -5. Interview breakoff (sufficient partial IW) 15 0.002 1 Under $9,999 647 0.087 2 $10,000-14,999 244 0.033 3 $15,000-19,999 185 0.025 4 $20,000-24,999 301 0.040 5 $25,000-29,999 228 0.031 6 $30,000-34,999 296 0.040 7 $35,000-39,999 226 0.030 8 $40,000-44,999 286 0.038 9 $45,000-49,999 213 0.029 10 $50,000-59,999 485 0.065 11 $60,000-64,999 294 0.039 12 $65,000-69,999 168 0.023 13 $70,000-74,999 243 0.033 14 $75,000-79,999 215 0.029 15 $80,000-89,999 383 0.051 16 $90,000-99,999 291 0.039 17 $100,000-109,999 451 0.061 18 $110,000-124,999 312 0.042 19 $125,000-149,999 323 0.043 20 $150,000-174,999 366 0.049 21 $175,000-249,999 374 0.050 22 $250,000 or more 405 0.054 Total 7453 1.000 Income Description: PRE: SUMMARY: Total (family) income Question: Derived from V201607, V201610, V201611, V201613, V201615, V201616 Variable class: factor Income n Unweighted Freq Under $9,999 647 0.087 $10,000-14,999 244 0.033 $15,000-19,999 185 0.025 $20,000-24,999 301 0.040 $25,000-29,999 228 0.031 $30,000-34,999 296 0.040 $35,000-39,999 226 0.030 $40,000-44,999 286 0.038 $45,000-49,999 213 0.029 $50,000-59,999 485 0.065 $60,000-64,999 294 0.039 $65,000-69,999 168 0.023 $70,000-74,999 243 0.033 $75,000-79,999 215 0.029 $80,000-89,999 383 0.051 $90,000-99,999 291 0.039 $100,000-109,999 451 0.061 $110,000-124,999 312 0.042 $125,000-149,999 323 0.043 $150,000-174,999 366 0.049 $175,000-249,999 374 0.050 $250,000 or more 405 0.054 NA 517 0.069 Total 7453 1.000 Income7 Description: PRE: SUMMARY: Total (family) income Question: Derived from V201607, V201610, V201611, V201613, V201615, V201616 Variable class: factor Income7 n Unweighted Freq Under $20k 1076 0.144 $20k to &lt; 40k 1051 0.141 $40k to &lt; 60k 984 0.132 $60k to &lt; 80k 920 0.123 $80k to &lt; 100k 674 0.090 $100k to &lt; 125k 763 0.102 $125k or more 1468 0.197 NA 517 0.069 Total 7453 1.000 B.4 POST-ELECTION SURVEY QUESTIONNAIRE V202051 Description: POST: R registered to vote (post-election) Question: Now on a different topic. Are you registered to vote at [Respondent’s preloaded address], registered at a different address, or not currently registered? Variable class: haven_labelled, vctrs_vctr, double V202051 Label n Unweighted Freq -9 -9. Refused 4 0.001 -6 -6. No post-election interview 4 0.001 -1 -1. Inapplicable 6820 0.915 1 Registered at this address 173 0.023 2 Registered at a different address 59 0.008 3 Not currently registered 393 0.053 Total 7453 1.000 V202066 Description: POST: Did R vote in November 2020 election Question: In talking to people about elections, we often find that a lot of people were not able to vote because they weren’t registered, they were sick, or they just didn’t have time. Which of the following statements best describes you: Variable class: haven_labelled, vctrs_vctr, double V202066 Label n Unweighted Freq -9 -9. Refused 7 0.001 -6 -6. No post-election interview 4 0.001 -1 -1. Inapplicable 372 0.050 1 I did not vote (in the election this November) 582 0.078 2 I thought about voting this time, but didn’t 265 0.036 3 I usually vote, but didn’t this time 192 0.026 4 I am sure I voted 6031 0.809 Total 7453 1.000 V202072 Description: POST: Did R vote for President Question: How about the election for President? Did you vote for a candidate for President? Variable class: haven_labelled, vctrs_vctr, double V202072 Label n Unweighted Freq -9 -9. Refused 2 0.000 -6 -6. No post-election interview 4 0.001 -1 -1. Inapplicable 1418 0.190 1 Yes, voted for President 5952 0.799 2 No, didn’t vote for President 77 0.010 Total 7453 1.000 VotedPres2020 Description: POST: Did R vote for President Question: How about the election for President? Did you vote for a candidate for President? Variable class: factor VotedPres2020 n Unweighted Freq Yes 6313 0.847 No 87 0.012 NA 1053 0.141 Total 7453 1.000 V202073 Description: POST: For whom did R vote for President Question: Who did you vote for? [Joe Biden, Donald Trump/Donald Trump, Joe Biden], Jo Jorgensen, Howie Hawkins, or someone else? Variable class: haven_labelled, vctrs_vctr, double V202073 Label n Unweighted Freq -9 -9. Refused 53 0.007 -6 -6. No post-election interview 4 0.001 -1 -1. Inapplicable 1497 0.201 1 Joe Biden 3267 0.438 2 Donald Trump 2462 0.330 3 Jo Jorgensen 69 0.009 4 Howie Hawkins 23 0.003 5 Other candidate {SPECIFY} 56 0.008 7 Specified as Republican candidate 1 0.000 8 Specified as Libertarian candidate 3 0.000 11 Specified as don’t know 2 0.000 12 Specified as refused 16 0.002 Total 7453 1.000 V202109x Description: PRE-POST: SUMMARY: Voter turnout in 2020 Question: Derived from V201024, V202066, V202051 Variable class: haven_labelled, vctrs_vctr, double V202109x Label n Unweighted Freq -2 -2. Not reported 7 0.001 0 Did not vote 1039 0.139 1 Voted 6407 0.860 Total 7453 1.000 V202110x Description: PRE-POST: SUMMARY: 2020 Presidential vote Question: Derived from V201029, V202073 Variable class: haven_labelled, vctrs_vctr, double V202110x Label n Unweighted Freq -9 -9. Refused 81 0.011 -8 -8. Don’t know 2 0.000 -1 -1. Inapplicable 1136 0.152 1 Joe Biden 3509 0.471 2 Donald Trump 2567 0.344 3 Jo Jorgensen 74 0.010 4 Howie Hawkins 24 0.003 5 Other candidate {SPECIFY} 60 0.008 Total 7453 1.000 VotedPres2020_selection Description: PRE-POST: SUMMARY: 2020 Presidential vote Question: Derived from V201029, V202073 Variable class: factor VotedPres2020_selection n Unweighted Freq Biden 3509 0.471 Trump 2567 0.344 Other 158 0.021 NA 1219 0.164 Total 7453 1.000 References ———. 2022. “ANES 2020 Time Series Study Full Release: User Guide and Codebook.” https://electionstudies.org/wp-content/uploads/2022/02/anes_timeseries_2020_userguidecodebook_20220210.pdf. "],["recs-cb.html", "C RECS derived variable codebook C.1 ADMIN C.2 GEOGRAPHY C.3 WEATHER C.4 YOUR HOME C.5 SPACE HEATING C.6 AIR CONDITIONING C.7 THERMOSTAT C.8 WEIGHTS C.9 CONSUMPTION AND EXPENDITURE", " C RECS derived variable codebook The full codebook with the original variables is available at https://www.eia.gov/consumption/residential/data/2020/index.php?view=microdata - “Variable and response codebook”. This is a codebook for the RECS data used in this book (recs_2020) from the {srvyrexploR} package. C.1 ADMIN DOEID Description: Unique identifier for each respondent ClimateRegion_BA Description: Building America Climate Zone ClimateRegion_BA n Unweighted Freq Mixed-Dry 142 0.008 Mixed-Humid 5579 0.302 Hot-Humid 2545 0.138 Hot-Dry 1577 0.085 Very-Cold 572 0.031 Cold 7116 0.385 Marine 911 0.049 Subarctic 54 0.003 Total 18496 1.000 Urbanicity Description: 2010 Census Urban Type Code Urbanicity n Unweighted Freq Urban Area 12395 0.670 Urban Cluster 2020 0.109 Rural 4081 0.221 Total 18496 1.000 C.2 GEOGRAPHY Region Description: Census Region Region n Unweighted Freq Northeast 3657 0.198 Midwest 3832 0.207 South 6426 0.347 West 4581 0.248 Total 18496 1.000 REGIONC Description: Census Region REGIONC n Unweighted Freq MIDWEST 3832 0.207 NORTHEAST 3657 0.198 SOUTH 6426 0.347 WEST 4581 0.248 Total 18496 1.000 Division Description: Census Division, Mountain Division is divided into North and South for RECS purposes Division n Unweighted Freq New England 1680 0.091 Middle Atlantic 1977 0.107 East North Central 2014 0.109 West North Central 1818 0.098 South Atlantic 3256 0.176 East South Central 1343 0.073 West South Central 1827 0.099 Mountain North 1180 0.064 Mountain South 904 0.049 Pacific 2497 0.135 Total 18496 1.000 STATE_FIPS Description: State Federal Information Processing System Code STATE_FIPS n Unweighted Freq 01 242 0.013 02 311 0.017 04 495 0.027 05 268 0.014 06 1152 0.062 08 360 0.019 09 294 0.016 10 143 0.008 11 221 0.012 12 655 0.035 13 417 0.023 15 282 0.015 16 270 0.015 17 530 0.029 18 400 0.022 19 286 0.015 20 208 0.011 21 428 0.023 22 311 0.017 23 223 0.012 24 359 0.019 25 552 0.030 26 388 0.021 27 325 0.018 28 168 0.009 29 296 0.016 30 172 0.009 31 189 0.010 32 231 0.012 33 175 0.009 34 456 0.025 35 178 0.010 36 904 0.049 37 479 0.026 38 331 0.018 39 339 0.018 40 232 0.013 41 313 0.017 42 617 0.033 44 191 0.010 45 334 0.018 46 183 0.010 47 505 0.027 48 1016 0.055 49 188 0.010 50 245 0.013 51 451 0.024 53 439 0.024 54 197 0.011 55 357 0.019 56 190 0.010 Total 18496 1.000 state_postal Description: State Postal Code state_postal n Unweighted Freq AL 242 0.013 AK 311 0.017 AZ 495 0.027 AR 268 0.014 CA 1152 0.062 CO 360 0.019 CT 294 0.016 DE 143 0.008 DC 221 0.012 FL 655 0.035 GA 417 0.023 HI 282 0.015 ID 270 0.015 IL 530 0.029 IN 400 0.022 IA 286 0.015 KS 208 0.011 KY 428 0.023 LA 311 0.017 ME 223 0.012 MD 359 0.019 MA 552 0.030 MI 388 0.021 MN 325 0.018 MS 168 0.009 MO 296 0.016 MT 172 0.009 NE 189 0.010 NV 231 0.012 NH 175 0.009 NJ 456 0.025 NM 178 0.010 NY 904 0.049 NC 479 0.026 ND 331 0.018 OH 339 0.018 OK 232 0.013 OR 313 0.017 PA 617 0.033 RI 191 0.010 SC 334 0.018 SD 183 0.010 TN 505 0.027 TX 1016 0.055 UT 188 0.010 VT 245 0.013 VA 451 0.024 WA 439 0.024 WV 197 0.011 WI 357 0.019 WY 190 0.010 Total 18496 1.000 state_name Description: State Name state_name n Unweighted Freq Alabama 242 0.013 Alaska 311 0.017 Arizona 495 0.027 Arkansas 268 0.014 California 1152 0.062 Colorado 360 0.019 Connecticut 294 0.016 Delaware 143 0.008 District of Columbia 221 0.012 Florida 655 0.035 Georgia 417 0.023 Hawaii 282 0.015 Idaho 270 0.015 Illinois 530 0.029 Indiana 400 0.022 Iowa 286 0.015 Kansas 208 0.011 Kentucky 428 0.023 Louisiana 311 0.017 Maine 223 0.012 Maryland 359 0.019 Massachusetts 552 0.030 Michigan 388 0.021 Minnesota 325 0.018 Mississippi 168 0.009 Missouri 296 0.016 Montana 172 0.009 Nebraska 189 0.010 Nevada 231 0.012 New Hampshire 175 0.009 New Jersey 456 0.025 New Mexico 178 0.010 New York 904 0.049 North Carolina 479 0.026 North Dakota 331 0.018 Ohio 339 0.018 Oklahoma 232 0.013 Oregon 313 0.017 Pennsylvania 617 0.033 Rhode Island 191 0.010 South Carolina 334 0.018 South Dakota 183 0.010 Tennessee 505 0.027 Texas 1016 0.055 Utah 188 0.010 Vermont 245 0.013 Virginia 451 0.024 Washington 439 0.024 West Virginia 197 0.011 Wisconsin 357 0.019 Wyoming 190 0.010 Total 18496 1.000 C.3 WEATHER HDD65 Description: Heating degree days in 2020, base temperature 65F; Derived from the weighted temperatures of nearby weather stations N Missing Minimum Median Maximum 0 0 4396 17383 CDD65 Description: Cooling degree days in 2020, base temperature 65F; Derived from the weighted temperatures of nearby weather stations N Missing Minimum Median Maximum 0 0 1179 5534 HDD30YR Description: Heating degree days, 30-year average 1981-2010, base temperature 65F; Taken from nearest weather station, inoculated with random errors N Missing Minimum Median Maximum 0 0 4825 16071 CDD30YR Description: Cooling degree days, 30-year average 1981-2010, base temperature 65F; Taken from nearest weather station, inoculated with random errors N Missing Minimum Median Maximum 0 0 1020 4905 C.4 YOUR HOME HousingUnitType Description: Type of housing unit Question: Which best describes your home? HousingUnitType n Unweighted Freq Mobile home 974 0.053 Single-family detached 12319 0.666 Single-family attached 1751 0.095 Apartment: 2-4 Units 1013 0.055 Apartment: 5 or more units 2439 0.132 Total 18496 1.000 YearMade Description: Range when housing unit was built Question: Derived from: In what year was your home built? AND Although you do not know the exact year your home was built, it is helpful to have an estimate. About when was your home built? YearMade n Unweighted Freq Before 1950 2721 0.147 1950-1959 1685 0.091 1960-1969 1867 0.101 1970-1979 2817 0.152 1980-1989 2435 0.132 1990-1999 2451 0.133 2000-2009 2748 0.149 2010-2015 989 0.053 2016-2020 783 0.042 Total 18496 1.000 TOTSQFT_EN Description: Total energy-consuming area (square footage) of the housing unit. Includes all main living areas; all basements; heated, cooled, or finished attics; and heating or cooled garages. For single-family housing units this is derived using the respondent-reported square footage (SQFTEST) and adjusted using the “include” variables (e.g., SQFTINCB), where applicable. For apartments and mobile homes this is the respondent-reported square footage. A derived variable rounded to the nearest 10 N Missing Minimum Median Maximum 0 200 1700 15000 TOTHSQFT Description: Square footage of the housing unit that is heated by space heating equipment. A derived variable rounded to the nearest 10 N Missing Minimum Median Maximum 0 0 1520 15000 TOTCSQFT Description: Square footage of the housing unit that is cooled by air-conditioning equipment or evaporative cooler, a derived variable rounded to the nearest 10 N Missing Minimum Median Maximum 0 0 1200 14600 ZTOTSQFT_EN Description: Imputation indicator for SQFTEST ZTOTSQFT_EN n Unweighted Freq Not imputed 11930 0.645 Imputed 6566 0.355 Total 18496 1.000 ZYearMade Description: Imputation indicator for YEARMADERANGE ZYearMade n Unweighted Freq Not imputed 18176 0.983 Imputed 320 0.017 Total 18496 1.000 ZHousingUnitType Description: Imputation indicator for TYPEHUQ ZHousingUnitType n Unweighted Freq Not imputed 18496 1 Total 18496 1 C.5 SPACE HEATING SpaceHeatingUsed Description: Space heating equipment used Question: Is your home heated during the winter? SpaceHeatingUsed n Unweighted Freq FALSE 751 0.041 TRUE 17745 0.959 Total 18496 1.000 ZSpaceHeatingUsed Description: Imputation indicator for HEATHOME ZSpaceHeatingUsed n Unweighted Freq Not imputed 18474 0.999 Imputed 22 0.001 Total 18496 1.000 C.6 AIR CONDITIONING ACUsed Description: Air conditioning equipment used Question: Is any air conditioning equipment used in your home? ACUsed n Unweighted Freq FALSE 2325 0.126 TRUE 16171 0.874 Total 18496 1.000 ZACUsed Description: Imputation indicator for AIRCOND ZACUsed n Unweighted Freq Not imputed 18448 0.997 Imputed 48 0.003 Total 18496 1.000 ZACBehavior Description: Imputation indicator for COOLCNTL ZACBehavior n Unweighted Freq Not imputed 15819 0.855 Imputed 352 0.019 Not applicable 2325 0.126 Total 18496 1.000 C.7 THERMOSTAT HeatingBehavior Description: Winter temperature control method Question: Which of the following best describes how your household controls the indoor temperature during the winter? HeatingBehavior n Unweighted Freq Set one temp and leave it 7806 0.422 Manually adjust at night/no one home 4654 0.252 Programmable or smart thermostat automatically adjusts the temperature 3310 0.179 Turn on or off as needed 1491 0.081 No control 438 0.024 Other 46 0.002 NA 751 0.041 Total 18496 1.000 WinterTempDay Description: Winter thermostat setting or temperature in home when someone is home during the day Question: During the winter, what is your home’s typical indoor temperature when someone is home during the day? N Missing Minimum Median Maximum 751 50 70 90 WinterTempAway Description: Winter thermostat setting or temperature in home when no one is home during the day Question: During the winter, what is your home’s typical indoor temperature when no one is inside your home during the day? N Missing Minimum Median Maximum 751 50 68 90 WinterTempNight Description: Winter thermostat setting or temperature in home at night Question: During the winter, what is your home’s typical indoor temperature inside your home at night? N Missing Minimum Median Maximum 751 50 68 90 ACBehavior Description: Summer temperature control method Question: Which of the following best describes how your household controls the indoor temperature during the summer? ACBehavior n Unweighted Freq Set one temp and leave it 6738 0.364 Manually adjust at night/no one home 3637 0.197 Programmable or smart thermostat automatically adjusts the temperature 2638 0.143 Turn on or off as needed 2746 0.148 No control 409 0.022 Other 3 0.000 NA 2325 0.126 Total 18496 1.000 SummerTempDay Description: Summer thermostat setting or temperature in home when someone is home during the day Question: During the summer, what is your home’s typical indoor temperature when someone is home during the day? N Missing Minimum Median Maximum 2325 50 72 90 SummerTempAway Description: Summer thermostat setting or temperature in home when no one is home during the day Question: During the summer, what is your home’s typical indoor temperature when no one is inside your home during the day? N Missing Minimum Median Maximum 2325 50 74 90 SummerTempNight Description: Summer thermostat setting or temperature in home at night Question: During the summer, what is your home’s typical indoor temperature inside your home at night? N Missing Minimum Median Maximum 2325 50 72 90 ZHeatingBehavior Description: Imputation indicator for HEATCNTL ZHeatingBehavior n Unweighted Freq Not imputed 17395 0.940 Imputed 350 0.019 Not applicable 751 0.041 Total 18496 1.000 ZWinterTempAway Description: Imputation indicator for TEMPGONE ZWinterTempAway n Unweighted Freq Not imputed 16840 0.910 Imputed 905 0.049 Not applicable 751 0.041 Total 18496 1.000 ZSummerTempAway Description: Imputation indicator for TEMPGONEAC ZSummerTempAway n Unweighted Freq Not imputed 15240 0.824 Imputed 931 0.050 Not applicable 2325 0.126 Total 18496 1.000 ZWinterTempDay Description: Imputation indicator for TEMPHOME ZWinterTempDay n Unweighted Freq Not imputed 17382 0.940 Imputed 363 0.020 Not applicable 751 0.041 Total 18496 1.000 ZSummerTempDay Description: Imputation indicator for TEMPHOMEAC ZSummerTempDay n Unweighted Freq Not imputed 15658 0.847 Imputed 513 0.028 Not applicable 2325 0.126 Total 18496 1.000 ZWinterTempNight Description: Imputation indicator for TEMPNITE ZWinterTempNight n Unweighted Freq Not imputed 17207 0.930 Imputed 538 0.029 Not applicable 751 0.041 Total 18496 1.000 ZSummerTempNight Description: Imputation indicator for TEMPNITEAC ZSummerTempNight n Unweighted Freq Not imputed 15497 0.838 Imputed 674 0.036 Not applicable 2325 0.126 Total 18496 1.000 C.8 WEIGHTS NWEIGHT Description: Final Analysis Weight N Missing Minimum Median Maximum 0 437.9 6119 29279 NWEIGHT1 Description: Final Analysis Weight for replicate 1 N Missing Minimum Median Maximum 0 0 6136 30015 NWEIGHT2 Description: Final Analysis Weight for replicate 2 N Missing Minimum Median Maximum 0 0 6151 29422 NWEIGHT3 Description: Final Analysis Weight for replicate 3 N Missing Minimum Median Maximum 0 0 6151 29431 NWEIGHT4 Description: Final Analysis Weight for replicate 4 N Missing Minimum Median Maximum 0 0 6153 29494 NWEIGHT5 Description: Final Analysis Weight for replicate 5 N Missing Minimum Median Maximum 0 0 6134 30039 NWEIGHT6 Description: Final Analysis Weight for replicate 6 N Missing Minimum Median Maximum 0 0 6147 29419 NWEIGHT7 Description: Final Analysis Weight for replicate 7 N Missing Minimum Median Maximum 0 0 6135 29586 NWEIGHT8 Description: Final Analysis Weight for replicate 8 N Missing Minimum Median Maximum 0 0 6151 29499 NWEIGHT9 Description: Final Analysis Weight for replicate 9 N Missing Minimum Median Maximum 0 0 6139 29845 NWEIGHT10 Description: Final Analysis Weight for replicate 10 N Missing Minimum Median Maximum 0 0 6163 29635 NWEIGHT11 Description: Final Analysis Weight for replicate 11 N Missing Minimum Median Maximum 0 0 6140 29681 NWEIGHT12 Description: Final Analysis Weight for replicate 12 N Missing Minimum Median Maximum 0 0 6160 29849 NWEIGHT13 Description: Final Analysis Weight for replicate 13 N Missing Minimum Median Maximum 0 0 6142 29843 NWEIGHT14 Description: Final Analysis Weight for replicate 14 N Missing Minimum Median Maximum 0 0 6154 30184 NWEIGHT15 Description: Final Analysis Weight for replicate 15 N Missing Minimum Median Maximum 0 0 6145 29970 NWEIGHT16 Description: Final Analysis Weight for replicate 16 N Missing Minimum Median Maximum 0 0 6133 29825 NWEIGHT17 Description: Final Analysis Weight for replicate 17 N Missing Minimum Median Maximum 0 0 6126 30606 NWEIGHT18 Description: Final Analysis Weight for replicate 18 N Missing Minimum Median Maximum 0 0 6155 29689 NWEIGHT19 Description: Final Analysis Weight for replicate 19 N Missing Minimum Median Maximum 0 0 6153 29336 NWEIGHT20 Description: Final Analysis Weight for replicate 20 N Missing Minimum Median Maximum 0 0 6139 30274 NWEIGHT21 Description: Final Analysis Weight for replicate 21 N Missing Minimum Median Maximum 0 0 6135 29766 NWEIGHT22 Description: Final Analysis Weight for replicate 22 N Missing Minimum Median Maximum 0 0 6149 29791 NWEIGHT23 Description: Final Analysis Weight for replicate 23 N Missing Minimum Median Maximum 0 0 6148 30126 NWEIGHT24 Description: Final Analysis Weight for replicate 24 N Missing Minimum Median Maximum 0 0 6136 29946 NWEIGHT25 Description: Final Analysis Weight for replicate 25 N Missing Minimum Median Maximum 0 0 6150 30445 NWEIGHT26 Description: Final Analysis Weight for replicate 26 N Missing Minimum Median Maximum 0 0 6136 29893 NWEIGHT27 Description: Final Analysis Weight for replicate 27 N Missing Minimum Median Maximum 0 0 6125 30030 NWEIGHT28 Description: Final Analysis Weight for replicate 28 N Missing Minimum Median Maximum 0 0 6149 29599 NWEIGHT29 Description: Final Analysis Weight for replicate 29 N Missing Minimum Median Maximum 0 0 6146 30136 NWEIGHT30 Description: Final Analysis Weight for replicate 30 N Missing Minimum Median Maximum 0 0 6149 29895 NWEIGHT31 Description: Final Analysis Weight for replicate 31 N Missing Minimum Median Maximum 0 0 6144 29604 NWEIGHT32 Description: Final Analysis Weight for replicate 32 N Missing Minimum Median Maximum 0 0 6159 29310 NWEIGHT33 Description: Final Analysis Weight for replicate 33 N Missing Minimum Median Maximum 0 0 6148 29408 NWEIGHT34 Description: Final Analysis Weight for replicate 34 N Missing Minimum Median Maximum 0 0 6139 29564 NWEIGHT35 Description: Final Analysis Weight for replicate 35 N Missing Minimum Median Maximum 0 0 6141 30437 NWEIGHT36 Description: Final Analysis Weight for replicate 36 N Missing Minimum Median Maximum 0 0 6149 27896 NWEIGHT37 Description: Final Analysis Weight for replicate 37 N Missing Minimum Median Maximum 0 0 6133 30596 NWEIGHT38 Description: Final Analysis Weight for replicate 38 N Missing Minimum Median Maximum 0 0 6139 30130 NWEIGHT39 Description: Final Analysis Weight for replicate 39 N Missing Minimum Median Maximum 0 0 6147 29262 NWEIGHT40 Description: Final Analysis Weight for replicate 40 N Missing Minimum Median Maximum 0 0 6144 30344 NWEIGHT41 Description: Final Analysis Weight for replicate 41 N Missing Minimum Median Maximum 0 0 6153 29594 NWEIGHT42 Description: Final Analysis Weight for replicate 42 N Missing Minimum Median Maximum 0 0 6137 29938 NWEIGHT43 Description: Final Analysis Weight for replicate 43 N Missing Minimum Median Maximum 0 0 6157 29878 NWEIGHT44 Description: Final Analysis Weight for replicate 44 N Missing Minimum Median Maximum 0 0 6148 29896 NWEIGHT45 Description: Final Analysis Weight for replicate 45 N Missing Minimum Median Maximum 0 0 6149 29729 NWEIGHT46 Description: Final Analysis Weight for replicate 46 N Missing Minimum Median Maximum 0 0 6152 29103 NWEIGHT47 Description: Final Analysis Weight for replicate 47 N Missing Minimum Median Maximum 0 0 6150 30070 NWEIGHT48 Description: Final Analysis Weight for replicate 48 N Missing Minimum Median Maximum 0 0 6139 29343 NWEIGHT49 Description: Final Analysis Weight for replicate 49 N Missing Minimum Median Maximum 0 0 6146 29590 NWEIGHT50 Description: Final Analysis Weight for replicate 50 N Missing Minimum Median Maximum 0 0 6159 30027 NWEIGHT51 Description: Final Analysis Weight for replicate 51 N Missing Minimum Median Maximum 0 0 6150 29247 NWEIGHT52 Description: Final Analysis Weight for replicate 52 N Missing Minimum Median Maximum 0 0 6154 29445 NWEIGHT53 Description: Final Analysis Weight for replicate 53 N Missing Minimum Median Maximum 0 0 6156 30131 NWEIGHT54 Description: Final Analysis Weight for replicate 54 N Missing Minimum Median Maximum 0 0 6151 29439 NWEIGHT55 Description: Final Analysis Weight for replicate 55 N Missing Minimum Median Maximum 0 0 6143 29216 NWEIGHT56 Description: Final Analysis Weight for replicate 56 N Missing Minimum Median Maximum 0 0 6153 29203 NWEIGHT57 Description: Final Analysis Weight for replicate 57 N Missing Minimum Median Maximum 0 0 6138 29819 NWEIGHT58 Description: Final Analysis Weight for replicate 58 N Missing Minimum Median Maximum 0 0 6137 29818 NWEIGHT59 Description: Final Analysis Weight for replicate 59 N Missing Minimum Median Maximum 0 0 6144 29606 NWEIGHT60 Description: Final Analysis Weight for replicate 60 N Missing Minimum Median Maximum 0 0 6140 29818 C.9 CONSUMPTION AND EXPENDITURE BTUEL Description: Total electricity use, in thousand Btu, 2020, including self-generation of solar power N Missing Minimum Median Maximum 0 143.3 31890 628155 DOLLAREL Description: Total electricity cost, in dollars, 2020 N Missing Minimum Median Maximum 0 -889.5 1258 15680 ZBTUEL Description: Imputation flag for total electricity use ZBTUEL n Unweighted Freq Not imputed 15965 0.863 Imputed amount and cost 2138 0.116 Imputed only amount for SOLAR=1 cases 393 0.021 Total 18496 1.000 BTUNG Description: Total natural gas use, in thousand Btu, 2020 N Missing Minimum Median Maximum 0 0 22012 1134709 DOLLARNG Description: Total natural gas cost, in dollars, 2020 N Missing Minimum Median Maximum 0 0 313.9 8155 ZBTUNG Description: Imputation flag for total natural gas use ZBTUNG n Unweighted Freq Not imputed 8823 0.477 Imputed 2331 0.126 Not applicable 7342 0.397 Total 18496 1.000 BTULP Description: Total propane use, in thousand Btu, 2020 N Missing Minimum Median Maximum 0 0 0 364215 DOLLARLP Description: Total propane cost, in dollars, 2020 N Missing Minimum Median Maximum 0 0 0 6621 ZBTULP Description: Imputation flag for total propane use ZBTULP n Unweighted Freq Not imputed 896 0.048 Imputed 1103 0.060 Not applicable 16497 0.892 Total 18496 1.000 BTUFO Description: Total fuel oil/kerosene use, in thousand Btu, 2020 N Missing Minimum Median Maximum 0 0 0 426268 DOLLARFO Description: Total fuel oil/kerosene cost, in dollars, 2020 N Missing Minimum Median Maximum 0 0 0 7004 ZBTUFO Description: Imputation flag for total fuel oil/kerosene use ZBTUFO n Unweighted Freq Not imputed 626 0.034 Imputed 607 0.033 Not applicable 17263 0.933 Total 18496 1.000 BTUWOOD Description: Total wood use, in thousand Btu, 2020 N Missing Minimum Median Maximum 0 0 0 5e+05 ZBTUWOOD Description: Imputation flag for total wood use ZBTUWOOD n Unweighted Freq Not imputed 1730 0.094 Imputed 244 0.013 Not applicable 16522 0.893 Total 18496 1.000 TOTALBTU Description: Total usage including electricity, natural gas, propane, and fuel oil, in thousand Btu, 2020 N Missing Minimum Median Maximum 0 1182 74180 1367548 TOTALDOL Description: Total cost including electricity, natural gas, propane, and fuel oil, in dollars, 2020 N Missing Minimum Median Maximum 0 -150.5 1793 20043 "],["exercise-solutions.html", "D Exercise solutions 5 - Descriptive analysis 6 - Statistical testing 7 - Modeling 10 - Specifying sample designs and replicate weights in {srvyr} 13 - National Crime Victimization Survey Vignette 14 - AmericasBarometer Vignette", " D Exercise solutions The chapter exercises use the survey design objects and packages provided in the Prerequisites box in the beginning of the chapter. Please ensure they are loaded in the environment before running the exercise solutions. Code chunks to load these are also included below. library(tidyverse) library(survey) library(srvyr) library(srvyrexploR) library(broom) library(prettyunits) library(gt) targetpop &lt;- 231592693 anes_adjwgt &lt;- anes_2020 %&gt;% mutate(Weight = Weight / sum(Weight) * targetpop) anes_des &lt;- anes_adjwgt %&gt;% as_survey_design( weights = Weight, strata = Stratum, ids = VarUnit, nest = TRUE ) recs_des &lt;- recs_2020 %&gt;% as_survey_rep( weights = NWEIGHT, repweights = NWEIGHT1:NWEIGHT60, type = &quot;JK1&quot;, scale = 59/60, mse = TRUE ) inc_series &lt;- ncvs_2021_incident %&gt;% mutate( series = case_when(V4017 %in% c(1, 8) ~ 1, V4018 %in% c(2, 8) ~ 1, V4019 %in% c(1, 8) ~ 1, TRUE ~ 2 ), n10v4016 = case_when(V4016 %in% c(997, 998) ~ NA_real_, V4016 &gt; 10 ~ 10, TRUE ~ V4016), serieswgt = case_when(series == 2 &amp; is.na(n10v4016) ~ 6, series == 2 ~ n10v4016, TRUE ~ 1), NEWWGT = WGTVICCY * serieswgt ) inc_ind &lt;- inc_series %&gt;% filter(V4022 != 1) %&gt;% mutate( WeapCat = case_when( is.na(V4049) ~ NA_character_, V4049 == 2 ~ &quot;NoWeap&quot;, V4049 == 3 ~ &quot;UnkWeapUse&quot;, V4050 == 3 ~ &quot;Other&quot;, V4051 == 1 | V4052 == 1 | V4050 == 7 ~ &quot;Firearm&quot;, V4053 == 1 | V4054 == 1 ~ &quot;Knife&quot;, TRUE ~ &quot;Other&quot; ), V4529_num = parse_number(as.character(V4529)), ReportPolice = V4399 == 1, Property = V4529_num &gt;= 31, Violent = V4529_num &lt;= 20, Property_ReportPolice = Property &amp; ReportPolice, Violent_ReportPolice = Violent &amp; ReportPolice, AAST = V4529_num %in% 11:13, AAST_NoWeap = AAST &amp; WeapCat == &quot;NoWeap&quot;, AAST_Firearm = AAST &amp; WeapCat == &quot;Firearm&quot;, AAST_Knife = AAST &amp; WeapCat == &quot;Knife&quot;, AAST_Other = AAST &amp; WeapCat == &quot;Other&quot; ) inc_hh_sums &lt;- inc_ind %&gt;% filter(V4529_num &gt; 23) %&gt;% # restrict to household crimes group_by(YEARQ, IDHH) %&gt;% summarize(WGTVICCY = WGTVICCY[1], across(starts_with(&quot;Property&quot;), ~ sum(. * serieswgt), .names = &quot;{.col}&quot;), .groups = &quot;drop&quot;) inc_pers_sums &lt;- inc_ind %&gt;% filter(V4529_num &lt;= 23) %&gt;% # restrict to person crimes group_by(YEARQ, IDHH, IDPER) %&gt;% summarize(WGTVICCY = WGTVICCY[1], across(c(starts_with(&quot;Violent&quot;), starts_with(&quot;AAST&quot;)), ~ sum(. * serieswgt), .names = &quot;{.col}&quot;), .groups = &quot;drop&quot;) hh_z_list &lt;- rep(0, ncol(inc_hh_sums) - 3) %&gt;% as.list() %&gt;% setNames(names(inc_hh_sums)[-(1:3)]) pers_z_list &lt;- rep(0, ncol(inc_pers_sums) - 4) %&gt;% as.list() %&gt;% setNames(names(inc_pers_sums)[-(1:4)]) hh_vsum &lt;- ncvs_2021_household %&gt;% full_join(inc_hh_sums, by = c(&quot;YEARQ&quot;, &quot;IDHH&quot;)) %&gt;% replace_na(hh_z_list) %&gt;% mutate(ADJINC_WT = if_else(is.na(WGTVICCY), 0, WGTVICCY / WGTHHCY)) pers_vsum &lt;- ncvs_2021_person %&gt;% full_join(inc_pers_sums, by = c(&quot;YEARQ&quot;, &quot;IDHH&quot;, &quot;IDPER&quot;)) %&gt;% replace_na(pers_z_list) %&gt;% mutate(ADJINC_WT = if_else(is.na(WGTVICCY), 0, WGTVICCY / WGTPERCY)) hh_vsum_der &lt;- hh_vsum %&gt;% mutate( Tenure = factor(case_when(V2015 == 1 ~ &quot;Owned&quot;, !is.na(V2015) ~ &quot;Rented&quot;), levels = c(&quot;Owned&quot;, &quot;Rented&quot;)), Urbanicity = factor(case_when(V2143 == 1 ~ &quot;Urban&quot;, V2143 == 2 ~ &quot;Suburban&quot;, V2143 == 3 ~ &quot;Rural&quot;), levels = c(&quot;Urban&quot;, &quot;Suburban&quot;, &quot;Rural&quot;)), SC214A_num = as.numeric(as.character(SC214A)), Income = case_when(SC214A_num &lt;= 8 ~ &quot;Less than $25,000&quot;, SC214A_num &lt;= 12 ~ &quot;$25,000-49,999&quot;, SC214A_num &lt;= 15 ~ &quot;$50,000-99,999&quot;, SC214A_num &lt;= 17 ~ &quot;$100,000-199,999&quot;, SC214A_num &lt;= 18 ~ &quot;$200,000 or more&quot;), Income = fct_reorder(Income, SC214A_num, .na_rm = FALSE), PlaceSize = case_match(as.numeric(as.character(V2126B)), 0 ~ &quot;Not in a place&quot;, 13 ~ &quot;Under 10,000&quot;, 16 ~ &quot;10,000-49,999&quot;, 17 ~ &quot;50,000-99,999&quot;, 18 ~ &quot;100,000-249,999&quot;, 19 ~ &quot;250,000-499,999&quot;, 20 ~ &quot;500,000-999,999&quot;, c(21, 22, 23) ~ &quot;1,000,000 or more&quot;), PlaceSize = fct_reorder(PlaceSize, as.numeric(V2126B)), Region = case_match(as.numeric(V2127B), 1 ~ &quot;Northeast&quot;, 2 ~ &quot;Midwest&quot;, 3 ~ &quot;South&quot;, 4 ~ &quot;West&quot;), Region = fct_reorder(Region, as.numeric(V2127B)) ) NHOPI &lt;- &quot;Native Hawaiian or Other Pacific Islander&quot; pers_vsum_der &lt;- pers_vsum %&gt;% mutate( Sex = factor(case_when(V3018 == 1 ~ &quot;Male&quot;, V3018 == 2 ~ &quot;Female&quot;)), RaceHispOrigin = factor(case_when(V3024 == 1 ~ &quot;Hispanic&quot;, V3023A == 1 ~ &quot;White&quot;, V3023A == 2 ~ &quot;Black&quot;, V3023A == 4 ~ &quot;Asian&quot;, V3023A == 5 ~ NHOPI, TRUE ~ &quot;Other&quot;), levels = c(&quot;White&quot;, &quot;Black&quot;, &quot;Hispanic&quot;, &quot;Asian&quot;, NHOPI, &quot;Other&quot;)), V3014_num = as.numeric(as.character(V3014)), AgeGroup = case_when(V3014_num &lt;= 17 ~ &quot;12-17&quot;, V3014_num &lt;= 24 ~ &quot;18-24&quot;, V3014_num &lt;= 34 ~ &quot;25-34&quot;, V3014_num &lt;= 49 ~ &quot;35-49&quot;, V3014_num &lt;= 64 ~ &quot;50-64&quot;, V3014_num &lt;= 90 ~ &quot;65 or older&quot;), AgeGroup = fct_reorder(AgeGroup, V3014_num), MaritalStatus = factor(case_when(V3015 == 1 ~ &quot;Married&quot;, V3015 == 2 ~ &quot;Widowed&quot;, V3015 == 3 ~ &quot;Divorced&quot;, V3015 == 4 ~ &quot;Separated&quot;, V3015 == 5 ~ &quot;Never married&quot;), levels = c(&quot;Never married&quot;, &quot;Married&quot;, &quot;Widowed&quot;,&quot;Divorced&quot;, &quot;Separated&quot;)) ) %&gt;% left_join(hh_vsum_der %&gt;% select(YEARQ, IDHH, V2117, V2118, Tenure:Region), by = c(&quot;YEARQ&quot;, &quot;IDHH&quot;)) hh_vsum_slim &lt;- hh_vsum_der %&gt;% select(YEARQ:V2118, WGTVICCY:ADJINC_WT, Tenure, Urbanicity, Income, PlaceSize, Region) pers_vsum_slim &lt;- pers_vsum_der %&gt;% select(YEARQ:WGTPERCY, WGTVICCY:ADJINC_WT, Sex:Region) dummy_records &lt;- hh_vsum_slim %&gt;% distinct(V2117, V2118) %&gt;% mutate(Dummy = 1, WGTVICCY = 1, NEWWGT = 1) inc_analysis &lt;- inc_ind %&gt;% mutate(Dummy = 0) %&gt;% left_join(select(pers_vsum_slim, YEARQ, IDHH, IDPER, Sex:Region), by = c(&quot;YEARQ&quot;, &quot;IDHH&quot;, &quot;IDPER&quot;)) %&gt;% bind_rows(dummy_records) %&gt;% select(YEARQ:IDPER, WGTVICCY, NEWWGT, V4529, WeapCat, ReportPolice, Property:Region) inc_des &lt;- inc_analysis %&gt;% as_survey( weight = NEWWGT, strata = V2117, ids = V2118, nest = TRUE ) hh_des &lt;- hh_vsum_slim %&gt;% as_survey( weight = WGTHHCY, strata = V2117, ids = V2118, nest = TRUE ) pers_des &lt;- pers_vsum_slim %&gt;% as_survey( weight = WGTPERCY, strata = V2117, ids = V2118, nest = TRUE ) The chapter exercises use the survey design objects and packages provided in the Prerequisites box in the beginning of the chapter. Please ensure they are loaded in the environment before running the exercise solutions. 5 - Descriptive analysis How many females have a graduate degree? Hint: the variables Gender and Education will be useful. # Option 1: femgd_option1 &lt;- anes_des %&gt;% filter(Gender == &quot;Female&quot;, Education == &quot;Graduate&quot;) %&gt;% survey_count(name = &quot;n&quot;) femgd_option1 ## # A tibble: 1 × 2 ## n n_se ## &lt;dbl&gt; &lt;dbl&gt; ## 1 15072196. 837872. # Option 2: femgd_option2 &lt;- anes_des %&gt;% filter(Gender == &quot;Female&quot;, Education == &quot;Graduate&quot;) %&gt;% summarize(N = survey_total(), .groups = &quot;drop&quot;) femgd_option2 ## # A tibble: 1 × 2 ## N N_se ## &lt;dbl&gt; &lt;dbl&gt; ## 1 15072196. 837872. Answer: 15,072,196 What percentage of people identify as “Strong Democrat”? Hint: The variable PartyID indicates someone’s party affiliation. psd &lt;- anes_des %&gt;% group_by(PartyID) %&gt;% summarize(p = survey_mean()) %&gt;% filter(PartyID == &quot;Strong democrat&quot;) psd ## # A tibble: 1 × 3 ## PartyID p p_se ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Strong democrat 0.219 0.00646 Answer: 21.9% What percentage of people who voted in the 2020 election identify as “Strong Republican”? Hint: The variable VotedPres2020 indicates whether someone voted in 2020. psr &lt;- anes_des %&gt;% filter(VotedPres2020 == &quot;Yes&quot;) %&gt;% group_by(PartyID) %&gt;% summarize(p = survey_mean()) %&gt;% filter(PartyID == &quot;Strong republican&quot;) psr ## # A tibble: 1 × 3 ## PartyID p p_se ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Strong republican 0.228 0.00824 Answer: 22.8% What percentage of people voted in both the 2016 election and the 2020 election? Include the logit confidence interval. Hint: The variable VotedPres2016 indicates whether someone voted in 2016. pvb &lt;- anes_des %&gt;% filter(!is.na(VotedPres2016),!is.na(VotedPres2020)) %&gt;% group_by(interact(VotedPres2016, VotedPres2020)) %&gt;% summarize(p = survey_prop(var = &quot;ci&quot;, method = &quot;logit&quot;),) %&gt;% filter(VotedPres2016 == &quot;Yes&quot;, VotedPres2020 == &quot;Yes&quot;) pvb ## # A tibble: 1 × 5 ## VotedPres2016 VotedPres2020 p p_low p_upp ## &lt;fct&gt; &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Yes Yes 0.794 0.777 0.810 Answer: 79.4 with confidence interval: (77.7, 81) What is the design effect for the proportion of people who voted early? Hint: The variable EarlyVote2020 indicates whether someone voted early in 2020. pdeff &lt;- anes_des %&gt;% filter(!is.na(EarlyVote2020)) %&gt;% group_by(EarlyVote2020) %&gt;% summarize(p = survey_mean(deff = TRUE)) %&gt;% filter(EarlyVote2020 == &quot;Yes&quot;) pdeff ## # A tibble: 1 × 4 ## EarlyVote2020 p p_se p_deff ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Yes 0.726 0.0247 1.50 Answer: 1.5 What is the median temperature people set their thermostats to at night during the winter? Hint: The variable WinterTempNight indicates the temperature that people set their temperature in the winter at night. med_wintertempnight &lt;- recs_des %&gt;% summarize(wtn_med = survey_median(x = WinterTempNight, na.rm = TRUE)) med_wintertempnight ## # A tibble: 1 × 2 ## wtn_med wtn_med_se ## &lt;dbl&gt; &lt;dbl&gt; ## 1 68 0.250 Answer: 68 People sometimes set their temperature differently over different seasons and during the day. What median temperatures do people set their thermostat to in the summer and winter, both during the day and at night? Include confidence intervals. Hint: Use the variables WinterTempDay, WinterTempNight, SummerTempDay, and SummerTempNight. # Option 1 med_temps &lt;- recs_des %&gt;% summarize( across(c(WinterTempDay, WinterTempNight, SummerTempDay, SummerTempNight), ~survey_median(.x, na.rm=TRUE)) ) med_temps ## # A tibble: 1 × 8 ## WinterTempDay WinterTempDay_se WinterTempNight WinterTempNight_se ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 70 0.250 68 0.250 ## # ℹ 4 more variables: SummerTempDay &lt;dbl&gt;, SummerTempDay_se &lt;dbl&gt;, ## # SummerTempNight &lt;dbl&gt;, SummerTempNight_se &lt;dbl&gt; # Alternatively, could use `survey_quantile()` as shown below for WinterTempNight: quant_temps &lt;- recs_des %&gt;% summarize( across(c(WinterTempDay, WinterTempNight, SummerTempDay, SummerTempNight), ~survey_quantile(.x, quantiles=0.5, na.rm=TRUE)) ) quant_temps ## # A tibble: 1 × 8 ## WinterTempDay_q50 WinterTempDay_q50_se WinterTempNight_q50 ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 70 0.250 68 ## # ℹ 5 more variables: WinterTempNight_q50_se &lt;dbl&gt;, ## # SummerTempDay_q50 &lt;dbl&gt;, SummerTempDay_q50_se &lt;dbl&gt;, ## # SummerTempNight_q50 &lt;dbl&gt;, SummerTempNight_q50_se &lt;dbl&gt; Answer: - Winter during the day: 70 - Winter during the night: 68 - Summer during the day: 72 - Summer during the night: 72 What is the correlation between the temperature that people set their temperature at during the night and during the day in the summer? corr_summer_temp &lt;- recs_des %&gt;% summarize(summer_corr = survey_corr(SummerTempNight, SummerTempDay, na.rm = TRUE)) corr_summer_temp ## # A tibble: 1 × 2 ## summer_corr summer_corr_se ## &lt;dbl&gt; &lt;dbl&gt; ## 1 0.806 0.00806 Answer: 0.806 What is the 1st, 2nd, and 3rd quartile of the amount of money spent on energy by Building America (BA) climate zone? Hint: TOTALDOL indicates the total amount spent on all fuel, and ClimateRegion_BA indicates the BA climate zones. quant_baenergyexp &lt;- recs_des %&gt;% group_by(ClimateRegion_BA) %&gt;% summarize(dol_quant = survey_quantile( TOTALDOL, quantiles = c(0.25, 0.5, 0.75), vartype = &quot;se&quot;, na.rm = TRUE )) quant_baenergyexp ## # A tibble: 8 × 7 ## ClimateRegion_BA dol_quant_q25 dol_quant_q50 dol_quant_q75 ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Mixed-Dry 1091. 1541. 2139. ## 2 Mixed-Humid 1317. 1840. 2462. ## 3 Hot-Humid 1094. 1622. 2233. ## 4 Hot-Dry 926. 1513. 2223. ## 5 Very-Cold 1195. 1986. 2955. ## 6 Cold 1213. 1756. 2422. ## 7 Marine 938. 1380. 1987. ## 8 Subarctic 2404. 3535. 5219. ## # ℹ 3 more variables: dol_quant_q25_se &lt;dbl&gt;, dol_quant_q50_se &lt;dbl&gt;, ## # dol_quant_q75_se &lt;dbl&gt; Answer: #rwvnyhqfyu table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #rwvnyhqfyu thead, #rwvnyhqfyu tbody, #rwvnyhqfyu tfoot, #rwvnyhqfyu tr, #rwvnyhqfyu td, #rwvnyhqfyu th { border-style: none; } #rwvnyhqfyu p { margin: 0; padding: 0; } #rwvnyhqfyu .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #rwvnyhqfyu .gt_caption { padding-top: 4px; padding-bottom: 4px; } #rwvnyhqfyu .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #rwvnyhqfyu .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #rwvnyhqfyu .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #rwvnyhqfyu .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #rwvnyhqfyu .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #rwvnyhqfyu .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #rwvnyhqfyu .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #rwvnyhqfyu .gt_column_spanner_outer:first-child { padding-left: 0; } #rwvnyhqfyu .gt_column_spanner_outer:last-child { padding-right: 0; } #rwvnyhqfyu .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #rwvnyhqfyu .gt_spanner_row { border-bottom-style: hidden; } #rwvnyhqfyu .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #rwvnyhqfyu .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #rwvnyhqfyu .gt_from_md > :first-child { margin-top: 0; } #rwvnyhqfyu .gt_from_md > :last-child { margin-bottom: 0; } #rwvnyhqfyu .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #rwvnyhqfyu .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #rwvnyhqfyu .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #rwvnyhqfyu .gt_row_group_first td { border-top-width: 2px; } #rwvnyhqfyu .gt_row_group_first th { border-top-width: 2px; } #rwvnyhqfyu .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #rwvnyhqfyu .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #rwvnyhqfyu .gt_first_summary_row.thick { border-top-width: 2px; } #rwvnyhqfyu .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #rwvnyhqfyu .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #rwvnyhqfyu .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #rwvnyhqfyu .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #rwvnyhqfyu .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #rwvnyhqfyu .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #rwvnyhqfyu .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #rwvnyhqfyu .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #rwvnyhqfyu .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #rwvnyhqfyu .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #rwvnyhqfyu .gt_left { text-align: left; } #rwvnyhqfyu .gt_center { text-align: center; } #rwvnyhqfyu .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #rwvnyhqfyu .gt_font_normal { font-weight: normal; } #rwvnyhqfyu .gt_font_bold { font-weight: bold; } #rwvnyhqfyu .gt_font_italic { font-style: italic; } #rwvnyhqfyu .gt_super { font-size: 65%; } #rwvnyhqfyu .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #rwvnyhqfyu .gt_asterisk { font-size: 100%; vertical-align: 0; } #rwvnyhqfyu .gt_indent_1 { text-indent: 5px; } #rwvnyhqfyu .gt_indent_2 { text-indent: 10px; } #rwvnyhqfyu .gt_indent_3 { text-indent: 15px; } #rwvnyhqfyu .gt_indent_4 { text-indent: 20px; } #rwvnyhqfyu .gt_indent_5 { text-indent: 25px; } Quartile summary of energy expenditure by BA Climate Zone Q1 Q2 Q3 Mixed-Dry $1,091 $1,541 $2,139 Mixed-Humid $1,317 $1,840 $2,462 Hot-Humid $1,094 $1,622 $2,233 Hot-Dry $926 $1,513 $2,223 Very-Cold $1,195 $1,986 $2,955 Cold $1,213 $1,756 $2,422 Marine $938 $1,380 $1,987 Subarctic $2,404 $3,535 $5,219 6 - Statistical testing Using the RECS data, do more than 50% of U.S. households use A/C (ACUsed)? ttest_solution1 &lt;- recs_des %&gt;% svyttest(design = ., formula = ((ACUsed == TRUE) - 0.5) ~ 0, na.rm = TRUE, alternative=&quot;greater&quot;) %&gt;% tidy() ttest_solution1 ## # A tibble: 1 × 8 ## estimate statistic p.value parameter conf.low conf.high method ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;chr&gt; ## 1 0.387 126. 1.73e-72 58 0.380 0.393 Design-based… ## # ℹ 1 more variable: alternative &lt;chr&gt; Answer: 88.7% of households use air-conditioning which is significantly different from 50% (p&lt;0.0001) so there is strong evidence that more than 50% of households use air-conditioning. Using the RECS data, does the average temperature that U.S. households set their thermostats to differ between the day and night in the winter (WinterTempDay and WinterTempNight)? ttest_solution2 &lt;- recs_des %&gt;% svyttest( design = ., formula = WinterTempDay - WinterTempNight ~ 0, na.rm = TRUE ) %&gt;% tidy() ttest_solution2 ## # A tibble: 1 × 8 ## estimate statistic p.value parameter conf.low conf.high method ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;chr&gt; ## 1 1.67 45.9 2.82e-47 58 1.59 1.74 Design-based… ## # ℹ 1 more variable: alternative &lt;chr&gt; Answer: The average temperature difference between night and day during the winter for thermostat settings is 1.67 which is significantly different from 0 (p&lt;0.0001) so there is strong evidence that the temperature setting is different between night and daytime during the winter. Using the ANES data, does the average age (Age) of those who voted for Joseph Biden in 2020 (VotedPres2020_selection) differ from those who voted for another candidate? ttest_solution3 &lt;- anes_des %&gt;% filter(!is.na(VotedPres2020_selection)) %&gt;% svyttest( design = ., formula = Age ~ VotedPres2020_selection == &quot;Biden&quot;, na.rm = TRUE ) %&gt;% tidy() ttest_solution3 ## # A tibble: 1 × 8 ## estimate statistic p.value parameter conf.low conf.high method ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;chr&gt; ## 1 -3.60 -5.97 0.000000244 50 -4.81 -2.39 Design-ba… ## # ℹ 1 more variable: alternative &lt;chr&gt; On average, those who voted for Joseph Biden in 2020 were -3.6 years younger than voters for other candidates and this is significantly different (p &lt;0.0001). If we wanted to determine if the political party affiliation differed for males and females, what test would we use? Goodness of fit test (svygofchisq()) Test of independence (svychisq()) Test of homogeneity (svychisq()) Answer: c. Test of homogeneity (svychisq()) In the RECS data, is there a relationship between the type of housing unit (HousingUnitType) and the year the house was built (YearMade)? chisq_solution2 &lt;- recs_des %&gt;% svychisq( formula = ~ HousingUnitType + YearMade, design = ., statistic = &quot;Wald&quot;, na.rm = TRUE ) chisq_solution2 %&gt;% tidy() ## Multiple parameters; naming those columns ndf, ddf ## # A tibble: 1 × 5 ## ndf ddf statistic p.value method ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;chr&gt; ## 1 32 59 67.9 5.54e-36 Design-based Wald test of association Answer: There is strong evidence (p&lt;0.0001) that there is a relationship between type of housing unit and the year the house was built. In the ANES data, is there a difference in the distribution of gender (Gender) across early voting status in 2020 (EarlyVote2020)? chisq_solution3 &lt;- anes_des %&gt;% svychisq( formula = ~ Gender + EarlyVote2020, design = ., statistic = &quot;F&quot;, na.rm = TRUE ) %&gt;% tidy() ## Multiple parameters; naming those columns ndf, ddf chisq_solution3 ## # A tibble: 1 × 5 ## ndf ddf statistic p.value method ## &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;chr&gt; ## 1 1 51 4.53 0.0381 Pearson&#39;s X^2: Rao &amp; Scott adjustment Answer: There is strong evidence that there is a difference in the gender distribution of gender by early voting status (p=0.0381). 7 - Modeling The type of housing unit may have an impact on energy expenses. Is there any relationship between housing unit type (HousingUnitType) and total energy expenditure (TOTALDOL)? First, find the average energy expenditure by housing unit type as a descriptive analysis and then do the test. The reference level in the comparison should be the housing unit type that is most common. expense_by_hut &lt;- recs_des %&gt;% group_by(HousingUnitType) %&gt;% summarize(Expense = survey_mean(TOTALDOL, na.rm = TRUE), HUs = survey_total()) %&gt;% arrange(desc(HUs)) expense_by_hut ## # A tibble: 5 × 5 ## HousingUnitType Expense Expense_se HUs HUs_se ## &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 Single-family detached 2205. 9.36 77067692. 0.00000277 ## 2 Apartment: 5 or more units 1108. 13.7 22835862. 0.000000226 ## 3 Apartment: 2-4 Units 1407. 24.2 9341795. 0.119 ## 4 Single-family attached 1653. 22.3 7451177. 0.114 ## 5 Mobile home 1773. 26.2 6832499. 0.0000000927 exp_unit_out &lt;- recs_des %&gt;% mutate(HousingUnitType = fct_infreq(HousingUnitType, NWEIGHT)) %&gt;% svyglm( design = ., formula = TOTALDOL ~ HousingUnitType, na.action = na.omit ) tidy(exp_unit_out) ## # A tibble: 5 × 5 ## term estimate std.error statistic p.value ## &lt;chr&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 (Intercept) 2205. 9.36 236. 2.53e-84 ## 2 HousingUnitTypeApartment: 5 or … -1097. 16.5 -66.3 3.52e-54 ## 3 HousingUnitTypeApartment: 2-4 U… -798. 28.0 -28.5 1.37e-34 ## 4 HousingUnitTypeSingle-family at… -551. 25.0 -22.1 5.28e-29 ## 5 HousingUnitTypeMobile home -431. 27.4 -15.7 5.36e-22 Answer: The reference level should be Single-family detached. All p-values are very small indicating there is a significant relationship between housing unit type and total energy expenditure. Does temperature play a role in electricity expenditure? Cooling degree days are a measure of how hot a place is. CDD65 for a given day indicates the number of degrees Fahrenheit warmer than 65°F (18.3°C) it is in a location. On a day that averages 65°F and below, CDD65=0. While a day that averages 85°F (29.4°C) would have CDD65=20 because it is 20 degrees Fahrenheit warmer (U.S. Energy Information Administration 2023d). For each day in the year, this is summed to give an indicator of how hot the place is throughout the year. Similarly, HDD65 indicates the days colder than 65°F. Can energy expenditure be predicted using these temperature indicators along with square footage? Is there a significant relationship? Include main effects and two-way interactions. temps_sqft_exp &lt;- recs_des %&gt;% svyglm( design = ., formula = DOLLAREL ~ (TOTSQFT_EN + CDD65 + HDD65) ^ 2, na.action = na.omit ) tidy(temps_sqft_exp) %&gt;% mutate(p.value=pretty_p_value(p.value) %&gt;% str_pad(7)) ## # A tibble: 7 × 5 ## term estimate std.error statistic p.value ## &lt;chr&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;chr&gt; ## 1 (Intercept) 741. 70.5 10.5 &quot;&lt;0.0001&quot; ## 2 TOTSQFT_EN 0.272 0.0471 5.77 &quot;&lt;0.0001&quot; ## 3 CDD65 0.0293 0.0227 1.29 &quot; 0.2024&quot; ## 4 HDD65 -0.00111 0.0104 -0.107 &quot; 0.9149&quot; ## 5 TOTSQFT_EN:CDD65 0.0000459 0.0000154 2.97 &quot; 0.0044&quot; ## 6 TOTSQFT_EN:HDD65 -0.00000840 0.00000633 -1.33 &quot; 0.1902&quot; ## 7 CDD65:HDD65 0.00000533 0.00000355 1.50 &quot; 0.1390&quot; Answer: There is a significant interaction between square footage and cooling degree days in the model and the square footage is a significant predictor of eletricity expenditure. Continuing with our results from question 2, create a plot between the actual and predicted expenditures and a residual plot for the predicted expenditures. Answer: temps_sqft_exp_fit &lt;- temps_sqft_exp %&gt;% augment() %&gt;% mutate(.se.fit = sqrt(attr(.fitted, &quot;var&quot;)), # extract the variance of the fitted value .fitted = as.numeric(.fitted)) temps_sqft_exp_fit %&gt;% ggplot(aes(x = DOLLAREL, y = .fitted)) + geom_point() + geom_abline(intercept = 0, slope = 1, color = &quot;red&quot;) + xlab(&quot;Actual expenditures&quot;) + ylab(&quot;Predicted expenditures&quot;) + theme_minimal() FIGURE D.1: Actual and predicted electricity expenditures temps_sqft_exp_fit %&gt;% ggplot(aes(x = .fitted, y = .resid)) + geom_point() + geom_hline(yintercept = 0, color = &quot;red&quot;) + xlab(&quot;Predicted expenditure&quot;) + ylab(&quot;Residual value of expenditure&quot;) + theme_minimal() FIGURE D.2: Residual plot of electric cost model with covariates TOTSQFT_EN, CDD65, and HDD65 Early voting expanded in 2020 (Sprunt 2020). Build a logistic model predicting early voting in 2020 (EarlyVote2020) using age (Age), education (Education), and party identification (PartyID.) Include two-way interactions. Answer: earlyvote_mod &lt;- anes_des %&gt;% filter(!is.na(EarlyVote2020)) %&gt;% svyglm( design = ., formula = EarlyVote2020 ~ (Age + Education + PartyID) ^ 2 , family = quasibinomial ) tidy(earlyvote_mod) %&gt;% print(n=50) ## # A tibble: 46 × 5 ## term estimate std.error statistic p.value ## &lt;chr&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 (Intercept) 3.28e-1 3.86 0.0848 0.940 ## 2 Age -2.20e-2 0.0579 -0.379 0.741 ## 3 EducationHigh school -2.56e+0 3.89 -0.658 0.578 ## 4 EducationPost HS -3.27e+0 3.97 -0.823 0.497 ## 5 EducationBachelor&#39;s -3.29e+0 3.91 -0.842 0.489 ## 6 EducationGraduate -1.36e+0 3.91 -0.349 0.761 ## 7 PartyIDNot very strong democrat 2.00e+0 3.30 0.605 0.607 ## 8 PartyIDIndependent-democrat 3.38e+0 2.60 1.30 0.323 ## 9 PartyIDIndependent 5.22e+0 2.25 2.32 0.146 ## 10 PartyIDIndependent-republican -1.95e+1 2.42 -8.09 0.0149 ## 11 PartyIDNot very strong republic… -1.33e+1 3.24 -4.10 0.0546 ## 12 PartyIDStrong republican 3.13e+0 2.18 1.44 0.287 ## 13 Age:EducationHigh school 4.72e-2 0.0592 0.796 0.509 ## 14 Age:EducationPost HS 5.25e-2 0.0588 0.892 0.467 ## 15 Age:EducationBachelor&#39;s 4.76e-2 0.0600 0.793 0.511 ## 16 Age:EducationGraduate 8.65e-3 0.0578 0.150 0.895 ## 17 Age:PartyIDNot very strong demo… -2.28e-2 0.0497 -0.459 0.691 ## 18 Age:PartyIDIndependent-democrat -7.03e-2 0.0285 -2.46 0.133 ## 19 Age:PartyIDIndependent -8.00e-2 0.0302 -2.65 0.118 ## 20 Age:PartyIDIndependent-republic… 6.72e-2 0.0378 1.78 0.217 ## 21 Age:PartyIDNot very strong repu… -3.07e-2 0.0420 -0.732 0.540 ## 22 Age:PartyIDStrong republican -3.84e-2 0.0180 -2.14 0.166 ## 23 EducationHigh school:PartyIDNot… -1.24e+0 2.22 -0.557 0.633 ## 24 EducationPost HS:PartyIDNot ver… -8.95e-1 2.16 -0.413 0.719 ## 25 EducationBachelor&#39;s:PartyIDNot … -1.21e+0 2.29 -0.528 0.650 ## 26 EducationGraduate:PartyIDNot ve… -1.90e+0 2.25 -0.844 0.487 ## 27 EducationHigh school:PartyIDInd… 7.84e-1 2.50 0.314 0.783 ## 28 EducationPost HS:PartyIDIndepen… 4.04e-1 2.31 0.175 0.877 ## 29 EducationBachelor&#39;s:PartyIDInde… 5.00e-1 2.60 0.193 0.865 ## 30 EducationGraduate:PartyIDIndepe… -1.48e+1 2.47 -5.99 0.0268 ## 31 EducationHigh school:PartyIDInd… -6.32e-1 1.72 -0.368 0.748 ## 32 EducationPost HS:PartyIDIndepen… -9.27e-2 1.63 -0.0568 0.960 ## 33 EducationBachelor&#39;s:PartyIDInde… -2.62e-1 2.13 -0.123 0.913 ## 34 EducationGraduate:PartyIDIndepe… -1.42e+1 1.75 -8.12 0.0148 ## 35 EducationHigh school:PartyIDInd… 1.55e+1 2.56 6.05 0.0262 ## 36 EducationPost HS:PartyIDIndepen… 1.48e+1 2.77 5.34 0.0333 ## 37 EducationBachelor&#39;s:PartyIDInde… 1.77e+1 2.32 7.64 0.0167 ## 38 EducationGraduate:PartyIDIndepe… 1.65e+1 2.33 7.10 0.0193 ## 39 EducationHigh school:PartyIDNot… 1.59e+1 2.02 7.88 0.0157 ## 40 EducationPost HS:PartyIDNot ver… 1.62e+1 1.69 9.54 0.0108 ## 41 EducationBachelor&#39;s:PartyIDNot … 1.58e+1 1.93 8.18 0.0146 ## 42 EducationGraduate:PartyIDNot ve… 1.54e+1 1.72 8.95 0.0123 ## 43 EducationHigh school:PartyIDStr… -2.06e+0 1.88 -1.10 0.387 ## 44 EducationPost HS:PartyIDStrong … 9.17e-2 2.01 0.0456 0.968 ## 45 EducationBachelor&#39;s:PartyIDStro… 6.87e-2 2.06 0.0333 0.976 ## 46 EducationGraduate:PartyIDStrong… -8.53e-1 1.81 -0.471 0.684 Continuing from Exercise 4, predict the probability of early voting for two people. Both are 28 years old and have a graduate degree, but one person is a strong Democrat, and the other is a strong Republican. add_vote_dat &lt;- anes_2020 %&gt;% select(EarlyVote2020, Age, Education, PartyID) %&gt;% rbind(tibble( EarlyVote2020 = NA, Age = 28, Education = &quot;Graduate&quot;, PartyID = c(&quot;Strong democrat&quot;, &quot;Strong republican&quot;) )) %&gt;% tail(2) log_ex_2_out &lt;- earlyvote_mod %&gt;% augment(newdata = add_vote_dat, type.predict = &quot;response&quot;) %&gt;% mutate(.se.fit = sqrt(attr(.fitted, &quot;var&quot;)), # extract the variance of the fitted value .fitted = as.numeric(.fitted)) log_ex_2_out ## # A tibble: 2 × 6 ## EarlyVote2020 Age Education PartyID .fitted .se.fit ## &lt;fct&gt; &lt;dbl&gt; &lt;fct&gt; &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; ## 1 &lt;NA&gt; 28 Graduate Strong democrat 0.197 0.150 ## 2 &lt;NA&gt; 28 Graduate Strong republican 0.450 0.244 Answer: We predict that the 28 year old with a graduate degree who identifies as a strong democrat will vote early 19.7% of the time while a person who is otherwise similar but is a strong replican will vote early 45% of the time 10 - Specifying sample designs and replicate weights in {srvyr} The National Health Interview Survey (NHIS) is an annual household survey conducted by the National Center for Health Statistics (NCHS.) The NHIS includes a wide variety of health topics for adults including health status and conditions, functioning and disability, health care access and health service utilization, health-related behaviors, health promotion, mental health, barriers to care, and community engagement. Like many national in-person surveys, the sampling design is a stratified clustered design with details included in the Survey Description (National Center for Health Statistics 2023). The Survey Description provides information on setting up syntax in SUDAAN, Stata, SPSS, SAS, and R ({survey} package implementation.) We have imported the data and the variable containing the data as: nhis_adult_data. How would we specify the design using either as_survey_design() or as_survey_rep()? Answer: nhis_adult_des &lt;- nhis_adult_data %&gt;% as_survey_design( ids = PPSU, strata = PSTRAT, nest = TRUE, weights = WTFA_A ) The General Social Survey is a survey that has been administered since 1972 on social, behavioral, and attitudinal topics. The 2016-2020 GSS Panel codebook provides examples of setting up syntax in SAS and Stata but not R (Davern et al. 2021). We have imported the data and the variable containing the data as: gss_data. How would we specify the design in R using either as_survey_design() or as_survey_rep()? Answer: gss_des &lt;- gss_data %&gt;% as_survey_design(ids = VPSU_2, strata = VSTRAT_2, weights = WTSSNR_2) 13 - National Crime Victimization Survey Vignette What proportion of completed motor vehicle thefts are not reported to the police? Hint: Use the codebook to look at the definition of Type of Crime (V4529.) ans1 &lt;- inc_des %&gt;% filter(str_detect(V4529, &quot;40|41&quot;)) %&gt;% summarize(Pct = survey_mean(!ReportPolice, na.rm = TRUE) * 100) Answer: It is estimated that 23.1% of motor vehicle thefts are not reported to the police. How many violent crimes occur in each region? Answer: inc_des %&gt;% filter(Violent) %&gt;% survey_count(Region) %&gt;% select(-n_se) %&gt;% gt(rowname_col=&quot;Region&quot;) %&gt;% fmt_integer() %&gt;% cols_label( n =&quot;Violent victimizations&quot;, ) %&gt;% tab_header(&quot;Estimated number of violent crimes by region&quot;) #kqivcbyivm table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #kqivcbyivm thead, #kqivcbyivm tbody, #kqivcbyivm tfoot, #kqivcbyivm tr, #kqivcbyivm td, #kqivcbyivm th { border-style: none; } #kqivcbyivm p { margin: 0; padding: 0; } #kqivcbyivm .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #kqivcbyivm .gt_caption { padding-top: 4px; padding-bottom: 4px; } #kqivcbyivm .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #kqivcbyivm .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #kqivcbyivm .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #kqivcbyivm .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #kqivcbyivm .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #kqivcbyivm .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #kqivcbyivm .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #kqivcbyivm .gt_column_spanner_outer:first-child { padding-left: 0; } #kqivcbyivm .gt_column_spanner_outer:last-child { padding-right: 0; } #kqivcbyivm .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #kqivcbyivm .gt_spanner_row { border-bottom-style: hidden; } #kqivcbyivm .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #kqivcbyivm .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #kqivcbyivm .gt_from_md > :first-child { margin-top: 0; } #kqivcbyivm .gt_from_md > :last-child { margin-bottom: 0; } #kqivcbyivm .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #kqivcbyivm .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #kqivcbyivm .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #kqivcbyivm .gt_row_group_first td { border-top-width: 2px; } #kqivcbyivm .gt_row_group_first th { border-top-width: 2px; } #kqivcbyivm .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #kqivcbyivm .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #kqivcbyivm .gt_first_summary_row.thick { border-top-width: 2px; } #kqivcbyivm .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #kqivcbyivm .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #kqivcbyivm .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #kqivcbyivm .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #kqivcbyivm .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #kqivcbyivm .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #kqivcbyivm .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #kqivcbyivm .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #kqivcbyivm .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #kqivcbyivm .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #kqivcbyivm .gt_left { text-align: left; } #kqivcbyivm .gt_center { text-align: center; } #kqivcbyivm .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #kqivcbyivm .gt_font_normal { font-weight: normal; } #kqivcbyivm .gt_font_bold { font-weight: bold; } #kqivcbyivm .gt_font_italic { font-style: italic; } #kqivcbyivm .gt_super { font-size: 65%; } #kqivcbyivm .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #kqivcbyivm .gt_asterisk { font-size: 100%; vertical-align: 0; } #kqivcbyivm .gt_indent_1 { text-indent: 5px; } #kqivcbyivm .gt_indent_2 { text-indent: 10px; } #kqivcbyivm .gt_indent_3 { text-indent: 15px; } #kqivcbyivm .gt_indent_4 { text-indent: 20px; } #kqivcbyivm .gt_indent_5 { text-indent: 25px; } Estimated number of violent crimes by region Violent victimizations Northeast 698,406 Midwest 1,144,407 South 1,394,214 West 1,361,278 What is the property victimization rate among each income level? Answer: hh_des %&gt;% filter(!is.na(Income)) %&gt;% group_by(Income) %&gt;% summarize(Property_Rate = survey_mean(Property * ADJINC_WT * 1000, na.rm = TRUE)) %&gt;% gt(rowname_col=&quot;Income&quot;) %&gt;% cols_label( Property_Rate=&quot;Rate&quot;, Property_Rate_se=&quot;Standard Error&quot; ) %&gt;% fmt_number(decimals=1) %&gt;% tab_header(&quot;Estimated property victimization rate by income level&quot;) #mdvpqxzjwg table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #mdvpqxzjwg thead, #mdvpqxzjwg tbody, #mdvpqxzjwg tfoot, #mdvpqxzjwg tr, #mdvpqxzjwg td, #mdvpqxzjwg th { border-style: none; } #mdvpqxzjwg p { margin: 0; padding: 0; } #mdvpqxzjwg .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #mdvpqxzjwg .gt_caption { padding-top: 4px; padding-bottom: 4px; } #mdvpqxzjwg .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #mdvpqxzjwg .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #mdvpqxzjwg .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #mdvpqxzjwg .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #mdvpqxzjwg .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #mdvpqxzjwg .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #mdvpqxzjwg .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #mdvpqxzjwg .gt_column_spanner_outer:first-child { padding-left: 0; } #mdvpqxzjwg .gt_column_spanner_outer:last-child { padding-right: 0; } #mdvpqxzjwg .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #mdvpqxzjwg .gt_spanner_row { border-bottom-style: hidden; } #mdvpqxzjwg .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #mdvpqxzjwg .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #mdvpqxzjwg .gt_from_md > :first-child { margin-top: 0; } #mdvpqxzjwg .gt_from_md > :last-child { margin-bottom: 0; } #mdvpqxzjwg .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #mdvpqxzjwg .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #mdvpqxzjwg .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #mdvpqxzjwg .gt_row_group_first td { border-top-width: 2px; } #mdvpqxzjwg .gt_row_group_first th { border-top-width: 2px; } #mdvpqxzjwg .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #mdvpqxzjwg .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #mdvpqxzjwg .gt_first_summary_row.thick { border-top-width: 2px; } #mdvpqxzjwg .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #mdvpqxzjwg .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #mdvpqxzjwg .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #mdvpqxzjwg .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #mdvpqxzjwg .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #mdvpqxzjwg .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #mdvpqxzjwg .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #mdvpqxzjwg .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #mdvpqxzjwg .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #mdvpqxzjwg .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #mdvpqxzjwg .gt_left { text-align: left; } #mdvpqxzjwg .gt_center { text-align: center; } #mdvpqxzjwg .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #mdvpqxzjwg .gt_font_normal { font-weight: normal; } #mdvpqxzjwg .gt_font_bold { font-weight: bold; } #mdvpqxzjwg .gt_font_italic { font-style: italic; } #mdvpqxzjwg .gt_super { font-size: 65%; } #mdvpqxzjwg .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #mdvpqxzjwg .gt_asterisk { font-size: 100%; vertical-align: 0; } #mdvpqxzjwg .gt_indent_1 { text-indent: 5px; } #mdvpqxzjwg .gt_indent_2 { text-indent: 10px; } #mdvpqxzjwg .gt_indent_3 { text-indent: 15px; } #mdvpqxzjwg .gt_indent_4 { text-indent: 20px; } #mdvpqxzjwg .gt_indent_5 { text-indent: 25px; } Estimated property victimization rate by income level Rate Standard Error Less than $25,000 110.6 5.0 $25,000-49,999 89.5 3.4 $50,000-99,999 87.8 3.3 $100,000-199,999 76.5 3.5 $200,000 or more 91.8 5.7 What is the difference between the violent victimization rate between males and females? Is it statistically different? vr_gender &lt;- pers_des %&gt;% group_by(Sex) %&gt;% summarize( Violent_rate=survey_mean(Violent * ADJINC_WT * 1000, na.rm=TRUE) ) vr_gender_test &lt;- pers_des %&gt;% mutate( Violent_Adj=Violent * ADJINC_WT * 1000 ) %&gt;% svyttest( formula = Violent_Adj ~ Sex, design = ., na.rm = TRUE ) %&gt;% broom::tidy() ## Warning in summary.glm(g): observations with zero weight not used for ## calculating dispersion ## Warning in summary.glm(glm.object): observations with zero weight not ## used for calculating dispersion Answer: The difference between male and female victimization rate is estimated as 1.9 victimizations/1,000 people and is not significantly different (p-value=0.1560) 14 - AmericasBarometer Vignette Calculate the percentage of households with broadband internet and those with any internet at home, including from a phone or tablet in Latin America and the Caribbean. Hint: if there are countries with 0% internet usage, try filtering by something first. Answer: int_ests &lt;- ambarom_des %&gt;% filter(!is.na(Internet) | !is.na(BroadbandInternet)) %&gt;% group_by(Country) %&gt;% summarize( p_broadband = survey_mean(BroadbandInternet, na.rm = TRUE) * 100, p_internet = survey_mean(Internet, na.rm = TRUE) * 100 ) int_ests %&gt;% gt(rowname_col = &quot;Country&quot;) %&gt;% fmt_number(decimals=1) %&gt;% tab_spanner( label=&quot;Broadband at home&quot;, columns=c(p_broadband, p_broadband_se) ) %&gt;% tab_spanner( label=&quot;Internet at home&quot;, columns=c(p_internet, p_internet_se) ) %&gt;% cols_label( p_broadband=&quot;Percent&quot;, p_internet=&quot;Percent&quot;, p_broadband_se=&quot;S.E.&quot;, p_internet_se=&quot;S.E.&quot;, ) #fqvyyzfmai table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } #fqvyyzfmai thead, #fqvyyzfmai tbody, #fqvyyzfmai tfoot, #fqvyyzfmai tr, #fqvyyzfmai td, #fqvyyzfmai th { border-style: none; } #fqvyyzfmai p { margin: 0; padding: 0; } #fqvyyzfmai .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; } #fqvyyzfmai .gt_caption { padding-top: 4px; padding-bottom: 4px; } #fqvyyzfmai .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; } #fqvyyzfmai .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; } #fqvyyzfmai .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #fqvyyzfmai .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #fqvyyzfmai .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; } #fqvyyzfmai .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 6px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; } #fqvyyzfmai .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; } #fqvyyzfmai .gt_column_spanner_outer:first-child { padding-left: 0; } #fqvyyzfmai .gt_column_spanner_outer:last-child { padding-right: 0; } #fqvyyzfmai .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; } #fqvyyzfmai .gt_spanner_row { border-bottom-style: hidden; } #fqvyyzfmai .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; } #fqvyyzfmai .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; } #fqvyyzfmai .gt_from_md > :first-child { margin-top: 0; } #fqvyyzfmai .gt_from_md > :last-child { margin-bottom: 0; } #fqvyyzfmai .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; } #fqvyyzfmai .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; } #fqvyyzfmai .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; } #fqvyyzfmai .gt_row_group_first td { border-top-width: 2px; } #fqvyyzfmai .gt_row_group_first th { border-top-width: 2px; } #fqvyyzfmai .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #fqvyyzfmai .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } #fqvyyzfmai .gt_first_summary_row.thick { border-top-width: 2px; } #fqvyyzfmai .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #fqvyyzfmai .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; } #fqvyyzfmai .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-top-style: double; border-top-width: 6px; border-top-color: #D3D3D3; } #fqvyyzfmai .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; border-bottom-style: double; border-bottom-width: 6px; border-bottom-color: #D3D3D3; } #fqvyyzfmai .gt_striped { background-color: rgba(128, 128, 128, 0.05); } #fqvyyzfmai .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } #fqvyyzfmai .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #fqvyyzfmai .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #fqvyyzfmai .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; } #fqvyyzfmai .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; } #fqvyyzfmai .gt_left { text-align: left; } #fqvyyzfmai .gt_center { text-align: center; } #fqvyyzfmai .gt_right { text-align: right; font-variant-numeric: tabular-nums; } #fqvyyzfmai .gt_font_normal { font-weight: normal; } #fqvyyzfmai .gt_font_bold { font-weight: bold; } #fqvyyzfmai .gt_font_italic { font-style: italic; } #fqvyyzfmai .gt_super { font-size: 65%; } #fqvyyzfmai .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } #fqvyyzfmai .gt_asterisk { font-size: 100%; vertical-align: 0; } #fqvyyzfmai .gt_indent_1 { text-indent: 5px; } #fqvyyzfmai .gt_indent_2 { text-indent: 10px; } #fqvyyzfmai .gt_indent_3 { text-indent: 15px; } #fqvyyzfmai .gt_indent_4 { text-indent: 20px; } #fqvyyzfmai .gt_indent_5 { text-indent: 25px; } Broadband at home Internet at home Percent S.E. Percent S.E. Argentina 62.3 1.1 86.2 0.9 Bolivia 41.4 1.0 77.2 1.0 Brazil 68.3 1.2 88.9 0.9 Chile 63.1 1.1 93.5 0.5 Colombia 45.7 1.2 68.7 1.1 Costa Rica 49.6 1.1 84.4 0.8 Dominican Republic 37.1 1.0 73.7 1.0 Ecuador 59.7 1.1 79.9 0.9 El Salvador 30.2 0.9 63.9 1.0 Guatemala 33.4 1.0 61.5 1.1 Guyana 63.7 1.1 86.8 0.8 Haiti 11.8 0.8 58.5 1.2 Honduras 28.2 1.0 60.7 1.1 Jamaica 64.2 1.0 91.5 0.6 Mexico 44.9 1.1 70.9 1.0 Nicaragua 39.1 1.1 76.3 1.1 Panama 43.4 1.0 73.1 1.0 Paraguay 33.3 1.0 72.9 1.0 Peru 42.4 1.1 71.1 1.1 Uruguay 62.7 1.1 90.6 0.7 Create a faceted map showing both broadband internet and any internet usage. Answer: library(sf) library(rnaturalearth) library(ggpattern) internet_sf &lt;- country_shape_upd %&gt;% full_join(select(int_ests, p = p_internet, geounit = Country), by = &quot;geounit&quot;) %&gt;% mutate(Type = &quot;Internet&quot;) broadband_sf &lt;- country_shape_upd %&gt;% full_join(select(int_ests, p = p_broadband, geounit = Country), by = &quot;geounit&quot;) %&gt;% mutate(Type = &quot;Broadband&quot;) b_int_sf &lt;- internet_sf %&gt;% bind_rows(broadband_sf) %&gt;% filter(region_wb == &quot;Latin America &amp; Caribbean&quot;) b_int_sf %&gt;% ggplot(aes(fill = p), color=&quot;darkgray&quot;) + geom_sf() + facet_wrap( ~ Type) + scale_fill_gradientn( guide = &quot;colorbar&quot;, name = &quot;Percent&quot;, labels = scales::comma, colors = c(&quot;#BFD7EA&quot;, &quot;#087E8B&quot;, &quot;#0B3954&quot;), na.value = NA ) + geom_sf_pattern( data = filter(b_int_sf, is.na(p)), pattern = &quot;crosshatch&quot;, pattern_fill = &quot;lightgray&quot;, pattern_color = &quot;lightgray&quot;, fill = NA, color = &quot;darkgray&quot; ) + theme_minimal() FIGURE D.3: Percent of broadband internet and any internet usage, Central and South America References Davern, Michael, Rene Bautista, Jeremy Freese, Stephen L. Morgan, and Tom W. Smith. 2021. “General Social Survey 2016-2020 Panel Codebook.” Edited by Chicago NORC. https://gss.norc.org/Documents/codebook/2016-2020%20GSS%20Panel%20Codebook%20-%20R1a.pdf. National Center for Health Statistics. 2023. “National Health Interview Survey, 2022 survey description.” https://ftp.cdc.gov/pub/Health_Statistics/NCHS/Dataset_Documentation/NHIS/2022/srvydesc-508.pdf. Sprunt, Barbara. 2020. “93 Million and Counting: Americans Are Shattering Early Voting Records.” National Public Radio. ———. 2023d. “Units and Calculators Explained: Degree Days.” https://www.eia.gov/energyexplained/units-and-calculators/degree-days.php. "],["references.html", "References", " References American National Election Studies. 2021. “ANES 2020 Time Series Study: Pre-Election and Post-Election Survey Questionnaires.” https://electionstudies.org/wp-content/uploads/2021/07/anes_timeseries_2020_questionnaire_20210719.pdf. ———. 2022. “ANES 2020 Time Series Study Full Release: User Guide and Codebook.” https://electionstudies.org/wp-content/uploads/2022/02/anes_timeseries_2020_userguidecodebook_20220210.pdf. Bache, Stefan Milton, and Hadley Wickham. 2022. magrittr: A Forward-Pipe Operator for R. Biemer, Paul P. 2010. “Total Survey Error: Design, Implementation, and Evaluation.” Public Opinion Quarterly 74 (5): 817–48. https://doi.org/10.1093/poq/nfq058. Biemer, Paul P., and Lars E. Lyberg. 2003. Introduction to Survey Quality. John Wiley &amp; Sons. Biemer, Paul P., Joe Murphy, Stephanie Zimmer, Chip Berry, Grace Deng, and Katie Lewis. 2017. “Using Bonus Monetary Incentives to Encourage Web Response in Mixed-Mode Household Surveys.” Journal of Survey Statistics and Methodology 6 (2): 240–61. https://doi.org/10.1093/jssam/smx015. Bollen, Kenneth A., Paul P. Biemer, Alan F. Karr, Stephen Tueller, and Marcus E. Berzofsky. 2016. “Are Survey Weights Needed? A Review of Diagnostic Tests in Regression Analysis.” Annual Review of Statistics and Its Application 3 (1): 375–92. https://doi.org/10.1146/annurev-statistics-011516-012958. Bradburn, Norman M., Seymour Sudman, and Brian Wansink. 2004. Asking Questions: The Definitive Guide to Questionnaire Design. 2nd Edition. Jossey-Bass. Bryan, Jenny. 2023. Happy Git and GitHub for the useR. https://happygitwithr.com/. Bureau of Justice Statistics. 2017. “National Crime Victimization Survey, 2016: Technical Documentation.” https://bjs.ojp.gov/sites/g/files/xyckuh236/files/media/document/ncvstd16.pdf. Centers for Disease Control and Prevention (CDC). 2021. “Behavioral Risk Factor Surveillance System Survey Questionnaire.” U.S. Department of Health; Human Services, Centers for Disease Control; Prevention; https://www.cdc.gov/brfss/questionnaires/pdf-ques/2021-BRFSS-Questionnaire-1-19-2022-508.pdf. Cochran, William G. 1977. Sampling Techniques. John Wiley &amp; Sons. Cox, Brenda G, David A Binder, B Nanjamma Chinnappa, Anders Christianson, Michael J Colledge, and Phillip S Kott. 2011. Business Survey Methods. John Wiley &amp; Sons. Csardi, Gabor. 2023. prettyunits: Pretty, Human Readable Formatting of Quantities. https://github.com/r-lib/prettyunits. Davern, Michael, Rene Bautista, Jeremy Freese, Stephen L. Morgan, and Tom W. Smith. 2021. “General Social Survey 2016-2020 Panel Codebook.” Edited by Chicago NORC. https://gss.norc.org/Documents/codebook/2016-2020%20GSS%20Panel%20Codebook%20-%20R1a.pdf. DeBell, Matthew. 2010. “How to Analyze ANES Survey Data.” ANES Technical Report Series nes012492. Palo Alto, CA: Stanford University; Ann Arbor, MI: the University of Michigan; https://electionstudies.org/wp-content/uploads/2018/05/HowToAnalyzeANESData.pdf. DeBell, Matthew, Michelle Amsbary, Ted Brader, Shelley Brock, Cindy Good, Justin Kamens, Natalya Maisel, and Sarah Pinto. 2022. “Methodology Report for the ANES 2020 Time Series Study.” https://electionstudies.org/wp-content/uploads/2022/08/anes_timeseries_2020_methodology_report.pdf. DeLeeuw, Edith D. 2005. “To Mix or Not to Mix Data Collection Modes in Surveys.” Journal of Official Statistics 21: 233–55. ———. 2018. “Mixed-Mode: Past, Present, and Future.” Survey Research Methods 12 (2): 75–89. https://doi.org/10.18148/srm/2018.v12i2.7402. Deming, W Edwards. 1991. Sample Design in Business Research. Vol. 23. John Wiley &amp; Sons. Dillman, Don A, Jolene D Smyth, and Leah Melani Christian. 2014. Internet, Phone, Mail, and Mixed-Mode Surveys: The Tailored Design Method. John Wiley &amp; Sons. FC, Mike, Trevor L Davis, and ggplot2 authors. 2022. ggpattern: ’ggplot2’ Pattern Geoms. Fowler, Floyd J, and Thomas W. Mangione. 1989. Standardized Survey Interviewing. SAGE. Freedman Ellis, Greg, and Ben Schneider. 2023. srvyr: ’dplyr’-Like Syntax for Summary Statistics of Survey Data. Fuller, Wayne A. 2011. Sampling Statistics. John Wiley &amp; Sons. Gard, Arianna M., Luke W. Hyde, Steven G. Heeringa, Brady T. West, and Colter Mitchell. 2023. “Why Weight? Analytic Approaches for Large-Scale Population Neuroscience Data.” Dev Cogn Neurosci. https://doi.org/10.1016/j.dcn.2023.101196. Gelman, Andrew. 2007. “Struggles with Survey Weighting and Regression Modeling.” Statistical Science 22 (2): 153–64. https://doi.org/10.1214/088342306000000691. Groves, Robert M, Floyd J Fowler Jr, Mick P Couper, James M Lepkowski, Eleanor Singer, and Roger Tourangeau. 2009. Survey Methodology. John Wiley &amp; Sons. Harter, Rachel, Michael P Battaglia, Trent D Buskirk, Don A Dillman, Ned English, Mansour Fahimi, Martin R Frankel, et al. 2016. “Address-Based Sampling.” Task force report. American Association for Public Opinion Research; https://aapor.org/wp-content/uploads/2022/11/AAPOR_Report_1_7_16_CLEAN-COPY-FINAL-2.pdf. Henry, Lionel, and Hadley Wickham. 2022. tidyselect: Select from a Set of Strings. Iannone, Richard, Joe Cheng, Barret Schloerke, Ellis Hughes, Alexandra Lauer, and JooYoung Seo. 2023. gt: Easily Create Presentation-Ready Display Tables. Kim, Jae Kwang, and Jun Shao. 2021. Statistical Methods for Handling Incomplete Data. Chapman &amp; Hall/CRC Press. Landau, William Michael. 2021. “The targets R Package: A Dynamic Make-Like Function-Oriented Pipeline Toolkit for Reproducibility and High-Performance Computing.” Journal of Open Source Software 6 (57): 2959. https://doi.org/10.21105/joss.02959. LAPOP. 2021a. “AmericasBarometer 2021 - Canada: Technical Information.” Vanderbilt University; http://datasets.americasbarometer.org/database/files/ABCAN2021-Technical-Report-v1.0-FINAL-eng-110921.pdf. ———. 2021b. “AmericasBarometer 2021 - U.S.: Technical Information.” Vanderbilt University; http://datasets.americasbarometer.org/database/files/ABUSA2021-Technical-Report-v1.0-FINAL-eng-110921.pdf. ———. 2021c. “AmericasBarometer 2021: Technical Information.” Vanderbilt University; https://www.vanderbilt.edu/lapop/ab2021/AB2021-Technical-Report-v1.0-FINAL-eng-030722.pdf. ———. 2021d. “Core Questionnaire.” https://www.vanderbilt.edu/lapop/ab2021/AB2021-Core-Questionnaire-v17.5-Eng-210514-W-v2.pdf. ———. 2023a. “About the AmericasBarometer.” https://www.vanderbilt.edu/lapop/about-americasbarometer.php. ———. 2023b. “The AmericasBarometer by the LAPOP Lab.” www.vanderbilt.edu/lapop. Larmarange, Joseph. 2023. labelled: Manipulating Labelled Data. https://larmarange.github.io/labelled/. Levy, Paul S, and Stanley Lemeshow. 2013. Sampling of Populations: Methods and Applications. John Wiley &amp; Sons. Lumley, Thomas. 2010. Complex Surveys: A Guide to Analysis Using R. John Wiley; Sons. Mack, Christina, Zhaohui Su, and Daniel Westreich. 2018. “Types of Missing Data.” In Managing Missing Data in Patient Registries: Addendum to Registries for Evaluating Patient Outcomes: A User’s Guide, Third Edition [Internet]. Rockville (MD): Agency for Healthcare Research; Quality (US); https://www.ncbi.nlm.nih.gov/books/NBK493614/. Massicotte, Philippe, and Andy South. 2023. rnaturalearth: World Map Data from Natural Earth. https://docs.ropensci.org/rnaturalearth/ https://github.com/ropensci/rnaturalearth. McCullagh, Peter, and John Ashworth Nelder. 1989. “Binary Data.” In Generalized Linear Models, 98–148. Springer. Müller, Kirill. 2020. here: A Simpler Way to Find Your Files. National Center for Health Statistics. 2023. “National Health Interview Survey, 2022 survey description.” https://ftp.cdc.gov/pub/Health_Statistics/NCHS/Dataset_Documentation/NHIS/2022/srvydesc-508.pdf. Ooms, Jeroen. 2014. “The jsonlite Package: A Practical and Consistent Mapping Between JSON Data and R Objects.” arXiv:1403.2805 [Stat.CO]. https://arxiv.org/abs/1403.2805. Pebesma, Edzer, and Roger Bivand. 2023. Spatial Data Science: With applications in R. Chapman and Hall/CRC. https://doi.org/10.1201/9780429459016. Penn State. 2019. “STAT 506: Sampling Theory and Methods [Online Course].” https://online.stat.psu.edu/stat506/. R Core Team. 2023. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/. Recht, Hannah. 2024. censusapi: Retrieve Data from the Census APIs. Robinson, David, Alex Hayes, and Simon Couch. 2023. broom: Convert Statistical Objects into Tidy Tibbles. Särndal, Carl-Erik, Bengt Swensson, and Jan Wretman. 2003. Model Assisted Survey Sampling. Springer Science &amp; Business Media. Schafer, Joseph L, and John W Graham. 2002. “Missing Data: Our View of the State of the Art.” Psychological Methods 7: 147–77. https://doi.org/10.1037//1082-989X.7.2.147. Schouten, Barry, Andy Peytchev, and James Wagner. 2018. Adaptive Survey Design. Chapman &amp; Hall/CRC Press. Scott, Alastair. 2007. “Rao-Scott Corrections and Their Impact.” In Section on Survey Research Methods, 3514–18. http://www.asasrms.org/Proceedings/y2007/Files/JSM2007-000874.pdf. Shah, Babubhai V, and Akhil K Vaish. 2006. “Confidence Intervals for Quantile Estimation from Complex Survey Data.” In Proceedings of the Section on Survey Research Methods. http://www.asasrms.org/Proceedings/y2006/Files/JSM2006-000749.pdf. Shook-Sa, Bonnie, Couzens, G. Lance, and Berzofsky, Marcus. 2015. “Users’ Guide to the National Crime Victimization Survey (NCVS) Direct Variance Estimation.” https://bjs.ojp.gov/sites/g/files/xyckuh236/files/media/document/ncvs_variance_user_guide_11.06.14.pdf; Bureau of Justice Statistics. Sjoberg, Daniel D., Karissa Whiting, Michael Curry, Jessica A. Lavery, and Joseph Larmarange. 2021. “Reproducible Summary Tables with the gtsummary Package.” The R Journal 13: 570–80. https://doi.org/10.32614/RJ-2021-053. Skinner, Chris. 2009. “Chapter 15: Statistical Disclosure Control for Survey Data.” In Handbook of Statistics: Sample Surveys: Design, Methods and Applications, edited by C. R. Rao, 381–96. Elsevier B.V. Sprunt, Barbara. 2020. “93 Million and Counting: Americans Are Shattering Early Voting Records.” National Public Radio. Tierney, Nicholas. 2017. “visdat: Visualising Whole Data Frames.” Journal of Open Source Software 2 (16): 355. https://doi.org/10.21105/joss.00355. Tierney, Nicholas, and Dianne Cook. 2023. “Expanding Tidy Data Principles to Facilitate Missing Data Exploration, Visualization and Assessment of Imputations.” Journal of Statistical Software 105 (7): 1–31. https://doi.org/10.18637/jss.v105.i07. Tourangeau, Roger, Mick P. Couper, and Frederick Conrad. 2004. “Spacing, Position, and Order: Interpretive Heuristics for Visual Features of Survey Questions.” Public Opinion Quarterly 68: 368–93. Tourangeau, Roger, Lance J. Rips, and Kenneth Rasinski. 2000. Psychology of Survey Response. Cambridge University Press. United States. Bureau of Justice Statistics. 2022. “National Crime Victimization Survey, [United States], 2021.” https://www.icpsr.umich.edu/web/NACJD/studies/38429; Inter-university Consortium for Political; Social Research [distributor]. https://doi.org/10.3886/ICPSR38429.v1. U.S. Census Bureau. 2021. “Understanding and Using the American Community Survey Public Use Microdata Sample Files What Data Users Need to Know.” U.S. Government Printing Office; https://www.census.gov/content/dam/Census/library/publications/2021/acs/acs_pums_handbook_2021.pdf. U.S. Energy Information Administration. 2017. “Residential Energy Consumption Survey (RECS): Using the 2015 microdata file to compute estimates and standard errors (RSEs).” https://www.eia.gov/consumption/residential/data/2015/pdf/microdata_v3.pdf. ———. 2023a. “2020 Residential Energy Consumption Survey: Consumption and Expenditures Technical Documentation Summary.” https://www.eia.gov/consumption/residential/data/2020/pdf/2020%20RECS%20CE%20Methodology_Final.pdf. ———. 2023b. “2020 Residential Energy Consumption Survey: Household Characteristics Technical Documentation Summary.” https://www.eia.gov/consumption/residential/data/2020/pdf/2020%20RECS_Methodology%20Report.pdf. ———. 2023c. “2020 Residential Energy Consumption Survey: Using the microdata file to compute estimates and relative standard errors (RSEs).” https://www.eia.gov/consumption/residential/data/2020/pdf/microdata-guide.pdf. ———. 2023d. “Units and Calculators Explained: Degree Days.” https://www.eia.gov/energyexplained/units-and-calculators/degree-days.php. Ushey, Kevin, and Hadley Wickham. 2023. renv: Project Environments. Valliant, Richard, and Jill A. Dever. 2018. Survey Weights: A Step-by-Step Guide to Calculation. Stata Press. Valliant, Richard, Jill A Dever, and Frauke Kreuter. 2013. Practical Tools for Designing and Weighting Survey Samples. Vol. 1. Springer. Walker, Kyle, and Matt Herman. 2024. tidycensus: Load US Census Boundary and Attribute Data as ’tidyverse’ and ’sf’-Ready Data Frames. https://walker-data.com/tidycensus/. Wickham, Hadley. 2016. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. https://ggplot2.tidyverse.org. ———. 2019. Advanced R. https://adv-r.hadley.nz/; CRC press. ———. 2023a. forcats: Tools for Working with Categorical Variables (Factors). ———. 2023b. httr2: Perform HTTP Requests and Process the Responses. Wickham, Hadley, Mara Averick, Jennifer Bryan, Winston Chang, Lucy D’Agostino McGowan, Romain François, Garrett Grolemund, et al. 2019. “Welcome to the tidyverse.” Journal of Open Source Software 4 (43): 1686. https://doi.org/10.21105/joss.01686. Wickham, Hadley, Mine Çetinkaya-Rundel, and Garrett Grolemund. 2023. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. 2rd Edition. https://r4ds.hadley.nz/; O’Reilly Media. Wickham, Hadley, Romain François, Lionel Henry, Kirill Müller, and Davis Vaughan. 2023. dplyr: A Grammar of Data Manipulation. Wickham, Hadley, and Lionel Henry. 2023. purrr: Functional Programming Tools. Wickham, Hadley, Jim Hester, and Jennifer Bryan. 2023. readr: Read Rectangular Text Data. Wickham, Hadley, Evan Miller, and Danny Smith. 2023. haven: Import and Export ’SPSS’, ’Stata’ and ’SAS’ Files. Wolter, Kirk M. 2007. Introduction to Variance Estimation. Vol. 53. Springer. Xie, Yihui, Christophe Dervieux, and Emily Riederer. 2020. R Markdown Cookbook. Boca Raton, Florida: Chapman; Hall/CRC. https://bookdown.org/yihui/rmarkdown-cookbook. Zimmer, Stephanie, Rebecca Powell, and Isabella Velásquez. 2024. srvyrexploR: Data Supplement for Exploring Complex Survey Data Analysis in R. "],["404.html", "Page not found", " Page not found The page you requested cannot be found (perhaps it was moved or renamed). You may want to try searching to find the page's new location, or use the table of contents to find the page you are looking for. "]]

(Intercept)	−0.31	0.27	−1.15	0.2553
1.28	0.43	2.99	0.0047
EarlyVote2020Yes	0.53	0.35	1.53	0.1338
GenderFemale	0.96	0.26	3.73	0.0005
0.44	0.34	1.29	0.2039
Income7$20k to < 40k	−1.06	0.49	−2.18	0.0352
Income7$40k to < 60k	−0.78	0.42	−1.86	0.0705
Income7$60k to < 80k	−1.24	0.70	−1.77	0.0842
Income7$80k to < 100k	−0.66	0.64	−1.02	0.3137
Income7$100k to < 125k	−1.02	0.54	−1.89	0.0662
Income7$125k or more	−1.25	0.44	−2.87	0.0065
(Intercept)	−0.20	0.36	−0.55	0.5844
2.32	0.67	3.45	0.0015
EarlyVote2020Yes	0.38	0.47	0.80	0.4277
GenderFemale	0.76	0.54	1.42	0.1625
EarlyVote2020Yes:GenderFemale	0.27	0.60	0.45	0.6583
−0.81	0.78	−1.03	0.3081
Income7$20k to < 40k	−2.33	0.87	−2.68	0.0113
Income7$40k to < 60k	−1.67	0.89	−1.87	0.0700
Income7$60k to < 80k	−2.05	1.05	−1.96	0.0580
Income7$80k to < 100k	−3.42	1.12	−3.06	0.0043
Income7$100k to < 125k	−2.33	1.07	−2.17	0.0368
Income7$125k or more	−2.09	0.92	−2.28	0.0289
EarlyVote2020Yes:Income7$20k to < 40k	1.60	0.95	1.69	0.1006
EarlyVote2020Yes:Income7$40k to < 60k	0.99	1.00	0.99	0.3289
EarlyVote2020Yes:Income7$60k to < 80k	0.90	1.14	0.79	0.4373
EarlyVote2020Yes:Income7$80k to < 100k	3.22	1.16	2.78	0.0087
EarlyVote2020Yes:Income7$100k to < 125k	1.64	1.11	1.48	0.1492
EarlyVote2020Yes:Income7$125k or more	1.00	1.14	0.88	0.3867