diff --git a/01-introduction.Rmd b/01-introduction.Rmd
index 7f73429d..c522ecd0 100644
--- a/01-introduction.Rmd
+++ b/01-introduction.Rmd
@@ -38,44 +38,18 @@ In most chapters, you'll find code that you can follow. Each of these chapters s
 
 ## Datasets used in this book {#book-datasets}
 
-We work with two key datasets throughout the book: the Residential Energy Consumption Survey [RECS -- @recs-2020-tech] and the American National Election Studies [ANES -- @debell].  To ensure that all readers can follow the examples, we have provided analytic datasets available on OSF^[https://osf.io/gzbkn/?view_only=8ca80573293b4e12b7f934a0f742b957]. 
-
-If a chapter contains data that is not part of existing packages, we have created a helper function, `read_osf()`,  for you to load it easily. We recommend saving the script below in a folder called "helper-fun" and calling the file `helper-function.R` if you would like to follow along with the prerequisites listed in the chapters that contain code. 
+We work with two key datasets throughout the book: the Residential Energy Consumption Survey [RECS -- @recs-2020-tech] and the American National Election Studies [ANES -- @debell].  To ensure that all readers can follow the examples, we have provided analytic datasets in an R package, {srvyr.data}. Install the package from GitHub using the {remotes} package.
 
 ```r
-read_osf <- function(filename){
-  #' Downloads file from OSF project
-  #' Reads in file
-  #' Deletes file from computer
-  
-  osf_dl_del_later <- !dir.exists("osf_dl")
-  
-  if (osf_dl_del_later) {
-    osf_dl_del_later <- TRUE
-    dir.create("osf_dl")
-  }
-  
-  dat_det <-
-    osf_retrieve_node("https://osf.io/gzbkn/?view_only=8ca80573293b4e12b7f934a0f742b957") %>%
-    osf_ls_files() %>%
-    dplyr::filter(name == filename) %>%
-    osf_download(conflicts = "overwrite", path = "osf_dl")
-  
-  out <- dat_det %>%
-    dplyr::pull(local_path) %>%
-    readr::read_rds()
-  
-  if (osf_dl_del_later) {
-    unlink("osf_dl", recursive = TRUE)
-  } else{
-    unlink(dplyr::pull(dat_det, local_path))
-  }
-  
-  return(out)
-}
+remotes::install_github("https://github.com/tidy-survey-r/srvyr.data")
 ```
 
-Here's how to use the function to read in the RECS and ANES datasets:
+To explore the provided datasets in the package, access the documentation usng the `help()` command.
+
+```r
+help(package="srvyr.data")
+```
+To load the RECS and ANES datasets, start by running `library(srvyr.data)` to load the package. Then, use the `data()` command to load the datasets into the environment.
 
 ```{r}
 #| label: intro-setup
@@ -85,8 +59,7 @@ Here's how to use the function to read in the RECS and ANES datasets:
 library(tidyverse)
 library(survey)
 library(srvyr)
-library(osfr)
-source("helper-fun/helper-function.R")
+library(srvyr.data)
 ```
 
 ```{r}
@@ -95,26 +68,26 @@ source("helper-fun/helper-function.R")
 #| warning: FALSE
 #| message: FALSE
 #| cache: TRUE
-recs_in <- read_osf("recs_2020.rds")
-anes_in <- read_osf("anes_2020.rds")
+data(recs_2020)
+data(anes_2020)
 ```
 
-RECS is a study that provides energy consumption and expenditures data in American households. The Energy Information Administration funds RECS and has been fielded 15 times between 1950 and 2020. The survey has two components - the household survey and the energy supplier survey. In 2020, the household survey was collected by web and paper questionnaires and included questions about appliances, electronics, heating, air conditioning (A/C), temperatures, water heating, lighting, respondent demographics, and energy assistance. The energy supplier survey consists of components relating to energy consumption and energy expenditure. Below is an overview of the `recs_in` data:
+RECS is a study that provides energy consumption and expenditures data in American households. The Energy Information Administration funds RECS and has been fielded 15 times between 1950 and 2020. The survey has two components - the household survey and the energy supplier survey. In 2020, the household survey was collected by web and paper questionnaires and included questions about appliances, electronics, heating, air conditioning (A/C), temperatures, water heating, lighting, respondent demographics, and energy assistance. The energy supplier survey consists of components relating to energy consumption and energy expenditure. Below is an overview of the `recs_2020` data:
 
 ```{r}
 #| label: intro-recs
-recs_in %>% select(-starts_with("NWEIGHT"))
-recs_in %>% select(starts_with("NWEIGHT"))
+recs_2020 %>% select(-starts_with("NWEIGHT"))
+recs_2020 %>% select(starts_with("NWEIGHT"))
 ```
 
-From this output, we can see that there are `r nrow(recs_in) %>% formatC(big.mark = ",")` rows and `r ncol(recs_in) %>% formatC(big.mark = ",")` variables.  We can see that there are variables containing an ID (`DOEID`), geographic information (e.g., `Region`, `state_postal`, `Urbanicity`), along with information about the house, including the type of house (`HousingUnitType`) and when the house was built (`YearMade`). Additionally, there is a long list of weighting variables that we will use in the analysis (e.g., `NWEIGHT`, `NWEIGHT1`, ..., `NWEIGHT60`). We will discuss using these weighting variables in Chapter \@ref(c03-specifying-sample-designs). For a more detailed codebook, see Appendix \@ref(recs-cb).
+From this output, we can see that there are `r nrow(recs_2020) %>% formatC(big.mark = ",")` rows and `r ncol(recs_2020) %>% formatC(big.mark = ",")` variables.  We can see that there are variables containing an ID (`DOEID`), geographic information (e.g., `Region`, `state_postal`, `Urbanicity`), along with information about the house, including the type of house (`HousingUnitType`) and when the house was built (`YearMade`). Additionally, there is a long list of weighting variables that we will use in the analysis (e.g., `NWEIGHT`, `NWEIGHT1`, ..., `NWEIGHT60`). We will discuss using these weighting variables in Chapter \@ref(c03-specifying-sample-designs). For a more detailed codebook, see Appendix \@ref(recs-cb).
 
-The ANES is a series study that has collected data from election surveys since 1948. These surveys contain data on public opinion and voting behavior in U.S. presidential elections. The 2020 survey (the data we will be using) was fielded to individuals over the web, through live video interviewing, or over with computer-assisted telephone interviewing (CATI). The survey includes questions on party affiliation, voting choice, and level of trust with the government. Here is an overview of the `anes_in` data. First, we show the variables starting with "V" followed by a number; these are the original variables. Then, we show you the remaining variables that we created based on the original data:
+The ANES is a series study that has collected data from election surveys since 1948. These surveys contain data on public opinion and voting behavior in U.S. presidential elections. The 2020 survey (the data we will be using) was fielded to individuals over the web, through live video interviewing, or over with computer-assisted telephone interviewing (CATI). The survey includes questions on party affiliation, voting choice, and level of trust with the government. Here is an overview of the `anes_2020` data. First, we show the variables starting with "V" followed by a number; these are the original variables. Then, we show you the remaining variables that we created based on the original data:
 
 ```{r}
 #| label: intro-anes
-anes_in %>% select(matches("^V\\d"))
-anes_in %>% select(-matches("^V\\d"))
+anes_2020 %>% select(matches("^V\\d"))
+anes_2020 %>% select(-matches("^V\\d"))
 ```
 
-From this output we can see that there are `r nrow(anes_in) %>% formatC(big.mark = ",")` rows and `r ncol(anes_in) %>% formatC(big.mark = ",")` variables.  Most of the variables start with V20, so referencing the documentation for survey will be crucial to not get lost (see Chapter \@ref(c04-understanding-survey-data-documentation)).  We have created some more descriptive variables for you to use throughout this book, such as the age (`Age`) and gender (`Gender`) of the respondent, along with variables that represent their party affiliation (`PartyID`). Additionally, we need the variables  `Weight` and `Stratum` to analyze this data accurately.  We will discuss how to use these weighting variables in Chapters \@ref(c03-specifying-sample-designs) and \@ref(c04-understanding-survey-data-documentation). For a more detailed codebook, see Appendix \@ref(anes-cb).
+From this output we can see that there are `r nrow(anes_2020) %>% formatC(big.mark = ",")` rows and `r ncol(anes_2020) %>% formatC(big.mark = ",")` variables.  Most of the variables start with V20, so referencing the documentation for survey will be crucial to not get lost (see Chapter \@ref(c04-understanding-survey-data-documentation)).  We have created some more descriptive variables for you to use throughout this book, such as the age (`Age`) and gender (`Gender`) of the respondent, along with variables that represent their party affiliation (`PartyID`). Additionally, we need the variables  `Weight` and `Stratum` to analyze this data accurately.  We will discuss how to use these weighting variables in Chapters \@ref(c03-specifying-sample-designs) and \@ref(c04-understanding-survey-data-documentation). For a more detailed codebook, see Appendix \@ref(anes-cb).
diff --git a/03-specifying-sample-designs.Rmd b/03-specifying-sample-designs.Rmd
index b30a9761..7a1738a3 100644
--- a/03-specifying-sample-designs.Rmd
+++ b/03-specifying-sample-designs.Rmd
@@ -14,8 +14,7 @@ For this chapter, load the following packages and the helper function:
 library(tidyverse)
 library(survey)
 library(srvyr)
-library(osfr)
-source("helper-fun/helper-function.R")
+library(srvyr.data)
 ```
 
 To help explain the different types of sample designs, this chapter will use the `api` and `scd` data that comes in the {survey} package:
@@ -25,12 +24,13 @@ data(api)
 data(scd)
 ```
 
-Additionally, we have created multiple analytic datasets for use in this book on a directory on OSF^[https://osf.io/gzbkn/?view_only=8ca80573293b4e12b7f934a0f742b957]. To load any data used in the book that is not included in existing packages, we have created a helper function `read_osf()`. This chapter uses data from the Residential Energy Consumption Survey (RECS) - both 2015 and 2020, so we will use the following code to load the RECS data to use later in this chapter:
+Additionally, we have created multiple analytic datasets for use in the {srvyr.data} package, as described in \@ref{book-datasets}. This chapter uses data from the Residential Energy Consumption Survey (RECS) - both 2015 and 2020, so we will use the following code to load the RECS data to use later in this chapter:
+
 ```{r}
 #| label: samp-setup-recs 
 #| eval: FALSE
-recs_2015_in <- read_osf("recs_2015.rds")
-recs_in <- read_osf("recs_2020.rds")
+data(recs_2015)
+data(recs_2020)
 ```
 :::
 
@@ -573,7 +573,7 @@ fay_des <- dat %>%
 
 #### Example {-} 
 
-The 2015 RECS [@recs-2015-micro] uses Fay's BRR weights with the final weight as NWEIGHT and replicate weights as BRRWT1 - BRRWT96 with $\rho=0.5$. On the file, DOEID is a unique identifier for each respondent, TOTALDOL is the total cost of energy, TOTSQFT_EN is the total square footage of the residence, and REGOINC is the Census region. We have already read in the RECS data and created a dataset called `recs_2015_in` above in the prerequisites.
+The 2015 RECS [@recs-2015-micro] uses Fay's BRR weights with the final weight as NWEIGHT and replicate weights as BRRWT1 - BRRWT96 with $\rho=0.5$. On the file, DOEID is a unique identifier for each respondent, TOTALDOL is the total cost of energy, TOTSQFT_EN is the total square footage of the residence, and REGOINC is the Census region. We have already read in the RECS data and created a dataset called `recs_2015` above in the prerequisites.
 
 To specify this design, use the following syntax:
 
@@ -583,14 +583,14 @@ To specify this design, use the following syntax:
 #| warning: FALSE
 #| message: FALSE
 #| cache: TRUE
-recs_2015_in <- read_osf("recs_2015.rds")
+data(recs_2015)
 ```
 
 
 ```{r}
 #| label: samp-des-recs-des
 #| eval: TRUE
-recs_2015_des <- recs_2015_in %>%
+recs_2015_des <- recs_2015 %>%
   as_survey_rep(weights = NWEIGHT,
                 repweights = BRRWT1:BRRWT96,
                 type = "Fay",
@@ -649,12 +649,13 @@ jkn_des <- dat %>%
 
 #### Example {-}
 
-The 2020 RECS [@recs-2020-micro] uses jackknife weights with the final weight as NWEIGHT and replicate weights as NWEIGHT1 - NWEIGHT60 with a scale of $(R-1)/R=59/60$. On the file, DOEID is a unique identifier for each respondent, TOTALDOL is the total cost of energy, TOTSQFT_EN is the total square footage of the residence, and REGOINC is the Census region. We have already read in the RECS data and created a dataset called `recs_in` above in the prerequisites.
+The 2020 RECS [@recs-2020-micro] uses jackknife weights with the final weight as NWEIGHT and replicate weights as NWEIGHT1 - NWEIGHT60 with a scale of $(R-1)/R=59/60$. On the file, DOEID is a unique identifier for each respondent, TOTALDOL is the total cost of energy, TOTSQFT_EN is the total square footage of the residence, and REGOINC is the Census region. We have already read in the RECS data and created a dataset called `recs_2020` above in the prerequisites.
 
 To specify this design, use the following syntax:
 
 ```{r}
-recs_des <- recs_in %>%
+#| label: samp-des-recs2020-des
+recs_des <- recs_2020 %>%
   as_survey_rep(
     weights = NWEIGHT,
     repweights = NWEIGHT1:NWEIGHT60,
@@ -673,7 +674,7 @@ summary(recs_des)
 #| label: samp-des-recs-des-full
 #| echo: FALSE
 # This is just for later use in book
-recs_des <- recs_in %>%
+recs_des <- recs_2020 %>%
   as_survey_rep(
     weights = NWEIGHT,
     repweights = NWEIGHT1:NWEIGHT60,
diff --git a/04-understanding-survey-data-documentation.Rmd b/04-understanding-survey-data-documentation.Rmd
index 67fc09dd..b2821541 100644
--- a/04-understanding-survey-data-documentation.Rmd
+++ b/04-understanding-survey-data-documentation.Rmd
@@ -14,8 +14,7 @@ For this chapter, load the following packages and the helper function:
 library(tidyverse)
 library(survey)
 library(srvyr)
-library(osfr)
-source("helper-fun/helper-function.R")
+library(srvyr.data)
 library(censusapi)
 ```
 
@@ -23,7 +22,7 @@ We will be using data from ANES. Here is the code to read in the data.
 ```{r}
 #| label: understand-anes-c04
 #| eval: FALSE
-anes_in <- read_osf("anes_2020.rds")
+data(anes_2020)
 ```
 :::
 
@@ -250,7 +249,7 @@ The target population in 2020 is `r scales::comma(targetpop)`. This information
 
 ```{r}
 #| label: understand-read-anes
-anes_adjwgt <- anes_in %>%
+anes_adjwgt <- anes_2020 %>%
   mutate(Weight = V200010b / sum(V200010b) * targetpop) 
 ```
 
diff --git a/05-descriptive-analysis.Rmd b/05-descriptive-analysis.Rmd
index 3696a7e1..0c862ae9 100644
--- a/05-descriptive-analysis.Rmd
+++ b/05-descriptive-analysis.Rmd
@@ -14,8 +14,7 @@ For this chapter, load the following packages and the helper function:
 library(tidyverse)
 library(survey)
 library(srvyr)
-library(osfr)
-source("helper-fun/helper-function.R")
+library(srvyr.data)
 library(broom)
 ```
 
@@ -40,10 +39,10 @@ We will be using data from ANES and RECS. Here is the code to create the design
 ```{r}
 #| label: desc-anes-des
 #| eval: FALSE
-anes_in <- read_osf("anes_2020.rds")
 targetpop <- 231592693
+data(anes_2020)
 
-anes_adjwgt <- anes_in %>%
+anes_adjwgt <- anes_2020 %>%
   mutate(Weight = Weight / sum(Weight) * targetpop)
 
 anes_des <- anes_adjwgt %>%
@@ -60,9 +59,9 @@ For RECS, details are included in the RECS documentation and Chapter \@ref(c03-s
 ```{r}
 #| label: desc-recs-des
 #| eval: FALSE
-recs_in <- read_osf("recs_2020.rds")
+data(recs_2020)
 
-recs_des <- recs_in %>%
+recs_des <- recs_2020 %>%
   as_survey_rep(
     weights = NWEIGHT,
     repweights = NWEIGHT1:NWEIGHT60,
@@ -978,7 +977,7 @@ It is estimated that American residential households spent an average of `r .elb
 
 Briefly, we mentioned using `filter()` to subset a survey object for analysis. This operation should be done after creating the design object. In rare circumstances, subsetting data before creating the object can lead to incorrect variability estimates. This can occur if subsetting removes an entire PSU.
 
-Suppose we wanted estimates of the average amount spent on natural gas among housing units that use natural gas using the variable `BTUNG`^[`BTUNG` is derived from the supplier side component of the survey where `BTUNG` represents the natural gas consumption in British thermal units (BTUs) in a year]. This could be obtained by first filtering records to only include records where `BTUNG > 0` and then finding the average amount of money spent.
+Suppose we wanted estimates of the average amount spent on natural gas among housing units that use natural gas using the variable `BTUNG`^[`BTUNG` is derived from the supplier side component of the survey where `BTUNG` represents the natural gas consumption in British thermal units (Btus) in a year]. This could be obtained by first filtering records to only include records where `BTUNG > 0` and then finding the average amount of money spent.
 
 ```{r}
 #| label: desc-subpop
diff --git a/06-statistical-testing.Rmd b/06-statistical-testing.Rmd
index 90e75478..2f49d5e5 100644
--- a/06-statistical-testing.Rmd
+++ b/06-statistical-testing.Rmd
@@ -14,8 +14,7 @@ For this chapter, load the following packages and the helper function:
 library(tidyverse)
 library(survey) 
 library(srvyr) 
-library(osfr)
-source("helper-fun/helper-function.R")
+library(srvyr.data)
 library(broom)
 library(gt)
 ```
@@ -24,10 +23,10 @@ We will be using data from ANES and RECS. Here is the code to create the design
 ```{r}
 #| label: stattest-anes-des
 #| eval: FALSE
-anes_in <- read_osf("anes_2020.rds")
 targetpop <- 231592693
+data(anes_2020)
 
-anes_adjwgt <- anes_in %>%
+anes_adjwgt <- anes_2020 %>%
   mutate(Weight = Weight / sum(Weight) * targetpop)
 
 anes_des <- anes_adjwgt %>%
@@ -44,9 +43,9 @@ For RECS, details are included in the RECS documentation and Chapter \@ref(c03-s
 ```{r}
 #| label: stattest-recs-des
 #| eval: FALSE
-recs_in <- read_osf("recs_2020.rds")
+data(recs_2020)
 
-recs_des <- recs_in %>%
+recs_des <- recs_2020 %>%
   as_survey_rep(
     weights = NWEIGHT,
     repweights = NWEIGHT1:NWEIGHT60,
diff --git a/07-modeling.Rmd b/07-modeling.Rmd
index 04dabc1c..a1e19a18 100644
--- a/07-modeling.Rmd
+++ b/07-modeling.Rmd
@@ -14,8 +14,7 @@ For this chapter, load the following packages and the helper function:
 library(tidyverse)
 library(survey) 
 library(srvyr) 
-library(osfr)
-source("helper-fun/helper-function.R")
+library(srvyr.data)
 library(broom)
 ```
 
@@ -23,10 +22,10 @@ We will be using data from ANES and RECS. Here is the code to create the design
 ```{r}
 #| label: model-anes-des
 #| eval: FALSE
-anes_in <- read_osf("anes_2020.rds") 
 targetpop <- 231592693
+data(anes_2020)
 
-anes_adjwgt <- anes_in %>%
+anes_adjwgt <- anes_2020 %>%
   mutate(Weight = Weight / sum(Weight) * targetpop)
 
 anes_des <- anes_adjwgt %>%
@@ -41,9 +40,7 @@ For RECS, details are included in the RECS documentation and Chapter \@ref(c03-s
 ```{r}
 #| label: model-recs-des
 #| eval: FALSE
-recs_in <- read_osf("recs_2020.rds")
-
-recs_des <- recs_in %>%
+recs_des <- recs_2020 %>%
   as_survey_rep(
     weights = NWEIGHT,
     repweights = NWEIGHT1:NWEIGHT60,
@@ -215,7 +212,7 @@ On RECS, we can obtain information on the square footage of homes and the electr
 #| fig.alt: Hex chart where each hexagon represents a number of housing units at a point. x-axis is 'Total square footage' ranging from 0 to 7,500 and y-axis is 'Amount spent on electricity' ranging from $0 to 8,000. The trend is relatively linear and positve. A high concentration of points have square footage between 0 and 2,500 square feet as well as between electricity expenditure between $0 and 2,000
 #| echo: FALSE
 #| warning: FALSE
-recs_in %>%
+recs_2020 %>%
   ggplot(aes(
     x = TOTSQFT_EN,
     y = DOLLAREL,
@@ -311,7 +308,7 @@ Additionally, `augment()` can be used to predict outcomes for data not used in m
 ```{r}
 #| label: model-predict-new-dat
 add_data <-
-  recs_in %>% select(DOEID,
+  recs_2020 %>% select(DOEID,
                      Region,
                      Urbanicity,
                      TOTSQFT_EN,
@@ -649,7 +646,7 @@ tidy(earlyvote_mod) %>% arrange(p.value)
 
 ```{r}
 #| label: model-ex-logistic-2
-add_vote_dat <- anes_in %>%
+add_vote_dat <- anes_2020 %>%
   select(EarlyVote2020, Age, Education, PartyID) %>%
   rbind(tibble(
     EarlyVote2020 = NA,
diff --git a/08-communicating-results.Rmd b/08-communicating-results.Rmd
index d43ec848..546cddb5 100644
--- a/08-communicating-results.Rmd
+++ b/08-communicating-results.Rmd
@@ -14,8 +14,7 @@ For this chapter, load the following packages and the helper function:
 library(tidyverse)
 library(survey) 
 library(srvyr) 
-library(osfr)
-source("helper-fun/helper-function.R")
+library(srvyr.data)
 library(gt)
 library(gtsummary)
 ```
@@ -25,10 +24,10 @@ We will be using data from ANES. Here is the code to create the ANES design obje
 ```{r}
 #| label: results-anes-des
 #| eval: FALSE
-anes_in <- read_osf("anes_2020.rds") 
 targetpop <- 231592693
+data(anes_2020)
 
-anes_adjwgt <- anes_in %>%
+anes_adjwgt <- anes_2020 %>%
   mutate(Weight = Weight / sum(Weight) * targetpop)
 
 anes_des <- anes_adjwgt %>%
diff --git a/09-ncvs-vignette.Rmd b/09-ncvs-vignette.Rmd
index c1801d13..bb558a0b 100644
--- a/09-ncvs-vignette.Rmd
+++ b/09-ncvs-vignette.Rmd
@@ -13,8 +13,7 @@ For this chapter, load the following packages and the helper function:
 #| message: FALSE
 library(tidyverse)
 library(srvyr) 
-library(osfr)
-source("helper-fun/helper-function.R")
+library(srvyr.data)
 library(gt)
 ```
 
@@ -22,9 +21,9 @@ We will be using data from NCVS. Here is the code to read in the three datasets
 ```{r}
 #| label: ncvs-data
 #| cache: TRUE
-inc_in <- read_osf("ncvs_2021_incident.rds")
-hh_in <- read_osf("ncvs_2021_household.rds")
-pers_in <- read_osf("ncvs_2021_person.rds")
+data(ncvs_2021_incident)
+data(ncvs_2021_household)
+data(ncvs_2021_person)
 ```
 :::
 
@@ -119,7 +118,7 @@ We want to create four variables to indicate if an incident is a series crime.
 #| label: ncvs-vign-incfile
 #| message: false
 #| cache: TRUE
-inc_series <- inc_in %>%
+inc_series <- ncvs_2021_incident %>%
   mutate(
     series = case_when(V4017 %in% c(1, 8) ~ 1,
                        V4018 %in% c(2, 8) ~ 1,
@@ -314,12 +313,12 @@ hh_z_list <- rep(0, ncol(inc_hh_sums) - 3) %>% as.list() %>%
 pers_z_list <- rep(0, ncol(inc_pers_sums) - 4) %>% as.list() %>%
   setNames(names(inc_pers_sums)[-(1:4)])
 
-hh_vsum <- hh_in %>%
+hh_vsum <- ncvs_2021_household %>%
   full_join(inc_hh_sums, by = c("YEARQ", "IDHH")) %>%
   replace_na(hh_z_list) %>%
   mutate(ADJINC_WT = if_else(is.na(WGTVICCY), 0, WGTVICCY / WGTHHCY))
 
-pers_vsum <- pers_in %>%
+pers_vsum <- ncvs_2021_person %>%
   full_join(inc_pers_sums, by = c("YEARQ", "IDHH", "IDPER")) %>%
   replace_na(pers_z_list) %>%
   mutate(ADJINC_WT = if_else(is.na(WGTVICCY), 0, WGTVICCY / WGTPERCY))
diff --git a/90-AppendixA.Rmd b/90-AppendixA.Rmd
index 1c627c1d..9b2c74c7 100644
--- a/90-AppendixA.Rmd
+++ b/90-AppendixA.Rmd
@@ -15,7 +15,7 @@ library(janitor)
 library(kableExtra)
 library(knitr)
 
-anes_2020 <- anes_in
+data(anes_2020)
 
 attrlist <- map(anes_2020, attributes)
 
@@ -27,7 +27,6 @@ NULL_to_NA <- function(x){
   }
 }
 
-
 anes_var_info <- tibble(
   Vars=names(attrlist),
   Section=map_chr(attrlist, "Section") %>% unname(),
@@ -45,8 +44,6 @@ anes_var_info <- tibble(
   ) %>%
   ungroup()
 
-
-
 cb_count <- function(dat, var){
   t <- dat %>%
     count(.data[[var]]) %>%
diff --git a/91-AppendixB.Rmd b/91-AppendixB.Rmd
index dc22defd..35a6c2f5 100644
--- a/91-AppendixB.Rmd
+++ b/91-AppendixB.Rmd
@@ -11,7 +11,7 @@ library(janitor)
 library(kableExtra)
 library(knitr)
 
-recs <- recs_in
+recs <- recs_2020
 ```
 
 The full codebook with the original variables is available at [https://www.eia.gov/consumption/residential/data/2020/index.php?view=microdata](https://www.eia.gov/consumption/residential/data/2020/index.php?view=microdata) - "Variable and response codebook". This codebook includes the variables on the dataset included for download along with this book.
diff --git a/DataCleaningScripts/00_Run.R b/DataCleaningScripts/00_Run.R
index 5c11556d..1ec96c83 100644
--- a/DataCleaningScripts/00_Run.R
+++ b/DataCleaningScripts/00_Run.R
@@ -1,24 +1,4 @@
-rmarkdown::render(
-  input=here::here("DataCleaningScripts", "RECS_2015_DataPrep.Rmd"),
-  envir=new.env()
-)
-
-rmarkdown::render(
-  input=here::here("DataCleaningScripts", "RECS_2020_DataPrep.Rmd"),
-  envir=new.env()
-)
-
-rmarkdown::render(
-  input=here::here("DataCleaningScripts", "ANES_2020_DataPrep.Rmd"),
-  envir=new.env()
-)
-
 rmarkdown::render(
   input=here::here("DataCleaningScripts", "LAPOP_2021_DataPrep.Rmd"),
   envir=new.env()
 )
-
-rmarkdown::render(
-  input=here::here("DataCleaningScripts", "NCVS_2021_DataPrep.Rmd"),
-  envir=new.env()
-)
diff --git a/DataCleaningScripts/ANES_2020_DataPrep.Rmd b/DataCleaningScripts/ANES_2020_DataPrep.Rmd
index ed2a2471..e69de29b 100644
--- a/DataCleaningScripts/ANES_2020_DataPrep.Rmd
+++ b/DataCleaningScripts/ANES_2020_DataPrep.Rmd
@@ -1,395 +0,0 @@
----
-title: "American National Election Studies (ANES) 2020 Time Series Study Data Prep"
-output: 
-  github_document:
-    html_preview: false
----
-
-```{r setup, include=FALSE}
-knitr::opts_chunk$set(echo = TRUE)
-```
-
-## Data information
-
-All data and resources were downloaded from https://electionstudies.org/data-center/2020-time-series-study/ on February 28, 2022.
-
-American National Election Studies. 2021. ANES 2020 Time Series Study Full Release [dataset and documentation].  www.electionstudies.org
-```{r loadpackageh, message=FALSE}
-library(here) # easy relative paths
-```
-
-
-
-```{r loadpackages}
-library(tidyverse) # data manipulation
-library(haven) # data import
-library(tidylog) # informative logging messages
-library(osfr)
-```
-## Import data and create derived variables
-
-```{r derivedata}
-anes_file_osf_det <- osf_retrieve_node("https://osf.io/z5c3m/") %>%
-  osf_ls_files(path="ANES_2020", pattern="sav") %>%
-  osf_download(conflicts="overwrite", path=here("osf_dl"))
-
-anes_in_2020 <- read_sav(pull(anes_file_osf_det, local_path))
-
-unlink(pull(anes_file_osf_det, local_path))
-
-# weight validity for post-election survey
-anes_in_2020 %>%
-   select(V200004, V200010a, V200010b) %>%
-   group_by(V200004) %>% #type of respondent
-   summarise(
-      n=n(),
-      nvalidwt_pre=sum(!is.na(V200010a) & V200010a>0),
-      nvalidwt_post=sum(!is.na(V200010b) & V200010b>0)
-   )
-
-# Are all PSU/Stratum represented in post-weight? If so, we can drop pre-only cases later
-
-anes_in_2020 %>%
-   count(V200010d, V200010c, V200004) %>%
-   group_by(V200010d, V200010c) %>%
-   mutate(
-      Pct=n/sum(n)
-   ) %>%
-   filter(V200004==3) %>%
-   arrange(Pct)
-
-
-anes_2020 <- anes_in_2020 %>%
-  filter(V200004==3) %>%
-  select(
-    V200001,
-    V200001,
-    V200002, # MODE OF INTERVIEW: PRE-ELECTION INTERVIEW
-    V200010b, # FULL SAMPLE POST-ELECTION WEIGHT
-    V200010d, # FULL SAMPLE VARIANCE STRATUM
-    V200010c, # FULL SAMPLE VARIANCE UNIT
-    V201006, # PRE: HOW INTERESTED IN FOLLOWING CAMPAIGNS
-    V201102, # PRE: DID R VOTE FOR PRESIDENT IN 2016
-    V201101, # PRE: DID R VOTE FOR PRESIDENT IN 2016 [REVISED]
-    V201103, # PRE: RECALL OF LAST (2016) PRESIDENTIAL VOTE CHOICE)
-    V201025x, # PRE: SUMMARY: REGISTRATION AND EARLY VOTE STATUS
-    V201228,
-    V201229,
-    V201230,
-    V201231x, # PRE: SUMMARY: PARTY ID
-    V201233, # PRE: HOW OFTEN TRUST GOVERNMENT IN WASHINGTON TO DO WHAT IS RIGHT [REVISED]
-    V201237, # PRE: HOW OFTEN CAN PEOPLE BE TRUSTED
-    V201507x, # PRE: SUMMARY: RESPONDENT AGE
-    V201510, # PRE: HIGHEST LEVEL OF EDUCATION
-    V201546,
-    starts_with("V201547"),
-    V201549x, # PRE: SUMMARY: R SELF-IDENTIFIED RACE/ETHNICITY
-    V201600, # PRE: WHAT IS YOUR (R) SEX? [REVISED]
-    V201607,
-    V201610,
-    V201611,
-    V201613,
-    V201615,
-    V201616,
-    V201617x, # PRE: SUMMARY: TOTAL (FAMILY) INCOME
-    V202066, # POST: DID R VOTE IN NOVEMBER 2020 ELECTION
-    V201024,
-    V202066,
-    V202051,
-    V202109x, # PRE-POST: SUMMARY: VOTER TURNOUT IN 2020
-    V202072, # POST: DID R VOTE FOR PRESIDENT
-    V201029,
-    V202073, # POST: FOR WHOM DID R VOTE FOR PRESIDENT
-    V202110x # PRE-POST: SUMMARY: 2020 PRESIDENTIAL VOTE
-  ) %>%
-  mutate(
-    CaseID=V200001,
-    InterviewMode = fct_recode(as.character(V200002), Video = "1", Telephone = "2", Web = "3"),
-    Weight = V200010b,
-    Stratum = as.factor(V200010d),
-    VarUnit = as.factor(V200010c),
-    Age = if_else(V201507x > 0, as.numeric(V201507x), NA_real_),
-    AgeGroup = cut(Age, c(17, 29, 39, 49, 59, 69, 200),
-                   labels = c("18-29", "30-39", "40-49", "50-59", "60-69", "70 or older")
-    ),
-    Gender = factor(
-      case_when(
-        V201600 == 1 ~ "Male",
-        V201600 == 2 ~ "Female",
-        TRUE ~ NA_character_
-      ),
-      levels = c("Male", "Female")
-    ),
-    RaceEth = factor(
-      case_when(
-        V201549x == 1 ~ "White",
-        V201549x == 2 ~ "Black",
-        V201549x == 3 ~ "Hispanic",
-        V201549x == 4 ~ "Asian, NH/PI",
-        V201549x == 5 ~ "AI/AN",
-        V201549x == 6 ~ "Other/multiple race",
-        TRUE ~ NA_character_
-      ),
-      levels = c("White", "Black", "Hispanic", "Asian, NH/PI", "AI/AN", "Other/multiple race", NA_character_)
-    ),
-    PartyID = factor(
-      case_when(
-        V201231x == 1 ~ "Strong democrat",
-        V201231x == 2 ~ "Not very strong democrat",
-        V201231x == 3 ~ "Independent-democrat",
-        V201231x == 4 ~ "Independent",
-        V201231x == 5 ~ "Independent-republican",
-        V201231x == 6 ~ "Not very strong republican",
-        V201231x == 7 ~ "Strong republican",
-        TRUE ~ NA_character_
-      ),
-      levels = c("Strong democrat", "Not very strong democrat", "Independent-democrat", "Independent", "Independent-republican", "Not very strong republican", "Strong republican")
-    ),
-    Education = factor(
-      case_when(
-        V201510 <= 0 ~ NA_character_,
-        V201510 == 1 ~ "Less than HS",
-        V201510 == 2 ~ "High school",
-        V201510 <= 5 ~ "Post HS",
-        V201510 == 6 ~ "Bachelor's",
-        V201510 <= 8 ~ "Graduate",
-        TRUE ~ NA_character_
-      ),
-      levels = c("Less than HS", "High school", "Post HS", "Bachelor's", "Graduate")
-    ),
-    Income = cut(V201617x, c(-5, 1:22),
-                 labels = c(
-                   "Under $9,999",
-                   "$10,000-14,999",
-                   "$15,000-19,999",
-                   "$20,000-24,999",
-                   "$25,000-29,999",
-                   "$30,000-34,999",
-                   "$35,000-39,999",
-                   "$40,000-44,999",
-                   "$45,000-49,999",
-                   "$50,000-59,999",
-                   "$60,000-64,999",
-                   "$65,000-69,999",
-                   "$70,000-74,999",
-                   "$75,000-79,999",
-                   "$80,000-89,999",
-                   "$90,000-99,999",
-                   "$100,000-109,999",
-                   "$110,000-124,999",
-                   "$125,000-149,999",
-                   "$150,000-174,999",
-                   "$175,000-249,999",
-                   "$250,000 or more"
-                 )
-    ),
-    Income7 = fct_collapse(
-      Income,
-      "Under $20k" = c("Under $9,999", "$10,000-14,999", "$15,000-19,999"),
-      "$20-40k" = c("$20,000-24,999", "$25,000-29,999", "$30,000-34,999", "$35,000-39,999"),
-      "$40-60k" = c("$40,000-44,999", "$45,000-49,999", "$50,000-59,999"),
-      "$60-80k" = c("$60,000-64,999", "$65,000-69,999", "$70,000-74,999", "$75,000-79,999"),
-      "$80-100k" = c("$80,000-89,999", "$90,000-99,999"),
-      "$100-125k" = c("$100,000-109,999", "$110,000-124,999"),
-      "$125k or more" = c("$125,000-149,999", "$150,000-174,999", "$175,000-249,999", "$250,000 or more")
-    ),
-    CampaignInterest = factor(
-      case_when(
-        V201006 == 1 ~ "Very much interested",
-        V201006 == 2 ~ "Somewhat interested",
-        V201006 == 3 ~ "Not much interested",
-        TRUE ~ NA_character_
-      ),
-      levels = c("Very much interested", "Somewhat interested", "Not much interested")
-    ),
-    TrustGovernment = factor(
-      case_when(
-        V201233 == 1 ~ "Always",
-        V201233 == 2 ~ "Most of the time",
-        V201233 == 3 ~ "About half the time",
-        V201233 == 4 ~ "Some of the time",
-        V201233 == 5 ~ "Never",
-        TRUE ~ NA_character_
-      ),
-      levels = c("Always", "Most of the time", "About half the time", "Some of the time", "Never")
-    ),
-    TrustPeople = factor(
-      case_when(
-        V201237 == 1 ~ "Always",
-        V201237 == 2 ~ "Most of the time",
-        V201237 == 3 ~ "About half the time",
-        V201237 == 4 ~ "Some of the time",
-        V201237 == 5 ~ "Never",
-        TRUE ~ NA_character_
-      ),
-      levels = c("Always", "Most of the time", "About half the time", "Some of the time", "Never")
-    ),
-    VotedPres2016 = factor(
-      case_when(
-        V201101 == 1 | V201102 == 1 ~ "Yes",
-        V201101 == 2 | V201102 == 2 ~ "No",
-        TRUE ~ NA_character_
-      ),
-      levels = c("Yes", "No")
-    ),
-    VotedPres2016_selection = factor(
-      case_when(
-        V201103 == 1 ~ "Clinton",
-        V201103 == 2 ~ "Trump",
-        V201103 == 5 ~ "Other",
-        TRUE ~ NA_character_
-      ),
-      levels = c("Clinton", "Trump", "Other")
-    ),
-    VotedPres2020 = factor(
-      case_when(
-        V202072 == 1 ~ "Yes",
-        V202072 == 2 ~ "No",
-        TRUE ~ NA_character_
-      ),
-      levels = c("Yes", "No")
-    ),
-    VotedPres2020_selection = factor(
-      case_when(
-        V202110x == 1 ~ "Biden",
-        V202110x == 2 ~ "Trump",
-        V202110x >= 3 & V202110x <= 5~ "Other",
-        TRUE ~ NA_character_
-      ),
-      levels = c("Biden", "Trump", "Other")
-    ),
-    EarlyVote2020 = factor(
-      case_when(
-        V201025x < 0 ~ NA_character_,
-        V201025x == 4 ~ "Yes",
-        VotedPres2020 == "Yes" ~ "No",
-        TRUE ~ NA_character_), 
-      levels = c("Yes", "No")
-    )
-  )
-
-summary(anes_2020)
-```
-
-## Check derived variables for correct coding
-
-```{r checkvars}
-
-anes_2020 %>% count(InterviewMode, V200002)
-
-anes_2020 %>%
-   group_by(AgeGroup) %>%
-   summarise(
-      minAge = min(Age),
-      maxAge = max(Age),
-      minV = min(V201507x),
-      maxV = max(V201507x)
-   )
-
-anes_2020 %>% count(Gender, V201600)
-
-anes_2020 %>% count(RaceEth, V201549x)
-
-anes_2020 %>% count(PartyID, V201231x)
-
-anes_2020 %>% count(Education, V201510)
-
-anes_2020 %>%
-   count(Income, Income7, V201617x) %>%
-   print(n = 30)
-
-anes_2020 %>% count(CampaignInterest, V201006)
-
-anes_2020 %>% count(TrustGovernment, V201233)
-
-anes_2020 %>% count(TrustPeople, V201237)
-
-anes_2020 %>% count(VotedPres2016, V201101, V201102)
-
-anes_2020 %>% count(VotedPres2016_selection, V201103)
-
-anes_2020 %>% count(VotedPres2020, V202072)
-
-anes_2020 %>% count(VotedPres2020_selection, V202110x)
-
-anes_2020 %>% count(EarlyVote2020, V201025x, VotedPres2020)
-
-anes_2020 %>%
-   summarise(WtSum = sum(Weight, na.rm = TRUE)) %>%
-   pull(WtSum)
-```
-
-
-## Label and order data
-
-```{r}
-#label: label-ord
-
-cb_in <- readxl::read_xlsx(here::here("DataCleaningScripts", "ANES Codebook Metadata.xlsx"))
-
-cb_ord <- cb_in %>%
-  mutate(
-    Type=1,
-    SectNum=case_match(
-      Section,
-      "ADMIN"~1,
-      "WEIGHTS"~2,
-      "PRE-ELECTION SURVEY QUESTIONNAIRE"~3,
-      "POST-ELECTION SURVEY QUESTIONNAIRE"~4
-    )) %>%
-  arrange(SectNum, Variable) %>%
-  mutate(
-    Order=row_number()
-  )
-
-
-
-
-cb_slim <- cb_ord %>%
-  select(Variable=BookDerived, `Description and Labels`, Question, Section, SectNum, Order) %>%
-  filter(!is.na(Variable)) %>%
-  separate_longer_delim(Variable, delim="; ") %>%
-  add_case(Variable="VotedPres2016", `Description and Labels`="PRE: Did R vote for President in 2016", Question="Derived from V201102, V201101", Section="PRE-ELECTION SURVEY QUESTIONNAIRE", SectNum=3, Order=11) %>%
-    add_case(Variable="EarlyVote2020", `Description and Labels`="PRE-POST: Voted early for president", Question="Derived from V201025x, VotedPres2020", Section="POST-ELECTION SURVEY QUESTIONNAIRE", SectNum=4, Order=44) %>%
-  mutate(Type=2) %>%
-  bind_rows(select(cb_ord, -BookDerived)) %>%
-  arrange(SectNum, Order, Type)
-
-names(anes_2020)[!(names(anes_2020) %in% pull(cb_slim, Variable))]
-
-cb_vars <- cb_slim %>%
-  filter(Variable %in% names(anes_2020))
-
-
-anes_ord <- anes_2020 %>%
-  select(all_of(pull(cb_vars, Variable)))
-
-options("tidylog.display" = list())  
-
-for (var in pull(cb_vars, Variable)) {
-  vi <- cb_vars %>% filter(Variable==var)
-  attr(anes_ord[[deparse(as.name(var))]], "format.spss") <- NULL
-  attr(anes_ord[[deparse(as.name(var))]], "display_width") <- NULL
-  attr(anes_ord[[deparse(as.name(var))]], "label") <- pull(vi, `Description and Labels`)
-  attr(anes_ord[[deparse(as.name(var))]], "Section") <- pull(vi, Section) %>% as.character()
-  if (!is.na(pull(vi, Question))) attr(anes_ord[[deparse(as.name(var))]], "Question") <- pull(vi, Question)
-}
-
-options("tidylog.display" = NULL) 
-
-```
-
-
-
-## Save data
-
-```{r savedat}
-summary(anes_ord)
-
-anes_der_tmp_loc <- here("osf_dl", "anes_2020.rds")
-write_rds(anes_ord, anes_der_tmp_loc)
-target_dir <- osf_retrieve_node("https://osf.io/gzbkn/?view_only=8ca80573293b4e12b7f934a0f742b957") 
-osf_upload(target_dir, path=anes_der_tmp_loc, conflicts="overwrite")
-unlink(anes_der_tmp_loc)
-
-```
diff --git a/DataCleaningScripts/ANES_2020_DataPrep.md b/DataCleaningScripts/ANES_2020_DataPrep.md
index d673ba6d..e69de29b 100644
--- a/DataCleaningScripts/ANES_2020_DataPrep.md
+++ b/DataCleaningScripts/ANES_2020_DataPrep.md
@@ -1,881 +0,0 @@
-American National Election Studies (ANES) 2020 Time Series Study Data
-Prep
-================
-
-## Data information
-
-All data and resources were downloaded from
-<https://electionstudies.org/data-center/2020-time-series-study/> on
-February 28, 2022.
-
-American National Election Studies. 2021. ANES 2020 Time Series Study
-Full Release \[dataset and documentation\]. www.electionstudies.org
-
-``` r
-library(here) # easy relative paths
-```
-
-``` r
-library(tidyverse) # data manipulation
-library(haven) # data import
-library(tidylog) # informative logging messages
-```
-
-    ## 
-    ## Attaching package: 'tidylog'
-
-    ## The following objects are masked from 'package:srvyr':
-    ## 
-    ##     anti_join, drop_na, filter, filter_all, filter_at, filter_if, group_by, group_by_all, group_by_at, group_by_if, mutate, mutate_all, mutate_at, mutate_if, rename,
-    ##     rename_all, rename_at, rename_if, rename_with, select, select_all, select_at, select_if, semi_join, summarise, summarise_all, summarise_at, summarise_if, summarize,
-    ##     summarize_all, summarize_at, summarize_if, transmute, ungroup
-
-    ## The following objects are masked from 'package:dplyr':
-    ## 
-    ##     add_count, add_tally, anti_join, count, distinct, distinct_all, distinct_at, distinct_if, filter, filter_all, filter_at, filter_if, full_join, group_by, group_by_all,
-    ##     group_by_at, group_by_if, inner_join, left_join, mutate, mutate_all, mutate_at, mutate_if, relocate, rename, rename_all, rename_at, rename_if, rename_with, right_join,
-    ##     sample_frac, sample_n, select, select_all, select_at, select_if, semi_join, slice, slice_head, slice_max, slice_min, slice_sample, slice_tail, summarise, summarise_all,
-    ##     summarise_at, summarise_if, summarize, summarize_all, summarize_at, summarize_if, tally, top_frac, top_n, transmute, transmute_all, transmute_at, transmute_if, ungroup
-
-    ## The following objects are masked from 'package:tidyr':
-    ## 
-    ##     drop_na, fill, gather, pivot_longer, pivot_wider, replace_na, spread, uncount
-
-    ## The following object is masked from 'package:stats':
-    ## 
-    ##     filter
-
-``` r
-library(osfr)
-```
-
-## Import data and create derived variables
-
-``` r
-anes_file_osf_det <- osf_retrieve_node("https://osf.io/z5c3m/") %>%
-  osf_ls_files(path="ANES_2020", pattern="sav") %>%
-  osf_download(conflicts="overwrite", path=here("osf_dl"))
-
-anes_in_2020 <- read_sav(pull(anes_file_osf_det, local_path))
-
-unlink(pull(anes_file_osf_det, local_path))
-
-# weight validity for post-election survey
-anes_in_2020 %>%
-   select(V200004, V200010a, V200010b) %>%
-   group_by(V200004) %>% #type of respondent
-   summarise(
-      n=n(),
-      nvalidwt_pre=sum(!is.na(V200010a) & V200010a>0),
-      nvalidwt_post=sum(!is.na(V200010b) & V200010b>0)
-   )
-```
-
-    ## select: dropped 1,768 variables (version, V200001, V160001_orig, V200002, V200003, …)
-
-    ## group_by: one grouping variable (V200004)
-
-    ## summarise: now 2 rows and 4 columns, ungrouped
-
-    ## # A tibble: 2 × 4
-    ##   V200004                                                     n nvalidwt_pre nvalidwt_post
-    ##   <dbl+lbl>                                               <int>        <int>         <int>
-    ## 1 1 [1. pre-election interview (only) complete]             827          827             0
-    ## 2 3 [3. pre and post-election interviews (both) complete]  7453         7453          7453
-
-``` r
-# Are all PSU/Stratum represented in post-weight? If so, we can drop pre-only cases later
-
-anes_in_2020 %>%
-   count(V200010d, V200010c, V200004) %>%
-   group_by(V200010d, V200010c) %>%
-   mutate(
-      Pct=n/sum(n)
-   ) %>%
-   filter(V200004==3) %>%
-   arrange(Pct)
-```
-
-    ## count: now 202 rows and 4 columns, ungrouped
-
-    ## group_by: 2 grouping variables (V200010d, V200010c)
-
-    ## mutate (grouped): new variable 'Pct' (double) with 168 unique values and 0% NA
-
-    ## filter (grouped): removed 101 rows (50%), 101 rows remaining
-
-    ## # A tibble: 101 × 5
-    ## # Groups:   V200010d, V200010c [101]
-    ##    V200010d V200010c V200004                                                     n   Pct
-    ##       <dbl>    <dbl> <dbl+lbl>                                               <int> <dbl>
-    ##  1       32        1 3 [3. pre and post-election interviews (both) complete]    63 0.797
-    ##  2       33        2 3 [3. pre and post-election interviews (both) complete]    67 0.798
-    ##  3       45        1 3 [3. pre and post-election interviews (both) complete]    60 0.8  
-    ##  4        8        1 3 [3. pre and post-election interviews (both) complete]    72 0.828
-    ##  5       38        2 3 [3. pre and post-election interviews (both) complete]    71 0.835
-    ##  6       49        2 3 [3. pre and post-election interviews (both) complete]    67 0.838
-    ##  7       36        1 3 [3. pre and post-election interviews (both) complete]    68 0.85 
-    ##  8       47        2 3 [3. pre and post-election interviews (both) complete]    69 0.852
-    ##  9       50        2 3 [3. pre and post-election interviews (both) complete]    69 0.852
-    ## 10        4        1 3 [3. pre and post-election interviews (both) complete]    76 0.854
-    ## # ℹ 91 more rows
-
-``` r
-anes_2020 <- anes_in_2020 %>%
-  filter(V200004==3) %>%
-  select(
-    V200001,
-    V200001,
-    V200002, # MODE OF INTERVIEW: PRE-ELECTION INTERVIEW
-    V200010b, # FULL SAMPLE POST-ELECTION WEIGHT
-    V200010d, # FULL SAMPLE VARIANCE STRATUM
-    V200010c, # FULL SAMPLE VARIANCE UNIT
-    V201006, # PRE: HOW INTERESTED IN FOLLOWING CAMPAIGNS
-    V201102, # PRE: DID R VOTE FOR PRESIDENT IN 2016
-    V201101, # PRE: DID R VOTE FOR PRESIDENT IN 2016 [REVISED]
-    V201103, # PRE: RECALL OF LAST (2016) PRESIDENTIAL VOTE CHOICE)
-    V201025x, # PRE: SUMMARY: REGISTRATION AND EARLY VOTE STATUS
-    V201228,
-    V201229,
-    V201230,
-    V201231x, # PRE: SUMMARY: PARTY ID
-    V201233, # PRE: HOW OFTEN TRUST GOVERNMENT IN WASHINGTON TO DO WHAT IS RIGHT [REVISED]
-    V201237, # PRE: HOW OFTEN CAN PEOPLE BE TRUSTED
-    V201507x, # PRE: SUMMARY: RESPONDENT AGE
-    V201510, # PRE: HIGHEST LEVEL OF EDUCATION
-    V201546,
-    starts_with("V201547"),
-    V201549x, # PRE: SUMMARY: R SELF-IDENTIFIED RACE/ETHNICITY
-    V201600, # PRE: WHAT IS YOUR (R) SEX? [REVISED]
-    V201607,
-    V201610,
-    V201611,
-    V201613,
-    V201615,
-    V201616,
-    V201617x, # PRE: SUMMARY: TOTAL (FAMILY) INCOME
-    V202066, # POST: DID R VOTE IN NOVEMBER 2020 ELECTION
-    V201024,
-    V202066,
-    V202051,
-    V202109x, # PRE-POST: SUMMARY: VOTER TURNOUT IN 2020
-    V202072, # POST: DID R VOTE FOR PRESIDENT
-    V201029,
-    V202073, # POST: FOR WHOM DID R VOTE FOR PRESIDENT
-    V202110x # PRE-POST: SUMMARY: 2020 PRESIDENTIAL VOTE
-  ) %>%
-  mutate(
-    CaseID=V200001,
-    InterviewMode = fct_recode(as.character(V200002), Video = "1", Telephone = "2", Web = "3"),
-    Weight = V200010b,
-    Stratum = as.factor(V200010d),
-    VarUnit = as.factor(V200010c),
-    Age = if_else(V201507x > 0, as.numeric(V201507x), NA_real_),
-    AgeGroup = cut(Age, c(17, 29, 39, 49, 59, 69, 200),
-                   labels = c("18-29", "30-39", "40-49", "50-59", "60-69", "70 or older")
-    ),
-    Gender = factor(
-      case_when(
-        V201600 == 1 ~ "Male",
-        V201600 == 2 ~ "Female",
-        TRUE ~ NA_character_
-      ),
-      levels = c("Male", "Female")
-    ),
-    RaceEth = factor(
-      case_when(
-        V201549x == 1 ~ "White",
-        V201549x == 2 ~ "Black",
-        V201549x == 3 ~ "Hispanic",
-        V201549x == 4 ~ "Asian, NH/PI",
-        V201549x == 5 ~ "AI/AN",
-        V201549x == 6 ~ "Other/multiple race",
-        TRUE ~ NA_character_
-      ),
-      levels = c("White", "Black", "Hispanic", "Asian, NH/PI", "AI/AN", "Other/multiple race", NA_character_)
-    ),
-    PartyID = factor(
-      case_when(
-        V201231x == 1 ~ "Strong democrat",
-        V201231x == 2 ~ "Not very strong democrat",
-        V201231x == 3 ~ "Independent-democrat",
-        V201231x == 4 ~ "Independent",
-        V201231x == 5 ~ "Independent-republican",
-        V201231x == 6 ~ "Not very strong republican",
-        V201231x == 7 ~ "Strong republican",
-        TRUE ~ NA_character_
-      ),
-      levels = c("Strong democrat", "Not very strong democrat", "Independent-democrat", "Independent", "Independent-republican", "Not very strong republican", "Strong republican")
-    ),
-    Education = factor(
-      case_when(
-        V201510 <= 0 ~ NA_character_,
-        V201510 == 1 ~ "Less than HS",
-        V201510 == 2 ~ "High school",
-        V201510 <= 5 ~ "Post HS",
-        V201510 == 6 ~ "Bachelor's",
-        V201510 <= 8 ~ "Graduate",
-        TRUE ~ NA_character_
-      ),
-      levels = c("Less than HS", "High school", "Post HS", "Bachelor's", "Graduate")
-    ),
-    Income = cut(V201617x, c(-5, 1:22),
-                 labels = c(
-                   "Under $9,999",
-                   "$10,000-14,999",
-                   "$15,000-19,999",
-                   "$20,000-24,999",
-                   "$25,000-29,999",
-                   "$30,000-34,999",
-                   "$35,000-39,999",
-                   "$40,000-44,999",
-                   "$45,000-49,999",
-                   "$50,000-59,999",
-                   "$60,000-64,999",
-                   "$65,000-69,999",
-                   "$70,000-74,999",
-                   "$75,000-79,999",
-                   "$80,000-89,999",
-                   "$90,000-99,999",
-                   "$100,000-109,999",
-                   "$110,000-124,999",
-                   "$125,000-149,999",
-                   "$150,000-174,999",
-                   "$175,000-249,999",
-                   "$250,000 or more"
-                 )
-    ),
-    Income7 = fct_collapse(
-      Income,
-      "Under $20k" = c("Under $9,999", "$10,000-14,999", "$15,000-19,999"),
-      "$20-40k" = c("$20,000-24,999", "$25,000-29,999", "$30,000-34,999", "$35,000-39,999"),
-      "$40-60k" = c("$40,000-44,999", "$45,000-49,999", "$50,000-59,999"),
-      "$60-80k" = c("$60,000-64,999", "$65,000-69,999", "$70,000-74,999", "$75,000-79,999"),
-      "$80-100k" = c("$80,000-89,999", "$90,000-99,999"),
-      "$100-125k" = c("$100,000-109,999", "$110,000-124,999"),
-      "$125k or more" = c("$125,000-149,999", "$150,000-174,999", "$175,000-249,999", "$250,000 or more")
-    ),
-    CampaignInterest = factor(
-      case_when(
-        V201006 == 1 ~ "Very much interested",
-        V201006 == 2 ~ "Somewhat interested",
-        V201006 == 3 ~ "Not much interested",
-        TRUE ~ NA_character_
-      ),
-      levels = c("Very much interested", "Somewhat interested", "Not much interested")
-    ),
-    TrustGovernment = factor(
-      case_when(
-        V201233 == 1 ~ "Always",
-        V201233 == 2 ~ "Most of the time",
-        V201233 == 3 ~ "About half the time",
-        V201233 == 4 ~ "Some of the time",
-        V201233 == 5 ~ "Never",
-        TRUE ~ NA_character_
-      ),
-      levels = c("Always", "Most of the time", "About half the time", "Some of the time", "Never")
-    ),
-    TrustPeople = factor(
-      case_when(
-        V201237 == 1 ~ "Always",
-        V201237 == 2 ~ "Most of the time",
-        V201237 == 3 ~ "About half the time",
-        V201237 == 4 ~ "Some of the time",
-        V201237 == 5 ~ "Never",
-        TRUE ~ NA_character_
-      ),
-      levels = c("Always", "Most of the time", "About half the time", "Some of the time", "Never")
-    ),
-    VotedPres2016 = factor(
-      case_when(
-        V201101 == 1 | V201102 == 1 ~ "Yes",
-        V201101 == 2 | V201102 == 2 ~ "No",
-        TRUE ~ NA_character_
-      ),
-      levels = c("Yes", "No")
-    ),
-    VotedPres2016_selection = factor(
-      case_when(
-        V201103 == 1 ~ "Clinton",
-        V201103 == 2 ~ "Trump",
-        V201103 == 5 ~ "Other",
-        TRUE ~ NA_character_
-      ),
-      levels = c("Clinton", "Trump", "Other")
-    ),
-    VotedPres2020 = factor(
-      case_when(
-        V202072 == 1 ~ "Yes",
-        V202072 == 2 ~ "No",
-        TRUE ~ NA_character_
-      ),
-      levels = c("Yes", "No")
-    ),
-    VotedPres2020_selection = factor(
-      case_when(
-        V202110x == 1 ~ "Biden",
-        V202110x == 2 ~ "Trump",
-        V202110x >= 3 & V202110x <= 5~ "Other",
-        TRUE ~ NA_character_
-      ),
-      levels = c("Biden", "Trump", "Other")
-    ),
-    EarlyVote2020 = factor(
-      case_when(
-        V201025x < 0 ~ NA_character_,
-        V201025x == 4 ~ "Yes",
-        VotedPres2020 == "Yes" ~ "No",
-        TRUE ~ NA_character_), 
-      levels = c("Yes", "No")
-    )
-  )
-```
-
-    ## filter: removed 827 rows (10%), 7,453 rows remaining
-
-    ## select: dropped 1,729 variables (version, V160001_orig, V200003, V200004, V200005, …)
-
-    ## mutate: new variable 'CaseID' (double) with 7,453 unique values and 0% NA
-
-    ##         new variable 'InterviewMode' (factor) with 3 unique values and 0% NA
-
-    ##         new variable 'Weight' (double) with 7,195 unique values and 0% NA
-
-    ##         new variable 'Stratum' (factor) with 50 unique values and 0% NA
-
-    ##         new variable 'VarUnit' (factor) with 3 unique values and 0% NA
-
-    ##         new variable 'Age' (double) with 64 unique values and 4% NA
-
-    ##         new variable 'AgeGroup' (factor) with 7 unique values and 4% NA
-
-    ##         new variable 'Gender' (factor) with 3 unique values and 1% NA
-
-    ##         new variable 'RaceEth' (factor) with 7 unique values and 1% NA
-
-    ##         new variable 'PartyID' (factor) with 8 unique values and <1% NA
-
-    ##         new variable 'Education' (factor) with 6 unique values and 2% NA
-
-    ##         new variable 'Income' (factor) with 23 unique values and 7% NA
-
-    ##         new variable 'Income7' (factor) with 8 unique values and 7% NA
-
-    ##         new variable 'CampaignInterest' (factor) with 4 unique values and <1% NA
-
-    ##         new variable 'TrustGovernment' (factor) with 6 unique values and <1% NA
-
-    ##         new variable 'TrustPeople' (factor) with 6 unique values and <1% NA
-
-    ##         new variable 'VotedPres2016' (factor) with 3 unique values and <1% NA
-
-    ##         new variable 'VotedPres2016_selection' (factor) with 4 unique values and 23% NA
-
-    ##         new variable 'VotedPres2020' (factor) with 3 unique values and 19% NA
-
-    ##         new variable 'VotedPres2020_selection' (factor) with 4 unique values and 16% NA
-
-    ##         new variable 'EarlyVote2020' (factor) with 3 unique values and 15% NA
-
-``` r
-summary(anes_2020)
-```
-
-    ##     V200001          V200002         V200010b           V200010d        V200010c        V201006          V201102           V201101            V201103          V201025x         V201228     
-    ##  Min.   :200015   Min.   :1.000   Min.   :0.008262   Min.   : 1.00   Min.   :1.000   Min.   :-9.000   Min.   :-9.0000   Min.   :-9.00000   Min.   :-9.000   Min.   :-4.000   Min.   :-9.00  
-    ##  1st Qu.:225427   1st Qu.:3.000   1st Qu.:0.386263   1st Qu.:12.00   1st Qu.:1.000   1st Qu.: 1.000   1st Qu.:-1.0000   1st Qu.:-1.00000   1st Qu.: 1.000   1st Qu.: 3.000   1st Qu.: 1.00  
-    ##  Median :335416   Median :3.000   Median :0.686301   Median :24.00   Median :2.000   Median : 1.000   Median : 1.0000   Median :-1.00000   Median : 1.000   Median : 3.000   Median : 2.00  
-    ##  Mean   :336416   Mean   :2.911   Mean   :1.000000   Mean   :24.63   Mean   :1.507   Mean   : 1.596   Mean   : 0.1048   Mean   : 0.08493   Mean   : 1.042   Mean   : 2.919   Mean   : 1.99  
-    ##  3rd Qu.:427865   3rd Qu.:3.000   3rd Qu.:1.211032   3rd Qu.:37.00   3rd Qu.:2.000   3rd Qu.: 2.000   3rd Qu.: 1.0000   3rd Qu.: 1.00000   3rd Qu.: 2.000   3rd Qu.: 3.000   3rd Qu.: 3.00  
-    ##  Max.   :535469   Max.   :3.000   Max.   :6.650665   Max.   :50.00   Max.   :3.000   Max.   : 3.000   Max.   : 2.0000   Max.   : 2.00000   Max.   : 5.000   Max.   : 4.000   Max.   : 5.00  
-    ##                                                                                                                                                                                             
-    ##     V201229           V201230            V201231x         V201233          V201237         V201507x        V201510          V201546          V201547a     V201547b     V201547c     V201547d 
-    ##  Min.   :-9.0000   Min.   :-9.00000   Min.   :-9.000   Min.   :-9.000   Min.   :-9.00   Min.   :-9.00   Min.   :-9.000   Min.   :-9.000   Min.   :-3   Min.   :-3   Min.   :-3   Min.   :-3  
-    ##  1st Qu.:-1.0000   1st Qu.:-1.00000   1st Qu.: 2.000   1st Qu.: 3.000   1st Qu.: 2.00   1st Qu.:35.00   1st Qu.: 3.000   1st Qu.: 2.000   1st Qu.:-3   1st Qu.:-3   1st Qu.:-3   1st Qu.:-3  
-    ##  Median : 1.0000   Median :-1.00000   Median : 4.000   Median : 4.000   Median : 3.00   Median :51.00   Median : 5.000   Median : 2.000   Median :-3   Median :-3   Median :-3   Median :-3  
-    ##  Mean   : 0.5154   Mean   : 0.01302   Mean   : 3.834   Mean   : 3.429   Mean   : 2.78   Mean   :49.43   Mean   : 5.621   Mean   : 1.841   Mean   :-3   Mean   :-3   Mean   :-3   Mean   :-3  
-    ##  3rd Qu.: 1.0000   3rd Qu.: 1.00000   3rd Qu.: 6.000   3rd Qu.: 4.000   3rd Qu.: 3.00   3rd Qu.:66.00   3rd Qu.: 6.000   3rd Qu.: 2.000   3rd Qu.:-3   3rd Qu.:-3   3rd Qu.:-3   3rd Qu.:-3  
-    ##  Max.   : 2.0000   Max.   : 3.00000   Max.   : 7.000   Max.   : 5.000   Max.   : 5.00   Max.   :80.00   Max.   :95.000   Max.   : 2.000   Max.   :-3   Max.   :-3   Max.   :-3   Max.   :-3  
-    ##                                                                                                                                                                                              
-    ##     V201547e     V201547z     V201549x         V201600          V201607      V201610      V201611      V201613      V201615      V201616      V201617x        V202066          V201024       
-    ##  Min.   :-3   Min.   :-3   Min.   :-9.000   Min.   :-9.000   Min.   :-3   Min.   :-3   Min.   :-3   Min.   :-3   Min.   :-3   Min.   :-3   Min.   :-9.00   Min.   :-9.000   Min.   :-9.0000  
-    ##  1st Qu.:-3   1st Qu.:-3   1st Qu.: 1.000   1st Qu.: 1.000   1st Qu.:-3   1st Qu.:-3   1st Qu.:-3   1st Qu.:-3   1st Qu.:-3   1st Qu.:-3   1st Qu.: 4.00   1st Qu.: 4.000   1st Qu.:-1.0000  
-    ##  Median :-3   Median :-3   Median : 1.000   Median : 2.000   Median :-3   Median :-3   Median :-3   Median :-3   Median :-3   Median :-3   Median :11.00   Median : 4.000   Median :-1.0000  
-    ##  Mean   :-3   Mean   :-3   Mean   : 1.499   Mean   : 1.472   Mean   :-3   Mean   :-3   Mean   :-3   Mean   :-3   Mean   :-3   Mean   :-3   Mean   :10.36   Mean   : 3.402   Mean   :-0.8595  
-    ##  3rd Qu.:-3   3rd Qu.:-3   3rd Qu.: 2.000   3rd Qu.: 2.000   3rd Qu.:-3   3rd Qu.:-3   3rd Qu.:-3   3rd Qu.:-3   3rd Qu.:-3   3rd Qu.:-3   3rd Qu.:17.00   3rd Qu.: 4.000   3rd Qu.:-1.0000  
-    ##  Max.   :-3   Max.   :-3   Max.   : 6.000   Max.   : 2.000   Max.   :-3   Max.   :-3   Max.   :-3   Max.   :-3   Max.   :-3   Max.   :-3   Max.   :22.00   Max.   : 4.000   Max.   : 4.0000  
-    ##                                                                                                                                                                                              
-    ##     V202051           V202109x          V202072           V201029           V202073           V202110x           CaseID         InterviewMode      Weight            Stratum     VarUnit 
-    ##  Min.   :-9.0000   Min.   :-2.0000   Min.   :-9.0000   Min.   :-9.0000   Min.   :-9.0000   Min.   :-9.0000   Min.   :200015   Video    : 274   Min.   :0.008262   12     : 179   1:3689  
-    ##  1st Qu.:-1.0000   1st Qu.: 1.0000   1st Qu.: 1.0000   1st Qu.:-1.0000   1st Qu.: 1.0000   1st Qu.: 1.0000   1st Qu.:225427   Telephone: 115   1st Qu.:0.386263   6      : 172   2:3750  
-    ##  Median :-1.0000   Median : 1.0000   Median : 1.0000   Median :-1.0000   Median : 1.0000   Median : 1.0000   Median :335416   Web      :7064   Median :0.686301   27     : 172   3:  14  
-    ##  Mean   :-0.7259   Mean   : 0.8578   Mean   : 0.6234   Mean   :-0.8967   Mean   : 0.9415   Mean   : 0.9902   Mean   :336416                    Mean   :1.000000   21     : 170           
-    ##  3rd Qu.:-1.0000   3rd Qu.: 1.0000   3rd Qu.: 1.0000   3rd Qu.:-1.0000   3rd Qu.: 2.0000   3rd Qu.: 2.0000   3rd Qu.:427865                    3rd Qu.:1.211032   25     : 169           
-    ##  Max.   : 3.0000   Max.   : 1.0000   Max.   : 2.0000   Max.   :12.0000   Max.   :12.0000   Max.   : 5.0000   Max.   :535469                    Max.   :6.650665   1      : 167           
-    ##                                                                                                                                                                   (Other):6424           
-    ##       Age               AgeGroup       Gender                    RaceEth                         PartyID            Education                 Income              Income7    
-    ##  Min.   :18.00   18-29      : 871   Male  :3375   White              :5420   Strong democrat         :1796   Less than HS: 312   Under $9,999    : 647   $125k or more:1468  
-    ##  1st Qu.:37.00   30-39      :1241   Female:4027   Black              : 650   Strong republican       :1545   High school :1160   $50,000-59,999  : 485   Under $20k   :1076  
-    ##  Median :53.00   40-49      :1081   NA's  :  51   Hispanic           : 662   Independent-democrat    : 881   Post HS     :2514   $100,000-109,999: 451   $20-40k      :1051  
-    ##  Mean   :51.83   50-59      :1200                 Asian, NH/PI       : 248   Independent             : 876   Bachelor's  :1877   $250,000 or more: 405   $40-60k      : 984  
-    ##  3rd Qu.:66.00   60-69      :1436                 AI/AN              : 155   Not very strong democrat: 790   Graduate    :1474   $80,000-89,999  : 383   $60-80k      : 920  
-    ##  Max.   :80.00   70 or older:1330                 Other/multiple race: 237   (Other)                 :1540   NA's        : 116   (Other)         :4565   (Other)      :1437  
-    ##  NA's   :294     NA's       : 294                 NA's               :  81   NA's                    :  25                       NA's            : 517   NA's         : 517  
-    ##              CampaignInterest            TrustGovernment              TrustPeople   VotedPres2016 VotedPres2016_selection VotedPres2020 VotedPres2020_selection EarlyVote2020
-    ##  Very much interested:3940    Always             :  80   Always             :  48   Yes :5810     Clinton:2911            Yes :5952     Biden:3509              Yes : 371    
-    ##  Somewhat interested :2569    Most of the time   :1016   Most of the time   :3511   No  :1622     Trump  :2466            No  :  77     Trump:2567              No  :5949    
-    ##  Not much interested : 943    About half the time:2313   About half the time:2020   NA's:  21     Other  : 390            NA's:1424     Other: 158              NA's:1133    
-    ##  NA's                :   1    Some of the time   :3313   Some of the time   :1597                 NA's   :1686                          NA's :1219                           
-    ##                               Never              : 702   Never              : 264                                                                                            
-    ##                               NA's               :  29   NA's               :  13                                                                                            
-    ## 
-
-## Check derived variables for correct coding
-
-``` r
-anes_2020 %>% count(InterviewMode, V200002)
-```
-
-    ## count: now 3 rows and 3 columns, ungrouped
-
-    ## # A tibble: 3 × 3
-    ##   InterviewMode V200002              n
-    ##   <fct>         <dbl+lbl>        <int>
-    ## 1 Video         1 [1. Video]       274
-    ## 2 Telephone     2 [2. Telephone]   115
-    ## 3 Web           3 [3. Web]        7064
-
-``` r
-anes_2020 %>%
-   group_by(AgeGroup) %>%
-   summarise(
-      minAge = min(Age),
-      maxAge = max(Age),
-      minV = min(V201507x),
-      maxV = max(V201507x)
-   )
-```
-
-    ## group_by: one grouping variable (AgeGroup)
-
-    ## summarise: now 7 rows and 5 columns, ungrouped
-
-    ## # A tibble: 7 × 5
-    ##   AgeGroup    minAge maxAge minV             maxV                    
-    ##   <fct>        <dbl>  <dbl> <dbl+lbl>        <dbl+lbl>               
-    ## 1 18-29           18     29 18               29                      
-    ## 2 30-39           30     39 30               39                      
-    ## 3 40-49           40     49 40               49                      
-    ## 4 50-59           50     59 50               59                      
-    ## 5 60-69           60     69 60               69                      
-    ## 6 70 or older     70     80 70               80 [80. Age 80 or older]
-    ## 7 <NA>            NA     NA -9 [-9. Refused] -9 [-9. Refused]
-
-``` r
-anes_2020 %>% count(Gender, V201600)
-```
-
-    ## count: now 3 rows and 3 columns, ungrouped
-
-    ## # A tibble: 3 × 3
-    ##   Gender V201600              n
-    ##   <fct>  <dbl+lbl>        <int>
-    ## 1 Male    1 [1. Male]      3375
-    ## 2 Female  2 [2. Female]    4027
-    ## 3 <NA>   -9 [-9. Refused]    51
-
-``` r
-anes_2020 %>% count(RaceEth, V201549x)
-```
-
-    ## count: now 8 rows and 3 columns, ungrouped
-
-    ## # A tibble: 8 × 3
-    ##   RaceEth             V201549x                                                                        n
-    ##   <fct>               <dbl+lbl>                                                                   <int>
-    ## 1 White                1 [1. White, non-Hispanic]                                                  5420
-    ## 2 Black                2 [2. Black, non-Hispanic]                                                   650
-    ## 3 Hispanic             3 [3. Hispanic]                                                              662
-    ## 4 Asian, NH/PI         4 [4. Asian or Native Hawaiian/other Pacific Islander, non-Hispanic alone]   248
-    ## 5 AI/AN                5 [5. Native American/Alaska Native or other race, non-Hispanic alone]       155
-    ## 6 Other/multiple race  6 [6. Multiple races, non-Hispanic]                                          237
-    ## 7 <NA>                -9 [-9. Refused]                                                               75
-    ## 8 <NA>                -8 [-8. Don't know]                                                             6
-
-``` r
-anes_2020 %>% count(PartyID, V201231x)
-```
-
-    ## count: now 9 rows and 3 columns, ungrouped
-
-    ## # A tibble: 9 × 3
-    ##   PartyID                    V201231x                               n
-    ##   <fct>                      <dbl+lbl>                          <int>
-    ## 1 Strong democrat             1 [1. Strong Democrat]             1796
-    ## 2 Not very strong democrat    2 [2. Not very strong Democrat]     790
-    ## 3 Independent-democrat        3 [3. Independent-Democrat]         881
-    ## 4 Independent                 4 [4. Independent]                  876
-    ## 5 Independent-republican      5 [5. Independent-Republican]       782
-    ## 6 Not very strong republican  6 [6. Not very strong Republican]   758
-    ## 7 Strong republican           7 [7. Strong Republican]           1545
-    ## 8 <NA>                       -9 [-9. Refused]                      23
-    ## 9 <NA>                       -8 [-8. Don't know]                    2
-
-``` r
-anes_2020 %>% count(Education, V201510)
-```
-
-    ## count: now 11 rows and 3 columns, ungrouped
-
-    ## # A tibble: 11 × 3
-    ##    Education    V201510                                                                                             n
-    ##    <fct>        <dbl+lbl>                                                                                       <int>
-    ##  1 Less than HS  1 [1. Less than high school credential]                                                          312
-    ##  2 High school   2 [2.  High school graduate - High school diploma or equivalent (e.g. GED)]                     1160
-    ##  3 Post HS       3 [3. Some college but no degree]                                                               1519
-    ##  4 Post HS       4 [4. Associate degree in college - occupational/vocational]                                     550
-    ##  5 Post HS       5 [5. Associate degree in college - academic]                                                    445
-    ##  6 Bachelor's    6 [6. Bachelor's degree (e.g. BA, AB, BS)]                                                      1877
-    ##  7 Graduate      7 [7. Master's degree (e.g. MA, MS, MEng, MEd, MSW, MBA)]                                       1092
-    ##  8 Graduate      8 [8. Professional school degree (e.g. MD, DDS, DVM, LLB, JD)/Doctoral degree (e.g. PHD, EDD)]   382
-    ##  9 <NA>         -9 [-9. Refused]                                                                                   25
-    ## 10 <NA>         -8 [-8. Don't know]                                                                                 1
-    ## 11 <NA>         95 [95. Other {SPECIFY}]                                                                           90
-
-``` r
-anes_2020 %>%
-   count(Income, Income7, V201617x) %>%
-   print(n = 30)
-```
-
-    ## count: now 24 rows and 4 columns, ungrouped
-
-    ## # A tibble: 24 × 4
-    ##    Income           Income7       V201617x                                                n
-    ##    <fct>            <fct>         <dbl+lbl>                                           <int>
-    ##  1 Under $9,999     Under $20k     1 [1. Under $9,999]                                  647
-    ##  2 $10,000-14,999   Under $20k     2 [2. $10,000-14,999]                                244
-    ##  3 $15,000-19,999   Under $20k     3 [3. $15,000-19,999]                                185
-    ##  4 $20,000-24,999   $20-40k        4 [4. $20,000-24,999]                                301
-    ##  5 $25,000-29,999   $20-40k        5 [5. $25,000-29,999]                                228
-    ##  6 $30,000-34,999   $20-40k        6 [6. $30,000-34,999]                                296
-    ##  7 $35,000-39,999   $20-40k        7 [7. $35,000-39,999]                                226
-    ##  8 $40,000-44,999   $40-60k        8 [8. $40,000-44,999]                                286
-    ##  9 $45,000-49,999   $40-60k        9 [9. $45,000-49,999]                                213
-    ## 10 $50,000-59,999   $40-60k       10 [10. $50,000-59,999]                               485
-    ## 11 $60,000-64,999   $60-80k       11 [11. $60,000-64,999]                               294
-    ## 12 $65,000-69,999   $60-80k       12 [12. $65,000-69,999]                               168
-    ## 13 $70,000-74,999   $60-80k       13 [13. $70,000-74,999]                               243
-    ## 14 $75,000-79,999   $60-80k       14 [14. $75,000-79,999]                               215
-    ## 15 $80,000-89,999   $80-100k      15 [15. $80,000-89,999]                               383
-    ## 16 $90,000-99,999   $80-100k      16 [16. $90,000-99,999]                               291
-    ## 17 $100,000-109,999 $100-125k     17 [17. $100,000-109,999]                             451
-    ## 18 $110,000-124,999 $100-125k     18 [18. $110,000-124,999]                             312
-    ## 19 $125,000-149,999 $125k or more 19 [19. $125,000-149,999]                             323
-    ## 20 $150,000-174,999 $125k or more 20 [20. $150,000-174,999]                             366
-    ## 21 $175,000-249,999 $125k or more 21 [21. $175,000-249,999]                             374
-    ## 22 $250,000 or more $125k or more 22 [22. $250,000 or more]                             405
-    ## 23 <NA>             <NA>          -9 [-9. Refused]                                      502
-    ## 24 <NA>             <NA>          -5 [-5. Interview breakoff (sufficient partial IW)]    15
-
-``` r
-anes_2020 %>% count(CampaignInterest, V201006)
-```
-
-    ## count: now 4 rows and 3 columns, ungrouped
-
-    ## # A tibble: 4 × 3
-    ##   CampaignInterest     V201006                          n
-    ##   <fct>                <dbl+lbl>                    <int>
-    ## 1 Very much interested  1 [1. Very much interested]  3940
-    ## 2 Somewhat interested   2 [2. Somewhat interested]   2569
-    ## 3 Not much interested   3 [3. Not much interested]    943
-    ## 4 <NA>                 -9 [-9. Refused]                 1
-
-``` r
-anes_2020 %>% count(TrustGovernment, V201233)
-```
-
-    ## count: now 7 rows and 3 columns, ungrouped
-
-    ## # A tibble: 7 × 3
-    ##   TrustGovernment     V201233                         n
-    ##   <fct>               <dbl+lbl>                   <int>
-    ## 1 Always               1 [1. Always]                 80
-    ## 2 Most of the time     2 [2. Most of the time]     1016
-    ## 3 About half the time  3 [3. About half the time]  2313
-    ## 4 Some of the time     4 [4. Some of the time]     3313
-    ## 5 Never                5 [5. Never]                 702
-    ## 6 <NA>                -9 [-9. Refused]               26
-    ## 7 <NA>                -8 [-8. Don't know]             3
-
-``` r
-anes_2020 %>% count(TrustPeople, V201237)
-```
-
-    ## count: now 7 rows and 3 columns, ungrouped
-
-    ## # A tibble: 7 × 3
-    ##   TrustPeople         V201237                         n
-    ##   <fct>               <dbl+lbl>                   <int>
-    ## 1 Always               1 [1. Always]                 48
-    ## 2 Most of the time     2 [2. Most of the time]     3511
-    ## 3 About half the time  3 [3. About half the time]  2020
-    ## 4 Some of the time     4 [4. Some of the time]     1597
-    ## 5 Never                5 [5. Never]                 264
-    ## 6 <NA>                -9 [-9. Refused]               12
-    ## 7 <NA>                -8 [-8. Don't know]             1
-
-``` r
-anes_2020 %>% count(VotedPres2016, V201101, V201102)
-```
-
-    ## count: now 8 rows and 4 columns, ungrouped
-
-    ## # A tibble: 8 × 4
-    ##   VotedPres2016 V201101                 V201102                     n
-    ##   <fct>         <dbl+lbl>               <dbl+lbl>               <int>
-    ## 1 Yes           -1 [-1. Inapplicable]    1 [1. Yes, voted]       3030
-    ## 2 Yes            1 [1. Yes, voted]      -1 [-1. Inapplicable]    2780
-    ## 3 No            -1 [-1. Inapplicable]    2 [2. No, didn't vote]   743
-    ## 4 No             2 [2. No, didn't vote] -1 [-1. Inapplicable]     879
-    ## 5 <NA>          -9 [-9. Refused]        -1 [-1. Inapplicable]      13
-    ## 6 <NA>          -8 [-8. Don't know]     -1 [-1. Inapplicable]       1
-    ## 7 <NA>          -1 [-1. Inapplicable]   -9 [-9. Refused]            6
-    ## 8 <NA>          -1 [-1. Inapplicable]   -8 [-8. Don't know]         1
-
-``` r
-anes_2020 %>% count(VotedPres2016_selection, V201103)
-```
-
-    ## count: now 6 rows and 3 columns, ungrouped
-
-    ## # A tibble: 6 × 3
-    ##   VotedPres2016_selection V201103                     n
-    ##   <fct>                   <dbl+lbl>               <int>
-    ## 1 Clinton                  1 [1. Hillary Clinton]  2911
-    ## 2 Trump                    2 [2. Donald Trump]     2466
-    ## 3 Other                    5 [5. Other {SPECIFY}]   390
-    ## 4 <NA>                    -9 [-9. Refused]           41
-    ## 5 <NA>                    -8 [-8. Don't know]         2
-    ## 6 <NA>                    -1 [-1. Inapplicable]    1643
-
-``` r
-anes_2020 %>% count(VotedPres2020, V202072)
-```
-
-    ## count: now 5 rows and 3 columns, ungrouped
-
-    ## # A tibble: 5 × 3
-    ##   VotedPres2020 V202072                                   n
-    ##   <fct>         <dbl+lbl>                             <int>
-    ## 1 Yes            1 [1. Yes, voted for President]       5952
-    ## 2 No             2 [2. No, didn't vote for President]    77
-    ## 3 <NA>          -9 [-9. Refused]                          2
-    ## 4 <NA>          -6 [-6. No post-election interview]       4
-    ## 5 <NA>          -1 [-1. Inapplicable]                  1418
-
-``` r
-anes_2020 %>% count(VotedPres2020_selection, V202110x)
-```
-
-    ## count: now 8 rows and 3 columns, ungrouped
-
-    ## # A tibble: 8 × 3
-    ##   VotedPres2020_selection V202110x                              n
-    ##   <fct>                   <dbl+lbl>                         <int>
-    ## 1 Biden                    1 [1. Joe Biden]                  3509
-    ## 2 Trump                    2 [2. Donald Trump]               2567
-    ## 3 Other                    3 [3. Jo Jorgensen]                 74
-    ## 4 Other                    4 [4. Howie Hawkins]                24
-    ## 5 Other                    5 [5. Other candidate {SPECIFY}]    60
-    ## 6 <NA>                    -9 [-9. Refused]                     81
-    ## 7 <NA>                    -8 [-8. Don't know]                   2
-    ## 8 <NA>                    -1 [-1. Inapplicable]              1136
-
-``` r
-anes_2020 %>% count(EarlyVote2020, V201025x, VotedPres2020)
-```
-
-    ## count: now 12 rows and 4 columns, ungrouped
-
-    ## # A tibble: 12 × 4
-    ##    EarlyVote2020 V201025x                                                                         VotedPres2020     n
-    ##    <fct>         <dbl+lbl>                                                                        <fct>         <int>
-    ##  1 Yes            4 [4. Registered and voted early]                                               Yes               2
-    ##  2 Yes            4 [4. Registered and voted early]                                               <NA>            369
-    ##  3 No             1 [1. Not registered (or DK/RF), does not intend to register (or DK/RF intent)] Yes              32
-    ##  4 No             2 [2. Not registered (or DK/RF), intends to register]                           Yes             105
-    ##  5 No             3 [3. Registered but did not vote early (or DK/RF)]                             Yes            5812
-    ##  6 <NA>          -4 [-4. Technical error]                                                         Yes               1
-    ##  7 <NA>           1 [1. Not registered (or DK/RF), does not intend to register (or DK/RF intent)] No                2
-    ##  8 <NA>           1 [1. Not registered (or DK/RF), does not intend to register (or DK/RF intent)] <NA>            305
-    ##  9 <NA>           2 [2. Not registered (or DK/RF), intends to register]                           No                1
-    ## 10 <NA>           2 [2. Not registered (or DK/RF), intends to register]                           <NA>            184
-    ## 11 <NA>           3 [3. Registered but did not vote early (or DK/RF)]                             No               74
-    ## 12 <NA>           3 [3. Registered but did not vote early (or DK/RF)]                             <NA>            566
-
-``` r
-anes_2020 %>%
-   summarise(WtSum = sum(Weight, na.rm = TRUE)) %>%
-   pull(WtSum)
-```
-
-    ## summarise: now one row and one column, ungrouped
-
-    ## [1] 7453
-
-## Label and order data
-
-``` r
-#label: label-ord
-
-cb_in <- readxl::read_xlsx(here::here("DataCleaningScripts", "ANES Codebook Metadata.xlsx"))
-
-cb_ord <- cb_in %>%
-  mutate(
-    Type=1,
-    SectNum=case_match(
-      Section,
-      "ADMIN"~1,
-      "WEIGHTS"~2,
-      "PRE-ELECTION SURVEY QUESTIONNAIRE"~3,
-      "POST-ELECTION SURVEY QUESTIONNAIRE"~4
-    )) %>%
-  arrange(SectNum, Variable) %>%
-  mutate(
-    Order=row_number()
-  )
-```
-
-    ## mutate: new variable 'Type' (double) with one unique value and 0% NA
-
-    ##         new variable 'SectNum' (double) with 4 unique values and 0% NA
-
-    ## mutate: new variable 'Order' (integer) with 42 unique values and 0% NA
-
-``` r
-cb_slim <- cb_ord %>%
-  select(Variable=BookDerived, `Description and Labels`, Question, Section, SectNum, Order) %>%
-  filter(!is.na(Variable)) %>%
-  separate_longer_delim(Variable, delim="; ") %>%
-  add_case(Variable="VotedPres2016", `Description and Labels`="PRE: Did R vote for President in 2016", Question="Derived from V201102, V201101", Section="PRE-ELECTION SURVEY QUESTIONNAIRE", SectNum=3, Order=11) %>%
-    add_case(Variable="EarlyVote2020", `Description and Labels`="PRE-POST: Voted early for president", Question="Derived from V201025x, VotedPres2020", Section="POST-ELECTION SURVEY QUESTIONNAIRE", SectNum=4, Order=44) %>%
-  mutate(Type=2) %>%
-  bind_rows(select(cb_ord, -BookDerived)) %>%
-  arrange(SectNum, Order, Type)
-```
-
-    ## select: dropped 2 variables (BookDerived, Type)
-
-    ## filter: removed 25 rows (60%), 17 rows remaining
-
-    ## mutate: new variable 'Type' (double) with one unique value and 0% NA
-
-    ## select: dropped one variable (BookDerived)
-
-``` r
-names(anes_2020)[!(names(anes_2020) %in% pull(cb_slim, Variable))]
-```
-
-    ## character(0)
-
-``` r
-cb_vars <- cb_slim %>%
-  filter(Variable %in% names(anes_2020))
-```
-
-    ## filter: no rows removed
-
-``` r
-anes_ord <- anes_2020 %>%
-  select(all_of(pull(cb_vars, Variable)))
-```
-
-    ## select: columns reordered (V200001, CaseID, V200002, InterviewMode, V200010b, …)
-
-``` r
-options("tidylog.display" = list())  
-
-for (var in pull(cb_vars, Variable)) {
-  vi <- cb_vars %>% filter(Variable==var)
-  attr(anes_ord[[deparse(as.name(var))]], "format.spss") <- NULL
-  attr(anes_ord[[deparse(as.name(var))]], "display_width") <- NULL
-  attr(anes_ord[[deparse(as.name(var))]], "label") <- pull(vi, `Description and Labels`)
-  attr(anes_ord[[deparse(as.name(var))]], "Section") <- pull(vi, Section) %>% as.character()
-  if (!is.na(pull(vi, Question))) attr(anes_ord[[deparse(as.name(var))]], "Question") <- pull(vi, Question)
-}
-
-options("tidylog.display" = NULL) 
-```
-
-## Save data
-
-``` r
-summary(anes_ord)
-```
-
-    ##     V200001           CaseID          V200002        InterviewMode     V200010b            Weight            V200010c     VarUnit     V200010d        Stratum        V201006      
-    ##  Min.   :200015   Min.   :200015   Min.   :1.000   Video    : 274   Min.   :0.008262   Min.   :0.008262   Min.   :1.000   1:3689   Min.   : 1.00   12     : 179   Min.   :-9.000  
-    ##  1st Qu.:225427   1st Qu.:225427   1st Qu.:3.000   Telephone: 115   1st Qu.:0.386263   1st Qu.:0.386263   1st Qu.:1.000   2:3750   1st Qu.:12.00   6      : 172   1st Qu.: 1.000  
-    ##  Median :335416   Median :335416   Median :3.000   Web      :7064   Median :0.686301   Median :0.686301   Median :2.000   3:  14   Median :24.00   27     : 172   Median : 1.000  
-    ##  Mean   :336416   Mean   :336416   Mean   :2.911                    Mean   :1.000000   Mean   :1.000000   Mean   :1.507            Mean   :24.63   21     : 170   Mean   : 1.596  
-    ##  3rd Qu.:427865   3rd Qu.:427865   3rd Qu.:3.000                    3rd Qu.:1.211032   3rd Qu.:1.211032   3rd Qu.:2.000            3rd Qu.:37.00   25     : 169   3rd Qu.: 2.000  
-    ##  Max.   :535469   Max.   :535469   Max.   :3.000                    Max.   :6.650665   Max.   :6.650665   Max.   :3.000            Max.   :50.00   1      : 167   Max.   : 3.000  
-    ##                                                                                                                                                    (Other):6424                   
-    ##              CampaignInterest    V201024           V201025x         V201029           V201101            V201102        VotedPres2016    V201103       VotedPres2016_selection    V201228     
-    ##  Very much interested:3940    Min.   :-9.0000   Min.   :-4.000   Min.   :-9.0000   Min.   :-9.00000   Min.   :-9.0000   Yes :5810     Min.   :-9.000   Clinton:2911            Min.   :-9.00  
-    ##  Somewhat interested :2569    1st Qu.:-1.0000   1st Qu.: 3.000   1st Qu.:-1.0000   1st Qu.:-1.00000   1st Qu.:-1.0000   No  :1622     1st Qu.: 1.000   Trump  :2466            1st Qu.: 1.00  
-    ##  Not much interested : 943    Median :-1.0000   Median : 3.000   Median :-1.0000   Median :-1.00000   Median : 1.0000   NA's:  21     Median : 1.000   Other  : 390            Median : 2.00  
-    ##  NA's                :   1    Mean   :-0.8595   Mean   : 2.919   Mean   :-0.8967   Mean   : 0.08493   Mean   : 0.1048                 Mean   : 1.042   NA's   :1686            Mean   : 1.99  
-    ##                               3rd Qu.:-1.0000   3rd Qu.: 3.000   3rd Qu.:-1.0000   3rd Qu.: 1.00000   3rd Qu.: 1.0000                 3rd Qu.: 2.000                           3rd Qu.: 3.00  
-    ##                               Max.   : 4.0000   Max.   : 4.000   Max.   :12.0000   Max.   : 2.00000   Max.   : 2.0000                 Max.   : 5.000                           Max.   : 5.00  
-    ##                                                                                                                                                                                               
-    ##     V201229           V201230            V201231x                          PartyID        V201233                  TrustGovernment    V201237                   TrustPeople      V201507x    
-    ##  Min.   :-9.0000   Min.   :-9.00000   Min.   :-9.000   Strong democrat         :1796   Min.   :-9.000   Always             :  80   Min.   :-9.00   Always             :  48   Min.   :-9.00  
-    ##  1st Qu.:-1.0000   1st Qu.:-1.00000   1st Qu.: 2.000   Strong republican       :1545   1st Qu.: 3.000   Most of the time   :1016   1st Qu.: 2.00   Most of the time   :3511   1st Qu.:35.00  
-    ##  Median : 1.0000   Median :-1.00000   Median : 4.000   Independent-democrat    : 881   Median : 4.000   About half the time:2313   Median : 3.00   About half the time:2020   Median :51.00  
-    ##  Mean   : 0.5154   Mean   : 0.01302   Mean   : 3.834   Independent             : 876   Mean   : 3.429   Some of the time   :3313   Mean   : 2.78   Some of the time   :1597   Mean   :49.43  
-    ##  3rd Qu.: 1.0000   3rd Qu.: 1.00000   3rd Qu.: 6.000   Not very strong democrat: 790   3rd Qu.: 4.000   Never              : 702   3rd Qu.: 3.00   Never              : 264   3rd Qu.:66.00  
-    ##  Max.   : 2.0000   Max.   : 3.00000   Max.   : 7.000   (Other)                 :1540   Max.   : 5.000   NA's               :  29   Max.   : 5.00   NA's               :  13   Max.   :80.00  
-    ##                                                        NA's                    :  25                                                                                                         
-    ##       Age               AgeGroup       V201510              Education       V201546          V201547a     V201547b     V201547c     V201547d     V201547e     V201547z     V201549x     
-    ##  Min.   :18.00   18-29      : 871   Min.   :-9.000   Less than HS: 312   Min.   :-9.000   Min.   :-3   Min.   :-3   Min.   :-3   Min.   :-3   Min.   :-3   Min.   :-3   Min.   :-9.000  
-    ##  1st Qu.:37.00   30-39      :1241   1st Qu.: 3.000   High school :1160   1st Qu.: 2.000   1st Qu.:-3   1st Qu.:-3   1st Qu.:-3   1st Qu.:-3   1st Qu.:-3   1st Qu.:-3   1st Qu.: 1.000  
-    ##  Median :53.00   40-49      :1081   Median : 5.000   Post HS     :2514   Median : 2.000   Median :-3   Median :-3   Median :-3   Median :-3   Median :-3   Median :-3   Median : 1.000  
-    ##  Mean   :51.83   50-59      :1200   Mean   : 5.621   Bachelor's  :1877   Mean   : 1.841   Mean   :-3   Mean   :-3   Mean   :-3   Mean   :-3   Mean   :-3   Mean   :-3   Mean   : 1.499  
-    ##  3rd Qu.:66.00   60-69      :1436   3rd Qu.: 6.000   Graduate    :1474   3rd Qu.: 2.000   3rd Qu.:-3   3rd Qu.:-3   3rd Qu.:-3   3rd Qu.:-3   3rd Qu.:-3   3rd Qu.:-3   3rd Qu.: 2.000  
-    ##  Max.   :80.00   70 or older:1330   Max.   :95.000   NA's        : 116   Max.   : 2.000   Max.   :-3   Max.   :-3   Max.   :-3   Max.   :-3   Max.   :-3   Max.   :-3   Max.   : 6.000  
-    ##  NA's   :294     NA's       : 294                                                                                                                                                       
-    ##                 RaceEth        V201600          Gender        V201607      V201610      V201611      V201613      V201615      V201616      V201617x                  Income    
-    ##  White              :5420   Min.   :-9.000   Male  :3375   Min.   :-3   Min.   :-3   Min.   :-3   Min.   :-3   Min.   :-3   Min.   :-3   Min.   :-9.00   Under $9,999    : 647  
-    ##  Black              : 650   1st Qu.: 1.000   Female:4027   1st Qu.:-3   1st Qu.:-3   1st Qu.:-3   1st Qu.:-3   1st Qu.:-3   1st Qu.:-3   1st Qu.: 4.00   $50,000-59,999  : 485  
-    ##  Hispanic           : 662   Median : 2.000   NA's  :  51   Median :-3   Median :-3   Median :-3   Median :-3   Median :-3   Median :-3   Median :11.00   $100,000-109,999: 451  
-    ##  Asian, NH/PI       : 248   Mean   : 1.472                 Mean   :-3   Mean   :-3   Mean   :-3   Mean   :-3   Mean   :-3   Mean   :-3   Mean   :10.36   $250,000 or more: 405  
-    ##  AI/AN              : 155   3rd Qu.: 2.000                 3rd Qu.:-3   3rd Qu.:-3   3rd Qu.:-3   3rd Qu.:-3   3rd Qu.:-3   3rd Qu.:-3   3rd Qu.:17.00   $80,000-89,999  : 383  
-    ##  Other/multiple race: 237   Max.   : 2.000                 Max.   :-3   Max.   :-3   Max.   :-3   Max.   :-3   Max.   :-3   Max.   :-3   Max.   :22.00   (Other)         :4565  
-    ##  NA's               :  81                                                                                                                                NA's            : 517  
-    ##           Income7        V202051           V202066          V202072        VotedPres2020    V202073           V202109x          V202110x       VotedPres2020_selection EarlyVote2020
-    ##  $125k or more:1468   Min.   :-9.0000   Min.   :-9.000   Min.   :-9.0000   Yes :5952     Min.   :-9.0000   Min.   :-2.0000   Min.   :-9.0000   Biden:3509              Yes : 371    
-    ##  Under $20k   :1076   1st Qu.:-1.0000   1st Qu.: 4.000   1st Qu.: 1.0000   No  :  77     1st Qu.: 1.0000   1st Qu.: 1.0000   1st Qu.: 1.0000   Trump:2567              No  :5949    
-    ##  $20-40k      :1051   Median :-1.0000   Median : 4.000   Median : 1.0000   NA's:1424     Median : 1.0000   Median : 1.0000   Median : 1.0000   Other: 158              NA's:1133    
-    ##  $40-60k      : 984   Mean   :-0.7259   Mean   : 3.402   Mean   : 0.6234                 Mean   : 0.9415   Mean   : 0.8578   Mean   : 0.9902   NA's :1219                           
-    ##  $60-80k      : 920   3rd Qu.:-1.0000   3rd Qu.: 4.000   3rd Qu.: 1.0000                 3rd Qu.: 2.0000   3rd Qu.: 1.0000   3rd Qu.: 2.0000                                        
-    ##  (Other)      :1437   Max.   : 3.0000   Max.   : 4.000   Max.   : 2.0000                 Max.   :12.0000   Max.   : 1.0000   Max.   : 5.0000                                        
-    ##  NA's         : 517
-
-``` r
-anes_der_tmp_loc <- here("osf_dl", "anes_2020.rds")
-write_rds(anes_ord, anes_der_tmp_loc)
-target_dir <- osf_retrieve_node("https://osf.io/gzbkn/?view_only=8ca80573293b4e12b7f934a0f742b957") 
-osf_upload(target_dir, path=anes_der_tmp_loc, conflicts="overwrite")
-```
-
-    ## # A tibble: 1 × 3
-    ##   name          id                       meta            
-    ##   <chr>         <chr>                    <list>          
-    ## 1 anes_2020.rds 647d2affa8dbe909c6cb5482 <named list [3]>
-
-``` r
-unlink(anes_der_tmp_loc)
-```
diff --git a/DataCleaningScripts/LAPOP_2021_DataPrep.Rmd b/DataCleaningScripts/LAPOP_2021_DataPrep.Rmd
index 609720b0..8457d6ab 100644
--- a/DataCleaningScripts/LAPOP_2021_DataPrep.Rmd
+++ b/DataCleaningScripts/LAPOP_2021_DataPrep.Rmd
@@ -13,13 +13,6 @@ knitr::opts_chunk$set(echo = TRUE)
 
 All data and resources were downloaded from http://datasets.americasbarometer.org/database/ on May 7, 2023.
 
-```{r}
-#| label: loadpackageh
-#| message: FALSE
-
-library(here) #easy relative paths
-```
-
 ```{r}
 #| label: loadpackages
 
@@ -39,7 +32,7 @@ stata_files <- osf_retrieve_node("https://osf.io/z5c3m/") %>%
 
 read_stata_unlabeled <- function(osf_tbl_i){
   filedet <- osf_tbl_i %>%
-    osf_download(conflicts="overwrite", path=here("osf_dl"))
+    osf_download(conflicts="overwrite", path=here::here("osf_dl"))
   
   tibin <- filedet %>%
     pull(local_path) %>%
@@ -77,9 +70,9 @@ lapop <- lapop_in %>%
 
 summary(lapop)
 
-dir.create(here("osf_dl", "LAPOP_2021"))
+dir.create(here::here("osf_dl", "LAPOP_2021"))
 
-lapop_temp_loc <- here("osf_dl", "LAPOP_2021", "lapop_2021.rds")
+lapop_temp_loc <- here::here("osf_dl", "LAPOP_2021", "lapop_2021.rds")
 
 write_rds(lapop, lapop_temp_loc)
 
@@ -87,7 +80,7 @@ write_rds(lapop, lapop_temp_loc)
 
 target_dir <- osf_retrieve_node("https://osf.io/z5c3m/")
 
-osf_upload(target_dir, path=here("osf_dl", "LAPOP_2021"), conflicts="overwrite")
+osf_upload(target_dir, path=here::here("osf_dl", "LAPOP_2021"), conflicts="overwrite")
 
 unlink(lapop_temp_loc)
 ```
diff --git a/DataCleaningScripts/LAPOP_2021_DataPrep.md b/DataCleaningScripts/LAPOP_2021_DataPrep.md
index 5bc6db9f..41347bba 100644
--- a/DataCleaningScripts/LAPOP_2021_DataPrep.md
+++ b/DataCleaningScripts/LAPOP_2021_DataPrep.md
@@ -6,10 +6,6 @@ AmericasBarometer 2021
 All data and resources were downloaded from
 <http://datasets.americasbarometer.org/database/> on May 7, 2023.
 
-``` r
-library(here) #easy relative paths
-```
-
 ``` r
 library(tidyverse) #data manipulation
 library(haven) #data import
@@ -25,7 +21,7 @@ stata_files <- osf_retrieve_node("https://osf.io/z5c3m/") %>%
 
 read_stata_unlabeled <- function(osf_tbl_i){
   filedet <- osf_tbl_i %>%
-    osf_download(conflicts="overwrite", path=here("osf_dl"))
+    osf_download(conflicts="overwrite", path=here::here("osf_dl"))
   
   tibin <- filedet %>%
     pull(local_path) %>%
@@ -61,47 +57,64 @@ lapop <- lapop_in %>%
 summary(lapop)
 ```
 
-    ##       pais           strata               upm              weight1500       core_a_core_b            q2              q1tb          covid2at    
-    ##  Min.   : 1.00   Min.   :1.000e+08   Min.   :1.001e+07   Min.   :0.004136   Length:64352       Min.   : 16.00   Min.   :1.000   Min.   :1.000  
-    ##  1st Qu.: 6.00   1st Qu.:6.000e+08   1st Qu.:6.153e+07   1st Qu.:0.251556   Class :character   1st Qu.: 27.00   1st Qu.:1.000   1st Qu.:1.000  
-    ##  Median :11.00   Median :1.100e+09   Median :1.202e+08   Median :0.417251   Mode  :character   Median : 36.00   Median :2.000   Median :2.000  
-    ##  Mean   :13.03   Mean   :1.303e+09   Mean   :1.666e+08   Mean   :0.512805                      Mean   : 38.86   Mean   :1.521   Mean   :2.076  
-    ##  3rd Qu.:17.00   3rd Qu.:1.700e+09   3rd Qu.:2.105e+08   3rd Qu.:0.674477                      3rd Qu.: 49.00   3rd Qu.:2.000   3rd Qu.:3.000  
-    ##  Max.   :41.00   Max.   :4.100e+09   Max.   :1.135e+09   Max.   :7.024495                      Max.   :121.00   Max.   :3.000   Max.   :4.000  
-    ##                                                                                                NA's   :90       NA's   :90      NA's   :6686   
-    ##        a4             idio2          idio2cov          it1             jc13             m1            mil10a          mil10e          ccch1      
-    ##  Min.   :  1.00   Min.   :1.000   Min.   :1.000   Min.   :1.000   Min.   :1.00    Min.   :1.00    Min.   :1.00    Min.   :1.00    Min.   :1.00   
-    ##  1st Qu.:  3.00   1st Qu.:2.000   1st Qu.:1.000   1st Qu.:2.000   1st Qu.:1.00    1st Qu.:2.00    1st Qu.:2.00    1st Qu.:2.00    1st Qu.:1.00   
-    ##  Median : 22.00   Median :3.000   Median :1.000   Median :2.000   Median :2.00    Median :3.00    Median :3.00    Median :2.00    Median :1.00   
-    ##  Mean   : 36.73   Mean   :2.439   Mean   :1.242   Mean   :2.275   Mean   :1.62    Mean   :2.98    Mean   :2.72    Mean   :2.39    Mean   :1.78   
-    ##  3rd Qu.: 71.00   3rd Qu.:3.000   3rd Qu.:1.000   3rd Qu.:3.000   3rd Qu.:2.00    3rd Qu.:4.00    3rd Qu.:3.00    3rd Qu.:3.00    3rd Qu.:2.00   
-    ##  Max.   :865.00   Max.   :3.000   Max.   :2.000   Max.   :4.000   Max.   :2.00    Max.   :5.00    Max.   :4.00    Max.   :4.00    Max.   :4.00   
-    ##  NA's   :4965     NA's   :2766    NA's   :31580   NA's   :3631    NA's   :50827   NA's   :33238   NA's   :49939   NA's   :44021   NA's   :50535  
-    ##      ccch3           ccus1           ccus3            edr            ocup4a           q14             q11n            q12c            q12bn       
-    ##  Min.   :1.00    Min.   :1.00    Min.   :1.00    Min.   :0.000   Min.   :1.000   Min.   :1.0     Min.   :1.000   Min.   : 1.000   Min.   : 0.000  
-    ##  1st Qu.:1.00    1st Qu.:1.00    1st Qu.:1.00    1st Qu.:2.000   1st Qu.:1.000   1st Qu.:1.0     1st Qu.:1.000   1st Qu.: 3.000   1st Qu.: 0.000  
-    ##  Median :2.00    Median :1.00    Median :2.00    Median :2.000   Median :1.000   Median :2.0     Median :2.000   Median : 4.000   Median : 1.000  
-    ##  Mean   :1.82    Mean   :1.58    Mean   :1.76    Mean   :2.192   Mean   :2.627   Mean   :1.6     Mean   :2.214   Mean   : 4.036   Mean   : 1.001  
-    ##  3rd Qu.:2.00    3rd Qu.:2.00    3rd Qu.:2.00    3rd Qu.:3.000   3rd Qu.:4.000   3rd Qu.:2.0     3rd Qu.:3.000   3rd Qu.: 5.000   3rd Qu.: 2.000  
-    ##  Max.   :3.00    Max.   :4.00    Max.   :3.00    Max.   :3.000   Max.   :7.000   Max.   :2.0     Max.   :7.000   Max.   :20.000   Max.   :16.000  
-    ##  NA's   :51961   NA's   :50028   NA's   :51226   NA's   :4114    NA's   :29505   NA's   :44130   NA's   :31198   NA's   :29144    NA's   :29449   
-    ##   covidedu1_1     covidedu1_2     covidedu1_3     covidedu1_4     covidedu1_5         gi0n            r15             r18n            r18       
-    ##  Min.   :0.00    Min.   :0.00    Min.   :0.00    Min.   :0.00    Min.   :0.00    Min.   :1.000   Min.   :0.000   Min.   :0.000   Min.   :0.000  
-    ##  1st Qu.:0.00    1st Qu.:0.00    1st Qu.:0.00    1st Qu.:0.00    1st Qu.:0.00    1st Qu.:1.000   1st Qu.:0.000   1st Qu.:0.000   1st Qu.:1.000  
-    ##  Median :0.00    Median :0.00    Median :1.00    Median :0.00    Median :0.00    Median :1.000   Median :1.000   Median :1.000   Median :1.000  
-    ##  Mean   :0.17    Mean   :0.07    Mean   :0.62    Mean   :0.12    Mean   :0.08    Mean   :1.646   Mean   :0.513   Mean   :0.537   Mean   :0.815  
-    ##  3rd Qu.:0.00    3rd Qu.:0.00    3rd Qu.:1.00    3rd Qu.:0.00    3rd Qu.:0.00    3rd Qu.:2.000   3rd Qu.:1.000   3rd Qu.:1.000   3rd Qu.:1.000  
-    ##  Max.   :1.00    Max.   :1.00    Max.   :1.00    Max.   :1.00    Max.   :1.00    Max.   :5.000   Max.   :1.000   Max.   :1.000   Max.   :1.000  
-    ##  NA's   :51297   NA's   :51297   NA's   :51297   NA's   :51297   NA's   :51297   NA's   :1240    NA's   :4118    NA's   :4386    NA's   :4249
+    ##       pais           strata               upm              weight1500       core_a_core_b     
+    ##  Min.   : 1.00   Min.   :1.000e+08   Min.   :1.001e+07   Min.   :0.004136   Length:64352      
+    ##  1st Qu.: 6.00   1st Qu.:6.000e+08   1st Qu.:6.153e+07   1st Qu.:0.251556   Class :character  
+    ##  Median :11.00   Median :1.100e+09   Median :1.202e+08   Median :0.417251   Mode  :character  
+    ##  Mean   :13.03   Mean   :1.303e+09   Mean   :1.666e+08   Mean   :0.512805                     
+    ##  3rd Qu.:17.00   3rd Qu.:1.700e+09   3rd Qu.:2.105e+08   3rd Qu.:0.674477                     
+    ##  Max.   :41.00   Max.   :4.100e+09   Max.   :1.135e+09   Max.   :7.024495                     
+    ##                                                                                               
+    ##        q2              q1tb          covid2at           a4             idio2          idio2cov    
+    ##  Min.   : 16.00   Min.   :1.000   Min.   :1.000   Min.   :  1.00   Min.   :1.000   Min.   :1.000  
+    ##  1st Qu.: 27.00   1st Qu.:1.000   1st Qu.:1.000   1st Qu.:  3.00   1st Qu.:2.000   1st Qu.:1.000  
+    ##  Median : 36.00   Median :2.000   Median :2.000   Median : 22.00   Median :3.000   Median :1.000  
+    ##  Mean   : 38.86   Mean   :1.521   Mean   :2.076   Mean   : 36.73   Mean   :2.439   Mean   :1.242  
+    ##  3rd Qu.: 49.00   3rd Qu.:2.000   3rd Qu.:3.000   3rd Qu.: 71.00   3rd Qu.:3.000   3rd Qu.:1.000  
+    ##  Max.   :121.00   Max.   :3.000   Max.   :4.000   Max.   :865.00   Max.   :3.000   Max.   :2.000  
+    ##  NA's   :90       NA's   :90      NA's   :6686    NA's   :4965     NA's   :2766    NA's   :31580  
+    ##       it1             jc13             m1            mil10a          mil10e          ccch1      
+    ##  Min.   :1.000   Min.   :1.00    Min.   :1.00    Min.   :1.00    Min.   :1.00    Min.   :1.00   
+    ##  1st Qu.:2.000   1st Qu.:1.00    1st Qu.:2.00    1st Qu.:2.00    1st Qu.:2.00    1st Qu.:1.00   
+    ##  Median :2.000   Median :2.00    Median :3.00    Median :3.00    Median :2.00    Median :1.00   
+    ##  Mean   :2.275   Mean   :1.62    Mean   :2.98    Mean   :2.72    Mean   :2.39    Mean   :1.78   
+    ##  3rd Qu.:3.000   3rd Qu.:2.00    3rd Qu.:4.00    3rd Qu.:3.00    3rd Qu.:3.00    3rd Qu.:2.00   
+    ##  Max.   :4.000   Max.   :2.00    Max.   :5.00    Max.   :4.00    Max.   :4.00    Max.   :4.00   
+    ##  NA's   :3631    NA's   :50827   NA's   :33238   NA's   :49939   NA's   :44021   NA's   :50535  
+    ##      ccch3           ccus1           ccus3            edr            ocup4a           q14       
+    ##  Min.   :1.00    Min.   :1.00    Min.   :1.00    Min.   :0.000   Min.   :1.000   Min.   :1.0    
+    ##  1st Qu.:1.00    1st Qu.:1.00    1st Qu.:1.00    1st Qu.:2.000   1st Qu.:1.000   1st Qu.:1.0    
+    ##  Median :2.00    Median :1.00    Median :2.00    Median :2.000   Median :1.000   Median :2.0    
+    ##  Mean   :1.82    Mean   :1.58    Mean   :1.76    Mean   :2.192   Mean   :2.627   Mean   :1.6    
+    ##  3rd Qu.:2.00    3rd Qu.:2.00    3rd Qu.:2.00    3rd Qu.:3.000   3rd Qu.:4.000   3rd Qu.:2.0    
+    ##  Max.   :3.00    Max.   :4.00    Max.   :3.00    Max.   :3.000   Max.   :7.000   Max.   :2.0    
+    ##  NA's   :51961   NA's   :50028   NA's   :51226   NA's   :4114    NA's   :29505   NA's   :44130  
+    ##       q11n            q12c            q12bn         covidedu1_1     covidedu1_2     covidedu1_3   
+    ##  Min.   :1.000   Min.   : 1.000   Min.   : 0.000   Min.   :0.00    Min.   :0.00    Min.   :0.00   
+    ##  1st Qu.:1.000   1st Qu.: 3.000   1st Qu.: 0.000   1st Qu.:0.00    1st Qu.:0.00    1st Qu.:0.00   
+    ##  Median :2.000   Median : 4.000   Median : 1.000   Median :0.00    Median :0.00    Median :1.00   
+    ##  Mean   :2.214   Mean   : 4.036   Mean   : 1.001   Mean   :0.17    Mean   :0.07    Mean   :0.62   
+    ##  3rd Qu.:3.000   3rd Qu.: 5.000   3rd Qu.: 2.000   3rd Qu.:0.00    3rd Qu.:0.00    3rd Qu.:1.00   
+    ##  Max.   :7.000   Max.   :20.000   Max.   :16.000   Max.   :1.00    Max.   :1.00    Max.   :1.00   
+    ##  NA's   :31198   NA's   :29144    NA's   :29449    NA's   :51297   NA's   :51297   NA's   :51297  
+    ##   covidedu1_4     covidedu1_5         gi0n            r15             r18n            r18       
+    ##  Min.   :0.00    Min.   :0.00    Min.   :1.000   Min.   :0.000   Min.   :0.000   Min.   :0.000  
+    ##  1st Qu.:0.00    1st Qu.:0.00    1st Qu.:1.000   1st Qu.:0.000   1st Qu.:0.000   1st Qu.:1.000  
+    ##  Median :0.00    Median :0.00    Median :1.000   Median :1.000   Median :1.000   Median :1.000  
+    ##  Mean   :0.12    Mean   :0.08    Mean   :1.646   Mean   :0.513   Mean   :0.537   Mean   :0.815  
+    ##  3rd Qu.:0.00    3rd Qu.:0.00    3rd Qu.:2.000   3rd Qu.:1.000   3rd Qu.:1.000   3rd Qu.:1.000  
+    ##  Max.   :1.00    Max.   :1.00    Max.   :5.000   Max.   :1.000   Max.   :1.000   Max.   :1.000  
+    ##  NA's   :51297   NA's   :51297   NA's   :1240    NA's   :4118    NA's   :4386    NA's   :4249
 
 ``` r
-dir.create(here("osf_dl", "LAPOP_2021"))
+dir.create(here::here("osf_dl", "LAPOP_2021"))
 ```
 
-    ## Warning in dir.create(here("osf_dl", "LAPOP_2021")): 'C:\Users\steph\Documents\GitHub\tidy-survey-book\osf_dl\LAPOP_2021' already exists
+    ## Warning in dir.create(here::here("osf_dl", "LAPOP_2021")):
+    ## 'C:\Users\steph\Documents\GitHub\tidy-survey-book\osf_dl\LAPOP_2021' already exists
 
 ``` r
-lapop_temp_loc <- here("osf_dl", "LAPOP_2021", "lapop_2021.rds")
+lapop_temp_loc <- here::here("osf_dl", "LAPOP_2021", "lapop_2021.rds")
 
 write_rds(lapop, lapop_temp_loc)
 
@@ -109,7 +122,7 @@ write_rds(lapop, lapop_temp_loc)
 
 target_dir <- osf_retrieve_node("https://osf.io/z5c3m/")
 
-osf_upload(target_dir, path=here("osf_dl", "LAPOP_2021"), conflicts="overwrite")
+osf_upload(target_dir, path=here::here("osf_dl", "LAPOP_2021"), conflicts="overwrite")
 ```
 
     ## Searching for conflicting files on OSF
diff --git a/DataCleaningScripts/NCVS_2021_DataPrep.Rmd b/DataCleaningScripts/NCVS_2021_DataPrep.Rmd
deleted file mode 100644
index c1cbe78d..00000000
--- a/DataCleaningScripts/NCVS_2021_DataPrep.Rmd
+++ /dev/null
@@ -1,158 +0,0 @@
----
-title: "National Crime Victimization Survey (NCVS) 2021 Data Prep"
-output: 
-  github_document:
-    html_preview: false
-bibliography: ../book.bib
----
-
-```{r setup, include=FALSE}
-knitr::opts_chunk$set(echo = TRUE)
-```
-
-## Data information
-
-Complete data is not stored on this repository but can be obtained on [ICPSR](https://www.icpsr.umich.edu/web/ICPSR/studies/38429) by downloading the R version of data files (@ncvs_data_2021). The files used here are from Version 1 and were downloaded on March 11, 2023.
-
-This script selects a subset of columns of several files and only retains those on this repository.
-
-```{r}
-#| label: loadpackageh
-#| message: FALSE
-library(here) #easy relative paths
-```
-
-```{r}
-#| label: loadpackages
-library(tidyverse) #data manipulation
-library(tidylog) #informative logging messages
-library(osfr)
-```
-
-## Incident data file
-
-```{r}
-#| label: incfile
-
-inc_file_osf_det <- osf_retrieve_node("https://osf.io/z5c3m/") %>%
-  osf_ls_files(path="NCVS_2021/DS0004") %>%
-  osf_download(conflicts="overwrite", path=here("osf_dl"))
-
-incfiles <- load(pull(inc_file_osf_det, local_path), verbose=TRUE)
-
-inc_in <- get(incfiles) %>%
-  as_tibble()
-
-unlink(pull(inc_file_osf_det, local_path))
-
-make_num_fact <- function(x){
-  xchar <- sub("^\\(0*([0-9]+)\\).+$", "\\1", x)
-  xnum <- as.numeric(xchar)
-  fct_reorder(xchar, xnum, .na_rm = TRUE)
-}
-
-inc_slim <- inc_in %>%
-  select(
-    YEARQ, IDHH, IDPER, V4012, WGTVICCY, # identifiers and weight
-    num_range("V", 4016:4019), # series crime information
-    V4021B, V4022, V4024, # time of incident, location of incident (macro and micro)
-    num_range("V", 4049:4058), #weapon type
-    V4234, V4235, num_range("V", 4241:4245), V4248, num_range("V", 4256:4278), starts_with("V4277"), # victim-offender relationship
-    V4399, # report to police
-    V4529 # type of crime
-  ) %>%
-  mutate(
-    IDHH=as.character(IDHH),
-    IDPER=as.character(IDPER),
-    across(where(is.factor), make_num_fact)
-  )
-
-summary(inc_slim)
-
-inc_temp_loc <- here("osf_dl", "ncvs_2021_incident.rds")
-write_rds(inc_slim, inc_temp_loc)
-target_dir <- osf_retrieve_node("https://osf.io/gzbkn/?view_only=8ca80573293b4e12b7f934a0f742b957") 
-osf_upload(target_dir, path=inc_temp_loc, conflicts="overwrite")
-unlink(inc_temp_loc)
-```
-
-
-## Person data file
-
-```{r}
-#| label: persfile
-
-pers_file_osf_det <- osf_retrieve_node("https://osf.io/z5c3m/") %>%
-  osf_ls_files(path="NCVS_2021/DS0003") %>%
-  osf_download(conflicts="overwrite", path=here("osf_dl"))
-
-persfiles <- load(pull(pers_file_osf_det, local_path), verbose=TRUE)
-
-pers_in <- get(persfiles) %>%
-  as_tibble()
-
-unlink(pull(pers_file_osf_det, local_path))
-
-pers_slim <- pers_in %>%
-  select(
-    YEARQ, IDHH, IDPER, WGTPERCY, # identifiers and weight
-    V3014, V3015, V3018, V3023A, V3024, V3084, V3086 
-    # age, marital status, sex, race, hispanic origin, gender, sexual orientation
-  ) %>%
-  mutate(
-    IDHH=as.character(IDHH),
-    IDPER=as.character(IDPER),
-    across(where(is.factor), make_num_fact)
-  )
-
-summary(pers_slim)
-
-pers_temp_loc <- here("osf_dl", "ncvs_2021_person.rds")
-write_rds(pers_slim, pers_temp_loc)
-target_dir <- osf_retrieve_node("https://osf.io/gzbkn/?view_only=8ca80573293b4e12b7f934a0f742b957") 
-osf_upload(target_dir, path=pers_temp_loc, conflicts="overwrite")
-unlink(pers_temp_loc)
-
-```
-
-## Household data file
-
-
-```{r}
-#| label: hhfile
-
-hh_file_osf_det <- osf_retrieve_node("https://osf.io/z5c3m/") %>%
-  osf_ls_files(path="NCVS_2021/DS0002") %>%
-  osf_download(conflicts="overwrite", path=here("osf_dl"))
-
-hhfiles <- load(pull(hh_file_osf_det, local_path), verbose=TRUE)
-
-hh_in <- get(hhfiles) %>%
-  as_tibble()
-
-unlink(pull(hh_file_osf_det, local_path))
-
-hh_slim <- hh_in %>%
-  select(
-    YEARQ, IDHH, WGTHHCY, V2117, V2118, # identifiers, weight, design
-    V2015, V2143, SC214A, V2122, V2126B, V2127B, V2129
-    # tenure, urbanicity, income, family structure, place size, region, msa status
-  ) %>%
-  mutate(
-    IDHH=as.character(IDHH),
-    across(where(is.factor), make_num_fact)
-  )
-
-summary(hh_slim)
-
-hh_temp_loc <- here("osf_dl", "ncvs_2021_household.rds")
-write_rds(hh_slim, hh_temp_loc)
-target_dir <- osf_retrieve_node("https://osf.io/gzbkn/?view_only=8ca80573293b4e12b7f934a0f742b957") 
-osf_upload(target_dir, path=hh_temp_loc, conflicts="overwrite")
-unlink(hh_temp_loc)
-```
-
-## Resources
-
-- [USER’S GUIDE TO NATIONAL CRIME VICTIMIZATION SURVEY (NCVS) DIRECT VARIANCE ESTIMATION](https://bjs.ojp.gov/sites/g/files/xyckuh236/files/media/document/ncvs_variance_user_guide_11.06.14.pdf)
--[Appendix C: Examples in SAS](https://bjs.ojp.gov/sites/g/files/xyckuh236/files/media/document/variance_guide_appendix_c_sas.pdf)
\ No newline at end of file
diff --git a/DataCleaningScripts/NCVS_2021_DataPrep.md b/DataCleaningScripts/NCVS_2021_DataPrep.md
deleted file mode 100644
index 42c839d1..00000000
--- a/DataCleaningScripts/NCVS_2021_DataPrep.md
+++ /dev/null
@@ -1,456 +0,0 @@
-National Crime Victimization Survey (NCVS) 2021 Data Prep
-================
-
-## Data information
-
-Complete data is not stored on this repository but can be obtained on
-[ICPSR](https://www.icpsr.umich.edu/web/ICPSR/studies/38429) by
-downloading the R version of data files (United States. Bureau of
-Justice Statistics (2022)). The files used here are from Version 1 and
-were downloaded on March 11, 2023.
-
-This script selects a subset of columns of several files and only
-retains those on this repository.
-
-``` r
-library(here) #easy relative paths
-```
-
-``` r
-library(tidyverse) #data manipulation
-library(tidylog) #informative logging messages
-library(osfr)
-```
-
-## Incident data file
-
-``` r
-inc_file_osf_det <- osf_retrieve_node("https://osf.io/z5c3m/") %>%
-  osf_ls_files(path="NCVS_2021/DS0004") %>%
-  osf_download(conflicts="overwrite", path=here("osf_dl"))
-
-incfiles <- load(pull(inc_file_osf_det, local_path), verbose=TRUE)
-```
-
-    ## Loading objects:
-    ##   da38429.0004
-
-``` r
-inc_in <- get(incfiles) %>%
-  as_tibble()
-
-unlink(pull(inc_file_osf_det, local_path))
-
-make_num_fact <- function(x){
-  xchar <- sub("^\\(0*([0-9]+)\\).+$", "\\1", x)
-  xnum <- as.numeric(xchar)
-  fct_reorder(xchar, xnum, .na_rm = TRUE)
-}
-
-inc_slim <- inc_in %>%
-  select(
-    YEARQ, IDHH, IDPER, V4012, WGTVICCY, # identifiers and weight
-    num_range("V", 4016:4019), # series crime information
-    V4021B, V4022, V4024, # time of incident, location of incident (macro and micro)
-    num_range("V", 4049:4058), #weapon type
-    V4234, V4235, num_range("V", 4241:4245), V4248, num_range("V", 4256:4278), starts_with("V4277"), # victim-offender relationship
-    V4399, # report to police
-    V4529 # type of crime
-  ) %>%
-  mutate(
-    IDHH=as.character(IDHH),
-    IDPER=as.character(IDPER),
-    across(where(is.factor), make_num_fact)
-  )
-```
-
-    ## select: dropped 1,201 variables (V4001, V4002, V4003, V4004, V4005, …)
-
-    ## mutate: converted 'IDHH' from factor to character (0 new NA)
-
-    ##         converted 'IDPER' from factor to character (0 new NA)
-
-    ##         changed 8,982 values (100%) of 'V4017' (0 new NA)
-
-    ##         changed 157 values (2%) of 'V4018' (0 new NA)
-
-    ##         changed 153 values (2%) of 'V4019' (0 new NA)
-
-    ##         changed 8,982 values (100%) of 'V4021B' (0 new NA)
-
-    ##         changed 8,982 values (100%) of 'V4022' (0 new NA)
-
-    ##         changed 8,982 values (100%) of 'V4024' (0 new NA)
-
-    ##         changed 2,737 values (30%) of 'V4049' (0 new NA)
-
-    ##         changed 409 values (5%) of 'V4050' (0 new NA)
-
-    ##         changed 409 values (5%) of 'V4051' (0 new NA)
-
-    ##         changed 409 values (5%) of 'V4052' (0 new NA)
-
-    ##         changed 409 values (5%) of 'V4053' (0 new NA)
-
-    ##         changed 409 values (5%) of 'V4054' (0 new NA)
-
-    ##         changed 409 values (5%) of 'V4055' (0 new NA)
-
-    ##         changed 409 values (5%) of 'V4056' (0 new NA)
-
-    ##         changed 409 values (5%) of 'V4057' (0 new NA)
-
-    ##         changed 409 values (5%) of 'V4058' (0 new NA)
-
-    ##         changed 2,827 values (31%) of 'V4234' (0 new NA)
-
-    ##         changed 398 values (4%) of 'V4235' (0 new NA)
-
-    ##         changed 2,096 values (23%) of 'V4241' (0 new NA)
-
-    ##         changed 920 values (10%) of 'V4242' (0 new NA)
-
-    ##         changed 1,246 values (14%) of 'V4243' (0 new NA)
-
-    ##         changed 831 values (9%) of 'V4244' (0 new NA)
-
-    ##         changed 1,075 values (12%) of 'V4245' (0 new NA)
-
-    ##         changed 353 values (4%) of 'V4256' (0 new NA)
-
-    ##         changed 231 values (3%) of 'V4257' (0 new NA)
-
-    ##         changed 139 values (2%) of 'V4258' (0 new NA)
-
-    ##         changed 139 values (2%) of 'V4259' (0 new NA)
-
-    ##         changed 139 values (2%) of 'V4260' (0 new NA)
-
-    ##         changed 139 values (2%) of 'V4261' (0 new NA)
-
-    ##         changed 139 values (2%) of 'V4262' (0 new NA)
-
-    ##         changed 181 values (2%) of 'V4263' (0 new NA)
-
-    ##         changed 104 values (1%) of 'V4264' (0 new NA)
-
-    ##         changed 104 values (1%) of 'V4265' (0 new NA)
-
-    ##         changed 104 values (1%) of 'V4266' (0 new NA)
-
-    ##         changed 104 values (1%) of 'V4267' (0 new NA)
-
-    ##         changed 104 values (1%) of 'V4268' (0 new NA)
-
-    ##         changed 104 values (1%) of 'V4269' (0 new NA)
-
-    ##         changed 104 values (1%) of 'V4270' (0 new NA)
-
-    ##         changed 104 values (1%) of 'V4271' (0 new NA)
-
-    ##         changed 104 values (1%) of 'V4272' (0 new NA)
-
-    ##         changed 104 values (1%) of 'V4273' (0 new NA)
-
-    ##         changed 104 values (1%) of 'V4274' (0 new NA)
-
-    ##         changed 104 values (1%) of 'V4275' (0 new NA)
-
-    ##         changed 104 values (1%) of 'V4276' (0 new NA)
-
-    ##         changed 104 values (1%) of 'V4277' (0 new NA)
-
-    ##         changed 104 values (1%) of 'V4278' (0 new NA)
-
-    ##         changed 104 values (1%) of 'V4277A' (0 new NA)
-
-    ##         changed 104 values (1%) of 'V4277B' (0 new NA)
-
-    ##         changed 104 values (1%) of 'V4277C' (0 new NA)
-
-    ##         changed 104 values (1%) of 'V4277D' (0 new NA)
-
-    ##         changed 104 values (1%) of 'V4277E' (0 new NA)
-
-    ##         changed 8,982 values (100%) of 'V4399' (0 new NA)
-
-    ##         changed 8,982 values (100%) of 'V4529' (0 new NA)
-
-``` r
-summary(inc_slim)
-```
-
-    ##      YEARQ          IDHH              IDPER               V4012          WGTVICCY           V4016        
-    ##  Min.   :2021   Length:8982        Length:8982        Min.   :1.000   Min.   :  221.6   Min.   :  1.000  
-    ##  1st Qu.:2021   Class :character   Class :character   1st Qu.:1.000   1st Qu.:  867.3   1st Qu.:  1.000  
-    ##  Median :2021   Mode  :character   Mode  :character   Median :1.000   Median : 1352.3   Median :  1.000  
-    ##  Mean   :2021                                         Mean   :1.179   Mean   : 1674.9   Mean   :  4.324  
-    ##  3rd Qu.:2021                                         3rd Qu.:1.000   3rd Qu.: 2217.4   3rd Qu.:  1.000  
-    ##  Max.   :2021                                         Max.   :7.000   Max.   :10106.2   Max.   :998.000  
-    ##                                                                                                          
-    ##  V4017     V4018       V4019          V4021B     V4022        V4024       V4049       V4050       V4051     
-    ##  1:8825   1   : 127   1   :  10   7      :1855   1:  34   5      :3210   1   : 409   1   : 380   0   : 278  
-    ##  2: 131   2   :   4   2   : 117   9      :1217   2:  65   1      :1481   2   :1803   3   :  26   1   : 131  
-    ##  8:  26   8   :  26   8   :  26   2      :1145   3:7697   7      : 727   3   : 525   7   :   3   NA's:8573  
-    ##           NA's:8825   NA's:8829   3      : 940   4:1143   21     : 453   NA's:6245   NA's:8573              
-    ##                                   8      : 856   5:  39   16     : 449                                      
-    ##                                   4      : 833   8:   4   6      : 429                                      
-    ##                                   (Other):2136            (Other):2233                                      
-    ##   V4052       V4053       V4054       V4055       V4056       V4057       V4058       V4234       V4235     
-    ##  0   : 390   0   : 334   0   : 394   0   : 302   0   : 360   0   : 406   0   : 380   1   :2076   1   :  20  
-    ##  1   :  19   1   :  75   1   :  15   1   : 107   1   :  49   1   :   3   8   :  29   2   : 353   2   : 291  
-    ##  NA's:8573   NA's:8573   NA's:8573   NA's:8573   NA's:8573   NA's:8573   NA's:8573   3   : 311   8   :  87  
-    ##                                                                                      8   :  87   NA's:8584  
-    ##                                                                                      NA's:6155              
-    ##                                                                                                             
-    ##                                                                                                             
-    ##   V4241       V4242       V4243       V4244          V4245          V4248         V4256       V4257     
-    ##  1   :1176   1   : 326   1   : 171   1   : 307   7      : 149   Min.   : 2.000   1   :  83   1   :  65  
-    ##  2   : 793   2   : 240   2   : 292   2   : 424   8      : 139   1st Qu.: 2.000   2   :  37   2   :  63  
-    ##  3   :  57   3   : 271   3   : 701   3   :   4   11     : 137   Median : 2.000   3   : 194   3   :  85  
-    ##  8   :  70   8   :  83   6   :   1   8   :  96   13     : 114   Mean   : 7.992   4   :  20   8   :  18  
-    ##  NA's:6886   NA's:8062   8   :  81   NA's:8151   98     :  85   3rd Qu.: 3.000   6   :   2   NA's:8751  
-    ##                          NA's:7736               (Other): 451   Max.   :98.000   8   :  17              
-    ##                                                  NA's   :7907   NA's   :8629     NA's:8629              
-    ##   V4258       V4259       V4260       V4261       V4262       V4263       V4264       V4265       V4266     
-    ##  1   : 122   0   :  85   0   :  76   0   :  77   0   : 122   1   :  65   1   :  87   0   :  87   0   :  83  
-    ##  8   :  17   1   :  37   1   :  46   1   :  45   8   :  17   2   :  98   8   :  17   8   :  17   1   :   4  
-    ##  NA's:8843   8   :  17   8   :  17   8   :  17   NA's:8843   8   :  18   NA's:8878   NA's:8878   8   :  17  
-    ##              NA's:8843   NA's:8843   NA's:8843               NA's:8801                           NA's:8878  
-    ##                                                                                                             
-    ##                                                                                                             
-    ##                                                                                                             
-    ##   V4267       V4268       V4269       V4270       V4271       V4272       V4273       V4274       V4275     
-    ##  0   :  82   0   :  84   0   :  84   0   :  83   0   :  84   0   :  66   0   :  81   0   :  84   0   :  64  
-    ##  1   :   5   1   :   3   1   :   3   1   :   4   1   :   3   1   :  21   1   :   6   1   :   3   1   :  23  
-    ##  8   :  17   8   :  17   8   :  17   8   :  17   8   :  17   8   :  17   8   :  17   8   :  17   8   :  17  
-    ##  NA's:8878   NA's:8878   NA's:8878   NA's:8878   NA's:8878   NA's:8878   NA's:8878   NA's:8878   NA's:8878  
-    ##                                                                                                             
-    ##                                                                                                             
-    ##                                                                                                             
-    ##   V4276       V4277       V4278       V4277A      V4277B      V4277C      V4277D      V4277E     V4399   
-    ##  0   :  85   0   :  62   0   :  85   0   :  87   0   :  87   0   :  87   0   :  84   0   :  87   1:3175  
-    ##  1   :   2   1   :  25   8   :  19   8   :  17   8   :  17   8   :  17   1   :   3   8   :  17   2:5692  
-    ##  8   :  17   8   :  17   NA's:8878   NA's:8878   NA's:8878   NA's:8878   8   :  17   NA's:8878   3: 103  
-    ##  NA's:8878   NA's:8878                                                   NA's:8878               8:  12  
-    ##                                                                                                          
-    ##                                                                                                          
-    ##                                                                                                          
-    ##      V4529     
-    ##  56     :1689  
-    ##  57     :1431  
-    ##  55     :1011  
-    ##  58     : 799  
-    ##  32     : 637  
-    ##  20     : 609  
-    ##  (Other):2806
-
-``` r
-inc_temp_loc <- here("osf_dl", "ncvs_2021_incident.rds")
-write_rds(inc_slim, inc_temp_loc)
-target_dir <- osf_retrieve_node("https://osf.io/gzbkn/?view_only=8ca80573293b4e12b7f934a0f742b957") 
-osf_upload(target_dir, path=inc_temp_loc, conflicts="overwrite")
-```
-
-    ## # A tibble: 1 × 3
-    ##   name                   id                       meta            
-    ##   <chr>                  <chr>                    <list>          
-    ## 1 ncvs_2021_incident.rds 647cfbcd85df4808fa7753f2 <named list [3]>
-
-``` r
-unlink(inc_temp_loc)
-```
-
-## Person data file
-
-``` r
-pers_file_osf_det <- osf_retrieve_node("https://osf.io/z5c3m/") %>%
-  osf_ls_files(path="NCVS_2021/DS0003") %>%
-  osf_download(conflicts="overwrite", path=here("osf_dl"))
-
-persfiles <- load(pull(pers_file_osf_det, local_path), verbose=TRUE)
-```
-
-    ## Loading objects:
-    ##   da38429.0003
-
-``` r
-pers_in <- get(persfiles) %>%
-  as_tibble()
-
-unlink(pull(pers_file_osf_det, local_path))
-
-pers_slim <- pers_in %>%
-  select(
-    YEARQ, IDHH, IDPER, WGTPERCY, # identifiers and weight
-    V3014, V3015, V3018, V3023A, V3024, V3084, V3086 
-    # age, marital status, sex, race, hispanic origin, gender, sexual orientation
-  ) %>%
-  mutate(
-    IDHH=as.character(IDHH),
-    IDPER=as.character(IDPER),
-    across(where(is.factor), make_num_fact)
-  )
-```
-
-    ## select: dropped 418 variables (V3001, V3002, V3003, V3004, V3005, …)
-
-    ## mutate: converted 'IDHH' from factor to character (0 new NA)
-
-    ##         converted 'IDPER' from factor to character (0 new NA)
-
-    ##         changed 291,878 values (100%) of 'V3015' (0 new NA)
-
-    ##         changed 291,878 values (100%) of 'V3018' (0 new NA)
-
-    ##         changed 291,878 values (100%) of 'V3023A' (0 new NA)
-
-    ##         changed 291,878 values (100%) of 'V3024' (0 new NA)
-
-    ##         changed 216,287 values (74%) of 'V3084' (0 new NA)
-
-    ##         changed 216,287 values (74%) of 'V3086' (0 new NA)
-
-``` r
-summary(pers_slim)
-```
-
-    ##      YEARQ          IDHH              IDPER              WGTPERCY           V3014       V3015      V3018     
-    ##  Min.   :2021   Length:291878      Length:291878      Min.   :    0.0   Min.   :12.00   1:148131   1:140922  
-    ##  1st Qu.:2021   Class :character   Class :character   1st Qu.:  432.2   1st Qu.:31.00   2: 17668   2:150956  
-    ##  Median :2021   Mode  :character   Mode  :character   Median :  791.5   Median :48.00   3: 28596             
-    ##  Mean   :2021                                         Mean   :  956.5   Mean   :47.57   4:  4524             
-    ##  3rd Qu.:2021                                         3rd Qu.: 1397.4   3rd Qu.:64.00   5: 90425             
-    ##  Max.   :2021                                         Max.   :10691.5   Max.   :90.00   8:  2534             
-    ##                                                                                                              
-    ##      V3023A       V3024          V3084         V3086       
-    ##  1      :236785   1: 41450   8      :151725   1   : 29733  
-    ##  2      : 30972   2:249306   2      : 61108   2   : 34489  
-    ##  4      : 16337   8:  1122   6      :  1477   3   :    56  
-    ##  3      :  1776              1      :   924   4   :   115  
-    ##  6      :  1590              3      :   611   8   :151894  
-    ##  7      :  1465              (Other):   442   NA's: 75591  
-    ##  (Other):  2953              NA's   : 75591
-
-``` r
-pers_temp_loc <- here("osf_dl", "ncvs_2021_person.rds")
-write_rds(pers_slim, pers_temp_loc)
-target_dir <- osf_retrieve_node("https://osf.io/gzbkn/?view_only=8ca80573293b4e12b7f934a0f742b957") 
-osf_upload(target_dir, path=pers_temp_loc, conflicts="overwrite")
-```
-
-    ## # A tibble: 1 × 3
-    ##   name                 id                       meta            
-    ##   <chr>                <chr>                    <list>          
-    ## 1 ncvs_2021_person.rds 647cfe9ba8dbe909bacb51bf <named list [3]>
-
-``` r
-unlink(pers_temp_loc)
-```
-
-## Household data file
-
-``` r
-hh_file_osf_det <- osf_retrieve_node("https://osf.io/z5c3m/") %>%
-  osf_ls_files(path="NCVS_2021/DS0002") %>%
-  osf_download(conflicts="overwrite", path=here("osf_dl"))
-
-hhfiles <- load(pull(hh_file_osf_det, local_path), verbose=TRUE)
-```
-
-    ## Loading objects:
-    ##   da38429.0002
-
-``` r
-hh_in <- get(hhfiles) %>%
-  as_tibble()
-
-unlink(pull(hh_file_osf_det, local_path))
-
-hh_slim <- hh_in %>%
-  select(
-    YEARQ, IDHH, WGTHHCY, V2117, V2118, # identifiers, weight, design
-    V2015, V2143, SC214A, V2122, V2126B, V2127B, V2129
-    # tenure, urbanicity, income, family structure, place size, region, msa status
-  ) %>%
-  mutate(
-    IDHH=as.character(IDHH),
-    across(where(is.factor), make_num_fact)
-  )
-```
-
-    ## select: dropped 440 variables (V2001, V2002, V2003, V2004, V2005, …)
-
-    ## mutate: converted 'IDHH' from factor to character (0 new NA)
-
-    ##         changed 150,138 values (59%) of 'V2015' (0 new NA)
-
-    ##         changed 256,460 values (100%) of 'V2143' (0 new NA)
-
-    ##         changed 253,779 values (99%) of 'SC214A' (0 new NA)
-
-    ##         changed 256,460 values (100%) of 'V2122' (0 new NA)
-
-    ##         changed 256,460 values (100%) of 'V2126B' (0 new NA)
-
-    ##         changed 256,460 values (100%) of 'V2127B' (0 new NA)
-
-    ##         changed 256,460 values (100%) of 'V2129' (0 new NA)
-
-``` r
-summary(hh_slim)
-```
-
-    ##      YEARQ          IDHH              WGTHHCY           V2117            V2118        V2015        V2143     
-    ##  Min.   :2021   Length:256460      Min.   :   0.0   Min.   :  1.00   Min.   :1.000   1   :101944   1: 26878  
-    ##  1st Qu.:2021   Class :character   1st Qu.:   0.0   1st Qu.: 24.00   1st Qu.:1.000   2   : 46269   2:173491  
-    ##  Median :2021   Mode  :character   Median : 399.4   Median : 48.00   Median :2.000   3   :  1925   3: 56091  
-    ##  Mean   :2021                      Mean   : 504.2   Mean   : 58.85   Mean   :1.526   NA's:106322             
-    ##  3rd Qu.:2021                      3rd Qu.: 829.1   3rd Qu.: 88.00   3rd Qu.:2.000                           
-    ##  Max.   :2021                      Max.   :4515.8   Max.   :160.00   Max.   :3.000                           
-    ##                                                                                                              
-    ##      SC214A           V2122            V2126B      V2127B    V2129     
-    ##  13     : 44601   33     :106322   0      :69484   1:41585   1: 80895  
-    ##  16     : 34287   32     : 28306   16     :53002   2:74666   2:135438  
-    ##  15     : 33353   16     : 23617   13     :39873   3:87783   3: 40127  
-    ##  12     : 23282   8      : 21383   17     :27205   4:52426             
-    ##  18     : 16892   24     : 15629   18     :24461                       
-    ##  (Other):101364   4      : 10477   20     :15194                       
-    ##  NA's   :  2681   (Other): 50726   (Other):27241
-
-``` r
-hh_temp_loc <- here("osf_dl", "ncvs_2021_household.rds")
-write_rds(hh_slim, hh_temp_loc)
-target_dir <- osf_retrieve_node("https://osf.io/gzbkn/?view_only=8ca80573293b4e12b7f934a0f742b957") 
-osf_upload(target_dir, path=hh_temp_loc, conflicts="overwrite")
-```
-
-    ## # A tibble: 1 × 3
-    ##   name                    id                       meta            
-    ##   <chr>                   <chr>                    <list>          
-    ## 1 ncvs_2021_household.rds 647cfe323c3a380880a046d8 <named list [3]>
-
-``` r
-unlink(hh_temp_loc)
-```
-
-## Resources
-
-- [USER’S GUIDE TO NATIONAL CRIME VICTIMIZATION SURVEY (NCVS) DIRECT
-  VARIANCE
-  ESTIMATION](https://bjs.ojp.gov/sites/g/files/xyckuh236/files/media/document/ncvs_variance_user_guide_11.06.14.pdf)
-  -[Appendix C: Examples in
-  SAS](https://bjs.ojp.gov/sites/g/files/xyckuh236/files/media/document/variance_guide_appendix_c_sas.pdf)
-
-<div id="refs" class="references csl-bib-body hanging-indent">
-
-<div id="ref-ncvs_data_2021" class="csl-entry">
-
-United States. Bureau of Justice Statistics. 2022. “National Crime
-Victimization Survey, \[United States\], 2021.” Inter-university
-Consortium for Political; Social Research \[distributor\].
-<https://doi.org/10.3886/ICPSR38429.v1>.
-
-</div>
-
-</div>
diff --git a/DataCleaningScripts/RECS_2015_DataPrep.Rmd b/DataCleaningScripts/RECS_2015_DataPrep.Rmd
deleted file mode 100644
index 046fe65a..00000000
--- a/DataCleaningScripts/RECS_2015_DataPrep.Rmd
+++ /dev/null
@@ -1,162 +0,0 @@
----
-title: "Residential Energy Consumption Survey (RECS) 2015 Data Prep"
-output: 
-  github_document:
-      html_preview: false
----
-
-```{r setup, include=FALSE}
-knitr::opts_chunk$set(echo = TRUE)
-```
-
-## Data information
-
-All data and resources were downloaded from https://www.eia.gov/consumption/residential/data/2015/index.php?view=microdata on March 3, 2021.
-
-```{r loadpackageh, message=FALSE}
-library(here) #easy relative paths
-```
-
-```{r loadpackages}
-library(tidyverse) #data manipulation
-library(haven) #data import
-library(tidylog) #informative logging messages
-library(osfr)
-```
-## Import data and create derived variables
-
-```{r derivedata}
-recs_file_osf_det <- osf_retrieve_node("https://osf.io/z5c3m/") %>%
-  osf_ls_files(path="RECS_2015", pattern="csv") %>%
-  osf_download(conflicts="overwrite", path=here("osf_dl"))
-
-recs_in <- read_csv(pull(recs_file_osf_det, local_path))
-
-unlink(pull(recs_file_osf_det, local_path))
-
-recs <- recs_in %>%
-   select(DOEID, REGIONC, DIVISION, METROMICRO, UATYP10, TYPEHUQ, YEARMADERANGE, HEATHOME, EQUIPMUSE, TEMPHOME, TEMPGONE, TEMPNITE, AIRCOND, USECENAC, TEMPHOMEAC, TEMPGONEAC, TEMPNITEAC, TOTCSQFT, TOTHSQFT, TOTSQFT_EN, TOTUCSQFT, TOTUSQFT, NWEIGHT, starts_with("BRRWT"), CDD30YR, CDD65, CDD80, CLIMATE_REGION_PUB, IECC_CLIMATE_PUB, HDD30YR, HDD65, HDD50, GNDHDD65, BTUEL, DOLLAREL, BTUNG, DOLLARNG, BTULP, DOLLARLP, BTUFO, DOLLARFO, TOTALBTU, TOTALDOL, BTUWOOD=WOODBTU, BTUPELLET=PELLETBTU ) %>%
-   mutate(
-      Region=parse_factor(
-         case_when(
-            REGIONC==1~"Northeast",
-            REGIONC==2~"Midwest",
-            REGIONC==3~"South",
-            REGIONC==4~"West",
-      ), levels=c("Northeast", "Midwest", "South", "West")),
-      Division=parse_factor(
-         case_when(
-            DIVISION==1~"New England",
-            DIVISION==2~"Middle Atlantic",
-            DIVISION==3~"East North Central",
-            DIVISION==4~"West North Central",
-            DIVISION==5~"South Atlantic",
-            DIVISION==6~"East South Central",
-            DIVISION==7~"West South Central",
-            DIVISION==8~"Mountain North",
-            DIVISION==9~"Mountain South",
-            DIVISION==10~"Pacific",
-      ), levels=c("New England", "Middle Atlantic", "East North Central", "West North Central", "South Atlantic", "East South Central", "West South Central", "Mountain North", "Mountain South", "Pacific")),
-      MSAStatus=fct_recode(METROMICRO, "Metropolitan Statistical Area"="METRO", "Micropolitan Statistical Area"="MICRO", "None"="NONE"),
-      Urbanicity=parse_factor(
-         case_when(
-            UATYP10=="U"~"Urban Area",
-            UATYP10=="C"~"Urban Cluster",
-            UATYP10=="R"~"Rural"
-         ),
-         levels=c("Urban Area", "Urban Cluster", "Rural")
-      ),
-      HousingUnitType=parse_factor(
-         case_when(
-            TYPEHUQ==1~"Mobile home",
-            TYPEHUQ==2~"Single-family detached",
-            TYPEHUQ==3~"Single-family attached",
-            TYPEHUQ==4~"Apartment: 2-4 Units",
-            TYPEHUQ==5~"Apartment: 5 or more units",
-      ), levels=c("Mobile home", "Single-family detached", "Single-family attached", "Apartment: 2-4 Units", "Apartment: 5 or more units")),
-      YearMade=parse_factor(
-         case_when(
-            YEARMADERANGE==1~"Before 1950",
-            YEARMADERANGE==2~"1950-1959",
-            YEARMADERANGE==3~"1960-1969",
-            YEARMADERANGE==4~"1970-1979",
-            YEARMADERANGE==5~"1980-1989",
-            YEARMADERANGE==6~"1990-1999",
-            YEARMADERANGE==7~"2000-2009",
-            YEARMADERANGE==8~"2010-2015",
-         ),
-         levels=c("Before 1950", "1950-1959", "1960-1969", "1970-1979", "1980-1989", "1990-1999", "2000-2009", "2010-2015"),
-         ordered = TRUE
-      ),
-      SpaceHeatingUsed=as.logical(HEATHOME),
-      HeatingBehavior=parse_factor(
-         case_when(
-            EQUIPMUSE==1~"Set one temp and leave it",
-            EQUIPMUSE==2~"Manually adjust at night/no one home",
-            EQUIPMUSE==3~"Program thermostat to change at certain times",
-            EQUIPMUSE==4~"Turn on or off as needed",
-            EQUIPMUSE==5~"No control",
-            EQUIPMUSE==9~"Other",
-            EQUIPMUSE==-9~NA_character_),
-         levels=c("Set one temp and leave it", "Manually adjust at night/no one home", "Program thermostat to change at certain times", "Turn on or off as needed", "No control", "Other")
-      ),
-      WinterTempDay=if_else(TEMPHOME>0, TEMPHOME, NA_real_),
-      WinterTempAway=if_else(TEMPGONE>0, TEMPGONE, NA_real_),
-      WinterTempNight=if_else(TEMPNITE>0, TEMPNITE, NA_real_),
-      ACUsed=as.logical(AIRCOND),
-      ACBehavior=parse_factor(
-         case_when(
-            USECENAC==1~"Set one temp and leave it",
-            USECENAC==2~"Manually adjust at night/no one home",
-            USECENAC==3~"Program thermostat to change at certain times",
-            USECENAC==4~"Turn on or off as needed",
-            USECENAC==5~"No control",
-            USECENAC==-9~NA_character_),
-         levels=c("Set one temp and leave it", "Manually adjust at night/no one home", "Program thermostat to change at certain times", "Turn on or off as needed", "No control")
-      ),
-      SummerTempDay=if_else(TEMPHOMEAC>0, TEMPHOMEAC, NA_real_),
-      SummerTempAway=if_else(TEMPGONEAC>0, TEMPGONEAC, NA_real_),
-      SummerTempNight=if_else(TEMPNITEAC>0, TEMPNITEAC, NA_real_),
-      ClimateRegion_BA=parse_factor(CLIMATE_REGION_PUB),
-      ClimateRegion_IECC=factor(IECC_CLIMATE_PUB)
-      
-   )
-
-```
-
-
-## Check derived variables for correct coding
-
-```{r checkvars}
-recs %>% count(Region, REGIONC)
-recs %>% count(Division, DIVISION)
-recs %>% count(MSAStatus, METROMICRO)
-recs %>% count(Urbanicity, UATYP10)
-recs %>% count(HousingUnitType, TYPEHUQ)
-recs %>% count(YearMade, YEARMADERANGE)
-recs %>% count(SpaceHeatingUsed, HEATHOME)
-recs %>% count(HeatingBehavior, EQUIPMUSE)
-recs %>% count(ACUsed, AIRCOND)
-recs %>% count(ACBehavior, USECENAC)
-recs %>% count(ClimateRegion_BA, CLIMATE_REGION_PUB)
-recs %>% count(ClimateRegion_IECC, IECC_CLIMATE_PUB)
-
-```
-
-## Save data
-
-```{r savedat}
-recs_out <- recs %>%
-   select(DOEID, REGIONC, Region, Division, MSAStatus, Urbanicity, HousingUnitType, YearMade, SpaceHeatingUsed, HeatingBehavior, WinterTempDay, WinterTempAway, WinterTempNight, ACUsed, ACBehavior, SummerTempDay, SummerTempAway, SummerTempNight, TOTCSQFT, TOTHSQFT, TOTSQFT_EN, TOTUCSQFT, TOTUSQFT, NWEIGHT, starts_with("BRRWT"), CDD30YR, CDD65, CDD80, ClimateRegion_BA, ClimateRegion_IECC, HDD30YR, HDD65, HDD50, GNDHDD65, BTUEL, DOLLAREL, BTUNG, DOLLARNG, BTULP, DOLLARLP, BTUFO, DOLLARFO, TOTALBTU, TOTALDOL, BTUWOOD, BTUPELLET)
-
-summary(recs_out)
-
-
-recs_der_tmp_loc <- here("osf_dl", "recs_2015.rds")
-write_rds(recs_out, recs_der_tmp_loc)
-target_dir <- osf_retrieve_node("https://osf.io/gzbkn/?view_only=8ca80573293b4e12b7f934a0f742b957") 
-osf_upload(target_dir, path=recs_der_tmp_loc, conflicts="overwrite")
-unlink(recs_der_tmp_loc)
-
-```
-
diff --git a/DataCleaningScripts/RECS_2015_DataPrep.md b/DataCleaningScripts/RECS_2015_DataPrep.md
deleted file mode 100644
index 863a88b8..00000000
--- a/DataCleaningScripts/RECS_2015_DataPrep.md
+++ /dev/null
@@ -1,617 +0,0 @@
-Residential Energy Consumption Survey (RECS) 2015 Data Prep
-================
-
-## Data information
-
-All data and resources were downloaded from
-<https://www.eia.gov/consumption/residential/data/2015/index.php?view=microdata>
-on March 3, 2021.
-
-``` r
-library(here) #easy relative paths
-```
-
-``` r
-library(tidyverse) #data manipulation
-library(haven) #data import
-library(tidylog) #informative logging messages
-```
-
-    ## 
-    ## Attaching package: 'tidylog'
-
-    ## The following objects are masked from 'package:srvyr':
-    ## 
-    ##     anti_join, drop_na, filter, filter_all, filter_at, filter_if, group_by, group_by_all,
-    ##     group_by_at, group_by_if, mutate, mutate_all, mutate_at, mutate_if, rename, rename_all,
-    ##     rename_at, rename_if, rename_with, select, select_all, select_at, select_if, semi_join,
-    ##     summarise, summarise_all, summarise_at, summarise_if, summarize, summarize_all, summarize_at,
-    ##     summarize_if, transmute, ungroup
-
-    ## The following objects are masked from 'package:dplyr':
-    ## 
-    ##     add_count, add_tally, anti_join, count, distinct, distinct_all, distinct_at, distinct_if,
-    ##     filter, filter_all, filter_at, filter_if, full_join, group_by, group_by_all, group_by_at,
-    ##     group_by_if, inner_join, left_join, mutate, mutate_all, mutate_at, mutate_if, relocate,
-    ##     rename, rename_all, rename_at, rename_if, rename_with, right_join, sample_frac, sample_n,
-    ##     select, select_all, select_at, select_if, semi_join, slice, slice_head, slice_max, slice_min,
-    ##     slice_sample, slice_tail, summarise, summarise_all, summarise_at, summarise_if, summarize,
-    ##     summarize_all, summarize_at, summarize_if, tally, top_frac, top_n, transmute, transmute_all,
-    ##     transmute_at, transmute_if, ungroup
-
-    ## The following objects are masked from 'package:tidyr':
-    ## 
-    ##     drop_na, fill, gather, pivot_longer, pivot_wider, replace_na, spread, uncount
-
-    ## The following object is masked from 'package:stats':
-    ## 
-    ##     filter
-
-``` r
-library(osfr)
-```
-
-## Import data and create derived variables
-
-``` r
-recs_file_osf_det <- osf_retrieve_node("https://osf.io/z5c3m/") %>%
-  osf_ls_files(path="RECS_2015", pattern="csv") %>%
-  osf_download(conflicts="overwrite", path=here("osf_dl"))
-
-recs_in <- read_csv(pull(recs_file_osf_det, local_path))
-```
-
-    ## Rows: 5686 Columns: 759
-    ## ── Column specification ───────────────────────────────────────────────────────────────────────────────────────
-    ## Delimiter: ","
-    ## chr   (4): METROMICRO, UATYP10, CLIMATE_REGION_PUB, IECC_CLIMATE_PUB
-    ## dbl (755): DOEID, REGIONC, DIVISION, TYPEHUQ, ZTYPEHUQ, CELLAR, ZCELLAR, BASEFIN, ZBASEFIN, ATTIC, ZATTIC, ...
-    ## 
-    ## ℹ Use `spec()` to retrieve the full column specification for this data.
-    ## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
-
-``` r
-unlink(pull(recs_file_osf_det, local_path))
-
-recs <- recs_in %>%
-   select(DOEID, REGIONC, DIVISION, METROMICRO, UATYP10, TYPEHUQ, YEARMADERANGE, HEATHOME, EQUIPMUSE, TEMPHOME, TEMPGONE, TEMPNITE, AIRCOND, USECENAC, TEMPHOMEAC, TEMPGONEAC, TEMPNITEAC, TOTCSQFT, TOTHSQFT, TOTSQFT_EN, TOTUCSQFT, TOTUSQFT, NWEIGHT, starts_with("BRRWT"), CDD30YR, CDD65, CDD80, CLIMATE_REGION_PUB, IECC_CLIMATE_PUB, HDD30YR, HDD65, HDD50, GNDHDD65, BTUEL, DOLLAREL, BTUNG, DOLLARNG, BTULP, DOLLARLP, BTUFO, DOLLARFO, TOTALBTU, TOTALDOL, BTUWOOD=WOODBTU, BTUPELLET=PELLETBTU ) %>%
-   mutate(
-      Region=parse_factor(
-         case_when(
-            REGIONC==1~"Northeast",
-            REGIONC==2~"Midwest",
-            REGIONC==3~"South",
-            REGIONC==4~"West",
-      ), levels=c("Northeast", "Midwest", "South", "West")),
-      Division=parse_factor(
-         case_when(
-            DIVISION==1~"New England",
-            DIVISION==2~"Middle Atlantic",
-            DIVISION==3~"East North Central",
-            DIVISION==4~"West North Central",
-            DIVISION==5~"South Atlantic",
-            DIVISION==6~"East South Central",
-            DIVISION==7~"West South Central",
-            DIVISION==8~"Mountain North",
-            DIVISION==9~"Mountain South",
-            DIVISION==10~"Pacific",
-      ), levels=c("New England", "Middle Atlantic", "East North Central", "West North Central", "South Atlantic", "East South Central", "West South Central", "Mountain North", "Mountain South", "Pacific")),
-      MSAStatus=fct_recode(METROMICRO, "Metropolitan Statistical Area"="METRO", "Micropolitan Statistical Area"="MICRO", "None"="NONE"),
-      Urbanicity=parse_factor(
-         case_when(
-            UATYP10=="U"~"Urban Area",
-            UATYP10=="C"~"Urban Cluster",
-            UATYP10=="R"~"Rural"
-         ),
-         levels=c("Urban Area", "Urban Cluster", "Rural")
-      ),
-      HousingUnitType=parse_factor(
-         case_when(
-            TYPEHUQ==1~"Mobile home",
-            TYPEHUQ==2~"Single-family detached",
-            TYPEHUQ==3~"Single-family attached",
-            TYPEHUQ==4~"Apartment: 2-4 Units",
-            TYPEHUQ==5~"Apartment: 5 or more units",
-      ), levels=c("Mobile home", "Single-family detached", "Single-family attached", "Apartment: 2-4 Units", "Apartment: 5 or more units")),
-      YearMade=parse_factor(
-         case_when(
-            YEARMADERANGE==1~"Before 1950",
-            YEARMADERANGE==2~"1950-1959",
-            YEARMADERANGE==3~"1960-1969",
-            YEARMADERANGE==4~"1970-1979",
-            YEARMADERANGE==5~"1980-1989",
-            YEARMADERANGE==6~"1990-1999",
-            YEARMADERANGE==7~"2000-2009",
-            YEARMADERANGE==8~"2010-2015",
-         ),
-         levels=c("Before 1950", "1950-1959", "1960-1969", "1970-1979", "1980-1989", "1990-1999", "2000-2009", "2010-2015"),
-         ordered = TRUE
-      ),
-      SpaceHeatingUsed=as.logical(HEATHOME),
-      HeatingBehavior=parse_factor(
-         case_when(
-            EQUIPMUSE==1~"Set one temp and leave it",
-            EQUIPMUSE==2~"Manually adjust at night/no one home",
-            EQUIPMUSE==3~"Program thermostat to change at certain times",
-            EQUIPMUSE==4~"Turn on or off as needed",
-            EQUIPMUSE==5~"No control",
-            EQUIPMUSE==9~"Other",
-            EQUIPMUSE==-9~NA_character_),
-         levels=c("Set one temp and leave it", "Manually adjust at night/no one home", "Program thermostat to change at certain times", "Turn on or off as needed", "No control", "Other")
-      ),
-      WinterTempDay=if_else(TEMPHOME>0, TEMPHOME, NA_real_),
-      WinterTempAway=if_else(TEMPGONE>0, TEMPGONE, NA_real_),
-      WinterTempNight=if_else(TEMPNITE>0, TEMPNITE, NA_real_),
-      ACUsed=as.logical(AIRCOND),
-      ACBehavior=parse_factor(
-         case_when(
-            USECENAC==1~"Set one temp and leave it",
-            USECENAC==2~"Manually adjust at night/no one home",
-            USECENAC==3~"Program thermostat to change at certain times",
-            USECENAC==4~"Turn on or off as needed",
-            USECENAC==5~"No control",
-            USECENAC==-9~NA_character_),
-         levels=c("Set one temp and leave it", "Manually adjust at night/no one home", "Program thermostat to change at certain times", "Turn on or off as needed", "No control")
-      ),
-      SummerTempDay=if_else(TEMPHOMEAC>0, TEMPHOMEAC, NA_real_),
-      SummerTempAway=if_else(TEMPGONEAC>0, TEMPGONEAC, NA_real_),
-      SummerTempNight=if_else(TEMPNITEAC>0, TEMPNITEAC, NA_real_),
-      ClimateRegion_BA=parse_factor(CLIMATE_REGION_PUB),
-      ClimateRegion_IECC=factor(IECC_CLIMATE_PUB)
-      
-   )
-```
-
-    ## select: renamed 2 variables (BTUWOOD, BTUPELLET) and dropped 619 variables
-    ## mutate: new variable 'Region' (factor) with 4 unique values and 0% NA
-    ##         new variable 'Division' (factor) with 10 unique values and 0% NA
-    ##         new variable 'MSAStatus' (factor) with 3 unique values and 0% NA
-    ##         new variable 'Urbanicity' (factor) with 3 unique values and 0% NA
-    ##         new variable 'HousingUnitType' (factor) with 5 unique values and 0% NA
-    ##         new variable 'YearMade' (ordered factor) with 8 unique values and 0% NA
-    ##         new variable 'SpaceHeatingUsed' (logical) with 2 unique values and 0% NA
-    ##         new variable 'HeatingBehavior' (factor) with 7 unique values and 0% NA
-    ##         new variable 'WinterTempDay' (double) with 35 unique values and 5% NA
-    ##         new variable 'WinterTempAway' (double) with 37 unique values and 5% NA
-    ##         new variable 'WinterTempNight' (double) with 38 unique values and 5% NA
-    ##         new variable 'ACUsed' (logical) with 2 unique values and 0% NA
-    ##         new variable 'ACBehavior' (factor) with 6 unique values and 0% NA
-    ##         new variable 'SummerTempDay' (double) with 38 unique values and 13% NA
-    ##         new variable 'SummerTempAway' (double) with 35 unique values and 13% NA
-    ##         new variable 'SummerTempNight' (double) with 36 unique values and 13% NA
-    ##         new variable 'ClimateRegion_BA' (factor) with 5 unique values and 0% NA
-    ##         new variable 'ClimateRegion_IECC' (factor) with 11 unique values and 0% NA
-
-## Check derived variables for correct coding
-
-``` r
-recs %>% count(Region, REGIONC)
-```
-
-    ## count: now 4 rows and 3 columns, ungrouped
-
-    ## # A tibble: 4 × 3
-    ##   Region    REGIONC     n
-    ##   <fct>       <dbl> <int>
-    ## 1 Northeast       1   794
-    ## 2 Midwest         2  1327
-    ## 3 South           3  2010
-    ## 4 West            4  1555
-
-``` r
-recs %>% count(Division, DIVISION)
-```
-
-    ## count: now 10 rows and 3 columns, ungrouped
-
-    ## # A tibble: 10 × 3
-    ##    Division           DIVISION     n
-    ##    <fct>                 <dbl> <int>
-    ##  1 New England               1   253
-    ##  2 Middle Atlantic           2   541
-    ##  3 East North Central        3   836
-    ##  4 West North Central        4   491
-    ##  5 South Atlantic            5  1058
-    ##  6 East South Central        6   372
-    ##  7 West South Central        7   580
-    ##  8 Mountain North            8   228
-    ##  9 Mountain South            9   242
-    ## 10 Pacific                  10  1085
-
-``` r
-recs %>% count(MSAStatus, METROMICRO)
-```
-
-    ## count: now 3 rows and 3 columns, ungrouped
-
-    ## # A tibble: 3 × 3
-    ##   MSAStatus                     METROMICRO     n
-    ##   <fct>                         <chr>      <int>
-    ## 1 Metropolitan Statistical Area METRO       4745
-    ## 2 Micropolitan Statistical Area MICRO        584
-    ## 3 None                          NONE         357
-
-``` r
-recs %>% count(Urbanicity, UATYP10)
-```
-
-    ## count: now 3 rows and 3 columns, ungrouped
-
-    ## # A tibble: 3 × 3
-    ##   Urbanicity    UATYP10     n
-    ##   <fct>         <chr>   <int>
-    ## 1 Urban Area    U        3928
-    ## 2 Urban Cluster C         598
-    ## 3 Rural         R        1160
-
-``` r
-recs %>% count(HousingUnitType, TYPEHUQ)
-```
-
-    ## count: now 5 rows and 3 columns, ungrouped
-
-    ## # A tibble: 5 × 3
-    ##   HousingUnitType            TYPEHUQ     n
-    ##   <fct>                        <dbl> <int>
-    ## 1 Mobile home                      1   286
-    ## 2 Single-family detached           2  3752
-    ## 3 Single-family attached           3   479
-    ## 4 Apartment: 2-4 Units             4   311
-    ## 5 Apartment: 5 or more units       5   858
-
-``` r
-recs %>% count(YearMade, YEARMADERANGE)
-```
-
-    ## count: now 8 rows and 3 columns, ungrouped
-
-    ## # A tibble: 8 × 3
-    ##   YearMade    YEARMADERANGE     n
-    ##   <ord>               <dbl> <int>
-    ## 1 Before 1950             1   858
-    ## 2 1950-1959               2   544
-    ## 3 1960-1969               3   565
-    ## 4 1970-1979               4   928
-    ## 5 1980-1989               5   874
-    ## 6 1990-1999               6   786
-    ## 7 2000-2009               7   901
-    ## 8 2010-2015               8   230
-
-``` r
-recs %>% count(SpaceHeatingUsed, HEATHOME)
-```
-
-    ## count: now 2 rows and 3 columns, ungrouped
-
-    ## # A tibble: 2 × 3
-    ##   SpaceHeatingUsed HEATHOME     n
-    ##   <lgl>               <dbl> <int>
-    ## 1 FALSE                   0   258
-    ## 2 TRUE                    1  5428
-
-``` r
-recs %>% count(HeatingBehavior, EQUIPMUSE)
-```
-
-    ## count: now 7 rows and 3 columns, ungrouped
-
-    ## # A tibble: 7 × 3
-    ##   HeatingBehavior                               EQUIPMUSE     n
-    ##   <fct>                                             <dbl> <int>
-    ## 1 Set one temp and leave it                             1  2156
-    ## 2 Manually adjust at night/no one home                  2  1414
-    ## 3 Program thermostat to change at certain times         3   972
-    ## 4 Turn on or off as needed                              4   761
-    ## 5 No control                                            5   114
-    ## 6 Other                                                 9    11
-    ## 7 <NA>                                                 -2   258
-
-``` r
-recs %>% count(ACUsed, AIRCOND)
-```
-
-    ## count: now 2 rows and 3 columns, ungrouped
-
-    ## # A tibble: 2 × 3
-    ##   ACUsed AIRCOND     n
-    ##   <lgl>    <dbl> <int>
-    ## 1 FALSE        0   737
-    ## 2 TRUE         1  4949
-
-``` r
-recs %>% count(ACBehavior, USECENAC)
-```
-
-    ## count: now 6 rows and 3 columns, ungrouped
-
-    ## # A tibble: 6 × 3
-    ##   ACBehavior                                    USECENAC     n
-    ##   <fct>                                            <dbl> <int>
-    ## 1 Set one temp and leave it                            1  1661
-    ## 2 Manually adjust at night/no one home                 2   984
-    ## 3 Program thermostat to change at certain times        3   727
-    ## 4 Turn on or off as needed                             4   438
-    ## 5 No control                                           5     2
-    ## 6 <NA>                                                -2  1874
-
-``` r
-recs %>% count(ClimateRegion_BA, CLIMATE_REGION_PUB)
-```
-
-    ## count: now 5 rows and 3 columns, ungrouped
-
-    ## # A tibble: 5 × 3
-    ##   ClimateRegion_BA  CLIMATE_REGION_PUB     n
-    ##   <fct>             <chr>              <int>
-    ## 1 Hot-Dry/Mixed-Dry Hot-Dry/Mixed-Dry    750
-    ## 2 Hot-Humid         Hot-Humid           1036
-    ## 3 Mixed-Humid       Mixed-Humid         1468
-    ## 4 Cold/Very Cold    Cold/Very Cold      2008
-    ## 5 Marine            Marine               424
-
-``` r
-recs %>% count(ClimateRegion_IECC, IECC_CLIMATE_PUB)
-```
-
-    ## count: now 11 rows and 3 columns, ungrouped
-
-    ## # A tibble: 11 × 3
-    ##    ClimateRegion_IECC IECC_CLIMATE_PUB     n
-    ##    <fct>              <chr>            <int>
-    ##  1 1A-2A              1A-2A              846
-    ##  2 2B                 2B                 106
-    ##  3 3A                 3A                 637
-    ##  4 3B-4B              3B-4B              644
-    ##  5 3C                 3C                 209
-    ##  6 4A                 4A                1021
-    ##  7 4C                 4C                 215
-    ##  8 5A                 5A                1240
-    ##  9 5B-5C              5B-5C              332
-    ## 10 6A-6B              6A-6B              376
-    ## 11 7A-7B-7AK-8AK      7A-7B-7AK-8AK       60
-
-## Save data
-
-``` r
-recs_out <- recs %>%
-   select(DOEID, REGIONC, Region, Division, MSAStatus, Urbanicity, HousingUnitType, YearMade, SpaceHeatingUsed, HeatingBehavior, WinterTempDay, WinterTempAway, WinterTempNight, ACUsed, ACBehavior, SummerTempDay, SummerTempAway, SummerTempNight, TOTCSQFT, TOTHSQFT, TOTSQFT_EN, TOTUCSQFT, TOTUSQFT, NWEIGHT, starts_with("BRRWT"), CDD30YR, CDD65, CDD80, ClimateRegion_BA, ClimateRegion_IECC, HDD30YR, HDD65, HDD50, GNDHDD65, BTUEL, DOLLAREL, BTUNG, DOLLARNG, BTULP, DOLLARLP, BTUFO, DOLLARFO, TOTALBTU, TOTALDOL, BTUWOOD, BTUPELLET)
-```
-
-    ## select: dropped 17 variables (DIVISION, METROMICRO, UATYP10, TYPEHUQ, YEARMADERANGE, …)
-
-``` r
-summary(recs_out)
-```
-
-    ##      DOEID          REGIONC            Region                   Division   
-    ##  Min.   :10001   Min.   :1.000   Northeast: 794   Pacific           :1085  
-    ##  1st Qu.:11422   1st Qu.:2.000   Midwest  :1327   South Atlantic    :1058  
-    ##  Median :12844   Median :3.000   South    :2010   East North Central: 836  
-    ##  Mean   :12844   Mean   :2.761   West     :1555   West South Central: 580  
-    ##  3rd Qu.:14265   3rd Qu.:4.000                    Middle Atlantic   : 541  
-    ##  Max.   :15686   Max.   :4.000                    West North Central: 491  
-    ##                                                   (Other)           :1095  
-    ##                          MSAStatus            Urbanicity                     HousingUnitType        YearMade  
-    ##  Metropolitan Statistical Area:4745   Urban Area   :3928   Mobile home               : 286   1970-1979  :928  
-    ##  Micropolitan Statistical Area: 584   Urban Cluster: 598   Single-family detached    :3752   2000-2009  :901  
-    ##  None                         : 357   Rural        :1160   Single-family attached    : 479   1980-1989  :874  
-    ##                                                            Apartment: 2-4 Units      : 311   Before 1950:858  
-    ##                                                            Apartment: 5 or more units: 858   1990-1999  :786  
-    ##                                                                                              1960-1969  :565  
-    ##                                                                                              (Other)    :774  
-    ##  SpaceHeatingUsed                                      HeatingBehavior WinterTempDay   WinterTempAway 
-    ##  Mode :logical    Set one temp and leave it                    :2156   Min.   :50.00   Min.   :50.00  
-    ##  FALSE:258        Manually adjust at night/no one home         :1414   1st Qu.:68.00   1st Qu.:65.00  
-    ##  TRUE :5428       Program thermostat to change at certain times: 972   Median :70.00   Median :68.00  
-    ##                   Turn on or off as needed                     : 761   Mean   :70.06   Mean   :67.12  
-    ##                   No control                                   : 114   3rd Qu.:72.00   3rd Qu.:70.00  
-    ##                   Other                                        :  11   Max.   :90.00   Max.   :90.00  
-    ##                   NA                                           : 258   NA's   :258     NA's   :258    
-    ##  WinterTempNight   ACUsed                                                ACBehavior   SummerTempDay  
-    ##  Min.   :50.00   Mode :logical   Set one temp and leave it                    :1661   Min.   :50.00  
-    ##  1st Qu.:65.00   FALSE:737       Manually adjust at night/no one home         : 984   1st Qu.:70.00  
-    ##  Median :68.00   TRUE :4949      Program thermostat to change at certain times: 727   Median :72.00  
-    ##  Mean   :68.06                   Turn on or off as needed                     : 438   Mean   :72.66  
-    ##  3rd Qu.:70.00                   No control                                   :   2   3rd Qu.:76.00  
-    ##  Max.   :90.00                   NA                                           :1874   Max.   :90.00  
-    ##  NA's   :258                                                                          NA's   :737    
-    ##  SummerTempAway  SummerTempNight    TOTCSQFT         TOTHSQFT      TOTSQFT_EN     TOTUCSQFT     
-    ##  Min.   :50.00   Min.   :50.00   Min.   :   0.0   Min.   :   0   Min.   : 221   Min.   :   0.0  
-    ##  1st Qu.:71.00   1st Qu.:70.00   1st Qu.: 466.2   1st Qu.:1008   1st Qu.:1100   1st Qu.:   0.0  
-    ##  Median :75.00   Median :72.00   Median :1218.5   Median :1559   Median :1774   Median : 400.0  
-    ##  Mean   :74.63   Mean   :71.82   Mean   :1454.5   Mean   :1816   Mean   :2081   Mean   : 793.9  
-    ##  3rd Qu.:78.00   3rd Qu.:75.00   3rd Qu.:2094.0   3rd Qu.:2400   3rd Qu.:2766   3rd Qu.:1150.0  
-    ##  Max.   :90.00   Max.   :90.00   Max.   :8066.0   Max.   :8066   Max.   :8501   Max.   :7986.0  
-    ##  NA's   :737     NA's   :737                                                                    
-    ##     TOTUSQFT         NWEIGHT           BRRWT1           BRRWT2             BRRWT3             BRRWT4        
-    ##  Min.   :   0.0   Min.   :  1236   Min.   :  1836   Min.   :   685.9   Min.   :   543.9   Min.   :   699.7  
-    ##  1st Qu.:   0.0   1st Qu.: 13874   1st Qu.:  9859   1st Qu.:  9733.0   1st Qu.:  9575.3   1st Qu.:  9518.5  
-    ##  Median : 250.0   Median : 18510   Median : 16942   Median : 16993.7   Median : 16698.7   Median : 17034.2  
-    ##  Mean   : 432.6   Mean   : 20789   Mean   : 20789   Mean   : 20789.3   Mean   : 20789.3   Mean   : 20789.3  
-    ##  3rd Qu.: 569.8   3rd Qu.: 24840   3rd Qu.: 27219   3rd Qu.: 27825.1   3rd Qu.: 27941.8   3rd Qu.: 27931.5  
-    ##  Max.   :6660.0   Max.   :139307   Max.   :203902   Max.   :189788.1   Max.   :180155.3   Max.   :159902.6  
-    ##                                                                                                             
-    ##      BRRWT5             BRRWT6             BRRWT7             BRRWT8           BRRWT9        
-    ##  Min.   :   649.3   Min.   :   638.7   Min.   :   564.1   Min.   :   591   Min.   :   545.2  
-    ##  1st Qu.:  9598.5   1st Qu.:  9501.7   1st Qu.:  9534.4   1st Qu.:  9653   1st Qu.:  9595.0  
-    ##  Median : 16487.5   Median : 16150.6   Median : 16332.5   Median : 16802   Median : 17352.7  
-    ##  Mean   : 20789.3   Mean   : 20789.3   Mean   : 20789.3   Mean   : 20789   Mean   : 20789.3  
-    ##  3rd Qu.: 27856.7   3rd Qu.: 28092.8   3rd Qu.: 27992.5   3rd Qu.: 27926   3rd Qu.: 27753.7  
-    ##  Max.   :141796.4   Max.   :189031.8   Max.   :192311.7   Max.   :195071   Max.   :117167.3  
-    ##                                                                                              
-    ##     BRRWT10            BRRWT11            BRRWT12            BRRWT13          BRRWT14        
-    ##  Min.   :   732.5   Min.   :   586.1   Min.   :   549.8   Min.   :   668   Min.   :   544.5  
-    ##  1st Qu.:  9077.6   1st Qu.:  9448.5   1st Qu.:  9388.2   1st Qu.:  9757   1st Qu.:  9491.8  
-    ##  Median : 16601.9   Median : 16172.3   Median : 16167.4   Median : 16584   Median : 17028.9  
-    ##  Mean   : 20789.3   Mean   : 20789.3   Mean   : 20789.3   Mean   : 20789   Mean   : 20789.3  
-    ##  3rd Qu.: 28089.9   3rd Qu.: 28022.1   3rd Qu.: 28075.4   3rd Qu.: 27455   3rd Qu.: 27975.3  
-    ##  Max.   :183073.4   Max.   :195408.4   Max.   :197373.3   Max.   :182228   Max.   :173341.2  
-    ##                                                                                              
-    ##     BRRWT15            BRRWT16            BRRWT17            BRRWT18            BRRWT19      
-    ##  Min.   :   671.4   Min.   :   603.4   Min.   :   563.3   Min.   :   517.2   Min.   :   657  
-    ##  1st Qu.:  9341.8   1st Qu.:  9804.6   1st Qu.:  9593.2   1st Qu.:  9839.6   1st Qu.:  9776  
-    ##  Median : 15996.8   Median : 16562.6   Median : 16750.8   Median : 16560.5   Median : 16779  
-    ##  Mean   : 20789.3   Mean   : 20789.3   Mean   : 20789.3   Mean   : 20789.3   Mean   : 20789  
-    ##  3rd Qu.: 28117.5   3rd Qu.: 27322.1   3rd Qu.: 27458.0   3rd Qu.: 27636.2   3rd Qu.: 27986  
-    ##  Max.   :179152.7   Max.   :210507.2   Max.   :195346.9   Max.   :158094.9   Max.   :197236  
-    ##                                                                                              
-    ##     BRRWT20            BRRWT21            BRRWT22            BRRWT23            BRRWT24        
-    ##  Min.   :   682.2   Min.   :   689.4   Min.   :   581.3   Min.   :   658.4   Min.   :   698.7  
-    ##  1st Qu.:  9569.2   1st Qu.:  9663.9   1st Qu.:  9805.3   1st Qu.:  9597.1   1st Qu.:  9387.9  
-    ##  Median : 16881.2   Median : 16503.8   Median : 16711.4   Median : 16205.0   Median : 16398.2  
-    ##  Mean   : 20789.3   Mean   : 20789.3   Mean   : 20789.3   Mean   : 20789.3   Mean   : 20789.3  
-    ##  3rd Qu.: 27467.7   3rd Qu.: 27863.0   3rd Qu.: 27503.4   3rd Qu.: 27855.2   3rd Qu.: 27791.0  
-    ##  Max.   :146347.4   Max.   :181583.8   Max.   :173557.2   Max.   :182366.0   Max.   :170970.0  
-    ##                                                                                                
-    ##     BRRWT25            BRRWT26            BRRWT27          BRRWT28            BRRWT29          BRRWT30        
-    ##  Min.   :   541.3   Min.   :   832.9   Min.   :  1372   Min.   :   764.7   Min.   :   854   Min.   :   680.6  
-    ##  1st Qu.:  9502.9   1st Qu.:  9593.2   1st Qu.:  9333   1st Qu.:  9358.0   1st Qu.:  9596   1st Qu.:  9689.3  
-    ##  Median : 17120.6   Median : 16642.2   Median : 16671   Median : 16663.4   Median : 16336   Median : 16683.8  
-    ##  Mean   : 20789.3   Mean   : 20789.3   Mean   : 20789   Mean   : 20789.3   Mean   : 20789   Mean   : 20789.3  
-    ##  3rd Qu.: 28108.8   3rd Qu.: 28018.5   3rd Qu.: 27832   3rd Qu.: 28065.9   3rd Qu.: 27506   3rd Qu.: 27613.1  
-    ##  Max.   :128220.6   Max.   :176770.0   Max.   :176453   Max.   :210413.6   Max.   :194434   Max.   :118557.6  
-    ##                                                                                                               
-    ##     BRRWT31            BRRWT32            BRRWT33            BRRWT34          BRRWT35        
-    ##  Min.   :   868.4   Min.   :   645.1   Min.   :   714.2   Min.   :  1880   Min.   :   629.3  
-    ##  1st Qu.:  9493.1   1st Qu.:  9370.6   1st Qu.:  9530.8   1st Qu.:  9703   1st Qu.:  9842.0  
-    ##  Median : 16876.0   Median : 16594.5   Median : 16839.7   Median : 16380   Median : 17204.4  
-    ##  Mean   : 20789.3   Mean   : 20789.3   Mean   : 20789.3   Mean   : 20789   Mean   : 20789.3  
-    ##  3rd Qu.: 27807.8   3rd Qu.: 28250.9   3rd Qu.: 27610.2   3rd Qu.: 27846   3rd Qu.: 27533.4  
-    ##  Max.   :197960.8   Max.   :182658.3   Max.   :183414.8   Max.   :130246   Max.   :125674.9  
-    ##                                                                                              
-    ##     BRRWT36            BRRWT37            BRRWT38            BRRWT39            BRRWT40          BRRWT41      
-    ##  Min.   :   980.2   Min.   :   634.6   Min.   :   738.1   Min.   :   684.5   Min.   :  1531   Min.   :  1406  
-    ##  1st Qu.:  9439.6   1st Qu.:  9276.7   1st Qu.:  9737.9   1st Qu.:  9389.5   1st Qu.:  9624   1st Qu.:  9776  
-    ##  Median : 16440.6   Median : 16620.9   Median : 16862.8   Median : 16797.7   Median : 16644   Median : 16910  
-    ##  Mean   : 20789.3   Mean   : 20789.3   Mean   : 20789.3   Mean   : 20789.3   Mean   : 20789   Mean   : 20789  
-    ##  3rd Qu.: 28354.2   3rd Qu.: 27754.3   3rd Qu.: 27710.0   3rd Qu.: 27850.3   3rd Qu.: 27858   3rd Qu.: 27616  
-    ##  Max.   :171375.9   Max.   :209103.9   Max.   :187208.7   Max.   :136106.4   Max.   :165612   Max.   :145467  
-    ##                                                                                                               
-    ##     BRRWT42            BRRWT43            BRRWT44            BRRWT45          BRRWT46            BRRWT47      
-    ##  Min.   :   943.8   Min.   :   683.3   Min.   :   866.4   Min.   :  1105   Min.   :   750.7   Min.   :  1230  
-    ##  1st Qu.:  9446.7   1st Qu.:  9563.6   1st Qu.:  9595.5   1st Qu.:  9563   1st Qu.:  9616.2   1st Qu.:  9362  
-    ##  Median : 16177.2   Median : 16999.1   Median : 17034.6   Median : 16629   Median : 16821.6   Median : 16243  
-    ##  Mean   : 20789.3   Mean   : 20789.3   Mean   : 20789.3   Mean   : 20789   Mean   : 20789.3   Mean   : 20789  
-    ##  3rd Qu.: 28089.3   3rd Qu.: 27724.1   3rd Qu.: 27593.8   3rd Qu.: 27773   3rd Qu.: 27563.3   3rd Qu.: 27547  
-    ##  Max.   :189726.6   Max.   :192302.9   Max.   :190671.5   Max.   :160108   Max.   :183963.8   Max.   :196001  
-    ##                                                                                                               
-    ##     BRRWT48            BRRWT49            BRRWT50          BRRWT51            BRRWT52        
-    ##  Min.   :   684.4   Min.   :   627.1   Min.   :  1638   Min.   :   922.9   Min.   :   749.9  
-    ##  1st Qu.:  9383.9   1st Qu.:  9489.0   1st Qu.:  9601   1st Qu.:  9704.7   1st Qu.:  9496.9  
-    ##  Median : 16720.3   Median : 17068.6   Median : 16788   Median : 16706.2   Median : 16442.9  
-    ##  Mean   : 20789.3   Mean   : 20789.3   Mean   : 20789   Mean   : 20789.3   Mean   : 20789.3  
-    ##  3rd Qu.: 27965.8   3rd Qu.: 27829.1   3rd Qu.: 27667   3rd Qu.: 27755.8   3rd Qu.: 27621.2  
-    ##  Max.   :199079.7   Max.   :203407.7   Max.   :223546   Max.   :161561.8   Max.   :146056.0  
-    ##                                                                                              
-    ##     BRRWT53            BRRWT54            BRRWT55          BRRWT56            BRRWT57        
-    ##  Min.   :   871.8   Min.   :   687.9   Min.   :  2056   Min.   :   623.7   Min.   :   713.4  
-    ##  1st Qu.:  9489.1   1st Qu.:  9623.3   1st Qu.:  9595   1st Qu.:  9798.4   1st Qu.:  9393.8  
-    ##  Median : 16494.9   Median : 16662.9   Median : 16589   Median : 16624.8   Median : 17198.4  
-    ##  Mean   : 20789.3   Mean   : 20789.3   Mean   : 20789   Mean   : 20789.3   Mean   : 20789.3  
-    ##  3rd Qu.: 28075.0   3rd Qu.: 27612.8   3rd Qu.: 27857   3rd Qu.: 27650.0   3rd Qu.: 27964.1  
-    ##  Max.   :143796.6   Max.   :174657.5   Max.   :206797   Max.   :226169.8   Max.   :162193.6  
-    ##                                                                                              
-    ##     BRRWT58            BRRWT59            BRRWT60          BRRWT61            BRRWT62        
-    ##  Min.   :   905.5   Min.   :   630.7   Min.   :  1275   Min.   :   546.4   Min.   :   739.7  
-    ##  1st Qu.:  9559.2   1st Qu.:  9623.7   1st Qu.:  9577   1st Qu.:  9387.4   1st Qu.:  9643.5  
-    ##  Median : 16540.0   Median : 16656.6   Median : 16197   Median : 16376.3   Median : 17067.2  
-    ##  Mean   : 20789.3   Mean   : 20789.3   Mean   : 20789   Mean   : 20789.3   Mean   : 20789.3  
-    ##  3rd Qu.: 27780.9   3rd Qu.: 27577.8   3rd Qu.: 27781   3rd Qu.: 28016.5   3rd Qu.: 27540.6  
-    ##  Max.   :211170.6   Max.   :206702.7   Max.   :169387   Max.   :122260.9   Max.   :158200.9  
-    ##                                                                                              
-    ##     BRRWT63            BRRWT64            BRRWT65          BRRWT66          BRRWT67            BRRWT68      
-    ##  Min.   :   671.5   Min.   :   926.4   Min.   :  1144   Min.   :  1264   Min.   :   684.8   Min.   :  1053  
-    ##  1st Qu.:  9455.3   1st Qu.:  9400.5   1st Qu.:  9597   1st Qu.:  9758   1st Qu.:  9588.0   1st Qu.:  9245  
-    ##  Median : 16632.1   Median : 16508.1   Median : 16442   Median : 16565   Median : 16560.8   Median : 16464  
-    ##  Mean   : 20789.3   Mean   : 20789.3   Mean   : 20789   Mean   : 20789   Mean   : 20789.3   Mean   : 20789  
-    ##  3rd Qu.: 28020.8   3rd Qu.: 27693.9   3rd Qu.: 27348   3rd Qu.: 27884   3rd Qu.: 27838.7   3rd Qu.: 28108  
-    ##  Max.   :196933.9   Max.   :217490.7   Max.   :239712   Max.   :157193   Max.   :179204.9   Max.   :183266  
-    ##                                                                                                             
-    ##     BRRWT69          BRRWT70            BRRWT71            BRRWT72            BRRWT73          BRRWT74        
-    ##  Min.   :  1676   Min.   :   758.4   Min.   :   892.2   Min.   :   695.5   Min.   :   875   Min.   :   541.6  
-    ##  1st Qu.:  9371   1st Qu.:  9622.5   1st Qu.:  9451.9   1st Qu.:  9516.0   1st Qu.:  9734   1st Qu.:  9503.9  
-    ##  Median : 16682   Median : 16676.4   Median : 16482.8   Median : 16717.8   Median : 16930   Median : 16128.6  
-    ##  Mean   : 20789   Mean   : 20789.3   Mean   : 20789.3   Mean   : 20789.3   Mean   : 20789   Mean   : 20789.3  
-    ##  3rd Qu.: 27957   3rd Qu.: 27897.7   3rd Qu.: 27882.7   3rd Qu.: 27611.7   3rd Qu.: 27756   3rd Qu.: 27849.9  
-    ##  Max.   :193274   Max.   :146583.8   Max.   :126528.3   Max.   :196704.6   Max.   :184412   Max.   :125833.8  
-    ##                                                                                                               
-    ##     BRRWT75            BRRWT76          BRRWT77            BRRWT78            BRRWT79        
-    ##  Min.   :   669.7   Min.   :   617   Min.   :   560.5   Min.   :   526.7   Min.   :   651.1  
-    ##  1st Qu.:  9835.9   1st Qu.:  9385   1st Qu.:  9673.8   1st Qu.:  9744.1   1st Qu.:  9549.7  
-    ##  Median : 16921.5   Median : 17000   Median : 16713.6   Median : 17098.9   Median : 16676.0  
-    ##  Mean   : 20789.3   Mean   : 20789   Mean   : 20789.3   Mean   : 20789.3   Mean   : 20789.3  
-    ##  3rd Qu.: 27352.3   3rd Qu.: 27558   3rd Qu.: 27712.8   3rd Qu.: 27459.8   3rd Qu.: 27857.9  
-    ##  Max.   :194829.8   Max.   :212262   Max.   :234971.4   Max.   :152055.4   Max.   :180157.0  
-    ##                                                                                              
-    ##     BRRWT80            BRRWT81            BRRWT82            BRRWT83            BRRWT84        
-    ##  Min.   :   675.7   Min.   :   681.2   Min.   :   563.6   Min.   :   656.9   Min.   :   652.7  
-    ##  1st Qu.:  9554.4   1st Qu.:  9489.0   1st Qu.:  9216.4   1st Qu.:  9634.4   1st Qu.:  9432.5  
-    ##  Median : 16707.8   Median : 16769.3   Median : 16121.6   Median : 16516.9   Median : 16454.8  
-    ##  Mean   : 20789.3   Mean   : 20789.3   Mean   : 20789.3   Mean   : 20789.3   Mean   : 20789.3  
-    ##  3rd Qu.: 27688.3   3rd Qu.: 27901.5   3rd Qu.: 28253.1   3rd Qu.: 27725.8   3rd Qu.: 28006.4  
-    ##  Max.   :165661.6   Max.   :191740.1   Max.   :171004.8   Max.   :184719.0   Max.   :191550.3  
-    ##                                                                                                
-    ##     BRRWT85            BRRWT86            BRRWT87            BRRWT88            BRRWT89        
-    ##  Min.   :   675.4   Min.   :   680.3   Min.   :   551.7   Min.   :   704.2   Min.   :   644.9  
-    ##  1st Qu.:  9551.2   1st Qu.:  9619.8   1st Qu.:  9436.6   1st Qu.:  9393.1   1st Qu.:  9643.2  
-    ##  Median : 16902.2   Median : 16772.0   Median : 16799.0   Median : 16778.6   Median : 16586.1  
-    ##  Mean   : 20789.3   Mean   : 20789.3   Mean   : 20789.3   Mean   : 20789.3   Mean   : 20789.3  
-    ##  3rd Qu.: 27325.4   3rd Qu.: 27638.1   3rd Qu.: 28046.3   3rd Qu.: 27789.9   3rd Qu.: 28075.4  
-    ##  Max.   :198238.4   Max.   :232065.5   Max.   :179835.0   Max.   :166866.1   Max.   :144299.3  
-    ##                                                                                                
-    ##     BRRWT90            BRRWT91            BRRWT92            BRRWT93            BRRWT94        
-    ##  Min.   :   649.2   Min.   :   568.2   Min.   :   591.9   Min.   :   545.3   Min.   :   716.2  
-    ##  1st Qu.:  9467.7   1st Qu.:  9506.3   1st Qu.:  9610.6   1st Qu.:  9688.4   1st Qu.:  9561.6  
-    ##  Median : 16212.0   Median : 16781.5   Median : 16524.1   Median : 16258.4   Median : 17099.7  
-    ##  Mean   : 20789.3   Mean   : 20789.3   Mean   : 20789.3   Mean   : 20789.3   Mean   : 20789.3  
-    ##  3rd Qu.: 28020.8   3rd Qu.: 27876.1   3rd Qu.: 27915.1   3rd Qu.: 27728.8   3rd Qu.: 27853.9  
-    ##  Max.   :175279.5   Max.   :205917.4   Max.   :225638.4   Max.   :117260.5   Max.   :207264.3  
-    ##                                                                                                
-    ##     BRRWT95            BRRWT96            CDD30YR         CDD65          CDD80       
-    ##  Min.   :   566.4   Min.   :   551.1   Min.   :   0   Min.   :   0   Min.   :   0.0  
-    ##  1st Qu.:  9530.2   1st Qu.:  9533.2   1st Qu.: 712   1st Qu.: 793   1st Qu.:  10.0  
-    ##  Median : 16577.2   Median : 16358.9   Median :1150   Median :1378   Median :  60.0  
-    ##  Mean   : 20789.3   Mean   : 20789.3   Mean   :1451   Mean   :1719   Mean   : 174.7  
-    ##  3rd Qu.: 27441.4   3rd Qu.: 27823.1   3rd Qu.:1880   3rd Qu.:2231   3rd Qu.: 208.0  
-    ##  Max.   :205015.8   Max.   :171550.8   Max.   :5792   Max.   :6607   Max.   :2297.0  
-    ##                                                                                      
-    ##           ClimateRegion_BA ClimateRegion_IECC    HDD30YR          HDD65          HDD50         GNDHDD65    
-    ##  Hot-Dry/Mixed-Dry: 750    5A     :1240       Min.   :    0   Min.   :   0   Min.   :   0   Min.   :    0  
-    ##  Hot-Humid        :1036    4A     :1021       1st Qu.: 2102   1st Qu.:1881   1st Qu.: 260   1st Qu.: 1337  
-    ##  Mixed-Humid      :1468    1A-2A  : 846       Median : 4353   Median :3878   Median :1260   Median : 3704  
-    ##  Cold/Very Cold   :2008    3B-4B  : 644       Mean   : 4087   Mean   :3708   Mean   :1486   Mean   : 3578  
-    ##  Marine           : 424    3A     : 637       3rd Qu.: 5967   3rd Qu.:5467   3rd Qu.:2499   3rd Qu.: 5630  
-    ##                            6A-6B  : 376       Max.   :12184   Max.   :9843   Max.   :4956   Max.   :11851  
-    ##                            (Other): 922                                                                    
-    ##      BTUEL             DOLLAREL           BTUNG           DOLLARNG          BTULP           DOLLARLP      
-    ##  Min.   :   201.6   Min.   :  18.72   Min.   :     0   Min.   :   0.0   Min.   :     0   Min.   :   0.00  
-    ##  1st Qu.: 20221.3   1st Qu.: 815.12   1st Qu.:     0   1st Qu.:   0.0   1st Qu.:     0   1st Qu.:   0.00  
-    ##  Median : 32582.4   Median :1253.02   Median : 17961   Median : 231.8   Median :     0   Median :   0.00  
-    ##  Mean   : 37630.7   Mean   :1403.78   Mean   : 33331   Mean   : 346.8   Mean   :  3192   Mean   :  67.72  
-    ##  3rd Qu.: 49670.6   3rd Qu.:1830.83   3rd Qu.: 57126   3rd Qu.: 605.1   3rd Qu.:     0   3rd Qu.:   0.00  
-    ##  Max.   :215695.7   Max.   :8121.56   Max.   :306594   Max.   :2789.8   Max.   :220435   Max.   :5121.27  
-    ##                                                                                                           
-    ##      BTUFO           DOLLARFO          TOTALBTU           TOTALDOL           BTUWOOD         BTUPELLET       
-    ##  Min.   :     0   Min.   :   0.00   Min.   :   201.6   Min.   :   60.46   Min.   :     0   Min.   :     0.0  
-    ##  1st Qu.:     0   1st Qu.:   0.00   1st Qu.: 42655.8   1st Qu.: 1175.49   1st Qu.:     0   1st Qu.:     0.0  
-    ##  Median :     0   Median :   0.00   Median : 68663.3   Median : 1724.60   Median :     0   Median :     0.0  
-    ##  Mean   :  3569   Mean   :  64.08   Mean   : 77722.9   Mean   : 1882.34   Mean   :  4140   Mean   :   197.4  
-    ##  3rd Qu.:     0   3rd Qu.:   0.00   3rd Qu.:103832.9   3rd Qu.: 2385.84   3rd Qu.:     0   3rd Qu.:     0.0  
-    ##  Max.   :273608   Max.   :4700.03   Max.   :490187.4   Max.   :10135.99   Max.   :295476   Max.   :115500.0  
-    ## 
-
-``` r
-recs_der_tmp_loc <- here("osf_dl", "recs_2015.rds")
-write_rds(recs_out, recs_der_tmp_loc)
-target_dir <- osf_retrieve_node("https://osf.io/gzbkn/?view_only=8ca80573293b4e12b7f934a0f742b957") 
-osf_upload(target_dir, path=recs_der_tmp_loc, conflicts="overwrite")
-```
-
-    ## # A tibble: 1 × 3
-    ##   name          id                       meta            
-    ##   <chr>         <chr>                    <list>          
-    ## 1 recs_2015.rds 647d2c0e85df48090e7754b2 <named list [3]>
-
-``` r
-unlink(recs_der_tmp_loc)
-```
diff --git a/DataCleaningScripts/RECS_2020_DataPrep.Rmd b/DataCleaningScripts/RECS_2020_DataPrep.Rmd
index bcf9994a..e69de29b 100644
--- a/DataCleaningScripts/RECS_2020_DataPrep.Rmd
+++ b/DataCleaningScripts/RECS_2020_DataPrep.Rmd
@@ -1,230 +0,0 @@
----
-title: "Residential Energy Consumption Survey (RECS) 2020 Data Prep"
-output: 
-  github_document:
-    html_preview: false
----
-
-```{r setup, include=FALSE}
-knitr::opts_chunk$set(echo = TRUE)
-```
-
-## Data information
-
-All data and resources were downloaded from https://www.eia.gov/consumption/residential/data/2020/index.php?view=microdata on September 17, 2023.
-
-```{r}
-#| label: loadpackages
-
-library(tidyverse) #data manipulation
-library(haven) #data import
-library(tidylog) #informative logging messages
-library(osfr)
-```
-
-## Import data and create derived variables
-
-```{r}
-#| label: derivedata
-
-recs_file_osf_det <- osf_retrieve_node("https://osf.io/z5c3m/") %>%
-  osf_ls_files(path="RECS_2020", pattern="sas7bdat") %>%
-  filter(str_detect(name, "v5")) %>%
-  osf_download(conflicts="overwrite", path=here::here("osf_dl"))
-
-recs_in <- haven::read_sas(pull(recs_file_osf_det, local_path))
-
-unlink(pull(recs_file_osf_det, local_path))
-
-
-# 2015 to 2020 differences
-# Added states!
-# Variables gone: METROMICRO, TOTUCSQFT (uncooled sq ft), TOTUSQFT (unheated sq ft), CDD80, HDD50, GNDHDD65, PELLETBTU
-# HEATCNTL replaces EQUIPMUSE
-# COOLCNTL replaces USECENAC
-# CDD30YR_PUB replaces CDD30YR
-# BA_climate replaces CLIMATE_REGION_PUB 
-# IECC_climate_code replaces IECC_CLIMATE_PUB
-# HDD30YR_PUB replaces HDD30YR
-# BTUWD replaces WOODBTU
-# BRR weights are NWEIGHT
-
-recs <- recs_in %>%
-   select(DOEID, REGIONC, DIVISION, STATE_FIPS, state_postal, state_name, UATYP10, TYPEHUQ, YEARMADERANGE, HEATHOME, HEATCNTL, TEMPHOME, TEMPGONE, TEMPNITE, AIRCOND, COOLCNTL, TEMPHOMEAC, TEMPGONEAC, TEMPNITEAC, TOTCSQFT, TOTHSQFT, TOTSQFT_EN, NWEIGHT, starts_with("NWEIGHT"), CDD30YR=CDD30YR_PUB, CDD65, BA_climate, IECC_climate_code, HDD30YR=HDD30YR_PUB, HDD65, BTUEL, DOLLAREL, BTUNG, DOLLARNG, BTULP, DOLLARLP, BTUFO, DOLLARFO, TOTALBTU, TOTALDOL, BTUWOOD=BTUWD) %>%
-  mutate(
-    Region=parse_factor(
-      str_to_title(REGIONC),
-      levels=c("Northeast", "Midwest", "South", "West")),
-    Division=parse_factor(
-      DIVISION, levels=c("New England", "Middle Atlantic", "East North Central", "West North Central", "South Atlantic", "East South Central", "West South Central", "Mountain North", "Mountain South", "Pacific")),
-    Urbanicity=parse_factor(
-      case_when(
-        UATYP10=="U"~"Urban Area",
-        UATYP10=="C"~"Urban Cluster",
-        UATYP10=="R"~"Rural"
-      ),
-      levels=c("Urban Area", "Urban Cluster", "Rural")
-    ),
-    HousingUnitType=parse_factor(
-      case_when(
-        TYPEHUQ==1~"Mobile home",
-        TYPEHUQ==2~"Single-family detached",
-        TYPEHUQ==3~"Single-family attached",
-        TYPEHUQ==4~"Apartment: 2-4 Units",
-        TYPEHUQ==5~"Apartment: 5 or more units",
-      ), levels=c("Mobile home", "Single-family detached", "Single-family attached", "Apartment: 2-4 Units", "Apartment: 5 or more units")),
-    YearMade=parse_factor(
-      case_when(
-        YEARMADERANGE==1~"Before 1950",
-        YEARMADERANGE==2~"1950-1959",
-        YEARMADERANGE==3~"1960-1969",
-        YEARMADERANGE==4~"1970-1979",
-        YEARMADERANGE==5~"1980-1989",
-        YEARMADERANGE==6~"1990-1999",
-        YEARMADERANGE==7~"2000-2009",
-        YEARMADERANGE==8~"2010-2015",
-        YEARMADERANGE==9~"2016-2020"
-      ),
-      levels=c("Before 1950", "1950-1959", "1960-1969", "1970-1979", "1980-1989", "1990-1999", "2000-2009", "2010-2015", "2016-2020"),
-      ordered = TRUE
-    ),
-    SpaceHeatingUsed=as.logical(HEATHOME),
-    HeatingBehavior=parse_factor(
-      case_when(
-        HEATCNTL==1~"Set one temp and leave it",
-        HEATCNTL==2~"Manually adjust at night/no one home",
-        HEATCNTL==3~"Programmable or smart thermostat automatically adjusts the temperature",
-        HEATCNTL==4~"Turn on or off as needed",
-        HEATCNTL==5~"No control",
-        HEATCNTL==99~"Other",
-        HEATCNTL==-2~NA_character_),
-      levels=c("Set one temp and leave it", "Manually adjust at night/no one home", "Programmable or smart thermostat automatically adjusts the temperature", "Turn on or off as needed", "No control", "Other")
-    ),
-    WinterTempDay=if_else(TEMPHOME>0, TEMPHOME, NA_real_),
-    WinterTempAway=if_else(TEMPGONE>0, TEMPGONE, NA_real_),
-    WinterTempNight=if_else(TEMPNITE>0, TEMPNITE, NA_real_),
-    ACUsed=as.logical(AIRCOND),
-    ACBehavior=parse_factor(
-      case_when(
-        COOLCNTL==1~"Set one temp and leave it",
-        COOLCNTL==2~"Manually adjust at night/no one home",
-        COOLCNTL==3~"Programmable or smart thermostat automatically adjusts the temperature",
-        COOLCNTL==4~"Turn on or off as needed",
-        COOLCNTL==5~"No control",
-        COOLCNTL==99~"Other",
-        COOLCNTL==-2~NA_character_),
-      levels=c("Set one temp and leave it", "Manually adjust at night/no one home", "Programmable or smart thermostat automatically adjusts the temperature", "Turn on or off as needed", "No control", "Other")
-    ),
-    SummerTempDay=if_else(TEMPHOMEAC>0, TEMPHOMEAC, NA_real_),
-    SummerTempAway=if_else(TEMPGONEAC>0, TEMPGONEAC, NA_real_),
-    SummerTempNight=if_else(TEMPNITEAC>0, TEMPNITEAC, NA_real_),
-    ClimateRegion_BA=parse_factor(BA_climate),
-    state_name=factor(state_name),
-    state_postal=fct_reorder(state_postal, as.numeric(state_name))
-    )
-
-```
-
-## Check derived variables for correct coding
-
-```{r}
-#| label: checkvars
-
-
-recs %>% count(Region, REGIONC)
-recs %>% count(Division, DIVISION)
-recs %>% count(Urbanicity, UATYP10)
-recs %>% count(HousingUnitType, TYPEHUQ)
-recs %>% count(YearMade, YEARMADERANGE)
-recs %>% count(SpaceHeatingUsed, HEATHOME)
-recs %>% count(HeatingBehavior, HEATCNTL)
-recs %>% count(ACUsed, AIRCOND)
-recs %>% count(ACBehavior, COOLCNTL)
-recs %>% count(ClimateRegion_BA, BA_climate)
-recs %>% count(state_postal, state_name, STATE_FIPS) %>% print(n=51)
-```
-
-
-## Save data
-
-```{r compare-2015}
-recs_out <- recs %>%
-  select(DOEID, starts_with("NWEIGHT"),
-         REGIONC, Region, Division, starts_with("state"), Urbanicity, 
-         HousingUnitType, YearMade, SpaceHeatingUsed, HeatingBehavior, 
-         WinterTempDay, WinterTempAway, WinterTempNight, ACUsed, 
-         ACBehavior, SummerTempDay, SummerTempAway, SummerTempNight, 
-         TOTCSQFT, TOTHSQFT, TOTSQFT_EN, 
-         CDD30YR, CDD65, ClimateRegion_BA, 
-         HDD30YR, HDD65, BTUEL, 
-         DOLLAREL, BTUNG, DOLLARNG, BTULP, DOLLARLP, BTUFO, DOLLARFO, 
-         TOTALBTU, TOTALDOL, BTUWOOD)
-
-
-
-
-source(here::here("helper-fun", "helper-function.R"))
-
-recs_2015 <- read_osf("recs_2015.rds")
-
-setdiff(names(recs_out), names(recs_2015)) #variables in 2020 and not 2015
-setdiff(names(recs_2015), names(recs_out)) #variables in 2015 and not 2020
-
-```
-
-
-```{r}
-#| label: add-question-text
-for (var in colnames(recs_out)) {
-  attr(recs_out[[deparse(as.name(var))]], "format.sas") <- NULL
-}
-
-cb_in <- readxl::read_xlsx(here::here("DataCleaningScripts", "RECS 2020 Codebook Questions.xlsx"), skip=1)
-
-cb_ord <- cb_in %>% 
-  mutate(
-    Order=row_number(),
-    Section=if_else(Section=="End-use Model", "CONSUMPTION AND EXPENDITURE", Section),
-    Section=fct_reorder(Section, Order, min)) 
-
-cb_slim <- cb_ord %>% select(Variable=BookDerived, `Description and Labels`, Question, Section, Order) %>%
-  filter(!is.na(Variable)) %>%
-  bind_rows(select(cb_ord, Variable, `Description and Labels`, Question, Section, Order)) %>%
-  arrange(Section, Order)
-
-names(recs_out)[!(names(recs_out) %in% pull(cb_slim, Variable))]
-
-cb_vars <- cb_slim %>%
-  filter(Variable %in% c(names(recs_out)))
-
-nrow(cb_vars)
-ncol(recs_out)
-
-recs_ord <- recs_out %>% select(all_of(pull(cb_vars, Variable)))
-
-for (var in pull(cb_vars, Variable)) {
-  vi <- cb_vars %>% filter(Variable==var)
-  attr(recs_ord[[deparse(as.name(var))]], "label") <- pull(vi, `Description and Labels`)
-  attr(recs_ord[[deparse(as.name(var))]], "Section") <- pull(vi, Section) %>% as.character()
-  if (!is.na(pull(vi, Question))) attr(recs_ord[[deparse(as.name(var))]], "Question") <- pull(vi, Question)
-}
-
-
-```
-
-
-
-
-
-```{r savedat}
-summary(recs_ord)
-str(recs_ord)
-
-recs_der_tmp_loc <- here::here("osf_dl", "recs_2020.rds")
-write_rds(recs_ord, recs_der_tmp_loc)
-target_dir <- osf_retrieve_node("https://osf.io/gzbkn/?view_only=8ca80573293b4e12b7f934a0f742b957") 
-osf_upload(target_dir, path=recs_der_tmp_loc, conflicts="overwrite")
-unlink(recs_der_tmp_loc)
-
-```
-
diff --git a/DataCleaningScripts/RECS_2020_DataPrep.md b/DataCleaningScripts/RECS_2020_DataPrep.md
index 45229eef..e69de29b 100644
--- a/DataCleaningScripts/RECS_2020_DataPrep.md
+++ b/DataCleaningScripts/RECS_2020_DataPrep.md
@@ -1,856 +0,0 @@
-Residential Energy Consumption Survey (RECS) 2020 Data Prep
-================
-
-## Data information
-
-All data and resources were downloaded from
-<https://www.eia.gov/consumption/residential/data/2020/index.php?view=microdata>
-on September 17, 2023.
-
-``` r
-library(tidyverse) #data manipulation
-library(haven) #data import
-library(tidylog) #informative logging messages
-library(osfr)
-```
-
-## Import data and create derived variables
-
-``` r
-recs_file_osf_det <- osf_retrieve_node("https://osf.io/z5c3m/") %>%
-  osf_ls_files(path="RECS_2020", pattern="sas7bdat") %>%
-  filter(str_detect(name, "v5")) %>%
-  osf_download(conflicts="overwrite", path=here::here("osf_dl"))
-
-recs_in <- haven::read_sas(pull(recs_file_osf_det, local_path))
-
-unlink(pull(recs_file_osf_det, local_path))
-
-
-# 2015 to 2020 differences
-# Added states!
-# Variables gone: METROMICRO, TOTUCSQFT (uncooled sq ft), TOTUSQFT (unheated sq ft), CDD80, HDD50, GNDHDD65, PELLETBTU
-# HEATCNTL replaces EQUIPMUSE
-# COOLCNTL replaces USECENAC
-# CDD30YR_PUB replaces CDD30YR
-# BA_climate replaces CLIMATE_REGION_PUB 
-# IECC_climate_code replaces IECC_CLIMATE_PUB
-# HDD30YR_PUB replaces HDD30YR
-# BTUWD replaces WOODBTU
-# BRR weights are NWEIGHT
-
-recs <- recs_in %>%
-   select(DOEID, REGIONC, DIVISION, STATE_FIPS, state_postal, state_name, UATYP10, TYPEHUQ, YEARMADERANGE, HEATHOME, HEATCNTL, TEMPHOME, TEMPGONE, TEMPNITE, AIRCOND, COOLCNTL, TEMPHOMEAC, TEMPGONEAC, TEMPNITEAC, TOTCSQFT, TOTHSQFT, TOTSQFT_EN, NWEIGHT, starts_with("NWEIGHT"), CDD30YR=CDD30YR_PUB, CDD65, BA_climate, IECC_climate_code, HDD30YR=HDD30YR_PUB, HDD65, BTUEL, DOLLAREL, BTUNG, DOLLARNG, BTULP, DOLLARLP, BTUFO, DOLLARFO, TOTALBTU, TOTALDOL, BTUWOOD=BTUWD) %>%
-  mutate(
-    Region=parse_factor(
-      str_to_title(REGIONC),
-      levels=c("Northeast", "Midwest", "South", "West")),
-    Division=parse_factor(
-      DIVISION, levels=c("New England", "Middle Atlantic", "East North Central", "West North Central", "South Atlantic", "East South Central", "West South Central", "Mountain North", "Mountain South", "Pacific")),
-    Urbanicity=parse_factor(
-      case_when(
-        UATYP10=="U"~"Urban Area",
-        UATYP10=="C"~"Urban Cluster",
-        UATYP10=="R"~"Rural"
-      ),
-      levels=c("Urban Area", "Urban Cluster", "Rural")
-    ),
-    HousingUnitType=parse_factor(
-      case_when(
-        TYPEHUQ==1~"Mobile home",
-        TYPEHUQ==2~"Single-family detached",
-        TYPEHUQ==3~"Single-family attached",
-        TYPEHUQ==4~"Apartment: 2-4 Units",
-        TYPEHUQ==5~"Apartment: 5 or more units",
-      ), levels=c("Mobile home", "Single-family detached", "Single-family attached", "Apartment: 2-4 Units", "Apartment: 5 or more units")),
-    YearMade=parse_factor(
-      case_when(
-        YEARMADERANGE==1~"Before 1950",
-        YEARMADERANGE==2~"1950-1959",
-        YEARMADERANGE==3~"1960-1969",
-        YEARMADERANGE==4~"1970-1979",
-        YEARMADERANGE==5~"1980-1989",
-        YEARMADERANGE==6~"1990-1999",
-        YEARMADERANGE==7~"2000-2009",
-        YEARMADERANGE==8~"2010-2015",
-        YEARMADERANGE==9~"2016-2020"
-      ),
-      levels=c("Before 1950", "1950-1959", "1960-1969", "1970-1979", "1980-1989", "1990-1999", "2000-2009", "2010-2015", "2016-2020"),
-      ordered = TRUE
-    ),
-    SpaceHeatingUsed=as.logical(HEATHOME),
-    HeatingBehavior=parse_factor(
-      case_when(
-        HEATCNTL==1~"Set one temp and leave it",
-        HEATCNTL==2~"Manually adjust at night/no one home",
-        HEATCNTL==3~"Programmable or smart thermostat automatically adjusts the temperature",
-        HEATCNTL==4~"Turn on or off as needed",
-        HEATCNTL==5~"No control",
-        HEATCNTL==99~"Other",
-        HEATCNTL==-2~NA_character_),
-      levels=c("Set one temp and leave it", "Manually adjust at night/no one home", "Programmable or smart thermostat automatically adjusts the temperature", "Turn on or off as needed", "No control", "Other")
-    ),
-    WinterTempDay=if_else(TEMPHOME>0, TEMPHOME, NA_real_),
-    WinterTempAway=if_else(TEMPGONE>0, TEMPGONE, NA_real_),
-    WinterTempNight=if_else(TEMPNITE>0, TEMPNITE, NA_real_),
-    ACUsed=as.logical(AIRCOND),
-    ACBehavior=parse_factor(
-      case_when(
-        COOLCNTL==1~"Set one temp and leave it",
-        COOLCNTL==2~"Manually adjust at night/no one home",
-        COOLCNTL==3~"Programmable or smart thermostat automatically adjusts the temperature",
-        COOLCNTL==4~"Turn on or off as needed",
-        COOLCNTL==5~"No control",
-        COOLCNTL==99~"Other",
-        COOLCNTL==-2~NA_character_),
-      levels=c("Set one temp and leave it", "Manually adjust at night/no one home", "Programmable or smart thermostat automatically adjusts the temperature", "Turn on or off as needed", "No control", "Other")
-    ),
-    SummerTempDay=if_else(TEMPHOMEAC>0, TEMPHOMEAC, NA_real_),
-    SummerTempAway=if_else(TEMPGONEAC>0, TEMPGONEAC, NA_real_),
-    SummerTempNight=if_else(TEMPNITEAC>0, TEMPNITEAC, NA_real_),
-    ClimateRegion_BA=parse_factor(BA_climate),
-    state_name=factor(state_name),
-    state_postal=fct_reorder(state_postal, as.numeric(state_name))
-    )
-```
-
-## Check derived variables for correct coding
-
-``` r
-recs %>% count(Region, REGIONC)
-```
-
-    ## # A tibble: 4 × 3
-    ##   Region    REGIONC       n
-    ##   <fct>     <chr>     <int>
-    ## 1 Northeast NORTHEAST  3657
-    ## 2 Midwest   MIDWEST    3832
-    ## 3 South     SOUTH      6426
-    ## 4 West      WEST       4581
-
-``` r
-recs %>% count(Division, DIVISION)
-```
-
-    ## # A tibble: 10 × 3
-    ##    Division           DIVISION               n
-    ##    <fct>              <chr>              <int>
-    ##  1 New England        New England         1680
-    ##  2 Middle Atlantic    Middle Atlantic     1977
-    ##  3 East North Central East North Central  2014
-    ##  4 West North Central West North Central  1818
-    ##  5 South Atlantic     South Atlantic      3256
-    ##  6 East South Central East South Central  1343
-    ##  7 West South Central West South Central  1827
-    ##  8 Mountain North     Mountain North      1180
-    ##  9 Mountain South     Mountain South       904
-    ## 10 Pacific            Pacific             2497
-
-``` r
-recs %>% count(Urbanicity, UATYP10)
-```
-
-    ## # A tibble: 3 × 3
-    ##   Urbanicity    UATYP10     n
-    ##   <fct>         <chr>   <int>
-    ## 1 Urban Area    U       12395
-    ## 2 Urban Cluster C        2020
-    ## 3 Rural         R        4081
-
-``` r
-recs %>% count(HousingUnitType, TYPEHUQ)
-```
-
-    ## # A tibble: 5 × 3
-    ##   HousingUnitType            TYPEHUQ     n
-    ##   <fct>                        <dbl> <int>
-    ## 1 Mobile home                      1   974
-    ## 2 Single-family detached           2 12319
-    ## 3 Single-family attached           3  1751
-    ## 4 Apartment: 2-4 Units             4  1013
-    ## 5 Apartment: 5 or more units       5  2439
-
-``` r
-recs %>% count(YearMade, YEARMADERANGE)
-```
-
-    ## # A tibble: 9 × 3
-    ##   YearMade    YEARMADERANGE     n
-    ##   <ord>               <dbl> <int>
-    ## 1 Before 1950             1  2721
-    ## 2 1950-1959               2  1685
-    ## 3 1960-1969               3  1867
-    ## 4 1970-1979               4  2817
-    ## 5 1980-1989               5  2435
-    ## 6 1990-1999               6  2451
-    ## 7 2000-2009               7  2748
-    ## 8 2010-2015               8   989
-    ## 9 2016-2020               9   783
-
-``` r
-recs %>% count(SpaceHeatingUsed, HEATHOME)
-```
-
-    ## # A tibble: 2 × 3
-    ##   SpaceHeatingUsed HEATHOME     n
-    ##   <lgl>               <dbl> <int>
-    ## 1 FALSE                   0   751
-    ## 2 TRUE                    1 17745
-
-``` r
-recs %>% count(HeatingBehavior, HEATCNTL)
-```
-
-    ## # A tibble: 7 × 3
-    ##   HeatingBehavior                                                        HEATCNTL     n
-    ##   <fct>                                                                     <dbl> <int>
-    ## 1 Set one temp and leave it                                                     1  7806
-    ## 2 Manually adjust at night/no one home                                          2  4654
-    ## 3 Programmable or smart thermostat automatically adjusts the temperature        3  3310
-    ## 4 Turn on or off as needed                                                      4  1491
-    ## 5 No control                                                                    5   438
-    ## 6 Other                                                                        99    46
-    ## 7 <NA>                                                                         -2   751
-
-``` r
-recs %>% count(ACUsed, AIRCOND)
-```
-
-    ## # A tibble: 2 × 3
-    ##   ACUsed AIRCOND     n
-    ##   <lgl>    <dbl> <int>
-    ## 1 FALSE        0  2325
-    ## 2 TRUE         1 16171
-
-``` r
-recs %>% count(ACBehavior, COOLCNTL)
-```
-
-    ## # A tibble: 7 × 3
-    ##   ACBehavior                                                             COOLCNTL     n
-    ##   <fct>                                                                     <dbl> <int>
-    ## 1 Set one temp and leave it                                                     1  6738
-    ## 2 Manually adjust at night/no one home                                          2  3637
-    ## 3 Programmable or smart thermostat automatically adjusts the temperature        3  2638
-    ## 4 Turn on or off as needed                                                      4  2746
-    ## 5 No control                                                                    5   409
-    ## 6 Other                                                                        99     3
-    ## 7 <NA>                                                                         -2  2325
-
-``` r
-recs %>% count(ClimateRegion_BA, BA_climate)
-```
-
-    ## # A tibble: 8 × 3
-    ##   ClimateRegion_BA BA_climate      n
-    ##   <fct>            <chr>       <int>
-    ## 1 Mixed-Dry        Mixed-Dry     142
-    ## 2 Mixed-Humid      Mixed-Humid  5579
-    ## 3 Hot-Humid        Hot-Humid    2545
-    ## 4 Hot-Dry          Hot-Dry      1577
-    ## 5 Very-Cold        Very-Cold     572
-    ## 6 Cold             Cold         7116
-    ## 7 Marine           Marine        911
-    ## 8 Subarctic        Subarctic      54
-
-``` r
-recs %>% count(state_postal, state_name, STATE_FIPS) %>% print(n=51)
-```
-
-    ## # A tibble: 51 × 4
-    ##    state_postal state_name           STATE_FIPS     n
-    ##    <fct>        <fct>                <chr>      <int>
-    ##  1 AL           Alabama              01           242
-    ##  2 AK           Alaska               02           311
-    ##  3 AZ           Arizona              04           495
-    ##  4 AR           Arkansas             05           268
-    ##  5 CA           California           06          1152
-    ##  6 CO           Colorado             08           360
-    ##  7 CT           Connecticut          09           294
-    ##  8 DE           Delaware             10           143
-    ##  9 DC           District of Columbia 11           221
-    ## 10 FL           Florida              12           655
-    ## 11 GA           Georgia              13           417
-    ## 12 HI           Hawaii               15           282
-    ## 13 ID           Idaho                16           270
-    ## 14 IL           Illinois             17           530
-    ## 15 IN           Indiana              18           400
-    ## 16 IA           Iowa                 19           286
-    ## 17 KS           Kansas               20           208
-    ## 18 KY           Kentucky             21           428
-    ## 19 LA           Louisiana            22           311
-    ## 20 ME           Maine                23           223
-    ## 21 MD           Maryland             24           359
-    ## 22 MA           Massachusetts        25           552
-    ## 23 MI           Michigan             26           388
-    ## 24 MN           Minnesota            27           325
-    ## 25 MS           Mississippi          28           168
-    ## 26 MO           Missouri             29           296
-    ## 27 MT           Montana              30           172
-    ## 28 NE           Nebraska             31           189
-    ## 29 NV           Nevada               32           231
-    ## 30 NH           New Hampshire        33           175
-    ## 31 NJ           New Jersey           34           456
-    ## 32 NM           New Mexico           35           178
-    ## 33 NY           New York             36           904
-    ## 34 NC           North Carolina       37           479
-    ## 35 ND           North Dakota         38           331
-    ## 36 OH           Ohio                 39           339
-    ## 37 OK           Oklahoma             40           232
-    ## 38 OR           Oregon               41           313
-    ## 39 PA           Pennsylvania         42           617
-    ## 40 RI           Rhode Island         44           191
-    ## 41 SC           South Carolina       45           334
-    ## 42 SD           South Dakota         46           183
-    ## 43 TN           Tennessee            47           505
-    ## 44 TX           Texas                48          1016
-    ## 45 UT           Utah                 49           188
-    ## 46 VT           Vermont              50           245
-    ## 47 VA           Virginia             51           451
-    ## 48 WA           Washington           53           439
-    ## 49 WV           West Virginia        54           197
-    ## 50 WI           Wisconsin            55           357
-    ## 51 WY           Wyoming              56           190
-
-## Save data
-
-``` r
-recs_out <- recs %>%
-  select(DOEID, starts_with("NWEIGHT"),
-         REGIONC, Region, Division, starts_with("state"), Urbanicity, 
-         HousingUnitType, YearMade, SpaceHeatingUsed, HeatingBehavior, 
-         WinterTempDay, WinterTempAway, WinterTempNight, ACUsed, 
-         ACBehavior, SummerTempDay, SummerTempAway, SummerTempNight, 
-         TOTCSQFT, TOTHSQFT, TOTSQFT_EN, 
-         CDD30YR, CDD65, ClimateRegion_BA, 
-         HDD30YR, HDD65, BTUEL, 
-         DOLLAREL, BTUNG, DOLLARNG, BTULP, DOLLARLP, BTUFO, DOLLARFO, 
-         TOTALBTU, TOTALDOL, BTUWOOD)
-
-
-
-
-source(here::here("helper-fun", "helper-function.R"))
-
-recs_2015 <- read_osf("recs_2015.rds")
-
-setdiff(names(recs_out), names(recs_2015)) #variables in 2020 and not 2015
-```
-
-    ##  [1] "NWEIGHT1"     "NWEIGHT2"     "NWEIGHT3"     "NWEIGHT4"     "NWEIGHT5"     "NWEIGHT6"     "NWEIGHT7"     "NWEIGHT8"     "NWEIGHT9"     "NWEIGHT10"   
-    ## [11] "NWEIGHT11"    "NWEIGHT12"    "NWEIGHT13"    "NWEIGHT14"    "NWEIGHT15"    "NWEIGHT16"    "NWEIGHT17"    "NWEIGHT18"    "NWEIGHT19"    "NWEIGHT20"   
-    ## [21] "NWEIGHT21"    "NWEIGHT22"    "NWEIGHT23"    "NWEIGHT24"    "NWEIGHT25"    "NWEIGHT26"    "NWEIGHT27"    "NWEIGHT28"    "NWEIGHT29"    "NWEIGHT30"   
-    ## [31] "NWEIGHT31"    "NWEIGHT32"    "NWEIGHT33"    "NWEIGHT34"    "NWEIGHT35"    "NWEIGHT36"    "NWEIGHT37"    "NWEIGHT38"    "NWEIGHT39"    "NWEIGHT40"   
-    ## [41] "NWEIGHT41"    "NWEIGHT42"    "NWEIGHT43"    "NWEIGHT44"    "NWEIGHT45"    "NWEIGHT46"    "NWEIGHT47"    "NWEIGHT48"    "NWEIGHT49"    "NWEIGHT50"   
-    ## [51] "NWEIGHT51"    "NWEIGHT52"    "NWEIGHT53"    "NWEIGHT54"    "NWEIGHT55"    "NWEIGHT56"    "NWEIGHT57"    "NWEIGHT58"    "NWEIGHT59"    "NWEIGHT60"   
-    ## [61] "STATE_FIPS"   "state_postal" "state_name"
-
-``` r
-setdiff(names(recs_2015), names(recs_out)) #variables in 2015 and not 2020
-```
-
-    ##   [1] "MSAStatus"          "TOTUCSQFT"          "TOTUSQFT"           "BRRWT1"             "BRRWT2"             "BRRWT3"             "BRRWT4"            
-    ##   [8] "BRRWT5"             "BRRWT6"             "BRRWT7"             "BRRWT8"             "BRRWT9"             "BRRWT10"            "BRRWT11"           
-    ##  [15] "BRRWT12"            "BRRWT13"            "BRRWT14"            "BRRWT15"            "BRRWT16"            "BRRWT17"            "BRRWT18"           
-    ##  [22] "BRRWT19"            "BRRWT20"            "BRRWT21"            "BRRWT22"            "BRRWT23"            "BRRWT24"            "BRRWT25"           
-    ##  [29] "BRRWT26"            "BRRWT27"            "BRRWT28"            "BRRWT29"            "BRRWT30"            "BRRWT31"            "BRRWT32"           
-    ##  [36] "BRRWT33"            "BRRWT34"            "BRRWT35"            "BRRWT36"            "BRRWT37"            "BRRWT38"            "BRRWT39"           
-    ##  [43] "BRRWT40"            "BRRWT41"            "BRRWT42"            "BRRWT43"            "BRRWT44"            "BRRWT45"            "BRRWT46"           
-    ##  [50] "BRRWT47"            "BRRWT48"            "BRRWT49"            "BRRWT50"            "BRRWT51"            "BRRWT52"            "BRRWT53"           
-    ##  [57] "BRRWT54"            "BRRWT55"            "BRRWT56"            "BRRWT57"            "BRRWT58"            "BRRWT59"            "BRRWT60"           
-    ##  [64] "BRRWT61"            "BRRWT62"            "BRRWT63"            "BRRWT64"            "BRRWT65"            "BRRWT66"            "BRRWT67"           
-    ##  [71] "BRRWT68"            "BRRWT69"            "BRRWT70"            "BRRWT71"            "BRRWT72"            "BRRWT73"            "BRRWT74"           
-    ##  [78] "BRRWT75"            "BRRWT76"            "BRRWT77"            "BRRWT78"            "BRRWT79"            "BRRWT80"            "BRRWT81"           
-    ##  [85] "BRRWT82"            "BRRWT83"            "BRRWT84"            "BRRWT85"            "BRRWT86"            "BRRWT87"            "BRRWT88"           
-    ##  [92] "BRRWT89"            "BRRWT90"            "BRRWT91"            "BRRWT92"            "BRRWT93"            "BRRWT94"            "BRRWT95"           
-    ##  [99] "BRRWT96"            "CDD80"              "ClimateRegion_IECC" "HDD50"              "GNDHDD65"           "BTUPELLET"
-
-``` r
-for (var in colnames(recs_out)) {
-  attr(recs_out[[deparse(as.name(var))]], "format.sas") <- NULL
-}
-
-cb_in <- readxl::read_xlsx(here::here("DataCleaningScripts", "RECS 2020 Codebook Questions.xlsx"), skip=1)
-
-cb_ord <- cb_in %>% 
-  mutate(
-    Order=row_number(),
-    Section=if_else(Section=="End-use Model", "CONSUMPTION AND EXPENDITURE", Section),
-    Section=fct_reorder(Section, Order, min)) 
-
-cb_slim <- cb_ord %>% select(Variable=BookDerived, `Description and Labels`, Question, Section, Order) %>%
-  filter(!is.na(Variable)) %>%
-  bind_rows(select(cb_ord, Variable, `Description and Labels`, Question, Section, Order)) %>%
-  arrange(Section, Order)
-
-names(recs_out)[!(names(recs_out) %in% pull(cb_slim, Variable))]
-```
-
-    ## character(0)
-
-``` r
-cb_vars <- cb_slim %>%
-  filter(Variable %in% c(names(recs_out)))
-
-nrow(cb_vars)
-```
-
-    ## [1] 100
-
-``` r
-ncol(recs_out)
-```
-
-    ## [1] 100
-
-``` r
-recs_ord <- recs_out %>% select(all_of(pull(cb_vars, Variable)))
-
-for (var in pull(cb_vars, Variable)) {
-  vi <- cb_vars %>% filter(Variable==var)
-  attr(recs_ord[[deparse(as.name(var))]], "label") <- pull(vi, `Description and Labels`)
-  attr(recs_ord[[deparse(as.name(var))]], "Section") <- pull(vi, Section) %>% as.character()
-  if (!is.na(pull(vi, Question))) attr(recs_ord[[deparse(as.name(var))]], "Question") <- pull(vi, Question)
-}
-```
-
-``` r
-summary(recs_ord)
-```
-
-    ##      DOEID           ClimateRegion_BA         Urbanicity          Region       REGIONC                        Division     STATE_FIPS       
-    ##  Min.   :100001   Cold       :7116    Urban Area   :12395   Northeast:3657   Length:18496       South Atlantic    :3256   Length:18496      
-    ##  1st Qu.:104625   Mixed-Humid:5579    Urban Cluster: 2020   Midwest  :3832   Class :character   Pacific           :2497   Class :character  
-    ##  Median :109249   Hot-Humid  :2545    Rural        : 4081   South    :6426   Mode  :character   East North Central:2014   Mode  :character  
-    ##  Mean   :109249   Hot-Dry    :1577                          West     :4581                      Middle Atlantic   :1977                     
-    ##  3rd Qu.:113872   Marine     : 911                                                              West South Central:1827                     
-    ##  Max.   :118496   Very-Cold  : 572                                                              West North Central:1818                     
-    ##                   (Other)    : 196                                                              (Other)           :5107                     
-    ##   state_postal           state_name        HDD65           CDD65         HDD30YR         CDD30YR                       HousingUnitType         YearMade   
-    ##  CA     : 1152   California   : 1152   Min.   :    0   Min.   :   0   Min.   :    0   Min.   :   0   Mobile home               :  974   1970-1979  :2817  
-    ##  TX     : 1016   Texas        : 1016   1st Qu.: 2434   1st Qu.: 814   1st Qu.: 2898   1st Qu.: 601   Single-family detached    :12319   2000-2009  :2748  
-    ##  NY     :  904   New York     :  904   Median : 4396   Median :1179   Median : 4825   Median :1020   Single-family attached    : 1751   Before 1950:2721  
-    ##  FL     :  655   Florida      :  655   Mean   : 4272   Mean   :1526   Mean   : 4679   Mean   :1310   Apartment: 2-4 Units      : 1013   1990-1999  :2451  
-    ##  PA     :  617   Pennsylvania :  617   3rd Qu.: 5810   3rd Qu.:1805   3rd Qu.: 6290   3rd Qu.:1703   Apartment: 5 or more units: 2439   1980-1989  :2435  
-    ##  MA     :  552   Massachusetts:  552   Max.   :17383   Max.   :5534   Max.   :16071   Max.   :4905                                      1960-1969  :1867  
-    ##  (Other):13600   (Other)      :13600                                                                                                    (Other)    :3457  
-    ##    TOTSQFT_EN       TOTHSQFT        TOTCSQFT     SpaceHeatingUsed   ACUsed       
-    ##  Min.   :  200   Min.   :    0   Min.   :    0   Mode :logical    Mode :logical  
-    ##  1st Qu.: 1100   1st Qu.: 1000   1st Qu.:  460   FALSE:751        FALSE:2325     
-    ##  Median : 1700   Median : 1520   Median : 1200   TRUE :17745      TRUE :16171    
-    ##  Mean   : 1960   Mean   : 1744   Mean   : 1394                                   
-    ##  3rd Qu.: 2510   3rd Qu.: 2300   3rd Qu.: 2000                                   
-    ##  Max.   :15000   Max.   :15000   Max.   :14600                                   
-    ##                                                                                  
-    ##                                                                HeatingBehavior WinterTempDay   WinterTempAway  WinterTempNight
-    ##  Set one temp and leave it                                             :7806   Min.   :50.00   Min.   :50.00   Min.   :50.00  
-    ##  Manually adjust at night/no one home                                  :4654   1st Qu.:68.00   1st Qu.:65.00   1st Qu.:65.00  
-    ##  Programmable or smart thermostat automatically adjusts the temperature:3310   Median :70.00   Median :68.00   Median :68.00  
-    ##  Turn on or off as needed                                              :1491   Mean   :69.77   Mean   :67.45   Mean   :68.01  
-    ##  No control                                                            : 438   3rd Qu.:72.00   3rd Qu.:70.00   3rd Qu.:70.00  
-    ##  Other                                                                 :  46   Max.   :90.00   Max.   :90.00   Max.   :90.00  
-    ##  NA                                                                    : 751   NA's   :751     NA's   :751     NA's   :751    
-    ##                                                                   ACBehavior   SummerTempDay   SummerTempAway  SummerTempNight    NWEIGHT       
-    ##  Set one temp and leave it                                             :6738   Min.   :50.00   Min.   :50.00   Min.   :50.00   Min.   :  437.9  
-    ##  Manually adjust at night/no one home                                  :3637   1st Qu.:70.00   1st Qu.:70.00   1st Qu.:68.00   1st Qu.: 4018.7  
-    ##  Programmable or smart thermostat automatically adjusts the temperature:2638   Median :72.00   Median :74.00   Median :72.00   Median : 6119.4  
-    ##  Turn on or off as needed                                              :2746   Mean   :72.01   Mean   :73.45   Mean   :71.22   Mean   : 6678.7  
-    ##  No control                                                            : 409   3rd Qu.:75.00   3rd Qu.:78.00   3rd Qu.:74.00   3rd Qu.: 8890.0  
-    ##  Other                                                                 :   3   Max.   :90.00   Max.   :90.00   Max.   :90.00   Max.   :29279.1  
-    ##  NA                                                                    :2325   NA's   :2325    NA's   :2325    NA's   :2325                     
-    ##     NWEIGHT1        NWEIGHT2        NWEIGHT3        NWEIGHT4        NWEIGHT5        NWEIGHT6        NWEIGHT7        NWEIGHT8        NWEIGHT9    
-    ##  Min.   :    0   Min.   :    0   Min.   :    0   Min.   :    0   Min.   :    0   Min.   :    0   Min.   :    0   Min.   :    0   Min.   :    0  
-    ##  1st Qu.: 3950   1st Qu.: 3951   1st Qu.: 3954   1st Qu.: 3953   1st Qu.: 3957   1st Qu.: 3966   1st Qu.: 3944   1st Qu.: 3956   1st Qu.: 3947  
-    ##  Median : 6136   Median : 6151   Median : 6151   Median : 6153   Median : 6134   Median : 6147   Median : 6135   Median : 6151   Median : 6139  
-    ##  Mean   : 6679   Mean   : 6679   Mean   : 6679   Mean   : 6679   Mean   : 6679   Mean   : 6679   Mean   : 6679   Mean   : 6679   Mean   : 6679  
-    ##  3rd Qu.: 8976   3rd Qu.: 8979   3rd Qu.: 8994   3rd Qu.: 8998   3rd Qu.: 8987   3rd Qu.: 8984   3rd Qu.: 8998   3rd Qu.: 8988   3rd Qu.: 8974  
-    ##  Max.   :30015   Max.   :29422   Max.   :29431   Max.   :29494   Max.   :30039   Max.   :29419   Max.   :29586   Max.   :29499   Max.   :29845  
-    ##                                                                                                                                                 
-    ##    NWEIGHT10       NWEIGHT11       NWEIGHT12       NWEIGHT13       NWEIGHT14       NWEIGHT15       NWEIGHT16       NWEIGHT17       NWEIGHT18    
-    ##  Min.   :    0   Min.   :    0   Min.   :    0   Min.   :    0   Min.   :    0   Min.   :    0   Min.   :    0   Min.   :    0   Min.   :    0  
-    ##  1st Qu.: 3961   1st Qu.: 3950   1st Qu.: 3947   1st Qu.: 3967   1st Qu.: 3962   1st Qu.: 3958   1st Qu.: 3958   1st Qu.: 3958   1st Qu.: 3937  
-    ##  Median : 6163   Median : 6140   Median : 6160   Median : 6142   Median : 6154   Median : 6145   Median : 6133   Median : 6126   Median : 6155  
-    ##  Mean   : 6679   Mean   : 6679   Mean   : 6679   Mean   : 6679   Mean   : 6679   Mean   : 6679   Mean   : 6679   Mean   : 6679   Mean   : 6679  
-    ##  3rd Qu.: 8994   3rd Qu.: 8991   3rd Qu.: 8988   3rd Qu.: 8977   3rd Qu.: 8981   3rd Qu.: 8997   3rd Qu.: 8979   3rd Qu.: 8977   3rd Qu.: 8993  
-    ##  Max.   :29635   Max.   :29681   Max.   :29849   Max.   :29843   Max.   :30184   Max.   :29970   Max.   :29825   Max.   :30606   Max.   :29689  
-    ##                                                                                                                                                 
-    ##    NWEIGHT19       NWEIGHT20       NWEIGHT21       NWEIGHT22       NWEIGHT23       NWEIGHT24       NWEIGHT25       NWEIGHT26       NWEIGHT27    
-    ##  Min.   :    0   Min.   :    0   Min.   :    0   Min.   :    0   Min.   :    0   Min.   :    0   Min.   :    0   Min.   :    0   Min.   :    0  
-    ##  1st Qu.: 3947   1st Qu.: 3943   1st Qu.: 3960   1st Qu.: 3964   1st Qu.: 3943   1st Qu.: 3946   1st Qu.: 3952   1st Qu.: 3966   1st Qu.: 3942  
-    ##  Median : 6153   Median : 6139   Median : 6135   Median : 6149   Median : 6148   Median : 6136   Median : 6150   Median : 6136   Median : 6125  
-    ##  Mean   : 6679   Mean   : 6679   Mean   : 6679   Mean   : 6679   Mean   : 6679   Mean   : 6679   Mean   : 6679   Mean   : 6679   Mean   : 6679  
-    ##  3rd Qu.: 8979   3rd Qu.: 8992   3rd Qu.: 8956   3rd Qu.: 8988   3rd Qu.: 8980   3rd Qu.: 8978   3rd Qu.: 8972   3rd Qu.: 8980   3rd Qu.: 8996  
-    ##  Max.   :29336   Max.   :30274   Max.   :29766   Max.   :29791   Max.   :30126   Max.   :29946   Max.   :30445   Max.   :29893   Max.   :30030  
-    ##                                                                                                                                                 
-    ##    NWEIGHT28       NWEIGHT29       NWEIGHT30       NWEIGHT31       NWEIGHT32       NWEIGHT33       NWEIGHT34       NWEIGHT35       NWEIGHT36    
-    ##  Min.   :    0   Min.   :    0   Min.   :    0   Min.   :    0   Min.   :    0   Min.   :    0   Min.   :    0   Min.   :    0   Min.   :    0  
-    ##  1st Qu.: 3956   1st Qu.: 3970   1st Qu.: 3956   1st Qu.: 3944   1st Qu.: 3954   1st Qu.: 3964   1st Qu.: 3950   1st Qu.: 3967   1st Qu.: 3948  
-    ##  Median : 6149   Median : 6146   Median : 6149   Median : 6144   Median : 6159   Median : 6148   Median : 6139   Median : 6141   Median : 6149  
-    ##  Mean   : 6679   Mean   : 6679   Mean   : 6679   Mean   : 6679   Mean   : 6679   Mean   : 6679   Mean   : 6679   Mean   : 6679   Mean   : 6679  
-    ##  3rd Qu.: 8989   3rd Qu.: 8979   3rd Qu.: 8991   3rd Qu.: 8994   3rd Qu.: 8982   3rd Qu.: 8993   3rd Qu.: 8985   3rd Qu.: 8990   3rd Qu.: 8979  
-    ##  Max.   :29599   Max.   :30136   Max.   :29895   Max.   :29604   Max.   :29310   Max.   :29408   Max.   :29564   Max.   :30437   Max.   :27896  
-    ##                                                                                                                                                 
-    ##    NWEIGHT37       NWEIGHT38       NWEIGHT39       NWEIGHT40       NWEIGHT41       NWEIGHT42       NWEIGHT43       NWEIGHT44       NWEIGHT45    
-    ##  Min.   :    0   Min.   :    0   Min.   :    0   Min.   :    0   Min.   :    0   Min.   :    0   Min.   :    0   Min.   :    0   Min.   :    0  
-    ##  1st Qu.: 3955   1st Qu.: 3954   1st Qu.: 3940   1st Qu.: 3959   1st Qu.: 3975   1st Qu.: 3949   1st Qu.: 3947   1st Qu.: 3956   1st Qu.: 3952  
-    ##  Median : 6133   Median : 6139   Median : 6147   Median : 6144   Median : 6153   Median : 6137   Median : 6157   Median : 6148   Median : 6149  
-    ##  Mean   : 6679   Mean   : 6679   Mean   : 6679   Mean   : 6679   Mean   : 6679   Mean   : 6679   Mean   : 6679   Mean   : 6679   Mean   : 6679  
-    ##  3rd Qu.: 8975   3rd Qu.: 8974   3rd Qu.: 8991   3rd Qu.: 8980   3rd Qu.: 8982   3rd Qu.: 8988   3rd Qu.: 9005   3rd Qu.: 8986   3rd Qu.: 8992  
-    ##  Max.   :30596   Max.   :30130   Max.   :29262   Max.   :30344   Max.   :29594   Max.   :29938   Max.   :29878   Max.   :29896   Max.   :29729  
-    ##                                                                                                                                                 
-    ##    NWEIGHT46       NWEIGHT47       NWEIGHT48       NWEIGHT49       NWEIGHT50       NWEIGHT51       NWEIGHT52       NWEIGHT53       NWEIGHT54    
-    ##  Min.   :    0   Min.   :    0   Min.   :    0   Min.   :    0   Min.   :    0   Min.   :    0   Min.   :    0   Min.   :    0   Min.   :    0  
-    ##  1st Qu.: 3966   1st Qu.: 3938   1st Qu.: 3953   1st Qu.: 3947   1st Qu.: 3948   1st Qu.: 3958   1st Qu.: 3938   1st Qu.: 3959   1st Qu.: 3954  
-    ##  Median : 6152   Median : 6150   Median : 6139   Median : 6146   Median : 6159   Median : 6150   Median : 6154   Median : 6156   Median : 6151  
-    ##  Mean   : 6679   Mean   : 6679   Mean   : 6679   Mean   : 6679   Mean   : 6679   Mean   : 6679   Mean   : 6679   Mean   : 6679   Mean   : 6679  
-    ##  3rd Qu.: 8959   3rd Qu.: 8991   3rd Qu.: 8991   3rd Qu.: 8990   3rd Qu.: 8995   3rd Qu.: 8992   3rd Qu.: 9012   3rd Qu.: 8979   3rd Qu.: 8973  
-    ##  Max.   :29103   Max.   :30070   Max.   :29343   Max.   :29590   Max.   :30027   Max.   :29247   Max.   :29445   Max.   :30131   Max.   :29439  
-    ##                                                                                                                                                 
-    ##    NWEIGHT55       NWEIGHT56       NWEIGHT57       NWEIGHT58       NWEIGHT59       NWEIGHT60         BTUEL             DOLLAREL           BTUNG        
-    ##  Min.   :    0   Min.   :    0   Min.   :    0   Min.   :    0   Min.   :    0   Min.   :    0   Min.   :   143.3   Min.   : -889.5   Min.   :      0  
-    ##  1st Qu.: 3945   1st Qu.: 3957   1st Qu.: 3942   1st Qu.: 3962   1st Qu.: 3965   1st Qu.: 3953   1st Qu.: 20205.8   1st Qu.:  836.5   1st Qu.:      0  
-    ##  Median : 6143   Median : 6153   Median : 6138   Median : 6137   Median : 6144   Median : 6140   Median : 31890.0   Median : 1257.9   Median :  22012  
-    ##  Mean   : 6679   Mean   : 6679   Mean   : 6679   Mean   : 6679   Mean   : 6679   Mean   : 6679   Mean   : 37016.2   Mean   : 1424.8   Mean   :  36961  
-    ##  3rd Qu.: 8977   3rd Qu.: 8995   3rd Qu.: 9004   3rd Qu.: 8986   3rd Qu.: 8977   3rd Qu.: 8983   3rd Qu.: 48298.0   3rd Qu.: 1819.0   3rd Qu.:  62714  
-    ##  Max.   :29216   Max.   :29203   Max.   :29819   Max.   :29818   Max.   :29606   Max.   :29818   Max.   :628155.5   Max.   :15680.2   Max.   :1134709  
-    ##                                                                                                                                                        
-    ##     DOLLARNG          BTULP           DOLLARLP           BTUFO           DOLLARFO          BTUWOOD          TOTALBTU          TOTALDOL      
-    ##  Min.   :   0.0   Min.   :     0   Min.   :   0.00   Min.   :     0   Min.   :   0.00   Min.   :     0   Min.   :   1182   Min.   : -150.5  
-    ##  1st Qu.:   0.0   1st Qu.:     0   1st Qu.:   0.00   1st Qu.:     0   1st Qu.:   0.00   1st Qu.:     0   1st Qu.:  45565   1st Qu.: 1258.3  
-    ##  Median : 313.9   Median :     0   Median :   0.00   Median :     0   Median :   0.00   Median :     0   Median :  74180   Median : 1793.2  
-    ##  Mean   : 396.0   Mean   :  3917   Mean   :  80.89   Mean   :  5109   Mean   :  88.43   Mean   :  3596   Mean   :  83002   Mean   : 1990.2  
-    ##  3rd Qu.: 644.9   3rd Qu.:     0   3rd Qu.:   0.00   3rd Qu.:     0   3rd Qu.:   0.00   3rd Qu.:     0   3rd Qu.: 108535   3rd Qu.: 2472.0  
-    ##  Max.   :8155.0   Max.   :364215   Max.   :6621.44   Max.   :426269   Max.   :7003.69   Max.   :500000   Max.   :1367548   Max.   :20043.4  
-    ## 
-
-``` r
-str(recs_ord)
-```
-
-    ## tibble [18,496 × 100] (S3: tbl_df/tbl/data.frame)
-    ##  $ DOEID           : num [1:18496] 1e+05 1e+05 1e+05 1e+05 1e+05 ...
-    ##   ..- attr(*, "label")= chr "Unique identifier for each respondent"
-    ##   ..- attr(*, "Section")= chr "ADMIN"
-    ##  $ ClimateRegion_BA: Factor w/ 8 levels "Mixed-Dry","Mixed-Humid",..: 1 2 1 2 2 3 2 2 2 4 ...
-    ##   ..- attr(*, "label")= chr "Building America Climate Zone"
-    ##   ..- attr(*, "Section")= chr "ADMIN"
-    ##  $ Urbanicity      : Factor w/ 3 levels "Urban Area","Urban Cluster",..: 1 1 1 1 1 1 1 2 1 1 ...
-    ##   ..- attr(*, "label")= chr "2010 Census Urban Type Code"
-    ##   ..- attr(*, "Section")= chr "ADMIN"
-    ##  $ Region          : Factor w/ 4 levels "Northeast","Midwest",..: 4 3 4 3 1 3 3 3 3 4 ...
-    ##   ..- attr(*, "label")= chr "Census Region"
-    ##   ..- attr(*, "Section")= chr "GEOGRAPHY"
-    ##  $ REGIONC         : chr [1:18496] "WEST" "SOUTH" "WEST" "SOUTH" ...
-    ##   ..- attr(*, "label")= chr "Census Region"
-    ##   ..- attr(*, "Section")= chr "GEOGRAPHY"
-    ##  $ Division        : Factor w/ 10 levels "New England",..: 9 7 9 5 2 7 7 6 5 9 ...
-    ##   ..- attr(*, "label")= chr "Census Division, Mountain Division is divided into North and South for RECS purposes"
-    ##   ..- attr(*, "Section")= chr "GEOGRAPHY"
-    ##  $ STATE_FIPS      : chr [1:18496] "35" "05" "35" "45" ...
-    ##   ..- attr(*, "label")= chr "State Federal Information Processing System Code"
-    ##   ..- attr(*, "Section")= chr "GEOGRAPHY"
-    ##  $ state_postal    : Factor w/ 51 levels "AL","AK","AZ",..: 32 4 32 41 31 44 37 25 9 3 ...
-    ##   ..- attr(*, "label")= chr "State Postal Code"
-    ##   ..- attr(*, "Section")= chr "GEOGRAPHY"
-    ##  $ state_name      : Factor w/ 51 levels "Alabama","Alaska",..: 32 4 32 41 31 44 37 25 9 3 ...
-    ##   ..- attr(*, "label")= chr "State Name"
-    ##   ..- attr(*, "Section")= chr "GEOGRAPHY"
-    ##  $ HDD65           : num [1:18496] 3844 3766 3819 2614 4219 ...
-    ##   ..- attr(*, "label")= chr "Heating degree days in 2020, base temperature 65F; Derived from the weighted temperatures of nearby weather stations"
-    ##   ..- attr(*, "Section")= chr "WEATHER"
-    ##  $ CDD65           : num [1:18496] 1679 1458 1696 1718 1363 ...
-    ##   ..- attr(*, "label")= chr "Cooling degree days in 2020, base temperature 65F; Derived from the weighted temperatures of nearby weather stations"
-    ##   ..- attr(*, "Section")= chr "WEATHER"
-    ##  $ HDD30YR         : num [1:18496] 4451 4429 4500 3229 4896 ...
-    ##   ..- attr(*, "label")= chr "Heating degree days, 30-year average 1981-2010, base temperature 65F; Taken from nearest weather station, inocu"| __truncated__
-    ##   ..- attr(*, "Section")= chr "WEATHER"
-    ##  $ CDD30YR         : num [1:18496] 1027 1305 1010 1653 1059 ...
-    ##   ..- attr(*, "label")= chr "Cooling degree days, 30-year average 1981-2010, base temperature 65F; Taken from nearest weather station, inocu"| __truncated__
-    ##   ..- attr(*, "Section")= chr "WEATHER"
-    ##  $ HousingUnitType : Factor w/ 5 levels "Mobile home",..: 2 5 5 2 5 2 2 5 5 5 ...
-    ##   ..- attr(*, "label")= chr "Type of housing unit"
-    ##   ..- attr(*, "Section")= chr "YOUR HOME"
-    ##   ..- attr(*, "Question")= chr "Which best describes your home?"
-    ##  $ YearMade        : Ord.factor w/ 9 levels "Before 1950"<..: 4 5 3 5 3 6 2 7 7 5 ...
-    ##   ..- attr(*, "label")= chr "Range when housing unit was built"
-    ##   ..- attr(*, "Section")= chr "YOUR HOME"
-    ##   ..- attr(*, "Question")= chr "Derived from: In what year was your home built? AND Although you do not know the exact year your home was built"| __truncated__
-    ##  $ TOTSQFT_EN      : num [1:18496] 2100 590 900 2100 800 4520 2100 900 750 760 ...
-    ##   ..- attr(*, "label")= chr "Total energy-consuming area (square footage) of the housing unit. Includes all main living areas; all basements"| __truncated__
-    ##   ..- attr(*, "Section")= chr "YOUR HOME"
-    ##  $ TOTHSQFT        : num [1:18496] 2100 590 900 2100 800 3010 1200 900 750 760 ...
-    ##   ..- attr(*, "label")= chr "Square footage of the housing unit that is heated by space heating equipment. A derived variable rounded to the nearest 10"
-    ##   ..- attr(*, "Section")= chr "YOUR HOME"
-    ##  $ TOTCSQFT        : num [1:18496] 2100 590 900 2100 800 3010 1200 0 500 760 ...
-    ##   ..- attr(*, "label")= chr "Square footage of the housing unit that is cooled by air-conditioning equipment or evaporative cooler, a derive"| __truncated__
-    ##   ..- attr(*, "Section")= chr "YOUR HOME"
-    ##  $ SpaceHeatingUsed: logi [1:18496] TRUE TRUE TRUE TRUE TRUE TRUE ...
-    ##   ..- attr(*, "label")= chr "Space heating equipment used"
-    ##   ..- attr(*, "Section")= chr "SPACE HEATING"
-    ##   ..- attr(*, "Question")= chr "Is your home heated during the winter?"
-    ##  $ ACUsed          : logi [1:18496] TRUE TRUE TRUE TRUE TRUE TRUE ...
-    ##   ..- attr(*, "label")= chr "Air conditioning equipment used"
-    ##   ..- attr(*, "Section")= chr "AIR CONDITIONING"
-    ##   ..- attr(*, "Question")= chr "Is any air conditioning equipment used in your home?"
-    ##  $ HeatingBehavior : Factor w/ 7 levels "Set one temp and leave it",..: 1 4 1 1 1 3 2 2 1 2 ...
-    ##   ..- attr(*, "label")= chr "Winter temperature control method"
-    ##   ..- attr(*, "Section")= chr "THERMOSTAT"
-    ##   ..- attr(*, "Question")= chr "Which of the following best describes how your household controls the indoor temperature during the winter?"
-    ##  $ WinterTempDay   : num [1:18496] 70 70 69 68 68 76 74 70 68 70 ...
-    ##   ..- attr(*, "label")= chr "Winter thermostat setting or temperature in home when someone is home during the day"
-    ##   ..- attr(*, "Section")= chr "THERMOSTAT"
-    ##   ..- attr(*, "Question")= chr "During the winter, what is your home’s typical indoor temperature when someone is home during the day?"
-    ##  $ WinterTempAway  : num [1:18496] 70 65 68 68 68 76 65 70 60 70 ...
-    ##   ..- attr(*, "label")= chr "Winter thermostat setting or temperature in home when no one is home during the day"
-    ##   ..- attr(*, "Section")= chr "THERMOSTAT"
-    ##   ..- attr(*, "Question")= chr "During the winter, what is your home’s typical indoor temperature when no one is inside your home during the day?"
-    ##  $ WinterTempNight : num [1:18496] 68 65 67 68 68 68 74 68 62 68 ...
-    ##   ..- attr(*, "label")= chr "Winter thermostat setting or temperature in home at night"
-    ##   ..- attr(*, "Section")= chr "THERMOSTAT"
-    ##   ..- attr(*, "Question")= chr "During the winter, what is your home’s typical indoor temperature inside your home at night?"
-    ##  $ ACBehavior      : Factor w/ 7 levels "Set one temp and leave it",..: 1 4 1 1 2 3 2 7 2 2 ...
-    ##   ..- attr(*, "label")= chr "Summer temperature control method"
-    ##   ..- attr(*, "Section")= chr "THERMOSTAT"
-    ##   ..- attr(*, "Question")= chr "Which of the following best describes how your household controls the indoor temperature during the summer?"
-    ##  $ SummerTempDay   : num [1:18496] 71 68 70 72 72 69 68 NA 72 74 ...
-    ##   ..- attr(*, "label")= chr "Summer thermostat setting or temperature in home when someone is home during the day"
-    ##   ..- attr(*, "Section")= chr "THERMOSTAT"
-    ##   ..- attr(*, "Question")= chr "During the summer, what is your home’s typical indoor temperature when someone is home during the day?"
-    ##  $ SummerTempAway  : num [1:18496] 71 68 68 72 72 74 70 NA 76 74 ...
-    ##   ..- attr(*, "label")= chr "Summer thermostat setting or temperature in home when no one is home during the day"
-    ##   ..- attr(*, "Section")= chr "THERMOSTAT"
-    ##   ..- attr(*, "Question")= chr "During the summer, what is your home’s typical indoor temperature when no one is inside your home during the day?"
-    ##  $ SummerTempNight : num [1:18496] 71 68 68 72 72 68 70 NA 68 72 ...
-    ##   ..- attr(*, "label")= chr "Summer thermostat setting or temperature in home at night"
-    ##   ..- attr(*, "Section")= chr "THERMOSTAT"
-    ##   ..- attr(*, "Question")= chr "During the summer, what is your home’s typical indoor temperature inside your home at night?"
-    ##  $ NWEIGHT         : num [1:18496] 3284 9007 5669 5294 9935 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT1        : num [1:18496] 3273 9020 5793 5361 10048 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 1"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT2        : num [1:18496] 3349 9081 5914 5362 10262 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 2"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT3        : num [1:18496] 3345 9020 5763 5371 10037 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 3"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT4        : num [1:18496] 3437 9213 5870 5393 9961 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 4"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT5        : num [1:18496] 3416 9117 5721 5328 10108 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 5"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT6        : num [1:18496] 3355 9179 5663 5354 10298 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 6"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT7        : num [1:18496] 3372 9096 5700 5325 10065 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 7"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT8        : num [1:18496] 3364 8920 5704 5376 10097 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 8"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT9        : num [1:18496] 3362 9189 5668 5391 10321 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 9"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT10       : num [1:18496] 3302 9060 5793 5501 9944 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 10"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT11       : num [1:18496] 3211 9127 5806 5427 10267 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 11"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT12       : num [1:18496] 3500 9264 5650 5384 10127 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 12"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT13       : num [1:18496] 3314 9222 5648 5302 10241 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 13"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT14       : num [1:18496] 3359 9199 5829 5362 9872 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 14"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT15       : num [1:18496] 3424 9143 5642 5383 10275 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 15"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT16       : num [1:18496] 3384 9042 5718 5381 9921 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 16"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT17       : num [1:18496] 3312 9417 5969 5418 10312 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 17"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT18       : num [1:18496] 3324 9163 5828 5356 10004 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 18"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT19       : num [1:18496] 3367 9192 5814 5343 10437 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 19"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT20       : num [1:18496] 3327 9092 5697 5360 10101 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 20"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT21       : num [1:18496] 3340 0 5687 5336 9982 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 21"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT22       : num [1:18496] 3292 9098 5739 5390 10000 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 22"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT23       : num [1:18496] 3278 9320 5945 5397 10180 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 23"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT24       : num [1:18496] 3340 9081 5820 5448 9826 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 24"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT25       : num [1:18496] 3386 9406 5823 5382 10149 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 25"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT26       : num [1:18496] 3301 9256 5650 5387 0 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 26"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT27       : num [1:18496] 3312 9318 5862 5351 10141 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 27"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT28       : num [1:18496] 3348 9154 5707 5371 9948 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 28"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT29       : num [1:18496] 3356 9372 5619 5362 10065 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 29"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT30       : num [1:18496] 3322 9137 5796 5381 10083 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 30"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT31       : num [1:18496] 3256 9233 5995 5320 10133 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 31"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT32       : num [1:18496] 3318 9115 0 5339 9978 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 32"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT33       : num [1:18496] 3402 9177 5638 0 10213 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 33"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT34       : num [1:18496] 3364 9191 5619 5380 9964 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 34"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT35       : num [1:18496] 3304 9100 5652 5363 10071 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 35"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT36       : num [1:18496] 3333 9072 5834 5477 9988 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 36"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT37       : num [1:18496] 3390 9263 5712 5386 10120 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 37"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT38       : num [1:18496] 3382 9078 5765 5326 10024 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 38"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT39       : num [1:18496] 3329 9011 5887 5421 10024 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 39"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT40       : num [1:18496] 3293 9166 5650 5370 10185 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 40"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT41       : num [1:18496] 3295 9091 5958 5339 10069 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 41"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT42       : num [1:18496] 3414 9194 5593 5329 9959 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 42"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT43       : num [1:18496] 3264 9215 6035 5409 10352 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 43"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT44       : num [1:18496] 3342 9048 5732 5416 10092 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 44"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT45       : num [1:18496] 3275 9259 5877 5453 10228 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 45"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT46       : num [1:18496] 3364 9171 5654 5449 10069 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 46"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT47       : num [1:18496] 3336 9260 5763 5376 9996 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 47"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT48       : num [1:18496] 3329 9105 5929 5408 10198 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 48"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT49       : num [1:18496] 3348 9117 5772 5400 10094 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 49"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT50       : num [1:18496] 3357 9261 5785 5359 10196 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 50"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT51       : num [1:18496] 3335 8955 5636 5448 10017 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 51"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT52       : num [1:18496] 3240 9000 5944 5344 9954 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 52"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT53       : num [1:18496] 3430 9290 5684 5438 10051 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 53"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT54       : num [1:18496] 3294 9199 5736 5378 10019 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 54"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT55       : num [1:18496] 3398 8959 5675 5357 10310 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 55"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT56       : num [1:18496] 3293 9233 5661 5421 10143 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 56"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT57       : num [1:18496] 0 9140 5917 5365 10177 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 57"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT58       : num [1:18496] 3370 9307 5571 5402 10043 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 58"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT59       : num [1:18496] 3358 9062 5887 5403 10248 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 59"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ NWEIGHT60       : num [1:18496] 3404 8958 5838 5351 10110 ...
-    ##   ..- attr(*, "label")= chr "Final Analysis Weight for replicate 60"
-    ##   ..- attr(*, "Section")= chr "WEIGHTS"
-    ##  $ BTUEL           : num [1:18496] 42723 17889 8147 31647 20027 ...
-    ##   ..- attr(*, "label")= chr "Total electricity use, in thousand Btu, 2020, including self-generation of solar power"
-    ##   ..- attr(*, "Section")= chr "CONSUMPTION AND EXPENDITURE"
-    ##  $ DOLLAREL        : num [1:18496] 1955 713 335 1425 1087 ...
-    ##   ..- attr(*, "label")= chr "Total electricity cost, in dollars, 2020"
-    ##   ..- attr(*, "Section")= chr "CONSUMPTION AND EXPENDITURE"
-    ##  $ BTUNG           : num [1:18496] 101924 10145 22603 55119 39100 ...
-    ##   ..- attr(*, "label")= chr "Total natural gas use, in thousand Btu, 2020"
-    ##   ..- attr(*, "Section")= chr "CONSUMPTION AND EXPENDITURE"
-    ##  $ DOLLARNG        : num [1:18496] 702 262 188 637 376 ...
-    ##   ..- attr(*, "label")= chr "Total natural gas cost, in dollars, 2020"
-    ##   ..- attr(*, "Section")= chr "CONSUMPTION AND EXPENDITURE"
-    ##  $ BTULP           : num [1:18496] 0 0 0 0 0 0 0 0 0 0 ...
-    ##   ..- attr(*, "label")= chr "Total propane use, in thousand Btu, 2020"
-    ##   ..- attr(*, "Section")= chr "CONSUMPTION AND EXPENDITURE"
-    ##  $ DOLLARLP        : num [1:18496] 0 0 0 0 0 0 0 0 0 0 ...
-    ##   ..- attr(*, "label")= chr "Total propane cost, in dollars, 2020"
-    ##   ..- attr(*, "Section")= chr "CONSUMPTION AND EXPENDITURE"
-    ##  $ BTUFO           : num [1:18496] 0 0 0 0 0 0 0 0 0 0 ...
-    ##   ..- attr(*, "label")= chr "Total fuel oil/kerosene use, in thousand Btu, 2020"
-    ##   ..- attr(*, "Section")= chr "CONSUMPTION AND EXPENDITURE"
-    ##  $ DOLLARFO        : num [1:18496] 0 0 0 0 0 0 0 0 0 0 ...
-    ##   ..- attr(*, "label")= chr "Total fuel oil/kerosene cost, in dollars, 2020"
-    ##   ..- attr(*, "Section")= chr "CONSUMPTION AND EXPENDITURE"
-    ##  $ BTUWOOD         : num [1:18496] 0 0 0 0 0 3000 0 0 0 0 ...
-    ##   ..- attr(*, "label")= chr "Total wood use, in thousand Btu, 2020"
-    ##   ..- attr(*, "Section")= chr "CONSUMPTION AND EXPENDITURE"
-    ##  $ TOTALBTU        : num [1:18496] 144648 28035 30750 86765 59127 ...
-    ##   ..- attr(*, "label")= chr "Total usage including electricity, natural gas, propane, and fuel oil, in thousand Btu, 2020"
-    ##   ..- attr(*, "Section")= chr "CONSUMPTION AND EXPENDITURE"
-    ##   [list output truncated]
-
-``` r
-recs_der_tmp_loc <- here::here("osf_dl", "recs_2020.rds")
-write_rds(recs_ord, recs_der_tmp_loc)
-target_dir <- osf_retrieve_node("https://osf.io/gzbkn/?view_only=8ca80573293b4e12b7f934a0f742b957") 
-osf_upload(target_dir, path=recs_der_tmp_loc, conflicts="overwrite")
-```
-
-    ## # A tibble: 1 × 3
-    ##   name          id                       meta            
-    ##   <chr>         <chr>                    <list>          
-    ## 1 recs_2020.rds 647d2d6bbf3d0f09b6d87a32 <named list [3]>
-
-``` r
-unlink(recs_der_tmp_loc)
-```
diff --git a/helper-fun/helper-function.R b/helper-fun/helper-function.R
deleted file mode 100644
index 1bf2f2d6..00000000
--- a/helper-fun/helper-function.R
+++ /dev/null
@@ -1,30 +0,0 @@
-read_osf <- function(filename){
-  #' Downloads file from OSF project
-  #' Reads in file
-  #' Deletes file from computer
-  
-  osf_dl_del_later <- !dir.exists("osf_dl")
-  
-  if (osf_dl_del_later) {
-    osf_dl_del_later <- TRUE
-    dir.create("osf_dl")
-  }
-  
-  dat_det <-
-    osf_retrieve_node("https://osf.io/gzbkn/?view_only=8ca80573293b4e12b7f934a0f742b957") %>%
-    osf_ls_files() %>%
-    dplyr::filter(name == filename) %>%
-    osf_download(conflicts = "overwrite", path = "osf_dl")
-  
-  out <- dat_det %>%
-    dplyr::pull(local_path) %>%
-    readr::read_rds()
-  
-  if (osf_dl_del_later) {
-    unlink("osf_dl", recursive = TRUE)
-  } else{
-    unlink(dplyr::pull(dat_det, local_path))
-  }
-  
-  return(out)
-}
\ No newline at end of file
diff --git a/renv.lock b/renv.lock
index dc399ea1..4feb4a43 100644
--- a/renv.lock
+++ b/renv.lock
@@ -1844,6 +1844,21 @@
       ],
       "Hash": "c77ebba142d814788bab0092bf102f6d"
     },
+    "srvyr.data": {
+      "Package": "srvyr.data",
+      "Version": "0.1.0",
+      "Source": "GitHub",
+      "RemoteType": "github",
+      "RemoteHost": "api.github.com",
+      "RemoteUsername": "tidy-survey-r",
+      "RemoteRepo": "srvyr.data",
+      "RemoteRef": "main",
+      "RemoteSha": "1f84dfdd630dde7fecc1a26b44543cd45674d08d",
+      "Requirements": [
+        "R"
+      ],
+      "Hash": "5a90f75ff1373bd3a1906d6712576c14"
+    },
     "stringi": {
       "Package": "stringi",
       "Version": "1.8.3",
@@ -1984,33 +1999,6 @@
       ],
       "Hash": "a84e2cc86d07289b3b6f5069df7a004c"
     },
-    "tidycensus": {
-      "Package": "tidycensus",
-      "Version": "1.4.1",
-      "Source": "Repository",
-      "Repository": "CRAN",
-      "Requirements": [
-        "R",
-        "crayon",
-        "dplyr",
-        "httr",
-        "jsonlite",
-        "purrr",
-        "rappdirs",
-        "readr",
-        "rlang",
-        "rvest",
-        "sf",
-        "stringr",
-        "tidyr",
-        "tidyselect",
-        "tigris",
-        "units",
-        "utils",
-        "xml2"
-      ],
-      "Hash": "dabd8f284f9b186872cce03640ef829a"
-    },
     "tidylog": {
       "Package": "tidylog",
       "Version": "1.0.2",
@@ -2103,25 +2091,6 @@
       ],
       "Hash": "c328568cd14ea89a83bd4ca7f54ae07e"
     },
-    "tigris": {
-      "Package": "tigris",
-      "Version": "2.0.3",
-      "Source": "Repository",
-      "Repository": "CRAN",
-      "Requirements": [
-        "R",
-        "dplyr",
-        "httr",
-        "magrittr",
-        "methods",
-        "rappdirs",
-        "sf",
-        "stringr",
-        "utils",
-        "uuid"
-      ],
-      "Hash": "6dd14cb88733b84d2b9af9fb8f64dbc5"
-    },
     "timechange": {
       "Package": "timechange",
       "Version": "0.2.0",