Skip to content

Commit

Permalink
One voice (#128)
Browse files Browse the repository at this point in the history
* Making data plural

* Standardize A/C format

* Standardize cross-tab format

* change final section header in chapter 9 to not be "summary" to match all other chapters.

* Removing "you" language.

* Adjusting tense "we will.." to just "we..."

* Remove markdown comments

* Changing from target population to population of interest.

* Updates to ch1 from one voice review

* Edits to ch02 from one-voice

* Ch03 edits from one-voice review

* Ch04 updates from one-voice

* Fix broken reference link in ch04.

* Ch05 edits from one-voice

* Ch06 edits from one-voice

* Ch07 edits from one-voice

* Ch08 edits from one-voice

* Ch09 edits from one-voice

* Ch10 edits from one-voice

* Ch11 edits from one-voice

* Ch12 edits from one-voice

* Ch13 edits from one-voice

* Ch14 edits from one-voice

* Appendix A edits from one-voice

* Adding blank line, to add a comment.

* Fixing reference type for Scott2007 to have author show up in bibliography.

* Adding spaces at ends of lines to add comment.

* Fixing typo in formula in ch7.

* Adding space to end of line to add a comment.

* Adding space at end of line to add comment.

* Fix ref to C10

* SZ full book review (#129)

* Change interaction example (#130)

* IV one voice review

---------

Co-authored-by: Stephanie Zimmer <[email protected]>
Co-authored-by: Isabella Velasquez <[email protected]>
  • Loading branch information
3 people authored Apr 24, 2024
1 parent f7dcc4c commit 13ceea6
Show file tree
Hide file tree
Showing 20 changed files with 970 additions and 755 deletions.
69 changes: 38 additions & 31 deletions 01-introduction.Rmd

Large diffs are not rendered by default.

96 changes: 48 additions & 48 deletions 02-overview-surveys.Rmd

Large diffs are not rendered by default.

40 changes: 19 additions & 21 deletions 03-survey-data-documentation.Rmd

Large diffs are not rendered by default.

97 changes: 49 additions & 48 deletions 04-set-up.Rmd

Large diffs are not rendered by default.

186 changes: 109 additions & 77 deletions 05-descriptive-analysis.Rmd

Large diffs are not rendered by default.

144 changes: 75 additions & 69 deletions 06-statistical-testing.Rmd

Large diffs are not rendered by default.

171 changes: 90 additions & 81 deletions 07-modeling.Rmd

Large diffs are not rendered by default.

83 changes: 42 additions & 41 deletions 08-communicating-results.Rmd

Large diffs are not rendered by default.

59 changes: 31 additions & 28 deletions 09-reproducible-data.Rmd

Large diffs are not rendered by default.

201 changes: 96 additions & 105 deletions 10-sample-designs-replicate-weights.Rmd

Large diffs are not rendered by default.

83 changes: 43 additions & 40 deletions 11-missing-data.Rmd

Large diffs are not rendered by default.

50 changes: 27 additions & 23 deletions 12-successful-survey-data-analysis.Rmd

Large diffs are not rendered by default.

199 changes: 144 additions & 55 deletions 13-ncvs-vignette.Rmd

Large diffs are not rendered by default.

36 changes: 20 additions & 16 deletions 14-ambarom-vignette.Rmd

Large diffs are not rendered by default.

110 changes: 63 additions & 47 deletions 89-Appendix-DataImport.Rmd

Large diffs are not rendered by default.

24 changes: 11 additions & 13 deletions 93-AppendixD.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
knitr::opts_chunk$set(tidy = 'styler')
```

The chapter exercises use the survey design objects and packages provided in the Prerequisites box in the beginning of the chapter. Please ensure they are loaded in your environment before running the exercise solutions. Code chunks to load these are also included below.
The chapter exercises use the survey design objects and packages provided in the Prerequisites box in the beginning of the chapter. Please ensure they are loaded in the environment before running the exercise solutions. Code chunks to load these are also included below.


```r
Expand Down Expand Up @@ -243,7 +243,7 @@ pers_des <- pers_vsum_slim %>%
nest = TRUE
)
```

The chapter exercises use the survey design objects and packages provided in the Prerequisites box in the beginning of the chapter. Please ensure they are loaded in the environment before running the exercise solutions.

## 5 - Descriptive analysis {-}

Expand Down Expand Up @@ -420,7 +420,7 @@ quant_baenergyexp %>%

## 6 - Statistical testing {-}

1. Using the RECS data, do more than 50% of U.S. households use AC (`ACUsed`)?
1. Using the RECS data, do more than 50% of U.S. households use A/C (`ACUsed`)?

```{r}
#| label: stattest-ex-solution1
Expand Down Expand Up @@ -472,9 +472,7 @@ ttest_solution3

On average, those who voted for Joseph Biden in 2020 were `r ttest_solution3$estimate %>% round(1)` years younger than voters for other candidates and this is significantly different (p `r ttest_solution3$p.value %>% pretty_p_value()`).



4. If you wanted to determine if the political party affiliation differed for males and females, what test would you use?
4. If we wanted to determine if the political party affiliation differed for males and females, what test would we use?

a. Goodness of fit test (`svygofchisq()`)
b. Test of independence (`svychisq()`)
Expand Down Expand Up @@ -546,7 +544,7 @@ tidy(exp_unit_out)
Answer: The reference level should be `r expense_by_hut %>% slice(1) %>% pull(HousingUnitType) %>% as.character()`. All p-values are very small indicating there is a significant relationship between housing unit type and total energy expenditure.


2. Does temperature play a role in electricity expenditure (`DOLLAREL`)? Cooling degree days are a measure of how hot a place is. CDD65 for a given day indicates the number of degrees Fahrenheit warmer than 65°F (18.3°C) it is in a location. On a day that averages 65°F and below, CDD65=0. While a day that averages 85°F (29.4°C) would have CDD65=20 because it is 20 degrees Fahrenheit warmer. For each day in the year, this is summed to give an indicator of how hot the place is throughout the year. Similarly, HDD65 indicates the days colder than 65°F^[<https://www.eia.gov/energyexplained/units-and-calculators/degree-days.php>]. Can energy expenditure be predicted using these temperature indicators along with square footage? Is there a significant relationship? Include main effects and two-way interactions.
2. Does temperature play a role in electricity expenditure? Cooling degree days are a measure of how hot a place is. CDD65 for a given day indicates the number of degrees Fahrenheit warmer than 65°F (18.3°C) it is in a location. On a day that averages 65°F and below, CDD65=0. While a day that averages 85°F (29.4°C) would have CDD65=20 because it is 20 degrees Fahrenheit warmer [@eia-cdd]. For each day in the year, this is summed to give an indicator of how hot the place is throughout the year. Similarly, HDD65 indicates the days colder than 65°F. Can energy expenditure be predicted using these temperature indicators along with square footage? Is there a significant relationship? Include main effects and two-way interactions.

```{r}
#| label: model-ex-solution2
Expand Down Expand Up @@ -601,7 +599,7 @@ temps_sqft_exp_fit %>%
theme_minimal()
```

4. Early voting expanded in 2020^[<https://www.npr.org/2020/10/26/927803214/62-million-and-counting-americans-are-breaking-early-voting-records>]. Build a logistic model predicting early voting in 2020 (`EarlyVote2020`) using age (`Age`), education (`Education`), and party identification (`PartyID`). Include two-way interactions.
4. Early voting expanded in 2020 [@npr-voting-trend]. Build a logistic model predicting early voting in 2020 (`EarlyVote2020`) using age (`Age`), education (`Education`), and party identification (`PartyID`.) Include two-way interactions.

Answer:
```{r}
Expand Down Expand Up @@ -644,7 +642,8 @@ Answer: We predict that the 28 year old with a graduate degree who identifies as

## 10 - Specifying sample designs and replicate weights in {srvyr} {-}

1. The National Health Interview Survey (NHIS) is an annual household survey conducted by the National Center for Health Statistics (NCHS). The NHIS includes a wide variety of health topics for adults including health status and conditions, functioning and disability, health care access and health service utilization, health-related behaviors, health promotion, mental health, barriers to care, and community engagement. Like many national in-person surveys, the sampling design is a stratified clustered design with details included in the Survey Description [@nhis-svy-des]. The Survey Description provides information on setting up syntax in SUDAAN, Stata, SPSS, SAS, and R ({survey} package implementation). You have imported the data and the variable containing the data is: `nhis_adult_data`. How would you specify the design using {srvyr} using either `as_survey_design()` or `as_survey_rep()`?

1. The National Health Interview Survey (NHIS) is an annual household survey conducted by the National Center for Health Statistics (NCHS.) The NHIS includes a wide variety of health topics for adults including health status and conditions, functioning and disability, health care access and health service utilization, health-related behaviors, health promotion, mental health, barriers to care, and community engagement. Like many national in-person surveys, the sampling design is a stratified clustered design with details included in the Survey Description [@nhis-svy-des]. The Survey Description provides information on setting up syntax in SUDAAN, Stata, SPSS, SAS, and R ({survey} package implementation.) We have imported the data and the variable containing the data as: `nhis_adult_data`. How would we specify the design using either `as_survey_design()` or `as_survey_rep()`?

Answer:

Expand All @@ -660,7 +659,7 @@ nhis_adult_des <- nhis_adult_data %>%
)
```

2. The General Social Survey is a survey that has been administered since 1972 on social, behavioral, and attitudinal topics. The 2016-2020 GSS Panel codebook provides examples of setting up syntax in SAS and Stata but not R [@gss-codebook]. You have imported the data and the variable containing the data is: `gss_data`. How would you specify the design in R using either `as_survey_design()` or `as_survey_rep()`?
2. The General Social Survey is a survey that has been administered since 1972 on social, behavioral, and attitudinal topics. The 2016-2020 GSS Panel codebook provides examples of setting up syntax in SAS and Stata but not R [@gss-codebook]. We have imported the data and the variable containing the data as: `gss_data`. How would we specify the design in R using either `as_survey_design()` or `as_survey_rep()`?

Answer:

Expand All @@ -675,7 +674,7 @@ gss_des <- gss_data %>%

## 13 - National Crime Victimization Survey Vignette {-}

1. What proportion of completed motor vehicle thefts are **not** reported to the police? Hint: Use the codebook to look at the definition of Type of Crime (V4529).
1. What proportion of completed motor vehicle thefts are **not** reported to the police? Hint: Use the codebook to look at the definition of Type of Crime (V4529.)

```{r}
#| label: ncvs-vign-ex-solution1
Expand Down Expand Up @@ -750,8 +749,7 @@ Answer: The difference between male and female victimization rate is estimated a

## 14 - AmericasBarometer Vignette {-}

1. Calculate the percentage of households with broadband internet in and those with any internet at home, including from a phone or tablet in Latin America and the Caribbean. Hint: if you come across countries with 0% internet usage, you may want to filter by something first.

1. Calculate the percentage of households with broadband internet and those with any internet at home, including from a phone or tablet in Latin America and the Caribbean. Hint: if there are countries with 0% internet usage, try filtering by something first.
Answer:

```{r}
Expand Down
8 changes: 6 additions & 2 deletions 99-references.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,10 @@ our_write_bib <- function (x = .packages(), file = "", tweak = TRUE, width = NUL
"\\1", cite$title)
cite$title = gsub(pkg, paste0("{", pkg, "}"), cite$title)
cite$title = gsub("\\b(R)\\b", "{R}", cite$title)
cite$title = gsub("\\b(ggplot2)\\b", "{ggplot2}", cite$title)
cite$title = gsub("\\b(dplyr)\\b", "{dplyr}", cite$title)
cite$title = gsub("\\b(tidyverse)\\b", "{tidyverse}", cite$title)
cite$title = gsub("\\b(sf)\\b", "{sf}", cite$title)
cite$title = gsub(" & ", " \\\\& ", cite$title)
}
entry = toBibtex(cite)
Expand All @@ -58,8 +62,8 @@ our_write_bib <- function (x = .packages(), file = "", tweak = TRUE, width = NUL
bib = lapply(bib, function(b) {
b["author"] = sub("Duncan Temple Lang", "Duncan {Temple Lang}",
b["author"])
b["title"] = gsub("(^|\\W)'([^']+)'(\\W|$)", "\\1\\2\\3",
b["title"])
# b["title"] = gsub("(^|\\W)'([^']+)'(\\W|$)", "\\1\\2\\3",
# b["title"])
if (!is.na(b["note"]))
b["note"] = gsub("(^.*?https?://.*?),\\s+https?://.*?(},\\s*)$",
"\\1\\2", b["note"])
Expand Down
56 changes: 54 additions & 2 deletions book.bib
Original file line number Diff line number Diff line change
Expand Up @@ -296,7 +296,7 @@ @misc{recs-2020-meth
}
@misc{anes-2020-tech,
title = {{Methodology Report for the ANES 2020 Time Series Study}},
author = {{DeBell, Matthew and Amsbary, Michelle and Brader, Ted and Brock, Shelley and Good, Cindy and Kamens, Justin and Maisel, Natalya and Pinto, Sarah}},
author = {DeBell, Matthew and Amsbary, Michelle and Brader, Ted and Brock, Shelley and Good, Cindy and Kamens, Justin and Maisel, Natalya and Pinto, Sarah},
year = 2022,
howpublished = {\url{https://electionstudies.org/wp-content/uploads/2022/08/anes_timeseries_2020_methodology_report.pdf}}
}
Expand Down Expand Up @@ -494,4 +494,56 @@ @misc{gss-codebook
editor = {NORC, Chicago},
year = 2021,
howpublished = {\url{https://gss.norc.org/Documents/codebook/2016-2020%20GSS%20Panel%20Codebook%20-%20R1a.pdf}}
}
}

@Book{ggplot2wickham,
author = {Hadley Wickham},
title = {{ggplot2}: Elegant Graphics for Data Analysis},
publisher = {Springer-Verlag New York},
year = {2016},
isbn = {978-3-319-24277-4},
url = {https://ggplot2.tidyverse.org},
}

@Article{gtsummarysjo,
author = {Daniel D. Sjoberg and Karissa Whiting and Michael Curry and Jessica A. Lavery and Joseph Larmarange},
title = {Reproducible Summary Tables with the {gtsummary} Package},
journal = {{The R Journal}},
year = {2021},
url = {https://doi.org/10.32614/RJ-2021-053},
doi = {10.32614/RJ-2021-053},
volume = {13},
issue = {1},
pages = {570-580},
}

@Article{targetslandau,
title = {The {targets} {R} package: a dynamic {Make}-like function-oriented pipeline toolkit for reproducibility and high-performance computing},
author = {William Michael Landau},
journal = {Journal of Open Source Software},
year = {2021},
volume = {6},
number = {57},
pages = {2959},
url = {https://doi.org/10.21105/joss.02959},
}

@Article{jsonliteooms,
title = {The {jsonlite} Package: A Practical and Consistent Mapping Between JSON Data and {R} Objects},
author = {Jeroen Ooms},
journal = {arXiv:1403.2805 [stat.CO]},
year = {2014},
url = {https://arxiv.org/abs/1403.2805},
}

@Article{visdattierney,
title = {{visdat}: Visualising Whole Data Frames},
author = {Nicholas Tierney},
doi = {10.21105/joss.00355},
url = {http://dx.doi.org/10.21105/joss.00355},
year = {2017},
journal = {Journal of Open Source Software},
volume = {2},
number = {16},
pages = {355}
}
3 changes: 0 additions & 3 deletions index.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -16,12 +16,9 @@ github-repo: tidy-survey-r/tidy-survey-book
graphics: yes
#cover-image: images/cover.jpg
header-includes:
- \usepackage{draftwatermark}
- \usepackage[titles]{tocloft}
---

\SetWatermarkText{DRAFT}


```{r setup}
#| include: false
Expand Down
10 changes: 5 additions & 5 deletions renv.lock
Original file line number Diff line number Diff line change
Expand Up @@ -2025,18 +2025,18 @@
},
"srvyrexploR": {
"Package": "srvyrexploR",
"Version": "0.0.0.9000",
"Version": "1.0.0",
"Source": "GitHub",
"RemoteType": "github",
"RemoteHost": "api.github.com",
"RemoteRepo": "srvyrexploR",
"RemoteUsername": "tidy-survey-r",
"RemoteRepo": "srvyrexploR",
"RemoteRef": "HEAD",
"RemoteSha": "914fc0fd0b7812d7d7260e15da882561602b21d2",
"RemoteSha": "e03f36c51c34f7d0f1a036246a15d3ed67806b4f",
"RemoteHost": "api.github.com",
"Requirements": [
"R"
],
"Hash": "3586abacf9e95b432824b9e9e60037d0"
"Hash": "30a1302b8eabd8d1a72228c799794665"
},
"stringi": {
"Package": "stringi",
Expand Down

0 comments on commit 13ceea6

Please sign in to comment.