From 7fc63498a655ae29b9993c403605b680f58b8fa1 Mon Sep 17 00:00:00 2001 From: Stephanie Zimmer Date: Sun, 28 Jul 2024 15:56:48 -0400 Subject: [PATCH 1/6] Pub edits c14 --- 14-ambarom-vignette.Rmd | 58 ++++++++++++++++++++--------------------- 1 file changed, 29 insertions(+), 29 deletions(-) diff --git a/14-ambarom-vignette.Rmd b/14-ambarom-vignette.Rmd index 5cda7d7..88bd75a 100644 --- a/14-ambarom-vignette.Rmd +++ b/14-ambarom-vignette.Rmd @@ -1,4 +1,4 @@ -# AmericasBarometer Vignette {#c14-ambarom-vignette} +# AmericasBarometer vignette {#c14-ambarom-vignette} \index{AmericasBarometer|(} \index{LAPOP|see {AmericasBarometer}} @@ -30,7 +30,7 @@ library(gt) library(ggpattern) ``` -This vignette uses a subset of data from the 2021 AmericasBarometer survey. Download the raw files, available on the [LAPOP website.](http://datasets.americasbarometer.org/database/index.php) We work with version 1.2 of the data, and there are separate files for each of the 22 countries. To import all files into R while ignoring the Stata labels, we recommend running the following code using the `read_stata()` function from the {haven} package [@R-haven]: +This vignette uses a subset of data from the 2021 AmericasBarometer survey. Download the raw files, available on the [LAPOP website](http://datasets.americasbarometer.org/database/index.php). We work with version 1.2 of the data, and there are separate files for each of the 22 countries. To import all files into R while ignoring the Stata labels, we recommend running the following code using the `read_stata()` function from the {haven} package [@R-haven]: ```r stata_files <- list.files(here("RawData", "LAPOP_2021"), "*.dta") @@ -58,13 +58,13 @@ The code above reads all the `.dta` files and combines them into one tibble. The AmericasBarometer surveys, conducted by the LAPOP Lab [@lapop], are public opinion surveys of the Americas focused on democracy. The study was launched in 2004/2005 with 11 countries. Though the participating countries change over time, AmericasBarometer maintains a consistent methodology across many of them. In 2021, the study included 22 countries ranging from Canada in the north to Chile and Argentina in the south [@lapop-about]. -Historically, surveys were administered through in-person household interviews, but the COVID-19 pandemic changed the study significantly. Now, random-digit dialing (RDD) of mobile phones is used in all countries except the United States and Canada [@lapop-tech]. In Canada, LAPOP collaborated with the Environics Institute to collect data from a panel of Canadians using a web survey [@lapop-can]. In the United States, YouGov conducted the survey on behalf of LAPOP by conducting a web survey among its panelists [@lapop-usa]. +Historically, surveys were administered through in-person household interviews, but the COVID-19 pandemic changed the study significantly. Now, random-digit dialing (RDD) of mobile phones is used in all countries except the United States and Canada [@lapop-tech]. In Canada, LAPOP collaborated with the Environics Institute to collect data from a panel of Canadians using a web survey [@lapop-can]. In the United States, YouGov conducted a web survey on behalf of LAPOP among its panelists [@lapop-usa]. The survey includes a core set of questions for all countries, but not every question is asked in each country. Additionally, some questions are only posed to half of the respondents in a country, with different randomized sections [@lapop-svy]. ## Data structure -Each country and year has its own file available in Stata format (`.dta`.) In this vignette, we download and combine all the data from the 22 participating countries in 2021. We subset the data to a smaller set of columns, as noted in the prerequisites box. We recommend reviewing the core questionnaire to understand the common variables across the countries [@lapop-svy]. +Each country and year has its own file available in Stata format (`.dta`). In this vignette, we download and combine all the data from the 22 participating countries in 2021. We subset the data to a smaller set of columns, as noted in the Prerequisites box. We recommend reviewing the core questionnaire to understand the common variables across the countries [@lapop-svy]. ## Preparing files @@ -171,9 +171,9 @@ One interesting thing to note is that these weight variables can provide estimat When calculating estimates from the data, we use the survey design object `ambarom_des` and then apply the \index{Functions in srvyr!survey\_mean} `survey_mean()` function. The next sections walk through a few examples. -### Example: Worried about COVID-19 +### Example: Worry about COVID-19 -This survey was administered between March and August of 2021, with the specific timing varying by country.^[See Table 2 in @lapop-tech for dates by country] Given the state of the pandemic at that time, several questions about COVID-19 were included. The first question about COVID-19 asked: +This survey was administered between March and August 2021, with the specific timing varying by country^[See table 2 in @lapop-tech for dates by country]. Given the state of the pandemic at that time, several questions about COVID-19 were included. The first question about COVID-19 asked was: > How worried are you about the possibility that you or someone in your household will get sick from coronavirus in the next 3 months? > @@ -217,10 +217,10 @@ To view the results for all countries, we can use the {gt} package to create Tab #| label: ambarom-worry-gt covid_worry_country_ests_gt <- covid_worry_country_ests %>% gt(rowname_col = "Country") %>% - cols_label(p = "Percent", - p_se = "SE") %>% + cols_label(p = "%", + p_se = "S.E.") %>% fmt_number(decimals = 1) %>% - tab_source_note("AmericasBarometer Surveys, 2021") + tab_source_note(md("*Source*: AmericasBarometer Surveys, 2021")) ``` ```{r} @@ -329,20 +329,20 @@ covid_educ_ests_gt <- covid_educ_ests %>% gt(rowname_col = "Country") %>% cols_label( p_onlynormal = "%", - p_onlynormal_se = "SE", + p_onlynormal_se = "S.E.", p_mediumchange = "%", - p_mediumchange_se = "SE", + p_mediumchange_se = "S.E.", p_noschool = "%", - p_noschool_se = "SE" + p_noschool_se = "S.E." ) %>% - tab_spanner(label = "Normal school only", + tab_spanner(label = "Normal School Only", columns = c("p_onlynormal", "p_onlynormal_se")) %>% - tab_spanner(label = "Medium change", + tab_spanner(label = "Medium Change", columns = c("p_mediumchange", "p_mediumchange_se")) %>% - tab_spanner(label = "Cut ties with school", + tab_spanner(label = "Cut Ties with School", columns = c("p_noschool", "p_noschool_se")) %>% fmt_number(decimals = 1) %>% - tab_source_note("AmericasBarometer Surveys, 2021") + tab_source_note(md("*Source*: AmericasBarometer Surveys, 2021")) ``` ```{r} @@ -351,7 +351,7 @@ covid_educ_ests_gt <- covid_educ_ests %>% covid_educ_ests_gt ``` -(ref:ambarom-covid-ed-der-tab) Impact on education in households with children under the age of 13 who had children that would generally attend school +(ref:ambarom-covid-ed-der-tab) Impact on education in households with children under the age of 13 who generally attend school ```{r} #| label: ambarom-covid-ed-der-tab @@ -366,7 +366,7 @@ In the countries that were asked this question, many households experienced a ch ## Mapping survey data {#ambarom-maps} -While the table effectively presents the data, a map could also be insightful. To create a map of the countries, we can use the package {rnaturalearth} and subset North and South America with the `ne_countries()` function [@R-rnaturalearth]. The function returns a simple features (sf) object with many columns [@sf2023], but most importantly, `soverignt` (sovereignty), `geounit` (country or territory), and `geometry` (the shape.) For an example of the difference between sovereignty and country/territory, the United States, Puerto Rico, and the U.S. Virgin Islands are all separate units with the same sovereignty. A map without data is plotted in Figure \@ref(fig:ambarom-americas-map) using `geom_sf()` from the {ggplot2} package, which plots sf objects [@ggplot2wickham]. +While the table effectively presents the data, a map could also be insightful. To create a map of the countries, we can use the package {rnaturalearth} and subset North and South America with the `ne_countries()` function [@R-rnaturalearth]. The function returns a simple features (sf) object with many columns [@sf2023], but most importantly, `soverignt` (sovereignty), `geounit` (country or territory), and `geometry` (shape). For an example of the difference between sovereignty and country/territory, the United States, Puerto Rico, and the U.S. Virgin Islands are all separate units with the same sovereignty. A map without data is plotted in Figure \@ref(fig:ambarom-americas-map) using `geom_sf()` from the {ggplot2} package, which plots sf objects [@ggplot2wickham]. ```{r} #| label: ambarom-americas-map @@ -385,7 +385,7 @@ country_shape %>% geom_sf() ``` -The map in Figure \@ref(fig:ambarom-americas-map) appears very wide due to the Aleutian islands in Alaska extending into the Eastern Hemisphere. We can crop the shapefile to include only the Western Hemisphere using `st_crop()` from the {sf} package, which removes some of the trailing islands of Alaska. +The map in Figure \@ref(fig:ambarom-americas-map) appears very wide due to the Aleutian Islands in Alaska extending into the Eastern Hemisphere. We can crop the shapefile to include only the Western Hemisphere using `st_crop()` from the {sf} package, which removes some of the trailing islands of Alaska. ```{r} #| label: ambarom-update-map @@ -397,7 +397,7 @@ country_shape_crop <- country_shape %>% ymax = 90)) ``` -Now that we have the necessary shape files, our next step is to match our survey data to the map. Countries can be named differently (e.g., "U.S", "U.S.A", "United States".) To make sure we can visualize our survey data on the map, we need to match the country names in both the survey data and the map data. To do this, we can use the `anti_join()` function from the {dplyr} package to identify the countries in the survey data that aren't in the map data. Table \@ref(tab:ambarom-map-merge-check-1-tab) shows the countries in the survey data but not the map data, and Table \@ref(tab:ambarom-map-merge-check-2-tab) shows the countries in the map data but not the survey data. As shown below, the United States is referred to as "United States" in the survey data but "United States of America" in the map data. +Now that we have the necessary shape files, our next step is to match our survey data to the map. Countries can be named differently (e.g., "U.S.", "U.S.A.", "United States"). To make sure we can visualize our survey data on the map, we need to match the country names in both the survey data and the map data. To do this, we can use the `anti_join()` function from the {dplyr} package to identify the countries in the survey data that are not in the map data. Table \@ref(tab:ambarom-map-merge-check-1-tab) shows the countries in the survey data but not the map data, and Table \@ref(tab:ambarom-map-merge-check-2-tab) shows the countries in the map data but not the survey data. As shown below, the United States is referred to as "United States" in the survey data but "United States of America" in the map data. ```{r} #| label: ambarom-map-merge-check-1-gt @@ -460,7 +460,7 @@ country_shape_upd <- country_shape_crop %>% "United States", geounit)) ``` -Now that the country names match, we can merge the survey and map data and then plot the resulting dataset. We begin with the map file and merge it with the survey estimates generated in Section \@ref(ambarom-estimates) (`covid_worry_country_ests` and `covid_educ_ests`.) We use the {dplyr} function of `full_join()`, which joins the rows in the map data and the survey estimates based on the columns `geounit` and `Country`. A full join keeps all the rows from both datasets, matching rows when possible. For any rows without matches, the function fills in an `NA` for the missing value [@sf2023]. +Now that the country names match, we can merge the survey and map data and then plot the resulting dataset. We begin with the map file and merge it with the survey estimates generated in Section \@ref(ambarom-estimates) (`covid_worry_country_ests` and `covid_educ_ests`). We use the {dplyr} function of `full_join()`, which joins the rows in the map data and the survey estimates based on the columns `geounit` and `Country`. A full join keeps all the rows from both datasets, matching rows when possible. For any rows without matches, the function fills in an `NA` for the missing value [@sf2023]. ```{r} #| label: ambarom-join-maps-ests @@ -471,12 +471,12 @@ covid_sf <- country_shape_upd %>% by = c("geounit" = "Country")) ``` -After the merge, we create two figures that display the population estimates for the percentage of people worried about COVID-19 (Figure \@ref(fig:ambarom-make-maps-covid)) and the percentage of households with at least one child participating in virtual or hybrid learning (Figure \@ref(fig:ambarom-make-maps-covid-ed).) We also add a crosshatch pattern to the countries without any data using the `geom_sf_pattern()` function from the {ggpattern} package [@R-ggpattern]. +After the merge, we create two figures that display the population estimates for the percentage of people worried about COVID-19 (Figure \@ref(fig:ambarom-make-maps-covid)) and the percentage of households with at least one child participating in virtual or hybrid learning (Figure \@ref(fig:ambarom-make-maps-covid-ed)). We also add a crosshatch pattern to the countries without any data using the `geom_sf_pattern()` function from the {ggpattern} package [@R-ggpattern]. ```{r} #| label: ambarom-make-maps-covid -#| fig.cap: "Percent of households worried someone in their household will get COVID-19 in the next 3 months by country" -#| fig.alt: "A choropleth map of the Western Hemisphere where the color scale filling in each country corresponds to the percent of households worried someone in their household will get COVID-19 in the next 3 months. The bottom of the range is 30% and the top of the range is 80%. Brazil and Chile look like the countries with the highest percentage of worry, with North America showing a lower percentage of worry. Countries without data, such as Venezuela, are displayed with a hash pattern." +#| fig.cap: "Percentage of households by country worried someone in their household will get COVID-19 in the next 3 months by country" +#| fig.alt: "A choropleth map of the Western Hemisphere where the color scale filling in each country corresponds to the percentage of households worried someone in their household will get COVID-19 in the next 3 months. The bottom of the range is 30% and the top of the range is 80%. Brazil and Chile look like the countries with the highest percentage of worry, with North America showing a lower percentage of worry. Countries without data, such as Venezuela, are displayed with a hash pattern." ggplot() + @@ -503,8 +503,8 @@ ggplot() + ```{r} #| label: ambarom-make-maps-covid-ed -#| fig.cap: "Percent of households who had at least one child participate in virtual or hybrid learning" -#| fig.alt: "A choropleth map of the Western Hemisphere where the color scale filling in each country corresponds to the percent of households who had at least one child participate in virtual or hybrid learning. The bottom of the range is 20% and the top of the range is 100%. Most of North America is missing data and are filled in with a hash pattern. The countries with data show a high percentage of households who had at least one child participate in virtual or hybrid learning." +#| fig.cap: "Percentage of households who had at least one child participate in virtual or hybrid learning" +#| fig.alt: "A choropleth map of the Western Hemisphere where the color scale filling in each country corresponds to the percentage of households who had at least one child participate in virtual or hybrid learning. The bottom of the range is 20% and the top of the range is 100%. Most of North America is missing data and are filled in with a hash pattern. The countries with data show a high percentage of households who had at least one child participate in virtual or hybrid learning." ggplot() + geom_sf( @@ -535,8 +535,8 @@ In Figure \@ref(fig:ambarom-make-maps-covid-ed), we observe missing data (repres ```{r} #| label: ambarom-make-maps-covid-ed-c-s -#| fig.cap: "Percent of households who had at least one child participate in virtual or hybrid learning, Central and South America" -#| fig.alt: "A choropleth map of Central and South America where the color scale filling in each country corresponds to the percent of households who had at least one child participate in virtual or hybrid learning. The bottom of the range is 20% and the top of the range is 100%. Most of North America is missing data and are filled in with a hash pattern. The countries with data show a high percentage of households who had at least one child participate in virtual or hybrid learning." +#| fig.cap: "Percentage of households who had at least one child participate in virtual or hybrid learning, in Central and South America" +#| fig.alt: "A choropleth map of Central and South America where the color scale filling in each country corresponds to the percentage of households who had at least one child participate in virtual or hybrid learning. The bottom of the range is 20% and the top of the range is 100%. Most of North America is missing data and are filled in with a hash pattern. The countries with data show a high percentage of households who had at least one child participate in virtual or hybrid learning." covid_c_s <- covid_sf %>% @@ -566,7 +566,7 @@ ggplot() + theme_minimal() ``` -In Figure \@ref(fig:ambarom-make-maps-covid-ed-c-s), we can see that most countries with available data have similar percentages (reflected in their similar shades.) However, Haiti stands out with a lighter shade, indicating a considerably lower percentage of households with at least one child participating in virtual or hybrid learning. +In Figure \@ref(fig:ambarom-make-maps-covid-ed-c-s), we can see that most countries with available data have similar percentages (reflected in their similar shades). However, Haiti stands out with a lighter shade, indicating a considerably lower percentage of households with at least one child participating in virtual or hybrid learning. ## Exercises From 01c12800d714d4643965640aadb32eb0de1d78ca Mon Sep 17 00:00:00 2001 From: Isabella Velasquez Date: Fri, 16 Aug 2024 15:16:06 -0700 Subject: [PATCH 2/6] Go through Stephanie/Editor edits --- 14-ambarom-vignette.Rmd | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/14-ambarom-vignette.Rmd b/14-ambarom-vignette.Rmd index 88bd75a..84515bd 100644 --- a/14-ambarom-vignette.Rmd +++ b/14-ambarom-vignette.Rmd @@ -475,7 +475,7 @@ After the merge, we create two figures that display the population estimates for ```{r} #| label: ambarom-make-maps-covid -#| fig.cap: "Percentage of households by country worried someone in their household will get COVID-19 in the next 3 months by country" +#| fig.cap: "Percentage of households by country worried someone in their household will get COVID-19 in the next 3 months" #| fig.alt: "A choropleth map of the Western Hemisphere where the color scale filling in each country corresponds to the percentage of households worried someone in their household will get COVID-19 in the next 3 months. The bottom of the range is 30% and the top of the range is 80%. Brazil and Chile look like the countries with the highest percentage of worry, with North America showing a lower percentage of worry. Countries without data, such as Venezuela, are displayed with a hash pattern." From f48458c96e7dae8da96fdc79ca0e3c77b8065b92 Mon Sep 17 00:00:00 2001 From: Isabella Velasquez Date: Sun, 18 Aug 2024 08:11:59 -0700 Subject: [PATCH 3/6] Add quotation --- 14-ambarom-vignette.Rmd | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/14-ambarom-vignette.Rmd b/14-ambarom-vignette.Rmd index 84515bd..5e9c71a 100644 --- a/14-ambarom-vignette.Rmd +++ b/14-ambarom-vignette.Rmd @@ -175,12 +175,14 @@ When calculating estimates from the data, we use the survey design object `ambar This survey was administered between March and August 2021, with the specific timing varying by country^[See table 2 in @lapop-tech for dates by country]. Given the state of the pandemic at that time, several questions about COVID-19 were included. The first question about COVID-19 asked was: -> How worried are you about the possibility that you or someone in your household will get sick from coronavirus in the next 3 months? +> Question text: "How worried are you about the possibility that you or someone in your household will get sick from coronavirus in the next 3 months?" > > - Very worried > - Somewhat worried > - A little worried > - Not worried at all +> +> Source: [@anes-svy] If we are interested in those who are very worried or somewhat worried, we can create a new variable (`CovidWorry_bin`) that groups levels of the original question using the `fct_collapse()` function from the {forcats} package [@R-forcats]. We then use the `survey_count()` function to understand how responses are distributed across each category of the original variable (`CovidWorry`) and the new variable (`CovidWorry_bin`.) \index{Functions in srvyr!survey\_count|(} From 538ed2bd66459e42893a0260e10d6851fc3622bf Mon Sep 17 00:00:00 2001 From: Isabella Velasquez Date: Sun, 18 Aug 2024 15:25:03 -0700 Subject: [PATCH 4/6] Edit quotation --- 14-ambarom-vignette.Rmd | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/14-ambarom-vignette.Rmd b/14-ambarom-vignette.Rmd index 5e9c71a..efb2138 100644 --- a/14-ambarom-vignette.Rmd +++ b/14-ambarom-vignette.Rmd @@ -177,12 +177,12 @@ This survey was administered between March and August 2021, with the specific ti > Question text: "How worried are you about the possibility that you or someone in your household will get sick from coronavirus in the next 3 months?" > -> - Very worried -> - Somewhat worried -> - A little worried -> - Not worried at all +> | - Very worried +> | - Somewhat worried +> | - A little worried +> | - Not worried at all > -> Source: [@anes-svy] +> Source: [@lapop-svy] If we are interested in those who are very worried or somewhat worried, we can create a new variable (`CovidWorry_bin`) that groups levels of the original question using the `fct_collapse()` function from the {forcats} package [@R-forcats]. We then use the `survey_count()` function to understand how responses are distributed across each category of the original variable (`CovidWorry`) and the new variable (`CovidWorry_bin`.) \index{Functions in srvyr!survey\_count|(} @@ -246,13 +246,15 @@ covid_worry_country_ests_gt %>% Respondents were also asked a question about how the pandemic affected education. This question was asked to households with children under the age of 13, and respondents could select more than one option, as follows: -> Did any of these children have their school education affected due to the pandemic? +> Question text: "Did any of these children have their school education affected due to the pandemic?" > > | - No, because they are not yet school age or because they do not attend school for another reason > | - No, their classes continued normally > | - Yes, they went to virtual or remote classes > | - Yes, they switched to a combination of virtual and in-person classes > | - Yes, they cut all ties with the school +> +> Source: [@lapop-svy] Working with multiple-choice questions can be both challenging and interesting. Let's walk through how to analyze this question. If we are interested in the impact on education, we should focus on the data of those whose children are attending school. This means we need to exclude those who selected the first response option: "No, because they are not yet school age or because they do not attend school for another reason." To do this, we use the `Educ_NotInSchool` variable in the dataset, which has values of `0` and `1`. A value of `1` indicates that the respondent chose the first response option (none of the children are in school), and a value of `0` means that at least one of their children is in school. By filtering the data to those with a value of `0` (they have at least one child in school), we can consider only respondents with at least one child attending school. @@ -505,7 +507,7 @@ ggplot() + ```{r} #| label: ambarom-make-maps-covid-ed -#| fig.cap: "Percentage of households who had at least one child participate in virtual or hybrid learning" +#| fig.cap: "Percentage of households by country who had at least one child participate in virtual or hybrid learning" #| fig.alt: "A choropleth map of the Western Hemisphere where the color scale filling in each country corresponds to the percentage of households who had at least one child participate in virtual or hybrid learning. The bottom of the range is 20% and the top of the range is 100%. Most of North America is missing data and are filled in with a hash pattern. The countries with data show a high percentage of households who had at least one child participate in virtual or hybrid learning." ggplot() + From 975563edef79d60bed367fa6f3e8144e485355d0 Mon Sep 17 00:00:00 2001 From: Isabella Velasquez Date: Mon, 19 Aug 2024 15:48:55 -0500 Subject: [PATCH 5/6] Edit period position --- 14-ambarom-vignette.Rmd | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/14-ambarom-vignette.Rmd b/14-ambarom-vignette.Rmd index efb2138..807c0a3 100644 --- a/14-ambarom-vignette.Rmd +++ b/14-ambarom-vignette.Rmd @@ -184,7 +184,7 @@ This survey was administered between March and August 2021, with the specific ti > > Source: [@lapop-svy] -If we are interested in those who are very worried or somewhat worried, we can create a new variable (`CovidWorry_bin`) that groups levels of the original question using the `fct_collapse()` function from the {forcats} package [@R-forcats]. We then use the `survey_count()` function to understand how responses are distributed across each category of the original variable (`CovidWorry`) and the new variable (`CovidWorry_bin`.) \index{Functions in srvyr!survey\_count|(} +If we are interested in those who are very worried or somewhat worried, we can create a new variable (`CovidWorry_bin`) that groups levels of the original question using the `fct_collapse()` function from the {forcats} package [@R-forcats]. We then use the `survey_count()` function to understand how responses are distributed across each category of the original variable (`CovidWorry`) and the new variable (`CovidWorry_bin`). \index{Functions in srvyr!survey\_count|(} ```{r} #| label: ambarom-worry-est1 From 035891851da6cfec83a988a17dc11b1c1b5e8b81 Mon Sep 17 00:00:00 2001 From: Isabella Velasquez Date: Mon, 19 Aug 2024 19:56:13 -0500 Subject: [PATCH 6/6] Edit question text --- 14-ambarom-vignette.Rmd | 12 ++++-------- 1 file changed, 4 insertions(+), 8 deletions(-) diff --git a/14-ambarom-vignette.Rmd b/14-ambarom-vignette.Rmd index 807c0a3..eb2415e 100644 --- a/14-ambarom-vignette.Rmd +++ b/14-ambarom-vignette.Rmd @@ -173,16 +173,14 @@ When calculating estimates from the data, we use the survey design object `ambar ### Example: Worry about COVID-19 -This survey was administered between March and August 2021, with the specific timing varying by country^[See table 2 in @lapop-tech for dates by country]. Given the state of the pandemic at that time, several questions about COVID-19 were included. The first question about COVID-19 asked was: +This survey was administered between March and August 2021, with the specific timing varying by country^[See table 2 in @lapop-tech for dates by country]. Given the state of the pandemic at that time, several questions about COVID-19 were included. According to the core questionnaire [@lapop-svy], the first question asked about COVID-19 was: -> Question text: "How worried are you about the possibility that you or someone in your household will get sick from coronavirus in the next 3 months?" +> "How worried are you about the possibility that you or someone in your household will get sick from coronavirus in the next 3 months?" > > | - Very worried > | - Somewhat worried > | - A little worried > | - Not worried at all -> -> Source: [@lapop-svy] If we are interested in those who are very worried or somewhat worried, we can create a new variable (`CovidWorry_bin`) that groups levels of the original question using the `fct_collapse()` function from the {forcats} package [@R-forcats]. We then use the `survey_count()` function to understand how responses are distributed across each category of the original variable (`CovidWorry`) and the new variable (`CovidWorry_bin`). \index{Functions in srvyr!survey\_count|(} @@ -244,17 +242,15 @@ covid_worry_country_ests_gt %>% ### Example: Education affected by COVID-19 -Respondents were also asked a question about how the pandemic affected education. This question was asked to households with children under the age of 13, and respondents could select more than one option, as follows: +In the core questionnaire [@lapop-svy], respondents were also asked a question about how the pandemic affected education. This question was asked to households with children under the age of 13, and respondents could select more than one option, as follows: -> Question text: "Did any of these children have their school education affected due to the pandemic?" +> "Did any of these children have their school education affected due to the pandemic?" > > | - No, because they are not yet school age or because they do not attend school for another reason > | - No, their classes continued normally > | - Yes, they went to virtual or remote classes > | - Yes, they switched to a combination of virtual and in-person classes > | - Yes, they cut all ties with the school -> -> Source: [@lapop-svy] Working with multiple-choice questions can be both challenging and interesting. Let's walk through how to analyze this question. If we are interested in the impact on education, we should focus on the data of those whose children are attending school. This means we need to exclude those who selected the first response option: "No, because they are not yet school age or because they do not attend school for another reason." To do this, we use the `Educ_NotInSchool` variable in the dataset, which has values of `0` and `1`. A value of `1` indicates that the respondent chose the first response option (none of the children are in school), and a value of `0` means that at least one of their children is in school. By filtering the data to those with a value of `0` (they have at least one child in school), we can consider only respondents with at least one child attending school.