-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path10_demographics.Rmd
119 lines (91 loc) · 3.49 KB
/
10_demographics.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
---
title: "Demographics"
output: html_document
---
This script is solely for displaying the demographics of the data.
We first declare ALL libraries up here. We make sure to have dplyr be the last one declared to prevent any errors.
```{r}
suppressWarnings(library(plyr))
suppressWarnings(library(ggplot2))
suppressWarnings(library(ggthemr))
library(tidyverse)
ggthemr('fresh', text_size = 20)
```
# Reading and Formatting the Data
Let's load the data.
```{r}
demo.sdat <- read_csv("z_demographics_data.csv")[,-1]
head(demo.sdat)
```
Group the participants based on their BMI according to known groupings.
```{r}
demo.sdat$bmi.group <- cut(demo.sdat$bmi, c(0,18.5,25,30,Inf))
levels(demo.sdat$bmi.group)[levels(demo.sdat$bmi.group)=="(0,18.5]"] <- "Underweight"
levels(demo.sdat$bmi.group)[levels(demo.sdat$bmi.group)=="(18.5,25]"] <- "Normal"
levels(demo.sdat$bmi.group)[levels(demo.sdat$bmi.group)=="(25,30]"] <- "Overweight"
levels(demo.sdat$bmi.group)[levels(demo.sdat$bmi.group)=="(30,Inf]"] <- "Obese"
```
# Demographics
In each of the plots below, we first summarize the data. For instance, for gender, we get the total number of participants who identified as male and female.
## Gender
It's pretty even so easy.
```{r, fig.width=4, fig.height=4}
sdat2 <- demo.sdat %>%
group_by(gender) %>%
summarise(n = n())
sdat2
ggplot(sdat2, aes(x=gender, y=n)) +
geom_bar(stat="Identity") +
xlab("") + ylab("Number of Participants") +
scale_y_continuous(expand = c(0,0)) +
theme(axis.line.x = element_blank(),
axis.ticks = element_blank(),
panel.grid.major.y = element_line(size=0.5, linetype='solid'),
axis.line.y = element_blank(),
panel.grid.major.x = element_blank())
```
## Age
Participants gave their ages in age bins (e.g., 18-25). We get the number of participants in each age bin.
```{r, fig.width=8, fig.height=4}
sdat2 <- data.frame(demo.sdat) %>%
group_by(age) %>%
summarise(n = n())
sdat2
ggplot(sdat2, aes(x=age, y=n)) +
geom_bar(stat="Identity") +
xlab("Age") + ylab("Number of Participants") + #ggtitle("Number of Participants by Age") +
scale_y_continuous(expand = c(0,0)) +
theme(axis.line.x = element_blank(),
axis.ticks = element_blank(),
panel.grid.major.y = element_line(size=0.5, linetype='solid'),
axis.line.y = element_blank(),
panel.grid.major.x = element_blank())
```
## BMI
We first plot the raw BMI values and then group them by underweight, overweight, obese, etc.
```{r, fig.width=8, fig.height=4}
ggplot(demo.sdat, aes(x=bmi)) +
geom_histogram(bins=20) +
xlab("BMI") + ylab("Number of Participants") + #ggtitle("Number of Participants by Age") +
scale_y_continuous(expand = c(0,0)) +
theme(axis.line.x = element_blank(),
axis.ticks = element_blank(),
panel.grid.major.y = element_line(size=0.5, linetype='solid'),
axis.line.y = element_blank(),
panel.grid.major.x = element_blank())
```
```{r, fig.width=8, fig.height=4}
sdat2 <- data.frame(demo.sdat) %>%
group_by(bmi.group) %>%
summarise(n = n())
sdat2
ggplot(sdat2, aes(x=bmi.group, y=n)) +
geom_histogram(stat ="Identity") +
xlab("BMI Group") + ylab("Number of Participants") + #ggtitle("Number of Participants by Age") +
scale_y_continuous(expand = c(0,0)) +
theme(axis.line.x = element_blank(),
axis.ticks = element_blank(),
panel.grid.major.y = element_line(size=0.5, linetype='solid'),
axis.line.y = element_blank(),
panel.grid.major.x = element_blank())
```