diff --git a/docs/lecs_html/06_hypothesis_testing.slides.html b/docs/lecs_html/06_hypothesis_testing.slides.html index 5ee8a25..84a4344 100644 --- a/docs/lecs_html/06_hypothesis_testing.slides.html +++ b/docs/lecs_html/06_hypothesis_testing.slides.html @@ -1,15900 +1,15764 @@ - - -
- - - - - - -library(tidyverse)
-library(infer)
-set.seed(2)
-
-data <- read_csv("https://moderndive.com/data/ageAtMar.csv")
-
-graduation <- rep_sample_n(data, size = 500, replace = F) %>% ungroup %>% select(-replicate)
-
Rows: 5534 Columns: 1 -── Column specification ──────────────────────────────────────────────────────── -Delimiter: "," -dbl (1): age - -ℹ Use `spec()` to retrieve the full column specification for this data. -ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message. --
Is the deck of cards fair?
- - -Let's assume that the deck of cards is fair. What assumption are we making about the value of the population proportion of red cards in the deck ($p$)?
-Assume $p$ = 0.5
- -Suppose we find $\hat p$ = 1!
- -Suppose Apple claims that the new Macbook Pro can work for more than 20 hours without a recharge. You suspect that the battery life is less than they claim. You randomly selects multiple Macbooks and measures how long they hold up without a recharge. You find the average time is 19 hours. Is there sufficient evidence to refute Apple's claim?
- - -In statistics, a hypothesis is a statement about a parameter. We carry out a hypothesis test to test the claim.
-Statement about the value of a population parameter whose general form is:
-$H_0$ : population parameter = specified value
- -E.g. In the Macbook example, we want to test
-$H_0: \mu = 20$ hours
-where $\mu$ is the true mean of the battery life of new Macbook Pros.
- -Statement that opposes the null hypothesis
-$H_A$ : population parameter $\ne$ specified value
-$H_A$ : population parameter $>$ specified value
-$H_A$ : population parameter $<$ specified value
- -E.g. In the Macbook example, $H_A: \mu <20$ hours (we are testing whether there is evidence to refute Apple's claim)
- -Suppose 500 randomly sampled Canadian adults recently completed a survey indicating their age of graduation from school. The ages are stored in the data set graduation
. Suppose from past surveys, it is known that the average age of graduation was 23 years. We are interested to see if the average in recent years has increased.
library(tidyverse)
-library(infer)
-
-head(graduation)
-
age |
---|
<dbl> |
24 |
16 |
22 |
23 |
18 |
22 |
Do the data suggest that the average age of graduation for Canadian adults has increased in recent years?
- -A. $H_0: \mu = 23 \; \;$ vs. $\; \; H_A: \mu \ne 23$
-B. $H_0: \mu = 23 \; \;$ vs. $\; \; H_A: \mu < 23$
-C. $H_0: \mu = 23 \; \;$ vs. $\; \; H_A: \mu > 23$
-where $\mu$ represents the mean age of graduation for all Canadian adults in recent years.
- -observed_sample_mean <- graduation %>% summarize(mean_age = mean(age)) %>% pull()
-observed_sample_mean
-
graduation %>% ggplot(aes(x = age)) +
- geom_histogram(binwidth = 3, color = "white") +
- theme(text = element_text(size=20))
-
set.seed(2018)
-boot_distn <- graduation %>%
- specify(response = age) %>%
- generate(type = "bootstrap", reps = 1000) %>%
- calculate(stat = "mean") %>%
- visualize() +
- geom_vline(xintercept = observed_sample_mean, color = "red", alpha=.3, lwd=2) +
- xlab("Mean Age") +
- theme(text = element_text(size=20))
-boot_distn
-
set.seed(2018)
-null_distn <- graduation %>%
- specify(response = age) %>%
- hypothesize(null = "point", mu = 23) %>%
- generate(reps = 1000) %>%
- calculate(stat = "mean")
-
-head(null_distn)
-
Setting `type = "bootstrap"` in `generate()`. - --
replicate | stat |
---|---|
<int> | <dbl> |
1 | 22.910 |
2 | 23.172 |
3 | 22.750 |
4 | 22.814 |
5 | 23.176 |
6 | 22.954 |
null_distn %>% visualize() +
- geom_vline(xintercept = observed_sample_mean, color = "red", lwd=2, alpha=.3) +
- geom_vline(xintercept = 23, color = "blue", lwd=2, alpha=.3) +
- xlab("Mean age") +
- theme(text = element_text(size=20))
-
$p$-value summarizes the evidence
-It describes how unusual the data would be if $H_0$ were true.
-$p$-value is defined as the probability of observing a result as extreme or more extreme towards the alternative hypothesis than what we observed given that $H_0$ is true.
-# plot the null model with shaded p-value
-null_model_plot <- null_distn %>%
- visualize() +
- shade_p_value(obs_stat = observed_sample_mean, direction = "greater") +
- geom_vline(xintercept = 23, color = "blue", lwd=2) +
- xlab("Mean age") +
- theme(text = element_text(size=20))
-null_model_plot
-
head(null_distn)
-
replicate | stat |
---|---|
<int> | <dbl> |
1 | 22.910 |
2 | 23.172 |
3 | 22.750 |
4 | 22.814 |
5 | 23.176 |
6 | 22.954 |
pvalue <- null_distn %>%
- get_pvalue(obs_stat = observed_sample_mean, direction = "greater")
-pvalue
-
-sum(null_distn$stat > observed_sample_mean)/1000
-
p_value |
---|
<dbl> |
0.01 |
The significance level is a predetermined number such that we reject $H_0$ if the $p$-value is less than or equal to that number
-In practice, the common significance levels are $\alpha=0.01$, $0.05$ or $0.10$
-We estimate the $p$-value to be 0.01.
-Suppose $\alpha = 0.05$, then we reject the null hypothesis. -There is evidence that the true average age of graduation of Canadian adults in recent years is greater than the previously documented value of 23 years.
- -For example, for a right-tailed test with $\alpha = 0.05$:
-Suppose we have a null model below
- - -A. Type I error
-B. Type II error
-C. Neither
- -worksheet_06
If you get stuck:
-Come back at 3:06 PM and Parsa will lead you through tutorial_06
library(tidyverse)
+library(infer)
+set.seed(2)
+
+data <- read_csv("https://moderndive.com/data/ageAtMar.csv")
+data
+graduation <- rep_sample_n(data, size = 500, replace = F) %>% ungroup %>% select(-replicate)
+graduation
+
Is the deck of cards fair?
+ + +Let's assume that the deck of cards is fair. What assumption are we making about the value of the population proportion of red cards in the deck ($p$)?
+Assume $p$ = 0.5
+ +Suppose we find $\hat p$ = 1!
+ +Suppose Apple claims that the new Macbook Pro can work for more than 20 hours without a recharge. You suspect that the battery life is less than they claim. You randomly selects multiple Macbooks and measures how long they hold up without a recharge. You find the average time is 19 hours. Is there sufficient evidence to refute Apple's claim?
+ + +In statistics, a hypothesis is a statement about a parameter. We carry out a hypothesis test to test the claim.
+Statement about the value of a population parameter whose general form is:
+$H_0$ : population parameter = specified value
+E.g. In the Macbook example, we want to test
+$H_0: \mu = 20$ hours
where $\mu$ is the true mean of the battery life of new Macbook Pros.
+ +Statement that opposes the null hypothesis
+$H_A$ : population parameter $\ne$ specified value
$H_A$ : population parameter $>$ specified value
$H_A$ : population parameter $<$ specified value
+E.g. In the Macbook example, $H_A: \mu <20$ hours (we are testing whether there is evidence to refute Apple's claim)
+ +Suppose 500 randomly sampled Canadian adults recently completed a survey indicating their age of graduation from school. The ages are stored in the data set graduation
. Suppose from past surveys, it is known that the average age of graduation was 23 years. We are interested to see if the average in recent years has increased.
library(tidyverse)
+library(infer)
+
+head(graduation)
+
Do the data suggest that the average age of graduation for Canadian adults has increased in recent years?
+ +A. $H_0: \mu = 23 \; \;$ vs. $\; \; H_A: \mu \ne 23$
+B. $H_0: \mu = 23 \; \;$ vs. $\; \; H_A: \mu < 23$
+C. $H_0: \mu = 23 \; \;$ vs. $\; \; H_A: \mu > 23$
+where $\mu$ represents the mean age of graduation for all Canadian adults in recent years.
+ +observed_sample_mean <- graduation %>% summarize(mean_age = mean(age)) %>% pull()
+observed_sample_mean
+
graduation %>% ggplot(aes(x = age)) +
+ geom_histogram(binwidth = 3, color = "white") +
+ theme(text = element_text(size=20))
+
set.seed(2018)
+boot_distn <- graduation %>%
+ specify(response = age) %>%
+ generate(type = "bootstrap", reps = 1000) %>%
+ calculate(stat = "mean") %>%
+ visualize() +
+ geom_vline(xintercept = observed_sample_mean, color = "red", alpha=.3, lwd=2) +
+ xlab("Mean Age") +
+ theme(text = element_text(size=20))
+boot_distn
+
set.seed(2018)
+null_distn <- graduation %>%
+ specify(response = age) %>%
+ hypothesize(null = "point", mu = 23) %>%
+ generate(reps = 1000) %>%
+ calculate(stat = "mean")
+
+head(null_distn)
+
##### plot the null model
+null_distn %>% visualize() +
+ geom_vline(xintercept = observed_sample_mean, color = "red", lwd=2, alpha=.3) +
+ geom_vline(xintercept = 23, color = "blue", lwd=2, alpha=.3) +
+ xlab("Mean age") +
+ theme(text = element_text(size=20))
+
$p$-value summarizes the evidence
+It describes how unusual the data would be if $H_0$ were true.
+$p$-value is defined as the probability of observing a result as extreme or more extreme towards the alternative hypothesis than what we observed given that $H_0$ is true.
+# plot the null model with shaded p-value
+null_model_plot <- null_distn %>%
+ visualize() +
+ shade_p_value(obs_stat = observed_sample_mean, direction = "greater") +
+ geom_vline(xintercept = 23, color = "blue", lwd=2) +
+ xlab("Mean age") +
+ theme(text = element_text(size=20))
+null_model_plot
+
head(null_distn)
+
# P-value
+
+pvalue <- null_distn %>%
+ get_pvalue(obs_stat = observed_sample_mean, direction = "greater")
+pvalue
+
+sum(null_distn$stat > observed_sample_mean)/1000
+
The significance level is a predetermined number such that we reject $H_0$ if the $p$-value is less than or equal to that number
+In practice, the common significance levels are $\alpha=0.01$, $0.05$ or $0.10$
+
We should choose the significance level ahead of time.
+We estimate the $p$-value to be 0.01.
+Suppose $\alpha = 0.05$, then we reject the null hypothesis. +There is evidence that the true average age of graduation of Canadian adults in recent years is greater than the previously documented value of 23 years.
+ +For example, for a right-tailed test with $\alpha = 0.05$:
+Suppose we have a null model below +
+ +A. Type I error
+B. Type II error
+C. Neither
+ +Power is the probability of correctly rejecting the null hypothesis $H_0$, when $H_0$ is false
+power $= P(\text{Reject } H_0 \text{ when } H_0 \text{ is false}) = 1 - \beta $
+ +$H_0: \mu_1-\mu_2 = \Delta_0 $ vs $ H_a: \mu_1-\mu_2 < \Delta_0 \Leftarrow $ left-tailed test
+$H_0: \mu_1-\mu_2 = \Delta_0 $ vs $ H_a: \mu_1-\mu_2 > \Delta_0 \Leftarrow $ right-tailed test
+$H_0: \mu_1-\mu_2 = \Delta_0 $ vs $ H_a: \mu_1-\mu_2 \neq \Delta_0 \Leftarrow $ two-tailed test
+where $\Delta_0$ is the hypothesized value of the population mean.
+ +$H_0: \mu_1 = \mu_2 \;$ vs $\; H_a: \mu_1 < \mu_2$
+Then we can give hypotheses as
+$H_0: \mu_1-\mu_2 = 0 \;$ vs $\; H_a: \mu_1-\mu_2 < 0$
+in this case $\Delta_0$ =0
+ +worksheet_06
If you get stuck:
+infer
package¶image source: Modern Dive by Kim & McConville
+ +A hypothesis test is like an argument by contradiction. +
+ +A hypothesis test is like an argument by contradiction.
+ + +A hypothesis test is like an argument by contradiction.
+ + +