forked from fri-datascience/course_pou
-
Notifications
You must be signed in to change notification settings - Fork 0
/
18-distributions_intuition.Rmd
667 lines (514 loc) · 26.7 KB
/
18-distributions_intuition.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
# Distributions intutition
This chapter is intended to help you familiarize yourself with the different
probability distributions you will encounter in this course.
You will need to use [Appendix B](#distributions) extensively as a reference for the basic properties of distributions, so keep it close!
```{r, echo = FALSE}
togs <- T
tog_ex <- T
```
<style>
.fold-btn {
float: right;
margin: 5px 5px 0 0;
}
.fold {
border: 1px solid black;
min-height: 40px;
}
</style>
<script type="text/javascript">
$(document).ready(function() {
$folds = $(".fold");
$folds.wrapInner("<div class=\"fold-blck\">"); // wrap a div container around content
$folds.prepend("<button class=\"fold-btn\">Unfold Solution</button>"); // add a button
$(".fold-blck").toggle(); // fold all blocks
$(".fold-btn").on("click", function() { // add onClick event
$(this).text($(this).text() === "Fold Solution" ? "Unfold Solution" : "Fold Solution"); // if the text equals "Fold", change it to "Unfold"or else to "Fold"
$(this).next(".fold-blck").toggle("linear"); // "swing" is the default easing function. This can be further customized in its speed or the overall animation itself.
})
});
</script>
## Discrete distributions
```{exercise, name = "Bernoulli intuition 1"}
The simplest distribution you will encounter is the Bernoulli distribution.
It is a discrete probability distribution used to represent the outcome of a yes/no
question. It has one parameter $0 \leq p \leq 1$, which is the probability of success. The
probability of failure is $q = (1-p)$.
A classic way to think about a Bernoulli trial (a yes/no experiment) is a coin
flip. Real coins are fair, meaning the probability of either heads (1)
or tails (0) are the same, so $p=0.5$ as shown below in *figure a*. Alternatively
we may want to represent a process that doesn't have equal probabilities of outcomes
like "Will a throw of a fair die result in a 6?". In this case $p=\frac{1}{6}$,
shown in *figure b*.
Using your knowledge of the Bernoulli distribution use the throw of a fair die
to think of events, such that:
a) $p = 0.5$
b) $p = \frac{5}{6}$
c) $q = \frac{2}{3}$
```
```{r, echo=FALSE, warning=FALSE, message=FALSE}
library(ggplot2)
# Create data
df <- data.frame(
outcome = rep(c(0, 1), 2),
probability = c(1-0.5, 0.5, 1-1/6, 1/6),
p_value = factor(rep(c("a", "b"), each=2))
)
# Plot
ggplot(df, aes(x=factor(outcome), y=probability)) +
geom_segment(aes(xend=factor(outcome), yend=0), color='black') +
geom_point(color="red", size=3) +
coord_cartesian(ylim = c(0, 1)) +
labs(y = "Probability", x = "Outcome") +
theme(legend.position="none") +
facet_wrap(~p_value, ncol=2, scales = "free_x", labeller = label_parsed)
```
<div class="fold">
```{solution, echo = togs}
a. An event that is equally likely to happen or not happen i.e. $p = 0.5$ would be
throwing an even number. More formally we can name this event $A$ and write:
$A = \{2,4,6\}$, its probability being $P(A) = 0.5$
b. An example of an event with $p = \frac{5}{6}$ would be throwing a number
greater than 1. Defined as $B = \{2,3,4,5,6\}$.
c. We need an event that fails $\frac{2}{3}$ of the time. Alternatively we can
reverse the problem and find an event that succeeds $\frac{1}{3}$ of the time,
since: $q = 1 - p \implies p = 1 - q = \frac{1}{3}$. The event that our outcome
is divisible by 3: $C = \{3, 6\}$ satisfies this condition.
```
</div>
```{exercise, name = "Binomial intuition 1"}
The binomial distribution is a generalization of the Bernoulli distribution.
Instead of considering a single Bernoulli trial, we now consider a sum of a sequence of $n$ trials,
which are independent and have the same parameter $p$. So the binomial distribution
has two parameters $n$ - the number of trials and $p$ - the probability of success
for each trial.
If we return to our coin flip representation, we now flip a coin several times.
The binomial distribution will give us the probabilities of all possible outcomes.
Below we show the distribution for a series of 10 coin flips with a fair coin
(left) and a biased coin (right). The numbers on the x axis represent the
number of times the coin landed heads.
Using your knowledge of the binomial distribution:
a. Take the [pmf of the binomial distribution](#distributions) and plug in $n=1$,
check that it is in fact equivalent to a Bernoulli distribution.
b. In our examples we show the graph of a binomial distribution over 10 trials with
$p=0.8$. If we take a look at the graph, it appears as though the probabilities of getting 0,1, 2 or 3
heads in 10 flips are zero. Is it actually zero? Check by plugging in the values
into the pmf.
```
<div class="fold">
```{solution, echo = togs}
a. The pmf of a binomial distribution is $\binom{n}{k} p^k (1 - p)^{n - k}$, now
we insert $n=1$ to get:
$$\binom{1}{k} p^k (1 - p)^{1 - k}$$
Not quite equivalent to
a Bernoulli, however note that the support of the binomial distribution is
defined as $k \in \{0,1,\dots,n\}$, so in our case $k = \{0,1\}$, then:
$$\binom{1}{0} = \binom{1}{1} = 1$$
we get: $p^k (1 - p)^{1 - k}$ ,the Bernoulli distribution.
b. As we already know $p=0.8, n=10$, so:
$$\binom{10}{0} 0.8^0 (1 - 0.8)^{10 - 0} = 1.024 \cdot 10^{-7}$$
$$\binom{10}{1} 0.8^1 (1 - 0.8)^{10 - 1} = 4.096 \cdot 10^{-6}$$
$$\binom{10}{2} 0.8^2 (1 - 0.8)^{10 - 2} = 7.3728 \cdot 10^{-5}$$
$$\binom{10}{3} 0.8^3 (1 - 0.8)^{10 - 3} = 7.86432\cdot 10^{-4}$$
So the probabilities are not zero, just very small.
```
</div>
```{r, echo=FALSE, warning=FALSE, message=FALSE}
library(ggplot2)
# Data for binomial distribution
df1 <- data.frame(
outcome = 0:10,
probability = dbinom(0:10, size=10, prob=0.5),
p_value = "p = 0.5"
)
df2 <- data.frame(
outcome = 0:10,
probability = dbinom(0:10, size=10, prob=0.8),
p_value = "p = 0.8"
)
df <- rbind(df1, df2)
ggplot(df, aes(x=factor(outcome), y=probability)) +
geom_segment(aes(xend=factor(outcome), yend=0), color='black') +
geom_point(color="red", size=2) +
labs(y = "Probability", x = "Outcome") +
theme(legend.position="none") +
facet_wrap(~p_value, ncol=2, scales = "free_x")
```
```{exercise, name = "Poisson intuition 1"}
Below are shown 3 different graphs of the Poisson distribution. Your task
is to replicate them on your own in R by varying the $\lambda$ parameter.
Hint: You can use dpois() to get the probabilities.
```
```{r, echo=FALSE, warning=FALSE, message=FALSE}
library(ggplot2)
library(gridExtra)
x = 0:15
# Create Poisson data
data1 <- data.frame(x = x, y = dpois(x, lambda = 0.1))
data2 <- data.frame(x = x, y = dpois(x, lambda = 1))
data3 <- data.frame(x = x, y = dpois(x, lambda = 7.5))
# Create individual ggplot objects
plot1 <- ggplot(data1, aes(x, y)) + geom_col() +
xlab("x") + ylab("Probability") + ylim(0,1)
plot2 <- ggplot(data2, aes(x, y)) + geom_col() +
xlab("x") + ylab(NULL) + ylim(0,1)
plot3 <- ggplot(data3, aes(x, y)) + geom_col() +
xlab("x") + ylab(NULL) + ylim(0,1)
# Combine the plots
grid.arrange(plot1, plot2, plot3, ncol = 3)
```
<div class="fold">
```{r, echo = tog_ex, eval=FALSE}
library(ggplot2)
library(gridExtra)
x = 0:15
# Create Poisson data
data1 <- data.frame(x = x, y = dpois(x, lambda = 0.1))
data2 <- data.frame(x = x, y = dpois(x, lambda = 1))
data3 <- data.frame(x = x, y = dpois(x, lambda = 7.5))
# Create individual ggplot objects
plot1 <- ggplot(data1, aes(x, y)) + geom_col() +
xlab("x") + ylab("Probability") + ylim(0,1)
plot2 <- ggplot(data2, aes(x, y)) + geom_col() +
xlab("x") + ylab(NULL) + ylim(0,1)
plot3 <- ggplot(data3, aes(x, y)) + geom_col() +
xlab("x") + ylab(NULL) + ylim(0,1)
# Combine the plots
grid.arrange(plot1, plot2, plot3, ncol = 3)
```
</div>
```{exercise, name = "Poisson intuition 2"}
The Poisson distribution is a discrete probability distribution that models the
probability of a given number of events occuring within processes where events
occur at a constant mean rate and independently of each other - a **Poisson process**.
It has a single parameter $\lambda$, which represents the constant mean rate.
A classic example of a scenario that can be modeled using the Poisson distribution
is the number of calls received at a call center in a day (or in fact any other
time interval).
Suppose you work in a call center and have some understanding of probability
distributions. You overhear your supervisor mentioning that the call center
receives an average of 2.5 calls per day. Using your knowledge of the Poisson
distribution, calculate:
a. The probability you will get no calls today.
b. The probability you will get more than 5 calls today.
```
<div class="fold">
```{solution, echo = togs}
First recall the Poisson pmf: $$p(k) = \frac{\lambda^k e^{-\lambda}}{k!}$$
as stated previously our parameter $\lambda = 2.5$
a. To get the probability of no calls we simply plug in $k = 0$, so: $$p(0) = \frac{2.5^0 e^{-2.5}}{0!} = e^{-2.5} \approx 0.082$$
b. The support of the Poisson distribution is non-negative integers. So if we wanted
to calculate the probability of getting more than 5 calls we would need to add up
the probabilities of getting 6 calls and 7 calls and so on up to infinity.
Let us instead remember that the sum of all probabilties will be 1, we will
reverse the problem and instead ask "What is the probability we get 5 calls or less?".
We can subtract the probability of the opposite outcome (the complement) from 1
to get the probability of our original question.
$$P(k > 5) = 1 - P(k \leq 5)$$
$$P(k \leq 5) = \sum_{i=0}^{5} p(i) = p(0) + p(1) + p(2) + p(3) + p(4) + p(5) =$$
$$= \frac{2.5^0 e^{-2.5}}{0!} + \frac{2.5^1 e^{-2.5}}{1!} + \dots =$$
$$=0.957979$$
So the probability of geting more than 5 calls will be $1 - 0.957979 = 0.042021$
```
</div>
```{exercise, name = "Geometric intuition 1"}
The geometric distribution is a discrete distribution that models the **number of
failures** before the first success in a sequence of independent Bernoulli trials.
It has a single parameter $p$, representing the probability of success and its
support is all non-negative integers $\{0,1,2,\dots\}$.
NOTE: There is an alternative way to think about this distribution, one that models
the **number of trials** before the first success. The
difference is subtle yet significant and you are likely to encounter both forms.
The key to telling them apart is to check their support, since the number of trials
has to be at least $1$, for this case we have $\{1,2,\dots\}$.
In the graph below we show the pmf of a geometric distribution with $p=0.5$. This
can be thought of as the number of successive failures (tails) in the flip of a fair coin.
You can see that there's a 50% chance you will have zero failures i.e. you will
flip a heads on your very first attempt. But there is some smaller chance that you
will flip a sequence of tails in a row, with longer sequences having ever lower
probability.
a) Create an equivalent graph that represents the probability of rolling a 6 with a fair 6-sided die.
b) Use the formula for the [mean](#distributions) of the geometric distribution and determine the average number of **failures** before you roll a 6.
c) Look up the alternative form of the geometric distribtuion and again use the formula for the mean to determine the average number of **trials** up to and including rolling a 6.
```
```{r, echo=FALSE, warning=FALSE, message=FALSE}
library(ggplot2)
# Parameters
p <- 0.5
x_vals <- 0:9 # Starting from 0
probs <- dgeom(x_vals, p)
# Data
data <- data.frame(x_vals, probs)
# Plot
ggplot(data, aes(x=x_vals, y=probs)) +
geom_segment(aes(xend=x_vals, yend=0), color="black", size=1) +
geom_point(color="red", size=2) +
labs(x = "Number of trials", y = "Probability") +
theme_minimal() +
scale_x_continuous(breaks = x_vals) # This line ensures integer x-axis labels
```
<div class="fold">
```{solution, echo = togs}
Parameter p (the probability of success) for rolling a 6 is $p=\frac{1}{6}$.
a)
```
```{r, echo=TRUE, warning=FALSE, message=FALSE}
library(ggplot2)
# Parameters
p <- 1/6
x_vals <- 0:9 # Starting from 0
probs <- dgeom(x_vals, p)
# Data
data <- data.frame(x_vals, probs)
# Plot
ggplot(data, aes(x=x_vals, y=probs)) +
geom_segment(aes(xend=x_vals, yend=0), color="black", size=1) +
geom_point(color="red", size=2) +
labs(x = "Number of trials", y = "Probability") +
theme_minimal() +
scale_x_continuous(breaks = x_vals) # This line ensures integer x-axis labels
```
```{solution, echo = togs}
b) The expected value of a random variable (the mean) is denoted as $E[X]$.
$$E[X] = \frac{1-p}{p}= \frac{1- \frac{1}{6}}{\frac{1}{6}} = \frac{5}{6}\cdot 6 = 5$$
On average we will fail 5 times before we roll our first 6.
c) The alternative form of this distribution (with support on all positive integers) has a slightly different formula for the mean. This change reflects the difference in the way we posed our question:
$$E[X] = \frac{1}{p} = \frac{1}{\frac{1}{6}} = 6$$
On average we will have to throw the die 6 times before we roll a 6.
```
</div>
## Continuous distributions
```{exercise, name = "Uniform intuition 1"}
The need for a randomness is a common problem. A practical solution are so-called random number generators (RNGs). The simplest RNG one would think of is choosing a set of numbers and having the generator return a number at random, where the probability of returning any number from this set is the same. If this set is an interval of real numbers, then we've basically described the continuous uniform distribution.
It has two parameters $a$ and $b$, which define the beginning and end of its support respectively.
a) Let's think about the mean intuitively. Think of the area under the graph as a geometric shape. The expected value or mean of a distribution is the x-axis value of its center of mass. Given parameters $a$ and $b$ what is your intuitive guess of the mean for the uniform distribution?
b) A special case of the uniform distribution is the **standard uniform distribution** with $a=0$ and $b=1$. Write the pdf $f(x)$ of this particular distribution.
```
```{r, fig.width=5, fig.height=3, echo=FALSE, warning=FALSE, message=FALSE}
# Load required libraries
library(ggplot2)
# Create data for a uniform distribution
df <- data.frame(x = c("a", "b"), y = c(1, 1))
# Plot
p <- ggplot(df, aes(x, y)) +
geom_line(aes(group = 1), color = "blue") +
geom_point(size = 2) +
geom_segment(aes(x = "a", xend = "a", y = 0, yend = 1), linetype = "dotted") +
geom_segment(aes(x = "b", xend = "b", y = 0, yend = 1), linetype = "dotted") +
labs(x = "Value", y = "Density", title = "Uniform Distribution") +
theme_minimal() +
theme(axis.text.y = element_blank(),
axis.ticks.y = element_blank()) +
annotate("text", x = "a", y = 1, label = "1/(b-a)", hjust = 1.2, vjust = 0.5) +
annotate("text", x = "a", y = 0, label = "0", hjust = 1.5, vjust = 0.5)
print(p)
```
<div class="fold">
```{solution, echo = togs}
a. The center of mass is the center of the square from $a$ to $b$ and from 0 to $\frac{1}{b-a}$. Its value on the x-axis is the midpoint between $a$ and $b$, so $\frac{a+b}{2}$
b. Inserting the parameter values we get:$$f(x) =
\begin{cases}
1 & \text{if } 0 \leq x \leq 1 \\
0 & \text{otherwise}
\end{cases}
$$
Notice how the pdf is just a constant $1$ across all values of $x \in [0,1]$. Here it is important to distinguish between probability and **probability density**. The density may be 1, but the probability is not and while discrete distributions never exceed 1 on the y-axis, continuous distributions can go as high as you like.
```
</div>
```{exercise, name = "Normal intuition 1"}
The normal distribution, also known as the Gaussian distribution, is a continuous distribution that encompasses the entire real number line. It has two parameters: the mean, denoted by $\mu$, and the variance, represented by $\sigma^2$. Its shape resembles the iconic bell curve. The position of its peak is determined by the parameter $\mu$, while the variance determines the spread or width of the curve. A smaller variance results in a sharper, narrower peak, while a larger variance leads to a broader, more spread-out curve.
Below, we graph the distribution of IQ scores for two different populations.
We aim to identify individuals with an IQ at or above 140 for an experiment. We can identify them reliably; however, we only have time to examine one of the two groups. Which group should we investigate to have the best chance of finding such individuals?
NOTE: The graph below displays the parameter $\sigma$, which is the square root of the variance, more commonly referred to as the **standard deviation**. Keep this in mind when solving the problems.
a) Insert the values of either population into the pdf of a normal distribution and determine which one has a higher density at $x=140$.
b) Generate the graph yourself and zoom into the relevant area to graphically verify your answer.
To determine probability density, we can use the pdf. However, if we wish to know the proportion of the population that falls within certain parameters, we would need to integrate the pdf. Fortunately, the integrals of common distributions are well-established. This integral gives us the cumulative distribution function $F(x)$ (CDF).
c) BONUS: Look up the CDF of the normal distribution and input the appropriate values to determine the percentage of each population that comprises individuals with an IQ of 140 or higher.
```
```{r, echo=FALSE, warning=FALSE, message=FALSE}
library(ggplot2)
library(tidyr)
# Create data
x <- seq(70, 135, by = 0.01) # Adjusting the x range to account for the larger standard deviations
df <- data.frame(x = x)
# Define the IQ distributions
df$IQ_mu100_sd10 <- dnorm(df$x, mean = 100, sd = 10)
df$IQ_mu105_sd8 <- dnorm(df$x, mean = 105, sd = 8)
# Convert from wide to long format for ggplot2
df_long <- gather(df, distribution, density, -x)
# Ensure the levels of the 'distribution' factor match our desired order
df_long$distribution <- factor(df_long$distribution, levels = c("IQ_mu100_sd10", "IQ_mu105_sd8"))
# Plot
ggplot(df_long, aes(x = x, y = density, color = distribution)) +
geom_line() +
labs(x = "IQ Score", y = "Density") +
scale_color_manual(
name = "IQ Distribution",
values = c(IQ_mu100_sd10 = "red", IQ_mu105_sd8 = "blue"),
labels = c("Group 1 (µ=100, σ=10)", "Group 2 (µ=105, σ=8)")
) +
theme_minimal()
```
<div class="fold">
```{solution, echo = togs}
a. Group 1: $\mu = 100, \sigma=10 \rightarrow \sigma^2 = 100$
$$\frac{1}{\sqrt{2 \pi \sigma^2}} e^{-\frac{(x - \mu)^2}{2 \sigma^2}} =
\frac{1}{\sqrt{2 \pi 100}} e^{-\frac{(140 - 100)^2}{2 \cdot 100}} \approx 1.34e-05$$
Group 2: $\mu = 105, \sigma=8 \rightarrow \sigma^2 = 64$
$$\frac{1}{\sqrt{2 \pi \sigma^2}} e^{-\frac{(x - \mu)^2}{2 \sigma^2}} =
\frac{1}{\sqrt{2 \pi 64}} e^{-\frac{(140 - 105)^2}{2 \cdot 64}} \approx 3.48e-06$$
So despite the fact that group 1 has a lower average IQ, we are more likely to find 140 IQ individuals in this group.
```
b.
```{r, echo=togs}
library(ggplot2)
library(tidyr)
# Create data
x <- seq(135, 145, by = 0.01) # Adjusting the x range to account for the larger standard deviations
df <- data.frame(x = x)
# Define the IQ distributions
df$IQ_mu100_sd10 <- dnorm(df$x, mean = 100, sd = 10)
df$IQ_mu105_sd8 <- dnorm(df$x, mean = 105, sd = 8)
# Convert from wide to long format for ggplot2
df_long <- gather(df, distribution, density, -x)
# Ensure the levels of the 'distribution' factor match our desired order
df_long$distribution <- factor(df_long$distribution, levels = c("IQ_mu100_sd10", "IQ_mu105_sd8"))
# Plot
ggplot(df_long, aes(x = x, y = density, color = distribution)) +
geom_line() +
labs(x = "IQ Score", y = "Density") +
scale_color_manual(
name = "IQ Distribution",
values = c(IQ_mu100_sd10 = "red", IQ_mu105_sd8 = "blue"),
labels = c("Group 1 (µ=100, σ=10)", "Group 2 (µ=105, σ=8)")
) +
theme_minimal()
```
```{solution, echo = togs}
c. The CDF of the normal distribution is $\Phi(x) = \frac{1}{2} \left[ 1 + \text{erf} \left( \frac{x - \mu}{\sigma \sqrt{2}} \right) \right]$. The CDF is defined as the integral of the distribution density up to x. So to get the total percentage of individuals with IQ at 140 or higher we will need to subtract the value from 1.
Group 1: $$1 - \Phi(140) = \frac{1}{2} \left[ 1 + \text{erf} \left( \frac{140 - 100}{10 \sqrt{2}} \right) \right] \approx 3.17e-05 $$
Group 2 : $$1 - \Phi(140) = \frac{1}{2} \left[ 1 + \text{erf} \left( \frac{140 - 105}{8 \sqrt{2}} \right) \right] \approx 6.07e-06 $$
So roughly 0.003% and 0.0006% of individuals in groups 1 and 2 respectively have an IQ at or above 140.
```
</div>
```{exercise, name = "Beta intuition 1"}
The beta distribution is a continuous distribution defined on the unit interval $[0,1]$. It has two strictly positive paramters $\alpha$ and $\beta$, which determine its shape. Its support makes it especially suitable to model distribtuions of percentages and proportions.
Below you've been provided with some code that you can copy into Rstudio. Once you run the code, an interactive Shiny app will appear and you will be able to manipulate the graph of the beta distribution.
Play around with the parameters to get:
a) A symmetric bell curve
b) A bowl-shaped curve
c) The standard uniform distribution is actually a special case of the beta distribution. Find the exact parameters $\alpha$ and $\beta$. Once you do, prove the equality by inserting the values into our pdf.
*Hint*: The beta function is evaluated as $\text{B}(a,b) = \frac{\Gamma(a)\Gamma(b)}{\Gamma(a+b)}$,
the gamma function for **positive integers** $n$ is evaluated as $\Gamma(n)= (n-1)!$
```
```
# Install and load necessary packages
install.packages(c("shiny", "ggplot2"))
library(shiny)
library(ggplot2)
# The Shiny App
ui <- fluidPage(
titlePanel("Beta Distribution Viewer"),
sidebarLayout(
sidebarPanel(
sliderInput("alpha", "Alpha:", min = 0.1, max = 10, value = 2, step = 0.1),
sliderInput("beta", "Beta:", min = 0.1, max = 10, value = 2, step = 0.1)
),
mainPanel(
plotOutput("betaPlot")
)
)
)
server <- function(input, output) {
output$betaPlot <- renderPlot({
x <- seq(0, 1, by = 0.01)
y <- dbeta(x, shape1 = input$alpha, shape2 = input$beta)
ggplot(data.frame(x = x, y = y), aes(x = x, y = y)) +
geom_line() +
labs(x = "Value", y = "Density") +
theme_minimal()
})
}
shinyApp(ui = ui, server = server)
```
<div class="fold">
```{solution, echo = togs}
a) Possible solution $\alpha = \beta= 5$
b) Possible solution $\alpha = \beta= 0.5$
c) The correct parameters are $\alpha = 1, \beta=1$, to prove the equality we insert them into the beta pdf:
$$\frac{x^{\alpha - 1} (1 - x)^{\beta - 1}}{\text{B}(\alpha, \beta)} =
\frac{x^{1 - 1} (1 - x)^{1 - 1}}{\text{B}(1, 1)} =
\frac{1}{\frac{\Gamma(1)\Gamma(1)}{\Gamma(1+1)}}=
\frac{1}{\frac{(1-1)!(1-1)!}{(2-1)!}} = 1$$
```
</div>
```{exercise, name = "Exponential intuition 1"}
The exponential distribution represents the distributon of **time between events** in a Poisson process. It is the continuous analogue of the geometric distribution.
It has a single parameter $\lambda$, which is strictly positive and represents the *constant rate* of the corresponding Poisson process. The support is all positive reals, since time between events is non-negative, but not bound upwards.
Let's revisit the call center from our Poisson problem. We get 2.5 calls per day on average, this is our rate parameter $\lambda$. A work day is 8 hours.
a) What is the mean time between phone calls?
The cdf $F(x)$ tells us what percentage of calls occur within x amount of time of each other.
b) You want to take an hour long lunch break but are worried about missing calls. Calculate the probability of missing at least one call if you're gone for an hour. Hint: The cdf is $F(x) = \int_{-\infty}^{x} f(x) dx$
```
<div class="fold">
```{solution, echo = togs}
a. Taking $\lambda = \frac{2.5 \text{ calls}}{8 \text{ hours}} = \frac{1 \text{ call}}{3.2 \text{ hours}}$
$$E[X] = \frac{1}{\lambda} = \frac{3.2 \text{ hours}}{\text{call}}$$
b. First we derive the CDF, we can integrate from 0 instead of $-\infty$, since we have no support in the negatives:
\begin{align}
F(x) &= \int_{0}^{x} \lambda e^{-\lambda t} dt \\
&= \lambda \int_{0}^{x} e^{-\lambda t} dt \\
&= \lambda (\frac{1}{-\lambda}e^{-\lambda t} |_{0}^{x}) \\
&= \lambda(\frac{1}{\lambda} - \frac{1}{\lambda} e^{-\lambda x}) \\
&= 1 - e^{-\lambda x}.
\end{align}
Then we just evaluate it for a time of 1 hour:
$$F(1 \text{ hour}) = 1 - e^{-\frac{1 \text{ call}}{3.2 \text{ hours}} \cdot 1 \text{ hour}}=
1 - e^{-\frac{1 \text{ call}}{3.2 \text{ hours}}} \approx 0.268$$
So we have about a 27% chance of missing at least one call if we're gone for an hour.
```
</div>
```{exercise, name = "Gamma intuition 1"}
The gamma distribution is a continuous distribution with by two parameters, $\alpha$ and $\beta$, both greater than 0. These parameters afford the distribution a broad range of shapes, leading to it being commonly referred to as a *family of distributions*. Given its support over the positive real numbers, it is well suited for modeling a diverse range of positive-valued phenomena.
a) The exponential distribution is actually just a particular form of the gamma distribution. What are the values of $\alpha$ and $\beta$?
b) Copy the code from our beta distribution Shiny app and modify it to simulate the gamma distribution. Then get it to show the exponential.
```
<div class="fold">
```{solution, echo = togs}
a. Let's start by taking a look at the pdfs of the two distributions side by side:
$$\frac{\beta^\alpha}{\Gamma(\alpha)} x^{\alpha - 1}e^{-\beta x} = \lambda e^{-\lambda x}$$
The $x^{\alpha - 1}$ term is not found anywhere in the pdf of the exponential so we need to eliminate it by setting $\alpha = 1$. This also makes the fraction evaluate to $\frac{\beta^1}{\Gamma(1)} = \beta$, which leaves us with $$\beta \cdot e^{-\beta x}$$
Now we can see that $\beta = \lambda$ and $\alpha = 1$.
b.
```
```{r, echo = tog_ex, eval=FALSE}
# Install and load necessary packages
install.packages(c("shiny", "ggplot2"))
library(shiny)
library(ggplot2)
# The Shiny App
ui <- fluidPage(
titlePanel("Gamma Distribution Viewer"),
sidebarLayout(
sidebarPanel(
sliderInput("shape", "Shape (α):", min = 0.1, max = 10, value = 2, step = 0.1),
sliderInput("scale", "Scale (β):", min = 0.1, max = 10, value = 2, step = 0.1)
),
mainPanel(
plotOutput("gammaPlot")
)
)
)
server <- function(input, output) {
output$gammaPlot <- renderPlot({
x <- seq(0, 25, by = 0.1)
y <- dgamma(x, shape = input$shape, scale = input$scale)
ggplot(data.frame(x = x, y = y), aes(x = x, y = y)) +
geom_line() +
labs(x = "Value", y = "Density") +
theme_minimal()
})
}
shinyApp(ui = ui, server = server)
```
</div>