forked from fri-datascience/course_pou
-
Notifications
You must be signed in to change notification settings - Fork 0
/
03-conditional_probability.Rmd
549 lines (430 loc) · 19 KB
/
03-conditional_probability.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
# Conditional probability {#condprob}
This chapter deals with conditional probability.
The students are expected to acquire the following knowledge:
**Theoretical**
- Identify whether variables are independent.
- Calculation of conditional probabilities.
- Understanding of conditional dependence and independence.
- How to apply Bayes' theorem to solve difficult probabilistic questions.
**R**
- Simulating conditional probabilities.
- _cumsum_.
- _apply_.
<style>
.fold-btn {
float: right;
margin: 5px 5px 0 0;
}
.fold {
border: 1px solid black;
min-height: 40px;
}
</style>
<script type="text/javascript">
$(document).ready(function() {
$folds = $(".fold");
$folds.wrapInner("<div class=\"fold-blck\">"); // wrap a div container around content
$folds.prepend("<button class=\"fold-btn\">Unfold</button>"); // add a button
$(".fold-blck").toggle(); // fold all blocks
$(".fold-btn").on("click", function() { // add onClick event
$(this).text($(this).text() === "Fold" ? "Unfold" : "Fold"); // if the text equals "Fold", change it to "Unfold"or else to "Fold"
$(this).next(".fold-blck").toggle("linear"); // "swing" is the default easing function. This can be further customized in its speed or the overall animation itself.
})
});
</script>
```{r, echo = FALSE, warning = FALSE}
togs <- T
library(ggplot2)
# togs <- FALSE
```
## Calculating conditional probabilities
```{exercise}
A military officer is in charge of identifying enemy aircraft and shooting them
down. He is able to positively identify an enemy airplane 95% of the time
and positively identify a friendly airplane 90% of the time. Furthermore, 99%
of the airplanes are friendly. When the officer identifies an airplane as an
enemy airplane, what is the probability that it is not and they will shoot at
a friendly airplane?
```
<div class="fold">
```{solution, echo = togs}
Let $E = 0$ denote that the observed plane is friendly and $E=1$ that it
is an enemy. Let $I = 0$ denote that the officer identified it as friendly and
$I = 1$ as enemy. Then
\begin{align}
P(E = 0 | I = 1) &= \frac{P(I = 1 | E = 0)P(E = 0)}{P(I = 1)} \\
&= \frac{P(I = 1 | E = 0)P(E = 0)}{P(I = 1 | E = 0)P(E = 0) +
P(I = 1 | E = 1)P(E = 1)} \\
&= \frac{0.1 \times 0.99}{0.1 \times 0.99 +
0.95 \times 0.01} \\
&= 0.91.
\end{align}
```
</div>
```{exercise}
<span style="color:blue">R: Consider tossing a fair die. Let $A = \{2,4,6\}$
and $B = \{1,2,3,4\}$. Then $P(A) = \frac{1}{2}$, $P(B) = \frac{2}{3}$ and
$P(AB) = \frac{1}{3}$. Since $P(AB) = P(A)P(B)$, the events $A$ and $B$ are
independent. Simulate draws from the sample space and verify that the
proportions are the same. Then find two events $C$ and $D$ that are not
independent and repeat the simulation.</span>
```
<div class="fold">
```{r, echo = togs, eval = togs}
set.seed(1)
nsamps <- 10000
tosses <- sample(1:6, nsamps, replace = TRUE)
PA <- sum(tosses %in% c(2,4,6)) / nsamps
PB <- sum(tosses %in% c(1,2,3,4)) / nsamps
PA * PB
sum(tosses %in% c(2,4)) / nsamps
# Let C = {1,2} and D = {2,3}
PC <- sum(tosses %in% c(1,2)) / nsamps
PD <- sum(tosses %in% c(2,3)) / nsamps
PC * PD
sum(tosses %in% c(2)) / nsamps
```
</div>
```{exercise}
A machine reports the true value of a thrown 12-sided die 5 out of 6 times.
a. If the machine reports a 1 has been tossed, what is the probability that
it is actually a 1?
b. Now let the machine only report whether a 1 has been
tossed or not. Does the probability change?
c. <span style="color:blue">R: Use simulation to check your answers
to a) and b). </span>
```
<div class="fold">
```{solution, echo = togs}
a. Let $T = 1$ denote that the toss is 1 and $M = 1$ that the machine reports a 1.
\begin{align}
P(T = 1 | M = 1) &= \frac{P(M = 1 | T = 1)P(T = 1)}{P(M = 1)} \\
&= \frac{P(M = 1 | T = 1)P(T = 1)}{\sum_{k=1}^{12}
P(M = 1 | T = k)P(T = k)} \\
&= \frac{\frac{5}{6}\frac{1}{12}}{\frac{5}{6}\frac{1}{12} + 11 \frac{1}{6} \frac{1}{11} \frac{1}{12}} \\
&= \frac{5}{6}.
\end{align}
b. Yes.
\begin{align}
P(T = 1 | M = 1) &= \frac{P(M = 1 | T = 1)P(T = 1)}{P(M = 1)} \\
&= \frac{P(M = 1 | T = 1)P(T = 1)}{\sum_{k=1}^{12}
P(M = 1 | T = k)P(T = k)} \\
&= \frac{\frac{5}{6}\frac{1}{12}}{\frac{5}{6}\frac{1}{12} + 11 \frac{1}{6} \frac{1}{12}} \\
&= \frac{5}{16}.
\end{align}
```
```{r, echo = togs, eval = togs}
set.seed(1)
nsamps <- 10000
report_a <- vector(mode = "numeric", length = nsamps)
report_b <- vector(mode = "logical", length = nsamps)
truths <- vector(mode = "logical", length = nsamps)
for (i in 1:10000) {
toss <- sample(1:12, size = 1)
truth <- sample(c(TRUE, FALSE), size = 1, prob = c(5/6, 1/6))
truths[i] <- truth
if (truth) {
report_a[i] <- toss
report_b[i] <- toss == 1
} else {
remaining <- (1:12)[1:12 != toss]
report_a[i] <- sample(remaining, size = 1)
report_b[i] <- toss != 1
}
}
truth_a1 <- truths[report_a == 1]
sum(truth_a1) / length(truth_a1)
truth_b1 <- truths[report_b]
sum(truth_b1) / length(truth_b1)
```
</div>
```{exercise}
A coin is tossed independently $n$ times. The probability of heads at each
toss is $p$. At each time $k$, $(k = 2,3,...,n)$ we get a reward at time $k+1$
if $k$-th toss was a head and the previous toss was a tail. Let $A_k$ be the
event that a reward is obtained at time $k$.
a. Are events $A_k$ and $A_{k+1}$ independent?
b. Are events $A_k$ and $A_{k+2}$ independent?
c. <span style="color:blue">R: simulate 10 tosses 10000 times, where
$p = 0.7$. Check your
answers to a) and b) by counting the frequencies of the events $A_5$,
$A_6$, and $A_7$.</span>
```
<div class="fold">
```{solution, echo = togs}
a. For $A_k$ to happen, we need the tosses $k-2$ and $k-1$ be tails and heads
respectively. For $A_{k+1}$ to happen, we need tosses $k-1$ and $k$ be tails
and heads respectively. As the toss $k-1$ need to be heads for one and tails
for the other, these two events can not happen simultaneously. Therefore the
probability of their intersection is 0. But the probability of each of them
separately is $p(1-p) > 0$. Therefore, they are not independent.
b. For $A_k$ to happen, we need the tosses $k-2$ and $k-1$ be tails and heads
respectively. For $A_{k+2}$ to happen, we need tosses $k$ and $k+1$ be tails
and heads respectively. So the probability of intersection is $p^2(1-p)^2$.
And the probability of each separately is again $p(1-p)$. Therefore, they
are independent.
```
```{r, echo = togs, eval = togs}
set.seed(1)
nsamps <- 10000
p <- 0.7
rewardA_5 <- vector(mode = "logical", length = nsamps)
rewardA_6 <- vector(mode = "logical", length = nsamps)
rewardA_7 <- vector(mode = "logical", length = nsamps)
rewardA_56 <- vector(mode = "logical", length = nsamps)
rewardA_57 <- vector(mode = "logical", length = nsamps)
for (i in 1:nsamps) {
samps <- sample(c(0,1), size = 10, replace = TRUE, prob = c(0.7, 0.3))
rewardA_5[i] <- (samps[4] == 0 & samps[3] == 1)
rewardA_6[i] <- (samps[5] == 0 & samps[4] == 1)
rewardA_7[i] <- (samps[6] == 0 & samps[5] == 1)
rewardA_56[i] <- (rewardA_5[i] & rewardA_6[i])
rewardA_57[i] <- (rewardA_5[i] & rewardA_7[i])
}
sum(rewardA_5) / nsamps
sum(rewardA_6) / nsamps
sum(rewardA_7) / nsamps
sum(rewardA_56) / nsamps
sum(rewardA_57) / nsamps
```
</div>
```{exercise}
A drawer contains two coins. One is an unbiased coin, the other is a biased
coin, which will turn up heads with probability $p$ and tails with
probability $1-p$. One coin is selected uniformly at random.
a. The selected coin is tossed $n$ times. The coin turns up heads $k$ times and
tails $n-k$ times. What is the probability that the coin is biased?
b. The selected coin is tossed repeatedly until it turns up heads $k$ times.
Given that it is tossed $n$ times in total, what is the probability that the
coin is biased?
```
<div class="fold">
```{solution, echo = togs}
a. Let $B = 1$ denote that the coin is biased and let $H = k$ denote that
we've seen $k$ heads.
\begin{align}
P(B = 1 | H = k) &= \frac{P(H = k | B = 1)P(B = 1)}{P(H = k)} \\
&= \frac{P(H = k | B = 1)P(B = 1)}{P(H = k | B = 1)P(B = 1) + P(H = k | B = 0)P(B = 0)} \\
&= \frac{p^k(1-p)^{n-k} 0.5}{p^k(1-p)^{n-k} 0.5 + 0.5^{n+1}} \\
&= \frac{p^k(1-p)^{n-k}}{p^k(1-p)^{n-k} + 0.5^n}.
\end{align}
b. The same results as in a). The only difference between these two scenarios
is that in b) the last throw must be heads. However, this holds for the biased
and the unbiased coin and therefore does not affect the probability of the
coin being biased.
```
</div>
```{exercise}
Judy goes around the company for Women's day and shares flowers. In every
office she leaves a flower, if there is at least one woman inside.
The probability
that there's a woman in the office is $\frac{3}{5}$.
a. What is the probability that Judy leaves her first flower in the
fourth office?
b. Given that she has given away exactly three flowers in the first four
offices,
what is the probability that she gives her fourth flower in the eighth office?
c. What is the probability that she leaves the second flower in the fifth
office?
d. What is the probability that she leaves the second flower in the fifth
office,
given that she did not leave the second flower in the second office?
e. Judy needs a new supply of flowers immediately after the office, where she
gives away her last flower. What is the probability that she visits at least
five offices, if she starts with two flowers?
f. <span style="color:blue">R: simulate Judy's walk 10000 times to
check your answers a) - e).</span>
```
<div class="fold">
```{solution, echo = togs}
Let $X_i = k$ denote the event that ... $i$-th sample on the $k$-th run.
a. Since the events are independent, we can multiply their probabilities to get
\begin{equation}
P(X_1 = 4) = 0.4^3 \times 0.6 = 0.0384.
\end{equation}
b. Same as in a) as we have a fresh start after first four offices.
c. For this to be possible, she had to leave the first flower in one of the
first four offices. Therefore there are four possibilities, and for each
of those the probability is $0.4^3 \times 0.6$. Additionally, the probability
that she leaves a flower in the fifth office is $0.6$. So
\begin{equation}
P(X_2 = 5) = \binom{4}{1} \times 0.4^3 \times 0.6^2 = 0.09216.
\end{equation}
d. We use Bayes' theorem.
\begin{align}
P(X_2 = 5 | X_2 \neq 2) &= \frac{P(X_2 \neq 2 | X_2 = 5)P(X_2 = 5)}{P(X_2 \neq 2)} \\
&= \frac{0.09216}{0.64} \\
&= 0.144.
\end{align}
The denominator in the second equation can be calculated as follows. One of
three things has to happen for the second not to be dealt in the second
round. First, both are zero, so $0.4^2$. Second, first is zero, and second is
one, so $0.4 \times 0.6$. Third,
the first is one and the second one zero, so $0.6 \times 0.4$. Summing these
values we get $0.64$.
e. We will look at the complement, so the events that she gave away exactly
two flowers
after two, three and four offices.
\begin{equation}
P(X_2 \geq 5) = 1 - 0.6^2 - 2 \times 0.4 \times 0.6^2 - 3 \times 0.4^2 \times 0.6^2 = 0.1792.
\end{equation}
The multiplying parts represent the possibilities of the first flower.
```
```{r, echo = togs, eval = togs}
set.seed(1)
nsamps <- 100000
Judyswalks <- matrix(data = NA, nrow = nsamps, ncol = 8)
for (i in 1:nsamps) {
thiswalk <- sample(c(0,1), size = 8, replace = TRUE, prob = c(0.4, 0.6))
Judyswalks[i, ] <- thiswalk
}
csJudy <- t(apply(Judyswalks, 1, cumsum))
# a
sum(csJudy[ ,4] == 1 & csJudy[ ,3] == 0) / nsamps
# b
csJsubset <- csJudy[csJudy[ ,4] == 3 & csJudy[ ,3] == 2, ]
sum(csJsubset[ ,8] == 4 & csJsubset[ ,7] == 3) / nrow(csJsubset)
# c
sum(csJudy[ ,5] == 2 & csJudy[ ,4] == 1) / nsamps
# d
sum(csJudy[ ,5] == 2 & csJudy[ ,4] == 1) / sum(csJudy[ ,2] != 2)
# e
sum(csJudy[ ,4] < 2) / nsamps
```
</div>
## Conditional independence
```{exercise}
Describe:
a. A real-world example of two events $A$ and $B$ that are dependent but
become conditionally independent if conditioned on a third event $C$.
b. A real-world example of two events $A$ and $B$ that are independent, but
become dependent if conditioned on some third event $C$.
```
<div class="fold">
```{solution, echo = togs}
a. Let $A$ be the height of a person and let $B$ be the person's knowledge
of the Dutch language. These events are dependent since the Dutch are known
to be taller than average. However if $C$ is the nationality of the person,
then $A$ and $B$ are independent given $C$.
b. Let $A$ be the event that Mary passes the exam and let $B$ be the event
that John passes the exam. These events are independent. However, if the
event $C$ is that Mary and John studied together, then $A$ and $B$ are
conditionally dependent given $C$.
```
</div>
```{exercise}
We have two coins of identical appearance. We know that one is a fair coin
and the other flips heads 80% of the time. We choose one of the two
coins uniformly at random. We discard the coin that was not chosen. We now
flip the chosen coin independently 10 times, producing a sequence
$Y_1 = y_1$, $Y_2 = y_2$, ..., $Y_{10} = y_{10}$.
a. Intuitively, without doing and computation, are these random variables
independent?
b. Compute the probability $P(Y_1 = 1)$.
c. Compute the probabilities $P(Y_2 = 1 | Y_1 = 1)$ and
$P(Y_{10} = 1 | Y_1 = 1,...,Y_9 = 1)$.
d. Given your answers to b) and c), would you now change your answer to a)?
If so, discuss why your intuition had failed.
```
<div class="fold">
```{solution, echo = togs}
b. $P(Y_1 = 1) = 0.5 * 0.8 + 0.5 * 0.5 = 0.65$.
c. Since we know that $Y_1 = 1$ this should change our view of the probability
of the coin being biased or not. Let $B = 1$ denote the event that the
coin is biased and let $B = 0$ denote that the coin is unbiased. By using
marginal probability, we can write
\begin{align}
P(Y_2 = 1 | Y_1 = 1) &= P(Y_2 = 1, B = 1 | Y_1 = 1) + P(Y_2 = 1, B = 0 | Y_1 = 1) \\
&= \sum_{k=1}^2 P(Y_2 = 1 | B = k, Y_1 = 1)P(B = k | Y_1 = 1) \\
&= 0.8 \frac{P(Y_1 = 1 | B = 1)P(B = 1)}{P(Y_1 = 1)} +
0.5 \frac{P(Y_1 = 1 | B = 0)P(B = 0)}{P(Y_1 = 1)} \\
&= 0.8 \frac{0.8 \times 0.5}{0.65} + 0.5 \frac{0.5 \times 0.5}{0.65} \\
&\approx 0.68.
\end{align}
For the other calculation we follow the same procedure. Let $X = 1$ denote
that first nine tosses are all heads (equivalent to $Y_1 = 1$,..., $Y_9 = 1$).
\begin{align}
P(Y_{10} = 1 | X = 1) &= P(Y_2 = 1, B = 1 | X = 1) + P(Y_2 = 1, B = 0 | X = 1) \\
&= \sum_{k=1}^2 P(Y_2 = 1 | B = k, X = 1)P(B = k | X = 1) \\
&= 0.8 \frac{P(X = 1 | B = 1)P(B = 1)}{P(X = 1)} +
0.5 \frac{P(X = 1 | B = 0)P(B = 0)}{P(X = 1)} \\
&= 0.8 \frac{0.8^9 \times 0.5}{0.5 \times 0.8^9 + 0.5 \times 0.5^9} + 0.5 \frac{0.5^9 \times 0.5}{0.5 \times 0.8^9 + 0.5 \times 0.5^9} \\
&\approx 0.8.
\end{align}
```
</div>
## Monty Hall problem
The Monty Hall problem is a famous probability puzzle with non-intuitive
outcome. Many established mathematicians and statisticians had problems
solving it and many even disregarded the correct solution until they've seen
the proof by simulation. Here we will show how it can be solved relatively
simply with the use of Bayes' theorem if we select the variables in a smart way.
```{exercise, name = "Monty Hall problem"}
A prize is placed at random behind one of three doors. You pick a door. Now
Monty Hall chooses one of the other two doors, opens it and shows you that it is
empty. He then gives you the opportunity to keep your door or switch to the
other unopened door. Should you stay or switch? Use Bayes' theorem to calculate the probability of winning if you switch and if you do not.
<span style="color:blue">R: Check your answers in R.</span>
```
<div class="fold">
```{solution, echo = togs}
W.L.O.G. assume we always pick the first door. The host can only open door 2
or door 3, as he can not open the door we picked. Let $k \in \{2,3\}$.
Let us first look at what happens if we do not change. Then we have
\begin{align}
P(\text{car in 1} | \text{open $k$}) &= \frac{P(\text{open $k$} | \text{car in 1})P(\text{car in 1})}{P(\text{open $k$})} \\
&= \frac{P(\text{open $k$} | \text{car in 1})P(\text{car in 1})}{\sum_{n=1}^3 P(\text{open $k$} | \text{car in $n$})P(\text{car in $n$)}}.
\end{align}
The probability that he opened $k$ if the car is in 1 is $\frac{1}{2}$,
as he can choose
between door 2 and 3 as both have a goat behind it. Let us look at the
normalization constant. When $n = 1$ we get the value in the nominator. When
$n=k$, we get 0, as he will not open the door if there's a prize behind. The
remaining option is that we select 1, the car is behind $k$ and he opens the
only door left. Since he can't open 1 due to it being our pick and $k$ due to
having the prize, the probability of opening the remaining door is 1, and the
prior probability of the car being behind this door is $\frac{1}{3}$. So
we have
\begin{align}
P(\text{car in 1} | \text{open $k$}) &= \frac{\frac{1}{2}\frac{1}{3}}{\frac{1}{2}\frac{1}{3} + \frac{1}{3}} \\
&= \frac{1}{3}.
\end{align}
Now let us look at what happens if we do change. Let $k' \in \{2,3\}$ be the
door that is not opened. If we change, we select this door, so we have
\begin{align}
P(\text{car in $k'$} | \text{open $k$}) &= \frac{P(\text{open $k$} | \text{car in $k'$})P(\text{car in $k'$})}{P(\text{open $k$})} \\
&= \frac{P(\text{open $k$} | \text{car in $k'$})P(\text{car in $k'$})}{\sum_{n=1}^3 P(\text{open $k$} | \text{car in $n$})P(\text{car in $n$)}}.
\end{align}
The denominator stays the same, the only thing that is different from before is
$P(\text{open $k$} | \text{car in $k'$})$. We have a situation where we
initially
selected door 1 and the car is in door $k'$. The probability that the host will
open door $k$ is then 1, as he can not pick any other door. So we have
\begin{align}
P(\text{car in $k'$} | \text{open $k$}) &= \frac{\frac{1}{3}}{\frac{1}{2}\frac{1}{3} + \frac{1}{3}} \\
&= \frac{2}{3}.
\end{align}
Therefore it makes sense to change the door.
```
```{r, echo = togs, eval = togs}
set.seed(1)
nsamps <- 1000
ifchange <- vector(mode = "logical", length = nsamps)
ifstay <- vector(mode = "logical", length = nsamps)
for (i in 1:nsamps) {
where_car <- sample(c(1:3), 1)
where_player <- sample(c(1:3), 1)
open_samp <- (1:3)[where_car != (1:3) & where_player != (1:3)]
if (length(open_samp) == 1) {
where_open <- open_samp
} else {
where_open <- sample(open_samp, 1)
}
ifstay[i] <- where_car == where_player
where_ifchange <- (1:3)[where_open != (1:3) & where_player != (1:3)]
ifchange[i] <- where_ifchange == where_car
}
sum(ifstay) / nsamps
sum(ifchange) / nsamps
```
</div>